Bo Wu

Associate Professor, Computer Science

Bo Wu

I am an Associate Professor in the Department of Computer Science at Colorado School of Mines. My research lies in the broad field of compilers and programming systems, with an emphasis on program optimizations for heterogeneous computing and emerging architectures. My current focus is on building efficient systems for machine learning and graph processing applications.

Before joining Mines in August 2014, I earned a Ph.D. in Computer Science from The College of William and Mary, where I worked with Professor Xipeng Shen. I received an M.S. in Computer Science and a B.S. in Mathematics both from Central South University, Hunan, China.

Contact

CTLM 246J
303-384-2135
bwu@mines.edu
Personal web page

Education

  • B.S. Computational Science and Technology Central South University, Changsha, China 2005
  • M.S. Computer Science Central South University, Changsha, China 2008
  • Ph.D. Computer Science The College of William and Mary, Williamsburg, VA 2014

Awards

  • NSF SPX Award, 2018
  • NSF CAREER Award, 2018
  • Best paper nomination, PACT, 2017
  • NSF grant to support research on optimizing applications for heterogeneous memory architectures (PI), 2016-2019
  • Supercomputing best paper award, 2015
  • Sole-PI NSF grant to support research on GPU scheduling, 2015-2017
  • NVIDIA CUDA teaching center, 2014
  • Stephen K. Park award, 2013
  • Fellowship, IBM Center of Advanced Study, 2011-2013

Research

“My research interest lies in the broad field of compilers and programming systems, with an emphasis on program optimizations for heterogeneous computing and emerging architectures. Most of my research activities have centered around data locality enhancement for heterogeneous computing systems. My choice of this area of focus is driven by the importance of heterogeneous processors (e.g., CPU plus GPU) in meeting the needs of the large variety of modern applications.”

Recent Papers:

Publications
  • [NeurIPS’21] “NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM”, Connor Holmes, Minjia Zhang, Yuxiong He, and Bo Wu. Thirty-fifth Conference on Neural Information Processing Systems, 2021.
  • [PACT’21] “Dryadic: Flexible and Fast Graph Pattern Matching at Scale”, Daniel Mawhirter, Samuel Reinehr, Wei Han, Noah Fields, Miles Claver, Connor Holmes, Jedidiah McClurg, Tongping Liu, and Bo Wu. The 30th International Conference on Parallel Architectures and Compilation Techniques, 2021.
  • [OSR’21] “GraphZero: A High-Performance Subgraph Matching System”, Daniel Mawhirter, Sam Reinehr, Connor Holmes, Tongping Liu, and Bo Wu. ACM SIGOPS Operating Systems Review, Volume 55, Issue 1, 2021.
  • [TKDE’21] “Automatic Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures”, Feng Zhang, Jidong Zhai, and Bo Wu, Bingsheng He, Wenguang Chen, and Xiaoyong Du. IEEE Transactions on Knowledge and Data Engineering, 2021.
  • [SOSP’19] “AutoMine: Harmonizing High-Level Abstraction and High Performance for Graph Mining”, Daniel Mawhirter and Bo Wu. ACM Symposium on Operating Systems Principles, Huntsville, Ontario, Canada, October, 2019. Acceptance ratio: 13.8% (38/276).
  • [LCPC’19] “FLARE: Flexibly Sharing Commodity GPUs to Enforce QoS and Improve Utilization”, Wei Han, Daniel Mawhirter, Lin Ma, Chen Tian, and Bo Wu. The 32nd Workshop on Languages and Compilers for Parallel Computing, Atlanta, October, 2019.
  • [EuroSys’19] “GRNN: Low-Latency and Scalable RNN Inference on GPUs”, Connor Holmes, Daniel Mawhirter, Yuxiong He, Feng Yan, and Bo Wu. European Conference on Computer Systems, Dresden, Germany, March, 2019. Acceptance ratio: 21.8% (45/206).
  • [ICS’19] “Laius: Towards Latency Awareness and Improved Utilization of Spatial Multitasking Accelerators in Datacenters”, Wei Zhang, Weihao Cui, Kaihua Fu, Quan Chen, Daniel Mawhirter, Bo Wu, Chao Li and Minyi Guo. International Conference on Supercomputing, Phoenix, Arizona, USA, June, 2019. Acceptance ratio: 23.3% (45/193).
  • [PACT’18] “GraphPhi: Efficient Parallel Graph Processing on Emerging Throughput-oriented Architectures”, Zhen Peng, Alexander Powell, Bo Wu, Tekin Bicer and Bin Ren. The 27th International Conference on Parallel Architectures and Compilation Techniques, Limassol, Cyprus, November, 2018. Acceptance ratio: 29% (36/126).
  • [CCGrid’18] “ApproxG: Fast Approximate Parallel Graphlet Counting Through Accuracy Control”, Daniel Mawhirter, Bo Wu, Dinesh Mehta and Chao Ai. The 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Washington DC, May, 2018. Acceptance ratio: 20.8% (52/250).
  • [FCS’18] “Resolving the GPU responsiveness dilemma through program transformations”, Qi Zhu, Bo Wu, Xipeng Shen, Kai Shen, Li Shen, Zhiying Wang, Frontiers of Computer Science, Springer, 2018, 12 (3): 545-559.
  • [ASPLOS’17] “FLEP: Enabling Flexible and Efficient Preemption on GPUs”, Bo Wu, Xu Liu, Xiaobo Zhou, and Changjun Jiang. The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Xi’an, China, April, 2017. Acceptance ratio: 17% (56/321).
  • [PACT’17] “Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU”, Wei Han, Daniel Mawhirter, Matthew Buland, and Bo Wu. The 26th International Conference on Parallel Architectures and Compilation Techniques, Portland, Oregon, Sep. 2017. Acceptance ratio: 23% (25/108). Nominated for best paper award.
  • [ICS’17] “ScalaFSM: Enabling Scalability-Sensitive Speculative Parallelization for FSM Computations”, Junqiao Qiu, Zhijia Zhao, Bo Wu, Abhinav Vishnu and Shuaiwen Leon Song. The International Conference on Supercomputing, Chicago, IL, June, 2017. Acceptance ratio: 16%.
  • [IPDPS’17] “Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems”, Qi Zhu, Bo Wu, Xipeng Shen, Li Shen and Zhiying Wang. The 31st IEEE International Parallel & Distributed Processing Symposium, Orlando, Florida, May, 2017. Acceptance ratio: 23%.
  • [CGO’17] “FinePar: Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures”, Feng Zhang, Bo Wu, Jidong Zhai, Bingsheng He, and Wenguang Chen. The International Symposium on Code Generation and Optimization, Austin, TX, Feb, 2017. Acceptance ratio: 22% (26/114).
  • [Book Chapter] “Data Placement on GPUs”, Xipeng Shen and Bo Wu. To appear as a chapter in “Advances in GPU Research and Practice”, by H. Sarbazi-Azad (editor), Elsevier, 2016.
  • [Book Chapter] “Software-Level Task Scheduling on GPUs”, Bo Wu and Xipeng Shen. To appear as a chapter in “Advances in GPU Research and Practice”, by H. Sarbazi-Azad (editor), Elsevier, 2016.
  • [TC’16] “Optimizing Data Placement on GPU Memory: A Portable Approach”, Guoyang Chen, Xipeng Shen, Bo Wu, and Dong Li. The IEEE Transactions on Computers, 2016. To appear.
  • [FCS’16] “Understanding Co-run Performance on CPU-GPU Integrated Processors: Observations, Insights, Directions”, Qi Zhu, Bo Wu, Kai Shen, and Xipeng Shen. Frontiers of Computer Science, Springer, 2016. To appear.
  • [TACO’16] “Examining and Reducing the Influence of Sampling Errors on Feedback-Driven Optimizations”, Mingzhou Zhou, Bo Wu, Xipeng Shen, Yaoqing Gao, and Graham Yiu. The ACM Transactions on Architecture and Code Optimization, 2016. To appear.
  • [SC’15] “ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs”, Xu Liu and Bo Wu. The International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, Nov, 2015. Acceptance ratio: 22%. Best paper award (1 out of 358 submissions).
  • [IEEE/Micro’15] “Enabling Portable Optimizations of Data Placement on GPU”, Guoyang Chen, Bo Wu, Dong Li and Xipeng Shen. July/August Issue, The Heterogeneous Computing special issue of IEEE Micro, 2015.
  • [HotOS’15] “Software Engagement with Sleeping CPUs”, Qi Zhu, Meng Zhu, Bo Wu, Xipeng Shen, Kai Shen and Zhiying Wang. The 15th Workshop on Hot Topics in Operating Systems, Kartause Ittingen, Switzerland, May, 2015. Acceptance ratio: 32% (29/90)
  • [ICS’15] “Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations”, Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen and Jeffrey Vetter. The 29th International Conference on Supercomputing, Newport Beach, CA, June, 2015. Acceptance ratio: 25%
  • [MICRO’14] “PORPLE: An Extensible Optimizer for Portable Data Placement on GPU”, Guoyang Chen, Bo Wu, Dong Li and Xipeng Shen. The 47th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, UK, Dec, 2014. Acceptance ratio: 19% (53/273)
  • [LCPC’14] “Understanding Co-Run Degradations on Integrated Heterogeneous Processors”, Qi Zhu, Bo Wu, Xipeng Shen, Li Shen and Zhiying Wang. The 27th International Workshop on Languages and Compilers for Parallel Computing, Hillsboro, OR, Sep, 2014.
  • [PACT’14 poster] “SM-Centric Transformation: Circumventing Hardware Restrictions for Flexible GPU Scheduling “, Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, Jeffrey Vetter. The 23rd International Conference on Parallel Architectures and Compilation Techniques, Edmonton, Alberta, Canada, Aug. 2014.
  • [OOPSLA’14] “Call Sequence Prediction through Probabilistic Calling Automata”, Zhijia Zhao, Bo Wu, Mingzhou Zhou, Yufei Ding, Jianhua Sun, Xipeng Shen, and Youfeng Wu. ACM SIGPLAN conference on Systems, Programming, Languages and Applications, Portland, USA, 2014. Acceptance ratio: 28% (53/186).
  • [ASPLOS’14] “Challenging the “Embarrassingly Sequential”: Parallelizing Finite State Machine-Based Computations through Principled Speculation”, Zhijia Zhao, Bo Wu, Xipeng Shen, The Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems, Salt Lake City, Utah, Mar, 2014. Acceptance ratio: 23% (49/217).
  • [PACT’13] “Exploring Hybrid Memory for GPU Energy Efficiency through Software-Hardware Co-Design”, Bin Wang, Bo Wu, Dong Li, Xipeng Shen, Weikuan Yu, Yizheng Jiao, Jeffrey Vetter, The 22nd International Conference on Parallel Architectures and Compilation Techniques, Edinburgh, Scotland, Sep, 2013. Acceptance ratio: 17% (36/208).
  • [MSPC’13 poster] “Software-level Scheduling to Exploit Non-uniformly Shared Data Cache”, Bo Wu, Weilin Wang, Xipeng Shen, ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, Seattle, USA, June, 2013. Two-page position paper.
  • [ECOOP’13] “Simple Profile Rectifications Go A Long Way: Demystifying the Influence of Sampling Errors on Feedback Driven Program Optimizations “, Bo Wu, Mingzhou Zhou, Xipeng Shen, Yaoqing Gao, Raul Silvera, Graham Yiu, European Conference on Object-oriented Programming, Montpellier, France, July, 2013. Acceptance ratio: 25%.
  • [PPoPP’13] “Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced GPU Memory Accesses”, Bo Wu, Zhijia Zhao, Eddy Zhang, Yunlian Jiang, Xipeng Shen, 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Shenzhen, China, Feb, 2013. Acceptance ratio: 18%.
  • [CGO’13] “ProfMig: The First Framework for Migrating Program Profiles Across Software Versions”, Mingzhou Zhou, Bo Wu, Yufei Ding, Xipeng Shen, International Symposium on Code Generation and Optimization, Shenzhen, China, Feb, 2013. Acceptance ratio: 28%.
  • [OOPSLA’12] “Exploiting Inter-Sequence Correlations for Program Behavior Prediction”, Bo Wu, Zhijia Zhao, Xipeng Shen, Yunlian Jiang, Yaoqing Gao, Raul Silvera, The 27th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications, Tucson, Arizona, USA, Oct, 2012. Acceptance ratio: 25%.
  • [PACT’12 poster] “Speculative Parallelization Needs Rigor: Probabilistic Analysis for Optimal Speculation of Finite State Machine Applications”, Zhijia Zhao, Bo Wu, Xipeng Shen, The Twenty-first International Conference on Parallel Architectures and Compilation Techniques, two-page poster paper, Minneapolis, MN, USA, Sep, 2012.
  • [ICS’12] “One Stone Two Birds: Synchronization Relaxation and Redundancy Removal in GPU-CPU Translation”, Ziyu Guo and Bo Wu and Xipeng Shen, ACM International Conference on Supercomputing, Venice, Italy, 2012. Acceptance ratio: 22%.
  • [PACT’11] “Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control”, Bo Wu, Eddy Zhang, Xipeng Shen, The Twentieth International Conference on Parallel Architectures and Compilation Techniques, Galveston Island, Texas, USA, Oct, 2011. Acceptance ratio: 16% (36/221).
  • [PACT’11 SRC] “Probabilistic Models towards Optimal Speculation of DFA Applications”, Zhijia Zhao and Bo Wu, PACT 2011 ACM Student Research Competition, Galveston Island, Texas, USA, Oct, 2011. (Second place among 29 submissions.
PROFESSIONAL ACTIVITIES

Conferences and workshops:

    • PPoPP’20: ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (Program Committee)
    • SC’19: The International Conference for High Performance Computing, Networking, Storage and Analysis (Program Committee)
    • ICS’19: The International Conference on Supercomputing (Program Committee)
    • ASPLOS’18: The 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Submission Chair)
    • HIPS’17: The 22nd International workshop on high-level parallel programming models and supportive environments (Program Co-Chair)
    • ICS’17: The International Conference on Supercomputing (External Review Committee)
    • ICPADS’16: The 21st IEEE International Conference on Parallel and Distributed Systems (Program Vice Chair, Track Co-Chair)
    • PLDI’16: The 37th annual ACM SIGPLAN conference on Programming Language Design and Implementation (External Review Committee)
    • NAS’16: The 11th IEEE International Conference on Networking, Architecture and Storage (Publication Chair)
    • IPDRM’16: First Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (Program Committee)
    • HIPS’16: The 21st International Workshop on High-Level Parallel Programming Models and Supportive Environments (Program Committee)
    • CLOUD’15: The 8th IEEE International Conference on Cloud Computing (Program Committee)
    • LCPC’15: The 28th International Workshop on Languages and Compilers for Parallel Computing (Program Committee)
    • APPT’15: The 11th International Conference on Advanced Parallel Processing Technology (Program Committee)
    • IPDPS’15: IEEE International Parallel & Distributed Processing Symposium (Program Committee)
    • OOPSLA’13: ACM SIGPLAN conference on Systems, Programming, Languages and Applications (Artifact Committee) 

Journals:

  • IEEE Transactions on Parallel and Distributed Systems (reviewer)
  • ACM Transactions on Architecture and Code Optimization (reviewer)
  • ACM Transactions on Modeling and Performance Evaluation of Computing Systems (reviewer)
  • ACM Computing Survey (reviewer)
  • Elsevier Journal of Computer and System Sciences (reviewer)
  • Elsevier Journal of Parallel Computing (reviewer)
STUDENTS

I currently work with the following very bright students.

    • PhD students: Daniel Mawhirter, Connor Holmes, Akshit Sharma, Chang Liu
    • MS students: Alexey Yaremenko, Izaak Sulka, Sam Reinehr
    • Undergrad students: Noah Fields, Benjamin Wagley

Former group members:

  • PhD graduates: Wei Han (AMD)
  • MS graduates: Matt Buland (Salesforce), Brian Fidder, Erol Cornelio, Matthew Berntson, and Fnu Aruna

HOBBIES

I try my best to play basketball three times a week. I play guitar and piano to entertain myself (not sure whether I’m skilled enough to entertain others). I like history, philosophy, and physics. Last but not least, I was once a pretty good counter strike gamer and played on a team which won a championship in a local league.