Bo Wu
Cited by
Cited by
Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced GPU Memory Accesses
B Wu, Z Zhao, E Zhang, Y Jiang, X Shen
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations
B Wu, G Chen, D Li, X Shen, J Vetter
The 29th International Conference on Supercomputing, 2015
Automine: harmonizing high-level abstraction and high performance for graph mining
D Mawhirter, B Wu
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 509-523, 2019
Flep: Enabling flexible and efficient preemption on gpus
B Wu, X Liu, X Zhou, C Jiang
ACM SIGPLAN Notices 52 (4), 483-496, 2017
Can PCM Benefit GPU? Reconciling Hybrid Memory Design with GPU Massive Parallelism for Energy Efficiency
B Wang, B Wu, D Li, X Shen, W Yu, Y Jiao, J Vetter
The 22nd International Conference on Parallel Architectures and Compilation …, 2013
PORPLE: An Extensible Optimizer for Portable Data Placement on GPU
G Chen, B Wu, D Li, X Shen
The 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Graphie: Large-scale asynchronous graph traversals on just a GPU
W Han, D Mawhirter, B Wu, M Buland
2017 26th International Conference on Parallel Architectures and Compilation …, 2017
Grnn: Low-latency and scalable rnn inference on gpus
C Holmes, D Mawhirter, Y He, F Yan, B Wu
Proceedings of the Fourteenth EuroSys Conference 2019, 1-16, 2019
FinePar: Irregularity-aware fine-grained workload partitioning on integrated architectures
F Zhang, B Wu, J Zhai, B He, W Chen
2017 IEEE/ACM International Symposium on Code Generation and Optimization …, 2017
ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs
X Liu, B Wu
The International Conference for High Performance Computing, Networking …, 2015
Challenging the" embarrassingly sequential" parallelizing finite state machine-based computations through principled speculation
Z Zhao, B Wu, X Shen
ACM SIGARCH Computer Architecture News 42 (1), 543-558, 2014
Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters
W Zhang, W Cui, K Fu, Q Chen, DE Mawhirter, B Wu, C Li, M Guo
Proceedings of the ACM international conference on supercomputing, 58-68, 2019
Co-run scheduling with power cap on integrated cpu-gpu systems
Q Zhu, B Wu, X Shen, L Shen, Z Wang
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017
Graphzero: A high-performance subgraph matching system
D Mawhirter, S Reinehr, C Holmes, T Liu, B Wu
ACM SIGOPS Operating Systems Review 55 (1), 21-37, 2021
Graphzero: Breaking symmetry for efficient graph mining
D Mawhirter, S Reinehr, C Holmes, T Liu, B Wu
arXiv preprint arXiv:1911.12877, 2019
Automatic irregularity-aware fine-grained workload partitioning on integrated architectures
F Zhang, J Zhai, B Wu, B He, W Chen, X Du
IEEE Transactions on Knowledge and Data Engineering 33 (3), 867-881, 2019
Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control
B Wu, EZ Zhang, X Shen
The Twentieth International Conference on Parallel Architectures and …, 2011
Graphphi: efficient parallel graph processing on emerging throughput-oriented architectures
Z Peng, A Powell, B Wu, T Bicer, B Ren
Proceedings of the 27th International Conference on Parallel Architectures …, 2018
Enabling scalability-sensitive speculative parallelization for fsm computations
J Qiu, Z Zhao, B Wu, A Vishnu, SL Song
Proceedings of the International Conference on Supercomputing, 1-10, 2017
Simple Profile Rectifications Go a Long Way: Statistically Exploring and Alleviating the Effects of Sampling Errors for Program Optimizations
B Wu, M Zhou, X Shen, Y Gao, R Silvera, G Yiu
ECOOP 2013–Object-Oriented Programming: 27th European Conference …, 2013
The system can't perform the operation now. Try again later.
Articles 1–20