Richard Vuduc
Title
Cited by
Cited by
Year
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
S Williams, L Oliker, R Vuduc, J Shalf, K Yelick, J Demmel
SC'07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, 1-12, 2007
9732007
OSKI: A library of automatically tuned sparse matrix kernels
R Vuduc, JW Demmel, KA Yelick
Journal of Physics: Conference Series 16 (1), 071, 2005
6522005
Model-driven autotuning of sparse matrix-vector multiply on GPUs
JW Choi, A Singh, RW Vuduc
ACM sigplan notices 45 (5), 115-126, 2010
5032010
Sparsity: Optimization framework for sparse matrix kernels
EJ Im, K Yelick, R Vuduc
The International Journal of High Performance Computing Applications 18 (1 …, 2004
3742004
Automatic performance tuning of sparse matrix kernels
RW Vuduc
University of California, Berkeley, 2003
3202003
Self-adapting linear algebra algorithms and software
J Demmel, J Dongarra, V Eijkhout, E Fuentes, A Petitet, R Vuduc, ...
Proceedings of the IEEE 93 (2), 293-312, 2005
2632005
A performance analysis framework for identifying potential benefits in GPGPU applications
J Sim, A Dasgupta, H Kim, R Vuduc
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012
2392012
A massively parallel adaptive fast-multipole method on heterogeneous architectures
I Lashuk, A Chandramowlishwaran, H Langston, TA Nguyen, R Sampath, ...
Proceedings of the Conference on High Performance Computing Networking …, 2009
2312009
Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures
A Rahimian, I Lashuk, S Veerapaneni, A Chandramowlishwaran, ...
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
1992010
Fast sparse matrix-vector multiplication by exploiting variable block structure
R Vuduc, HJ Moon
High Performance Computing and Communications, 807-816, 2005
1902005
Many-thread aware prefetching mechanisms for GPGPU applications
J Lee, NB Lakshminarayana, H Kim, R Vuduc
2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 213-224, 2010
1722010
Performance optimizations and bounds for sparse matrix-vector multiply
R Vuduc, JW Demmel, KA Yelick, S Kamil, R Nishtala, B Lee
SC'02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, 26-26, 2002
1702002
A roofline model of energy
JW Choi, D Bedard, R Fowler, R Vuduc
2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013
1682013
On the limits of GPU acceleration
R Vuduc, A Chandramowlishwaran, J Choi, M Guney, A Shringarpure
Proceedings of the 2nd USENIX conference on Hot topics in parallelism 13, 2010
1672010
Falcon: fault localization in concurrent programs
S Park, RW Vuduc, MJ Harrold
Proceedings of the 32nd ACM/IEEE International Conference on Software …, 2010
1662010
When prefetching works, when it doesn’t, and why
J Lee, H Kim, R Vuduc
ACM Transactions on Architecture and Code Optimization (TACO) 9 (1), 1-29, 2012
1562012
Statistical models for empirical search-based performance tuning
R Vuduc, JW Demmel, JA Bilmes
International Journal of High Performance Computing Applications 18 (1), 65-94, 2004
1472004
POET: Parameterized optimizations for empirical tuning
Q Yi, K Seymour, H You, R Vuduc, D Quinlan
2007 IEEE International Parallel and Distributed Processing Symposium, 1-8, 2007
1412007
When cache blocking of sparse matrix vector multiply works and why
R Nishtala, RW Vuduc, JW Demmel, KA Yelick
Applicable Algebra in Engineering, Communication and Computing 18 (3), 297-311, 2007
1342007
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
S Venkatasubramanian, RW Vuduc, none none
Proceedings of the 23rd international conference on Supercomputing, 244-255, 2009
1132009
The system can't perform the operation now. Try again later.
Articles 1–20