Follow
Naveen Mellempudi
Naveen Mellempudi
Verified email at amd.com
Title
Cited by
Cited by
Year
A study of BFLOAT16 for deep learning training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
3492019
Mixed precision training of convolutional neural networks using integer operations
D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ...
arXiv preprint arXiv:1802.00930, 2018
2002018
Ternary neural networks with fine-grained quantization
N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey
arXiv preprint arXiv:1705.01462, 2017
1402017
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,234,930, 2019
1292019
Fp8 formats for deep learning
P Micikevicius, D Stosic, N Burgess, M Cornea, P Dubey, R Grisenthwaite, ...
arXiv preprint arXiv:2209.05433, 2022
1152022
Mixed precision training with 8-bit floating point
N Mellempudi, S Srinivasan, D Das, B Kaul
arXiv preprint arXiv:1905.12334, 2019
772019
Dynamic precision management for integer deep learning primitives
N Mellempudi, D Mudigere, D Das, S Sridharan
US Patent 10,643,297, 2020
502020
Optimized compute hardware for machine learning operations
D Das, R Gramunt, M Smelyanskiy, J Corbal, D Mudigere, NK Mellempudi, ...
US Patent 10,776,699, 2020
472020
Scaling half-precision floating point tensors for training deep neural networks
N Mellempudi, D Das
US Patent 11,501,139, 2022
462022
On scale-out deep learning training for cloud and hpc
S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ...
arXiv preprint arXiv:1801.08030, 2018
352018
Mixed low-precision deep learning inference using dynamic fixed point
N Mellempudi, A Kundu, D Das, D Mudigere, B Kaul
arXiv preprint arXiv:1701.08978, 2017
292017
Performing power management in a multicore processor
VW Lee, D Kim, Y Bai, S Ji, S Li, DD Kalamkar, NK Mellempudi
US Patent 9,910,481, 2018
242018
Incremental precision networks using residual inference and fine-grain quantization
A Kundu, N Mellempudi, D Mudigere, D Das
US Patent 11,556,772, 2023
212023
Conversion hardware mechanism
N Mellempudi, D Das, MEI Chunhui, K Wong, DD Kalamkar, HH Jiang, ...
US Patent 11,494,163, 2022
152022
Ternary residual networks
A Kundu, K Banerjee, N Mellempudi, D Mudigere, D Das, B Kaul, ...
arXiv preprint arXiv:1707.04679, 2017
152017
Dynamic precision management for integer deep learning primitives
N Mellempudi, D Mudigere, D Das, S Sridharan
US Patent 11,321,805, 2022
102022
Efficient post-training quantization with fp8 formats
H Shen, N Mellempudi, X He, Q Gao, C Wang, M Wang
Proceedings of Machine Learning and Systems 6, 483-498, 2024
82024
Technologies for scaling deep learning training
NK Mellempudi, S Sridharan, D Mudigere, D Das
US Patent 11,068,780, 2021
82021
High performance scalable FPGA accelerator for deep neural networks
S Srinivasan, P Janedula, S Dhoble, S Avancha, D Das, N Mellempudi, ...
arXiv preprint arXiv:1908.11809, 2019
52019
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,775,873, 2020
42020
The system can't perform the operation now. Try again later.
Articles 1–20