Naveen Mellempudi

Cited by

	All	Since 2019
Citations	1170	1104
h-index	13	13
i10-index	14	14

300

150

225

201720182019202020212022202320249 50 107 164 213 255 294 68

Co-authors

Dipankar DasIntel Parallel Computing Labs, Intel LabsVerified email at intel.com
Dheevatsa MudigereDistinguished Engineer, NVIDIAVerified email at nvidia.com
Alexander HeineckeSenior Principal Engineer at Intel LabsVerified email at intel.com
Evangelos GeorganasIntel Labs, Parallel Computing LabVerified email at intel.com
Abhisek KunduResearch Scientist, Intel Parallel Computing Labs, IndiaVerified email at intel.com
Bharat KaulIntel LabsVerified email at intel.com
Pradeep DubeyIntel CorporationVerified email at intel.com
Srinivas Sridharan, PhdDistinguished Engineer, NVIDIAVerified email at nvidia.com
Sasikanth AvanchaVerified email at intel.com
Kunal BanerjeeData Science Foundation, WalmartVerified email at walmartlabs.com
Sudarshan SrinivasanIntelVerified email at intel.com
Mikhail SmelyanskiyFacebookVerified email at intel.com
Nataraj JammalamadakaPhd ScholarVerified email at research.iiit.ac.in
Yuxin BaiUniversity of Rochester, Apple IncVerified email at apple.com
Victor LeeGoogleVerified email at google.com
Jongsoo ParkResearch Scientist, FacebookVerified email at fb.com
Roman DubtsovNVIDIA CorporationVerified email at nvidia.com
Dusan StosicDL Architecture @ NVIDIAVerified email at nvidia.com

Naveen Mellempudi

Fellow, Advanced Micro Devices

Verified email at amd.com

Artificial Intelligence Computer Architecture


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019	308	2019
Mixed precision training of convolutional neural networks using integer operations D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ... arXiv preprint arXiv:1802.00930, 2018	187	2018
Ternary neural networks with fine-grained quantization N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey arXiv preprint arXiv:1705.01462, 2017	132	2017
Performing power management in a multicore processor VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ... US Patent 10,234,930, 2019	126	2019
Fp8 formats for deep learning P Micikevicius, D Stosic, N Burgess, M Cornea, P Dubey, R Grisenthwaite, ... arXiv preprint arXiv:2209.05433, 2022	78	2022
Mixed precision training with 8-bit floating point N Mellempudi, S Srinivasan, D Das, B Kaul arXiv preprint arXiv:1905.12334, 2019	69	2019
Dynamic precision management for integer deep learning primitives N Mellempudi, D Mudigere, D Das, S Sridharan US Patent 10,643,297, 2020	47	2020
Optimized compute hardware for machine learning operations D Das, R Gramunt, M Smelyanskiy, J Corbal, D Mudigere, NK Mellempudi, ... US Patent 10,776,699, 2020	45	2020
On scale-out deep learning training for cloud and hpc S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ... arXiv preprint arXiv:1801.08030, 2018	34	2018
Mixed low-precision deep learning inference using dynamic fixed point N Mellempudi, A Kundu, D Das, D Mudigere, B Kaul arXiv preprint arXiv:1701.08978, 2017	28	2017
Performing power management in a multicore processor VW Lee, D Kim, Y Bai, S Ji, S Li, DD Kalamkar, NK Mellempudi US Patent 9,910,481, 2018	22	2018
Incremental precision networks using residual inference and fine-grain quantization A Kundu, N Mellempudi, D Mudigere, D Das US Patent 11,556,772, 2023	18	2023
Ternary residual networks A Kundu, K Banerjee, N Mellempudi, D Mudigere, D Das, B Kaul, ... arXiv preprint arXiv:1707.04679, 2017	14	2017
Conversion hardware mechanism N Mellempudi, D Das, MEI Chunhui, K Wong, DD Kalamkar, HH Jiang, ... US Patent 11,494,163, 2022	13	2022
Dynamic precision management for integer deep learning primitives N Mellempudi, D Mudigere, D Das, S Sridharan US Patent 11,321,805, 2022	8	2022
Technologies for scaling deep learning training NK Mellempudi, S Sridharan, D Mudigere, D Das US Patent 11,068,780, 2021	5	2021
High performance scalable FPGA accelerator for deep neural networks S Srinivasan, P Janedula, S Dhoble, S Avancha, D Das, N Mellempudi, ... arXiv preprint arXiv:1908.11809, 2019	5	2019
Performing power management in a multicore processor VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ... US Patent 10,775,873, 2020	4	2020
K-tanh: Hardware efficient activations for deep learning A Kundu, S Srinivasan, EC Qin, D Kalamkar, NK Mellempudi, D Das, ... arXiv preprint arXiv:1909.07729, 2019	4	2019
Hardware apparatuses and methods relating to elemental register accesses V Lee, U Echeruo, G Chrysos, N Mellempudi US Patent 9,996,347, 2018	3	2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors