Multiscale vision transformers H Fan*, B Xiong*, K Mangalam*, Y Li*, Z Yan, J Malik, C Feichtenhofer* IEEE Conference on Computer Vision and Pattern Recognition, 2021 | 241 | 2021 |
It is not the journey but the destination: Endpoint conditioned trajectory prediction K Mangalam, H Girase, S Agarwal, KH Lee, E Adeli, J Malik, A Gaidon European Conference on Computer Vision, 2020 | 154 | 2020 |
Future person localization in first-person videos T Yagi, K Mangalam, R Yonetani, Y Sato IEEE Conference on Computer Vision and Pattern Recognition, 2018 | 133 | 2018 |
Long-term human motion prediction with scene context Z Cao, H Gao, K Mangalam, QZ Cai, M Vo, J Malik European Conference on Computer Vision, 2020 | 76 | 2020 |
Ego4d: Around the world in 3,000 hours of egocentric video K Grauman, A Westbury, E Byrne, Z Chavis, A Furnari, R Girdhar, ... IEEE Conference on Computer Vision and Pattern Recognition, 2022 | 45 | 2022 |
From goals, waypoints & paths to long term human trajectory forecasting K Mangalam, Y An, H Girase, J Malik IEEE International Conference on Computer Vision, 2021 | 44 | 2021 |
Disentangling human dynamics for pedestrian locomotion forecasting with noisy supervision K Mangalam, E Adeli, KH Lee, A Gaidon, JC Niebles IEEE Winter Conference on Applications of Computer Vision, 2020 | 31 | 2020 |
Improved Multiscale Vision Transformers for Classification and Detection Y Li*, CY Wu*, H Fan, K Mangalam, B Xiong, J Malik, C Feichtenhofer* IEEE Conference on Computer Vision and Pattern Recognition, 2022 | 30 | 2022 |
Do deep neural networks learn shallow learnable examples first? K Mangalam, VU Prabhu Understanding Deep Phenomena, International Conference on Machine Learning, 2019 | 24 | 2019 |
Learning spontaneity to improve emotion recognition in speech K Mangalam, T Guha Interspeech, 2018 | 12 | 2018 |
LOKI: Long Term and Key Intentions for Trajectory Prediction H Girase, H Gang, S Malla, J Li, A Kanehara, K Mangalam, C Choi IEEE International Conference on Computer Vision, 2021 | 8 | 2021 |
On compressing u-net using knowledge distillation K Mangalam, M Salzamann arXiv preprint arXiv:1812.00249, 2018 | 8 | 2018 |
Multiscale vision transformers. arXiv 2021 H Fan, B Xiong, K Mangalam, Y Li, Z Yan, J Malik, C Feichtenhofer arXiv preprint arXiv:2104.11227, 0 | 8 | |
Object-region video transformers R Herzig, E Ben-Avraham, K Mangalam, A Bar, G Chechik, A Rohrbach, ... IEEE Conference on Computer Vision and Pattern Recognition, 2022 | 7 | 2022 |
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition CY Wu*, Y Li*, K Mangalam, H Fan, B Xiong, J Malik, C Feichtenhofer* arXiv preprint arXiv:2201.08383, 2022 | 6 | 2022 |
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection Y Li, CY Wu, H Fan, K Mangalam, B Xiong, J Malik, C Feichtenhofer Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 5 | 2022 |
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition S Kim, A Gholami, A Shaw, N Lee, K Mangalam, J Malik, MW Mahoney, ... arXiv preprint arXiv:2206.00888, 2022 | 1 | 2022 |
Structured Video Tokens@ Ego4D PNR Temporal Localization Challenge 2022 E Ben-Avraham, R Herzig, K Mangalam, A Bar, A Rohrbach, L Karlinsky, ... arXiv preprint arXiv:2206.07689, 2022 | | 2022 |
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens E Ben-Avraham, R Herzig, K Mangalam, A Bar, A Rohrbach, L Karlinsky, ... arXiv preprint arXiv:2206.06346, 2022 | | 2022 |
Reversible Vision Transformers K Mangalam, H Fan, Y Li, CY Wu, B Xiong, C Feichtenhofer, J Malik Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | | 2022 |