Learning audio-visual speech representation by masked multimodal cluster prediction B Shi, WN Hsu, K Lakhotia, A Mohamed arXiv preprint arXiv:2201.02184, 2022 | 82 | 2022 |
Offloading guidelines for augmented reality applications on wearable devices B Shi, J Yang, Z Huang, P Hui Proceedings of the 23rd ACM international conference on Multimedia, 1271-1274, 2015 | 71 | 2015 |
American sign language fingerspelling recognition in the wild B Shi, AM Del Rio, J Keane, J Michaux, D Brentari, G Shakhnarovich, ... 2018 IEEE Spoken Language Technology Workshop (SLT), 145-152, 2018 | 60 | 2018 |
Few-shot acoustic event detection via meta learning B Shi, M Sun, KC Puvvada, CC Kao, S Matsoukas, C Wang ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 56 | 2020 |
Fingerspelling recognition in the wild with iterative visual attention B Shi, AMD Rio, J Keane, D Brentari, G Shakhnarovich, K Livescu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 45 | 2019 |
Robust self-supervised audio-visual speech recognition B Shi, WN Hsu, A Mohamed arXiv preprint arXiv:2201.01763, 2022 | 33 | 2022 |
A cross-task analysis of text span representations S Toshniwal, H Shi, B Shi, L Gao, K Livescu, K Gimpel arXiv preprint arXiv:2006.03866, 2020 | 31 | 2020 |
Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition B Shi, K Livescu 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017 | 18 | 2017 |
Fingerspelling detection in american sign language B Shi, D Brentari, G Shakhnarovich, K Livescu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 17 | 2021 |
Semi-supervised acoustic event detection based on tri-training B Shi, M Sun, CC Kao, V Rozgic, S Matsoukas, C Wang ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 16 | 2019 |
Compression of acoustic event detection models with low-rank matrix factorization and quantization training B Shi, M Sun, CC Kao, V Rozgic, S Matsoukas, C Wang arXiv preprint arXiv:1905.00855, 2019 | 15 | 2019 |
On the contributions of visual and textual supervision in low-resource semantic speech retrieval A Pasad, B Shi, H Kamper, K Livescu arXiv preprint arXiv:1904.10947, 2019 | 11 | 2019 |
Compression of acoustic event detection models with quantized distillation B Shi, M Sun, CC Kao, V Rozgic, S Matsoukas, C Wang arXiv preprint arXiv:1907.00873, 2019 | 10 | 2019 |
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality WN Hsu, B Shi Advances in Neural Information Processing Systems, 2022 | 6 | 2022 |
A joint framework for audio tagging and weakly supervised acoustic event detection using densenet with global average pooling CC Kao, B Shi, M Sun, C Wang arXiv preprint arXiv:2008.03350, 2020 | 6 | 2020 |
Open-domain sign language translation learned from online video B Shi, D Brentari, G Shakhnarovich, K Livescu arXiv preprint arXiv:2205.12870, 2022 | 5 | 2022 |
Whole-word segmental speech recognition with acoustic word embeddings B Shi, S Settle, K Livescu 2021 IEEE Spoken Language Technology Workshop (SLT), 164-171, 2021 | 5 | 2021 |
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement WN Hsu, T Remez, B Shi, J Donley, Y Adi arXiv preprint arXiv:2212.11377, 2022 | 4 | 2022 |
A Single Self-Supervised Model for Many Speech Modalities Enables Zero-Shot Modality Transfer WN Hsu, B Shi arXiv preprint arXiv:2207.07036, 2022 | 4 | 2022 |
Learning lip-based audio-visual speaker embeddings with av-hubert B Shi, A Mohamed, WN Hsu arXiv preprint arXiv:2205.07180, 2022 | 4 | 2022 |