Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... The Journal of Machine Learning Research 21 (1), 5485-5551, 2020 | 7911 | 2020 |
Exploring the limits of transfer learning with a unified text-to-text transformer (2019) C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... arXiv preprint arXiv:1910.10683, 2021 | 80 | 2021 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 54 | 2021 |
Merging models with fisher-weighted averaging MS Matena, CA Raffel Advances in Neural Information Processing Systems 35, 17703-17716, 2022 | 29 | 2022 |
Exploring the limits of transfer learning with a unified text-to-text transformer A Roberts, C Raffel, K Lee, M Matena, N Shazeer, PJ Liu, S Narang, W Li, ... | 9 | 2019 |
A Combinatorial Perspective on the Optimization of Shallow ReLU Networks MS Matena, CA Raffel Advances in Neural Information Processing Systems 35, 22187-22198, 2022 | | 2022 |