Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes X Jia, S Song, W He, Y Wang, H Rong, F Zhou, L Xie, Z Guo, Y Yang, L Yu, ... arXiv preprint arXiv:1807.11205, 2018 | 422 | 2018 |
Bigdl: A distributed deep learning framework for big data JJ Dai, Y Wang, X Qiu, D Ding, Y Zhang, Y Wang, X Jia, CL Zhang, Y Wan, ... Proceedings of the ACM Symposium on Cloud Computing, 50-60, 2019 | 172 | 2019 |
M6: A chinese multimodal pretrainer J Lin, R Men, A Yang, C Zhou, M Ding, Y Zhang, P Wang, A Wang, ... arXiv preprint arXiv:2103.00823, 2021 | 126 | 2021 |
M6-t: Exploring sparse expert models and beyond A Yang, J Lin, R Men, C Zhou, L Jiang, X Jia, A Wang, J Zhang, J Wang, ... arXiv preprint arXiv:2105.15082, 2021 | 48 | 2021 |
Whale: Efficient giant model training over heterogeneous {GPUs} X Jia, L Jiang, A Wang, W Xiao, Z Shi, J Zhang, X Li, L Chen, Y Li, ... 2022 USENIX Annual Technical Conference (USENIX ATC 22), 673-688, 2022 | 30 | 2022 |
M6-10t: A sharing-delinking paradigm for efficient multi-trillion parameter pretraining J Lin, A Yang, J Bai, C Zhou, L Jiang, X Jia, A Wang, J Zhang, Y Li, W Lin, ... arXiv preprint arXiv:2110.03888, 2021 | 25 | 2021 |
Easytransfer: a simple and scalable deep transfer learning platform for NLP applications M Qiu, P Li, C Wang, H Pan, A Wang, C Chen, X Jia, Y Li, J Huang, D Cai, ... Proceedings of the 30th ACM international conference on information …, 2021 | 18 | 2021 |
Whale: Scaling deep learning model training to the trillions X Jia, AW Le Jiang, J Zhang, X Li, W Xiao, Y Li, Z Zheng, X Liu, W Lin arXiv preprint arXiv:2011.09208, 2020 | 7* | 2020 |
Target-oriented keyword search over temporal databases X Jia, W Hsu, ML Lee Database and Expert Systems Applications: 27th International Conference …, 2016 | 5 | 2016 |
Easyscale: Accuracy-consistent elastic training for deep learning M Li, W Xiao, B Sun, H Zhao, H Yang, S Ren, Z Luan, X Jia, Y Liu, Y Li, ... arXiv preprint arXiv:2208.14228, 2022 | 4 | 2022 |
TAP: Accelerating large-scale DNN training through tensor automatic parallelisation Z Shi, L Jiang, A Wang, J Zhang, X Jia, Y Li, C Wu, J Li, W Lin arXiv preprint arXiv:2302.00247, 2023 | 2 | 2023 |
Sentiment Analysis for Twitter: Going Beyond Tweet Text L Poddar, K Halder, X Jia arXiv preprint arXiv:1611.09441, 2016 | 2 | 2016 |
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs M Li, W Xiao, H Yang, B Sun, H Zhao, S Ren, Z Luan, X Jia, Y Liu, Y Li, ... Proceedings of the International Conference for High Performance Computing …, 2023 | 1 | 2023 |
TAP: Efficient Derivation of Tensor Parallel Plans for Large Neural Networks Z Shi, L Jiang, A Wang, J Zhang, X Jia, Y Li, C Wu, J Li, W Lin Architecture and System Support for Transformer Models (ASSYST@ ISCA 2023), 2023 | | 2023 |