Supervision exists everywhere: A data efficient contrastive language-image pre-training paradigm Y Li, F Liang, L Zhao, Y Cui, W Ouyang, J Shao, F Yu, J Yan arXiv preprint arXiv:2110.05208, 2021 | 174 | 2021 |
3DVG-Transformer: Relation modeling for visual grounding on point clouds L Zhao, D Cai, L Sheng, D Xu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 52 | 2021 |
Transformer3D-Det: Improving 3D object detection by vote refinement L Zhao, J Guo, D Xu, L Sheng IEEE Transactions on Circuits and Systems for Video Technology 31 (12), 4735 …, 2021 | 34 | 2021 |
Democratizing contrastive language-image pre-training: A clip benchmark of data, model, and supervision Y Cui, L Zhao, F Liang, Y Li, J Shao arXiv preprint arXiv:2203.05796, 2022 | 19 | 2022 |
3djcg: A unified framework for joint dense captioning and visual grounding on 3d point clouds D Cai, L Zhao, J Zhang, L Sheng, D Xu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 19 | 2022 |
Towards Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline L Zhao, D Cai, J Zhang, L Sheng, D Xu, R Zheng, Y Zhao, L Wang, X Fan IEEE Transactions on Circuits and Systems for Video Technology, 2022 | 3 | 2022 |
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud Z Wang, B Cheng, L Zhao, D Xu, Y Tang, L Sheng Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 1 | 2023 |