Bootstrapping fitted q-evaluation for off-policy inference B Hao, X Ji, Y Duan, H Lu, C Szepesvari, M Wang International Conference on Machine Learning, 4074-4084, 2021 | 31 | 2021 |
Sample complexity of nonparametric off-policy evaluation on low-dimensional manifolds using deep networks X Ji, M Chen, M Wang, T Zhao The Eleventh International Conference on Learning Representations 2023, 2023 | 18 | 2023 |
Bootstrapping Statistical Inference for Off-Policy Evaluation B Hao, X Ji, Y Duan, H Lu, C Szepesvári, M Wang arXiv preprint arXiv:2102.03607, 2021 | 16 | 2021 |
Urban bike lane planning with bike trajectories: Models, algorithms, and a real-world case study S Liu, ZJM Shen, X Ji Manufacturing & Service Operations Management 24 (5), 2500-2515, 2022 | 15 | 2022 |
Provable benefits of policy learning from human preferences in contextual bandit problems X Ji, H Wang, M Chen, T Zhao, M Wang arXiv preprint arXiv:2307.12975, 2023 | 5 | 2023 |
Optimal estimation of policy gradient via double fitted iteration C Ni, R Zhang, X Ji, X Zhang, M Wang International Conference on Machine Learning, 16724-16783, 2022 | 5* | 2022 |
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time X Ji, G Li Conference on Neural Information Processing Systems (2023), 2023 | 2 | 2023 |
Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis Z Li, X Ji, M Chen, M Wang International Conference on Artificial Intelligence and Statistics, 2737-2745, 2024 | | 2024 |
Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks Z Li, X Ji, M Chen, M Wang arXiv preprint arXiv:2310.10556, 2023 | | 2023 |
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds Z Xu, X Ji, M Chen, M Wang, T Zhao arXiv preprint arXiv:2309.13915, 2023 | | 2023 |