Zhaohan Daniel Guo
Zhaohan Daniel Guo
DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Joint semantic utterance classification and slot filling with recursive neural networks
D Guo, G Tur, W Yih, G Zweig
2014 IEEE Spoken Language Technology Workshop (SLT), 554-559, 2014
1082014
Bootstrap your own latent-a new approach to self-supervised learning
JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ...
Advances in Neural Information Processing Systems 33, 2020
522020
Agent57: Outperforming the atari human benchmark
AP Badia, B Piot, S Kapturowski, P Sprechmann, A Vitvitskyi, D Guo, ...
arXiv preprint arXiv:2003.13350, 2020
402020
Using options and covariance testing for long horizon off-policy policy evaluation
Z Guo, PS Thomas, E Brunskill
Advances in Neural Information Processing Systems, 2492-2501, 2017
252017
Neural predictive belief representations
ZD Guo, MG Azar, B Piot, BA Pires, R Munos
arXiv preprint arXiv:1811.06407, 2018
192018
Concurrent PAC RL.
Z Guo, E Brunskill
AAAI, 2624-2630, 2015
192015
Never Give Up: Learning Directed Exploration Strategies
AP Badia, P Sprechmann, A Vitvitskyi, D Guo, B Piot, S Kapturowski, ...
arXiv preprint arXiv:2002.06038, 2020
142020
A PAC RL algorithm for episodic POMDPs
ZD Guo, S Doroudi, E Brunskill
Artificial Intelligence and Statistics, 510-518, 2016
142016
Pac continuous state online multitask reinforcement learning with identification
Y Liu, Z Guo, E Brunskill
Proceedings of the 2016 International Conference on Autonomous Agents …, 2016
72016
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
D Guo, BA Pires, B Piot, J Grill, F Altché, R Munos, MG Azar
arXiv preprint arXiv:2004.14646, 2020
62020
Sample efficient feature selection for factored mdps
ZD Guo, E Brunskill
arXiv preprint arXiv:1703.03454, 2017
62017
Never give up: Learning directed exploration strategies
A Puigdomènech Badia, P Sprechmann, A Vitvitskyi, D Guo, B Piot, ...
arXiv, arXiv: 2002.06038, 2020
32020
Sample Efficient Learning with Feature Selection for Factored MDPs
ZD Guo, E Brunskill
EWRL, 2018
22018
Using options for long-horizon off-policy evaluation
ZD Guo, PS Thomas, E Brunskill
arXiv preprint arXiv:1703.03453, 2017
22017
Directed exploration for reinforcement learning
ZD Guo, E Brunskill
arXiv preprint arXiv:1906.07805, 2019
12019
Directed Exploration for Improved Sample Efficiency in Reinforcement Learning
ZD Guo
Google, 2019
12019
Agent57: Outperforming the Atari Human Benchmark
A Puigdomènech Badia, B Piot, S Kapturowski, P Sprechmann, ...
arXiv, arXiv: 2003.13350, 2020
2020
The system can't perform the operation now. Try again later.
Articles 1–17