Owain Evans

Cited by

	All	Since 2019
Citations	5615	5144
h-index	19	18
i10-index	25	24

1600

800

400

1200

2014201520162017201820192020202120222023202418 14 26 67 247 379 463 544 686 1534 1528

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Katja GraceAI ImpactsVerified email at intelligence.org
Andreas StuhlmüllerElicitVerified email at elicit.com
Jacob HiltonAlignment Research CenterVerified email at alignment.org
Stephanie LinResearch Scholar, University of OxfordVerified email at philosophy.ox.ac.uk
Noah D. GoodmanStanford UniversityVerified email at stanford.edu
William SaundersOpenAIVerified email at cs.toronto.edu
Joshua B. TenenbaumMITVerified email at mit.edu
Jacob SteinhardtStanford UniversityVerified email at cs.stanford.edu
Andrew IlyasMassachusetts Institute of TechnologyVerified email at mit.edu
Mihaela CurmeiBerkeleyVerified email at berkeley.edu
Yarin GalAssociate Professor, University of OxfordVerified email at cs.ox.ac.uk
David AbelResearch Scientist, DeepMindVerified email at deepmind.com
Zachary KentonGoogle DeepMindVerified email at google.com
David Scott KruegerUniversity Assistant Professor, University of CambridgeVerified email at cam.ac.uk
Jan LeikeOpenAIVerified email at openai.com

Owain Evans

Research Associate, University of Oxford

Verified email at philosophy.ox.ac.uk - Homepage

AI alignment Artificial Intelligence Machine Learning AI safety Truthful AI


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
The malicious use of artificial intelligence: Forecasting, prevention, and mitigation M Brundage, S Avin, J Clark, H Toner, P Eckersley, B Garfinkel, A Dafoe, ... arXiv preprint arXiv:1802.07228, 2018	1169*	2018
When will AI exceed human performance? Evidence from AI experts K Grace, J Salvatier, A Dafoe, B Zhang, O Evans Journal of Artificial Intelligence Research 62, 729-754, 2018	1083*	2018
Truthfulqa: Measuring how models mimic human falsehoods S Lin, J Hilton, O Evans arXiv preprint arXiv:2109.07958, 2021	938	2021
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022	905	2022
Trial without error: Towards safe reinforcement learning via human intervention W Saunders, G Sastry, A Stuhlmueller, O Evans arXiv preprint arXiv:1707.05173, 2017	304	2017
Help or hinder: Bayesian models of social goal inference T Ullman, C Baker, O Macindoe, O Evans, N Goodman, J Tenenbaum Advances in neural information processing systems 22, 2009	222	2009
Teaching models to express their uncertainty in words S Lin, J Hilton, O Evans arXiv preprint arXiv:2205.14334, 2022	180	2022
Learning the Preferences of Ignorant, Inconsistent Agents O Evans, A Stuhlmüller, ND Goodman Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-2016), 2016	138	2016
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" L Berglund, M Tong, M Kaufmann, M Balesni, AC Stickland, T Korbak, ... arXiv preprint arXiv:2309.12288, 2023	125*	2023
Truthful AI: Developing and governing AI that does not lie O Evans, O Cotton-Barratt, L Finnveden, A Bales, A Balwit, P Wills, ... arXiv preprint arXiv:2110.06674, 2021	84	2021
Agent-Agnostic Human-in-the-Loop Reinforcement Learning D Abel, J Salvatier, A Stuhlmüller, O Evans arXiv:1701.0407, 2017	80	2017
AI progress measurement P Eckersley, Y Nasser, Y Bayle, O Evans, G Gebhart, D Schwenk Electronic Frontier Foundation, 2017	51*	2017
Constructing and adjusting estimates for household transmission of SARS-CoV-2 from prior studies, widespread-testing and contact-tracing data M Curmei, A Ilyas, O Evans, J Steinhardt International Journal of Epidemiology 50 (5), 1444-1457, 2021	40*	2021
Active Reinforcement Learning: Observing Rewards at a Cost D Krueger, J Leike, O Evans, J Salvatier NIPS 2016 Workshop, 2016	36*	2016
Learning the Preferences of Bounded Agents O Evans, A Stuhlmüller, ND Goodman Advances in Neural Information Processing Systems (Bounded Optimality Workshop), 2015	36	2015
Taken out of context: On measuring situational awareness in LLMs L Berglund, AC Stickland, M Balesni, M Kaufmann, M Tong, T Korbak, ... arXiv preprint arXiv:2309.00667, 2023	29*	2023
Modeling Agents with Probabilistic Programs O Evans, A Stuhlmüller, J Salvatier, D Filan agentmodels.org, 2017	28*	2017
How to catch an ai liar: Lie detection in black-box llms by asking unrelated questions L Pacchiardi, AJ Chan, S Mindermann, I Moscovitz, AY Pan, Y Gal, ... arXiv preprint arXiv:2309.15840, 2023	25	2023
Learning structured preferences O Evans, L Bergen, JB Tenenbaum Proceedings of the 32nd annual conference of the cognitive science society, 2010	21*	2010
Modelling the Health and Economic Impacts of Population-Wide Testing, Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK T Colbourn, W Waites, J Panovska-Griffiths, D Manheim, S Sturniolo, ... SSRN, 2020	17*	2020

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors