Vladimir Mikulik

Cited by

	All	Since 2019
Citations	3017	3014
h-index	14	14
i10-index	15	15

960

480

240

720

2020202120222023202456 421 654 915 947

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Vladimir Mikulik

DeepMind

Verified email at google.com

AI Safety Interpretability NLP


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Inferring the effectiveness of government interventions against COVID-19 JM Brauner, S Mindermann, M Sharma, D Johnston, J Salvatier, ... Science 371 (6531), eabd9338, 2021	1017	2021
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021	776	2021
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	568	2023
Teaching language models to support answers with verified quotes J Menick, M Trebacz, V Mikulik, J Aslanides, F Song, M Chadwick, ... arXiv preprint arXiv:2203.11147, 2022	148	2022
Alignment of language agents Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving arXiv preprint arXiv:2103.14659, 2021	115	2021
Risks from learned optimization in advanced machine learning systems E Hubinger, C van Merwijk, V Mikulik, J Skalse, S Garrabrant arXiv preprint arXiv:1906.01820, 2019	101	2019
Specification gaming: the flip side of AI ingenuity V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ... DeepMind Blog 3, 2020	90	2020
Meta-trained agents implement Bayes-optimal agents V Mikulik, G Delétang, T McGrath, T Genewein, M Martic, S Legg, ... Advances in Neural Information Processing Systems 33, 2020	36	2020
The effectiveness and perceived burden of nonpharmaceutical interventions against COVID-19 transmission: a modelling study with 41 countries JM Brauner, S Mindermann, M Sharma, AB Stephenson, T Gavenčiak, ... medRxiv, 2020.05. 28.20116129, 2020	33	2020
Tracr: Compiled transformers as a laboratory for interpretability D Lindner, J Kramár, S Farquhar, M Rahtz, T McGrath, V Mikulik Advances in Neural Information Processing Systems 36, 2024	32	2024
Does circuit analysis interpretability scale? evidence from multiple choice capabilities in chinchilla T Lieberum, M Rahtz, J Kramár, G Irving, R Shah, V Mikulik arXiv preprint arXiv:2307.09458, 2023	27	2023
Neural networks are a priori biased towards boolean functions with low entropy C Mingard, J Skalse, G Valle-Pérez, D Martínez-Rubio, V Mikulik, ... arXiv preprint arXiv:1909.11522, 2019	24	2019
The hydra effect: Emergent self-repair in language model computations T McGrath, M Rahtz, J Kramar, V Mikulik, S Legg arXiv preprint arXiv:2307.15771, 2023	18	2023
Scaling Language Models: Methods JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ... Analysis & Insights from Training Gopher. arXiv, 2021	18	2021
Causal analysis of agent behavior for ai safety G Déletang, J Grau-Moya, M Martic, T Genewein, T McGrath, V Mikulik, ... arXiv preprint arXiv:2103.03938, 2021	10	2021
Challenges with unsupervised LLM knowledge discovery S Farquhar, V Varma, Z Kenton, J Gasteiger, V Mikulik, R Shah arXiv preprint arXiv:2312.10029, 2023	4	2023

The system can't perform the operation now. Try again later.

Articles 1–16

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by