Follow
Hassan Mansoor
Hassan Mansoor
Software Engineer Google DeepMind
Verified email at google.com
Title
Cited by
Cited by
Year
Rlaif: Scaling reinforcement learning from human feedback with ai feedback
H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard, J Ferret, C Bishop, ...
4032023
LLMs cannot find reasoning errors, but can correct them!
G Tyen, H Mansoor, P Chen, T Mak, V Cărbune
arXiv preprint arXiv:2311.08516, 2023
452023
Screenai: A vision-language model for ui and infographics understanding
G Baechler, S Sunkara, M Wang, F Zubach, H Mansoor, V Etter, ...
arXiv preprint arXiv:2402.04615, 2024
332024
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
H Lee, S Phatale, H Mansoor, T Mesnard, J Ferret, KR Lu, C Bishop, ...
Forty-first International Conference on Machine Learning, 0
23
Methods and systems for predicting conversion rates of content publisher and content provider pairs
R Kirillov, H Mansoor
US Patent 9,246,990, 2016
102016
RLAIF: Scaling reinforcement learning from human feedback with ai feedback, 2024
H Lee, S Phatale, H Mansoor, T Mesnard, J Ferret, K Lu, C Bishop, E Hall, ...
URL https://openreview. net/forum, 0
9
Methods and systems for providing an actionable object within a third-party content slot of an information resource of a content publisher
R Kirillov, A Tyler, D Banfield, H Mansoor, DM Goodridge, LA Collard
US Patent 10,067,916, 2018
62018
Methods and systems for providing an actionable object within a third-party content slot of an information resource of a content publisher
R Kirillov, A Tyler, D Banfield, H Mansoor, DM Goodridge, LA Collard
US Patent 9,461,936, 2016
62016
Chart-based reasoning: Transferring capabilities from llms to vlms
V Carbune, H Mansoor, F Liu, R Aralikatte, G Baechler, J Chen, A Sharma
arXiv preprint arXiv:2403.12596, 2024
32024
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
H Sidahmed, S Phatale, A Hutcheson, Z Lin, Z Chen, Z Yu, J Jin, ...
arXiv preprint arXiv:2403.10704, 2024
32024
VQA Training Sets are Self-play Environments for Generating Few-shot Pools
T Misiunas, H Mansoor, J Uijlings, O Riva, V Carbune
arXiv preprint arXiv:2405.19773, 2024
2024
The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization
S Gooding, H Mansoor
arXiv preprint arXiv:2311.04919, 2023
2023
Methods and systems for providing an actionable object within a third-party content slot of an information resource of a content publisher
R Kirillov, A Tyler, D Banfield, H Mansoor, DM Goodridge, LA Collard
US Patent 10,210,140, 2019
2019
The system can't perform the operation now. Try again later.
Articles 1–13