Max Bain

Biography

Research Scientist at Google DeepMind.
Previously, I was a Member of Technical Staff at Reka.
Before that, I completed my PhD at VGG, University of Oxford, under the supervision of Prof. A. Zisserman.
But above all, I'm just trying to be useful.

Research Artefacts

2025

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities.
Gemini Team, Google
Technical report & product, 2025.
[Paper] [AI Studio]

2024

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Reka Team
Technical report & product, 2024.
[Paper] [Chat] [Showcase] [Blog]

Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Reka Team (Piotr Padlewski*, Max Bain* et al.)
Technical report, 2024.
[Paper] [Code] [Dataset] [Blog]

AutoAD III: The Prequel - Back to the Pixels
Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
CVPR, 2024.
[Paper] [Code]

AutoAD-Zero: A training-free framework for zero-shot audio description
J Xie, T Han, M Bain, A Nagrani, G Varol, W Xie, A Zisserman
ACCV, 2024.
[Paper] [Code]

2023

Understanding Video Through the Lens of Language
M. Bain
Doctoral Thesis, 2023.
[Thesis]

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
B. Smith*, M. Farinha*, S. M. Hall, H. R. Kirk^†, A. Shtedritski^†, M. Bain^†
Technical report, 2023.
[Paper] [Code]

AutoAD II: The Sequel – Who, When, and What in Movie Audio Description
Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
ICCV, 2023.
[Paper] [Code]

WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain, Jaesung Huh, Tengda Han, Andrew Zisserman
Interspeech, 2023.
[Paper] [Code]

AutoAD: Movie Description in Context
Tengda Han*, Max Bain*, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
CVPR, 2023. [Highlight]
[Paper] [Code]

2022

A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning
H. Berg, S. Hall, Y. Bhalgat, W. Yang, H. R. Kirk, A. Shtedritski, M. Bain
AACL, 2022.
[Paper] [Code]

The CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain, Arsha Nagrani, Gül Varol, Andrew Zisserman
Technical report , 2022.
[Paper] [Code]

2021

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain, Arsha Nagrani, Gül Varol, Andrew Zisserman
ICCV, 2021.
[Paper] [Code] [Project] [Dataset] [Demo]

Automated Audiovisual Behaviour Recognition in Wild Primates
M. Bain, A. Nagrani, D. Schofield, S. Berdugo, J. Bessa, J. Owen, K. J. Hockings, T. Matsuzawa, M. Hayashi, D. Biro, S. Carvalho, A. Zisserman
Science advances, 2021.
[Paper] [Press]

2020

Condensed Movies: Story Based Retrieval with Contextual Embeddings
Max Bain, Arsha Nagrani, Gül Varol, Andrew Zisserman
ACCV, 2020. [Oral]
[Paper] [Code] [Challenge]

2019

Count, Crop and Recognise: Fine-Grained Recognition in the Wild
Max Bain, Arsha Nagrani, Daniel Schofield, Andrew Zisserman
ICCVW, 2019. [Oral]
[Paper]

Mood

Berserk (1997)