Anelia Angelova

Anelia Angelova is a research scientist in the area of computer vision. She leads the Robot Vision research team in Brain Robotics at Google Brain. Her most recent research focuses on deep learning for robotics perception, including semantic and 3D scene understanding and real-time algorithms for pedestrian detection and robot grasp localization. She has integrated her work in production systems, including the first deep neural network models running onboard Google's self-driving car, now Waymo. Anelia received her MS and PhD degrees in Computer Science from California Institute of Technology.

Research Areas

Authored Publications

Google Publications

Other Publications

PaLI-X: On Scaling up a Multilingual Vision and Language Model

Xi Chen

Josip Djolonga

Piotr Padlewski

Basil Mustafa

Beer Changpinyo

Jialin Wu

Carlos Riquelme

Sebastian Goodman

Xiao Wang

Yi Tay

Siamak Shakeri

Mostafa Dehghani

Daniel Salz

Mario Lučić

Michael Tschannen

Arsha Nagrani

Hexiang (Frank) Hu

Mandar Joshi

Bo Pang

Ceslee Montgomery

Paulina Pietrzyk

Marvin Ritter

AJ Piergiovanni

Matthias Minderer

Filip Pavetić

Austin Waters

Gang Li

Ibrahim Alabdulmohsin

Lucas Beyer

Julien Amelot

Kenton Lee

Andreas Steiner

Yang Li

Daniel Keysers

Anurag Arnab

Yuanzhong Xu

Keran Rong

Alexander Kolesnikov

Mojtaba Seyedhosseini

Anelia Angelova

Xiaohua Zhai

Neil Houlsby

Radu Soricut

Computer Vision and Pattern Recognition Conference (CVPR) (2024)

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Xi Chen

Xiao Wang

Soravit Changpinyo

AJ Piergiovanni

Piotr Padlewski

Daniel Salz

Sebastian Alexander Goodman

Adam Grycner

Basil Mustafa

Lucas Beyer

Alexander Kolesnikov

Joan Puigcerver

Nan Ding

Keran Rong

Hassan Akbari

Gaurav Mishra

Linting Xue

Ashish Thapliyal

James Bradbury

Weicheng Kuo

Mojtaba Seyedhosseini

Chao Jia

Burcu Karagol Ayan

Carlos Riquelme

Andreas Steiner

Anelia Angelova

Xiaohua Zhai

Neil Houlsby

Radu Soricut

International Conference on Learning Representations (ICLR) (2023)

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

Weicheng Kuo

Yin Cui

Xiuye Gu

AJ Piergiovanni

Anelia Angelova

ICLR (2023)

Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects

Huang Huang

Letian Fu

Michael Danielczuk

Chung Min Kim

Zachary Tam

Jeff Ichnowski

Anelia Angelova

Brian Ichter

Ken Goldberg

The International Symposium of Robotics Research (ISRR) (2023)

Joint Adaptive Representations for Image-Language Learning

AJ Piergiovanni

Anelia Angelova

Transformers for Vision (T4V) Workshop at the Conference on Computer Vision and Pattern Recognition (CVPR) (2023)

Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

AJ Piergiovanni

Weicheng Kuo

Anelia Angelova

CVPR (2023)

Dynamic Pre-training of Vision-Language Models

AJ Piergiovanni

Weicheng Kuo

Wei Li

Anelia Angelova

ICLR 2023 Workshop on Multimodal Representation Learning (2023)

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Dahun Kim

Anelia Angelova

Weicheng Kuo

Conference on Computer Vision and Pattern Recognition (CVPR) (2023)

Diversifying Joint Vision-Language Tokenization Learning

Vardaan Pahuja

AJ Piergiovanni

Anelia Angelova

Transformers for Vision (T4V) Workshop at the Conference on Computer Vision and Pattern Recognition (CVPR) (2023)

MaMMUT: A Simple Vision-Encoder Text-Decoder Architecture for MultiModal Tasks

Weicheng Kuo

AJ Piergiovanni

Dahun Kim

Xiyang Luo

Ben Caine

Wei Li

Abhijit Ogale

Luowei Zhou

Andrew Dai

Zhifeng Chen

Claire Cui

Anelia Angelova

Transactions on Machine Learning Research (2023)

Learning Open-World Object Proposals without Learning to Classify

Dahun Kim

Tsung-Yi Lin

Anelia Angelova

In So Kweon

Weicheng Kuo

Robotics and Automation Letters (RA-L) Journal and International Conference on Robotics and Automation (ICRA) (2022)

FindIt: Generalized Localization with Natural Language Queries

Weicheng Kuo

Fred Bertsch

Wei Li

AJ Piergiovanni

Mohammad Taghi Saffar

Anelia Angelova

European Conference on Computer Vision (ECCV) (2022)

Answer-Me: Multi-Task Open-Vocabulary Learning for Visual Question-Answering

AJ Piergiovanni

Wei Li

Weicheng Kuo

Mohammad Taghi Saffar

Fred Bertsch

Anelia Angelova

CVPR Workshop (2022)

Efficient Adaptive Image-Language Learning for Visual Question Answering

AJ Piergiovanni

Weicheng Kuo

Anelia Angelova

CVPR Workshop on Transformers for Vision (T4V) (2022)

Mechanical Search on Shelves using a Novel “Bluction” Tool

Huang Huang

Michael Danielczuk

Chung Min Kim

Letian Fu

Zachary Tam

Jeff Ichnowski

Anelia Angelova

Brian Andrew Ichter

Ken Goldberg

International Conference on Robotics and Automation (ICRA) (2022) (to appear)

Video Question Answering with Iterative Video-Text Co-Tokenization

AJ Piergiovanni

Kairo Morton

Weicheng Kuo

Michael Ryoo

Anelia Angelova

European Conference on Computer Vision (ECCV) (2022)

Pre-training image-language transformers for open-vocabulary tasks

AJ Piergiovanni

Weicheng Kuo

Anelia Angelova

Transformers for Vision Workshop, CVPR (2022)

Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval from a Single Image

Weicheng Kuo

Anelia Angelova

Tsung-Yi Lin

Angela Dai

International Conference on Computer Vision (ICCV) (2021)

4D-Net for Learned Multi-Modal Alignment

AJ Piergiovanni

Vincent Casser

Michael Ryoo

Anelia Angelova

International Conference on Computer Vision (ICCV) (2021)

Mechanical Search on Shelves using LAX-RAY: Lateral Access X-RAY

Huang Huang

Marcus Dominguez-Kuhne

Vishal Satish

Michael Danielczuk

Kate Sanders

Jeff Ichnowski

Andrew Lee

Anelia Angelova

Vincent Olivier Vanhoucke

Ken Goldberg

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2021)

Tiny Video Networks

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

Applied AI Letters Journal (2021)

SMURF: Self-Teaching Multi-Frame Unsupervised RAFT with Full-Image Warping

Austin Stone

Daniel Maurer

Alper Ayvaci

Anelia Angelova

Rico Jonschkowski

Computer Vision and Pattern Recognition (CVPR) (2021)

Unsupervised Discovery of Actions in Instructional Videos

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

Irfan Essa

British Machine Vision Conference (BMVC) (2021)

TokenLearner: Adaptive Space-Time Tokenization for Videos

Michael Ryoo

AJ Piergiovanni

Anurag Arnab

Mostafa Dehghani

Anelia Angelova

Conference on Neural Information Processing Systems (NeurIPS) (2021)

Adaptive Intermediate Representations for Video Understanding

Juhana Kangaspunta

AJ Piergiovanni

Rico Jonschkowski

Michael Ryoo

Anelia Angelova

MUltimodal Learning and Applications (MULA) Workshop, CVPR (2021)

Taskology: Utilizing Task Relations at Scale

Yao Lu

Sören Pirk

Jan Dlabal

Anthony Brohan

Ankita Pasad

Zhao Chen

Vincent Casser

Anelia Angelova

Ariel Gordon

Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

Visionary: Vision Architecture Discovery for Robot Learning

Iretiayo Akinola

Anelia Angelova

Yao Lu

Yevgen Chebotar

Dmitry Kalashnikov

Jake Varley

Julian Ibarz

Michael Ryoo

International Conference on Robotics and Automation (ICRA) (2021)

Unsupervised Action Segmentation for Instructional Videos

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

Irfan Essa

Learning from Unlabeled Videos (LUV) Workshop, CVPR (2021)

Unsupervised Monocular Depth Learning in Dynamic Scenes

Hanhan Li

Ariel Gordon

Hang Zhao

Vincent Casser

Anelia Angelova

Conference on Robot Learning (CoRL) (2020)

Evolving Losses for Unsupervised Video Representation Learning

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

CVPR (2020)

Semantically-Agnostic Unsupervised Monocular Depth Learning in Dynamic Scenes

Hanhan Li

Ariel Gordon

Hang Zhao

Vincent Casser

Anelia Angelova

Workshop on Perception for Autonomous Driving, ECCV 2020 (2020)

Tiny Video Networks: Architecture Search for Efficient Video Models

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

ICML Workshop on Automated Machine Learning (AutoML) (2020)

AssembleNet++: Assembling Modality Representations via Attention Connectivity

Michael Ryoo

AJ Piergiovanni

Juhana Kangaspunta

Anelia Angelova

European Conference on Computer Vision (ECCV) (2020)

Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve

Weicheng Kuo

Anelia Angelova

Tsung-Yi Lin

Angela Dai

ECCV (2020)

Differentiable Mapping Networks: Learning Structured Map Representations for Sparse Visual Localization

Peter Karkus

Anelia Angelova

Vincent Vanhoucke

Rico Jonschkowski

International Conference on Robotics and Automation (ICRA) (2020)

AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures

Michael Ryoo

AJ Piergiovanni

Mingxing Tan

Anelia Angelova

ICLR 2020 (to appear)

Differentiable Grammars for Videos

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

AAAI (2020)

KeyPose: Multi-View 3D Labeling and Keypoint Estimationfor Transparent Objects

Xingyu Liu

Rico Jonschkowski

Anelia Angelova

Kurt Konolige

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

X-Ray: Mechanical Search for an Occluded Object by Minimizing Support of Learned Occupancy Distributions

Michael Danielczuk

Anelia Angelova

Vincent Olivier Vanhoucke

Ken Goldberg

International Conference on Intelligent Robots and Systems (IROS) (2020)

Probabilistic Object Detection: Definition and Evaluation

David Hall

Feras Dayoub

John Skinner

Haoyang Zhang

Dimity Miller

Peter Corke

Gustavo Carneiro

Anelia Angelova

Niko Suenderhauf

WACV (2020)

Improving Semantic Segmentation through Spatio-Temporal Consistency Learned from Videos

Ankita Pasad

Ariel Gordon

Tsung-Yi Lin

Anelia Angelova

CVPR 2020 Workshop on Learning from Unlabeled Videos (2020) (to appear)

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

Xiaofang Wang

Xuehan Xiong

Maxim Neumann

AJ Piergiovanni

Michael Ryoo

Anelia Angelova

Kris Kitani

Wei Hua

European Conference on Computer Vision (ECCV) (2020) (to appear)

What Matters in Unsupervised Optical Flow

Rico Jonschkowski

Austin Stone

Jon Barron

Ariel Gordon

Kurt Konolige

Anelia Angelova

ECCV (2020)

Adversarial Generative Grammars for Human Activity Prediction

AJ Piergiovanni

Anelia Angelova

Alexander Toshev

Michael Ryoo

European Conference on Computer Vision (ECCV) (2020)

ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors

Weicheng Kuo

Anelia Angelova

Jitendra Malik

Tsung-Yi Lin

International Conference on Computer Vision (ICCV) (2019)

Unsupervised monocular depth and ego-motion learning with structure and semantics

Vincent Michael Casser

Soeren Pirk

Reza Mahjourian

Anelia Angelova

CVPR Workshop on Visual Odometry & Computer Vision Applications Based on Location Clues (2019)

Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos

Vincent Michael Casser

Soeren Pirk

Reza Mahjourian

Anelia Angelova

AAAI 2019 (2019)

OnboardDepth: Depth Prediction for Onboard Systems

Anelia Angelova

Devesh Yamparala

Justin Vincent

Chris Leger

European Conference on Mobile Robots (ECMR) (2019)

Evolving Space-Time Neural Architectures for Videos

AJ Piergiovanni

Anelia Angelova

Alexander Toshev

Michael Ryoo

International Conference on Computer Vision (ICCV) (2019)

Differentiable Mapping Networks: Learning Task-Oriented Latent Maps with Spatial Structure

Peter Karkus

Anelia Angelova

Rico Jonschkowski

Perception as Generative Reasoning Workshop, NeurIPS 2019

Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras

Ariel Gordon

Hanhan Li

Rico Jonschkowski

Anelia Angelova

The IEEE International Conference on Computer Vision (ICCV) (2019)

Evolving Losses for Unlabeled Video Representation Learning

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

CVPR 2019 Workshop on Learning from Unlabeled Videos (2019)

Learning Differentiable Grammars for Videos

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

Bay Area Machine Learning Symposium (BayLearn) (2019)

Evolving Losses for Video Representation Learning

AJ Piergiovanni

Anelia Angelova

Michael Ryoo

Bay Area Machine Learning Symposium (BayLearn) (2019)

Unsupervised Monocular Depth and Ego-motion Learning with Structure and Semantics

Vincent Casser

Soeren Pirk

Reza Mahjourian

Anelia Angelova

CVPR Workshop on Visual Odometry & Computer Vision Applications Based on Location Clues (VOCVALC) (2019)

EvaNet: A Family of Diverse, Fast and Accurate Video Architectures

AJ Piergiovanni

Anelia Angelova

Alexander Toshev

Michael Ryoo

Bay Area Machine Learning Symposium (BayLearn) (2019)

Evolving Space-Time Neural Architectures for Videos

AJ Piergiovanni

Anelia Angelova

Alexander Toshev

Michael Ryoo

CoRR (2018)

Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints

Reza Mahjourian

Martin Wicke

Anelia Angelova

CVPR (2018)

Future Semantic Segmentation Using 3D Structure

Suhani Vora

Reza Mahjourian

Soeren Pirk

Anelia Angelova

CoRR (2018)

Future Semantic Segmentation Leveraging 3D Information

Suhani Vora

Reza Mahjourian

Soeren Pirk

Anelia Angelova

ECCV 3D Reconstruction meets Semantics Workshop (2018)

Object category learning and retrieval with weak supervision

Steven Hickson

Anelia Angelova

Irfan Essa

Rahul Sukthankar

NIPS Workshop on Learning With Limited Labeled Data (2017)

Geometry-Based Next Frame Prediction from Monocular Video

Reza Mahjourian

Martin Wicke

Anelia Angelova

Intelligent Vehicles Symposium (2017)

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

Michael Gygli

Mohammad Norouzi

Anelia Angelova

ICML (2017)

Unsupervised deep clustering for semantic object retrieval

Steven Hickson

Anelia Angelova

Irfan Essa

Rahul Sukthankar

Baylearn, http://www.baylearn.org/ (2017)

Learning with Proxy Supervision for End-To-End Visual Learning

Jiří Čermák

Anelia Angelova

Deep Learning for Vehicle Perception Workshop, Intelligent Vehicles Symposium (2017)

Improved generator objectives for GANs

Ben Poole

Alex Alemi

Jascha Sohl-dickstein

Anelia Angelova

NIPS Workshop on Adversarial Learning (2016)

Real-Time Pedestrian Detection With Deep Network Cascades

Anelia Angelova

Alex Krizhevsky

Vincent Vanhoucke

Abhijit Ogale

Dave Ferguson

Proceedings of BMVC 2015

Real-Time Grasp Detection Using Convolutional Neural Networks

Joseph Redmon

Anelia Angelova

International Conference on Robotics and Automation (ICRA), IEEE (2015)

Pedestrian Detection with a Large-Field-Of-View Deep Network

Anelia Angelova

Alex Krizhevsky

Vincent Vanhoucke

Proceedings of ICRA 2015

Object Recognition from Short Videos for Robotic Perception

Ivan Bogun

Anelia Angelova

Navdeep Jaitly

CoRR, vol. abs/1509.01602 (2015)

Feature combination with Multi-Kernel Learning for Fine-Grained Visual Classification

Anelia Angelova

Alexandru Niculescu-Mizil

IEEE Winter Conference on Applications of Computer Vision (WACV) (2014)

Benchmarking Large-Scale Fine-Grained Categorization

Anelia Angelova

Phil Long

IEEE Winter Conference on Applications of Computer Vision (WACV) (2014)

Efficient object detection and segmentation for fine-grained recognition

Anelia Angelova

Shenghuo Zhu

Computer Vision and Pattern Recognition (CVPR), IEEE (2013)

Development and Deployment of a Large-Scale Flower Recognition Mobile App

Anelia Angelova

Shenghuo Zhu

Yuanqing Lin

Josephine Wong

Chelsea Specht

NEC Laboratories America (2012)

Terrain Adaptive Navigation for Planetary Rovers

Daniel Helmick

Anelia Angelova

Larry Matthies

Journal of Field Robotics (JFR) (2009)

Characterization of traverse slippage experienced by Spirit rover on Husband Hill at Gusev crater

Rongxing Li

Bo Wu

Kaichang Di

Anelia Angelova

Raymond E. Arvidson

I-Chieh Lee

Mark Maimone

Larry H. Matthies

Lutz Richer

Robert Sullivan

Michael H. Sims

Rebecca Greenberger

Steven W. Squyres

Journal of Geophysical Research - Planets (2008)

Visual Prediction of Rover Slip: Learning Algorithms and Field Experiments

Anelia Angelova

Ph.D. Thesis, California Institute of Technology (2008)

Experimental results from a terrain adaptive navigation system for planetary rovers

Daniel Helmick

Anelia Angelova

Larry Matthies

Chris Brooks

Ibrahim Halatci

Steve Dubowsky

Karl Iagnemma

International Symposium on Artificial Intelligence, Robotics and Automation in Space (2008)

Terrain Adaptive Navigation for a Mars Rover

Daniel Helmick

Anelia Angelova

Matthew Livianu

Larry Matthies

IEEE Aerospace Conference (2007)

Learning And Prediction of Slip Using Visual Information

Anelia Angelova

Larry Matthies

Daniel Helmick

Pietro Perona

Journal of Field Robotics (JFR) (2007)

Dimensionality Reduction Using Automatic Supervision for Vision-Based Terrain Learning

Anelia Angelova

Larry Matthies

Daniel Helmick

Pietro Perona

Robotics: Science and Systems (RSS) (2007)

Learning Slip Behavior Using Automatic Mechanical Supervision

Anelia Angelova

Larry Matthies

Daniel Helmick

Pietro Perona

IEEE International Conference on Robotics and Automation (ICRA) (2007)

Fast Terrain Classification Using Variable-Length Representation for Autonomous Navigation

Anelia Angelova

Larry Matthies,

Daniel Helmick

Pietro Perona

Computer Vision and Pattern Recognition (CVPR), IEEE (2007)

Computer Vision on Mars

Larry Matthies

Mark Maimone

Andrew Johnson

Yang Cheng

Reg Willson

Carlos Villalpando

Steve Goldberg

Andres Huertas

Andrew Stein

Anelia Angelova

International Journal of Computer Vision (IJCV) (2007)

Learning to Predict Slip for Ground Robots

Anelia Angelova

Larry Matthies

Daniel Helmick

Gabe Sibley

Pietro Perona

IEEE International Conference on Robotics and Automation (ICRA) (2006)

Towards Learned Traversability for Robot Navigation: From Underfoot to the Far Field

Andrew Howard

Michael Turmon

Larry Matthies

Benyang Tang

Anelia Angelova

Eric Mjolsness

Journal of Field Robotics (JFR) (2006)

Slip Prediction Using Visual Information

Anelia Angelova

Larry Matthies

Daniel Helmick

Pietro Perona

Robotics: Science and Systems (RSS) (2006)

Learning for Autonomous Navigation

Larry Matthies

Michael Turmon

Andrew Howard

Anelia Angelova

Benyang Tang

Eric Mjolsness

NIPS, Workshop on Machine Learning Based Robotics in Unstructured Environments (2005)

Pruning Training Sets for Learning of Object Categories

Anelia Angelova

Yaser Abu-Mostafa

Pietro Perona

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005)

Data Pruning

Anelia Angelova

(2004)

Search on Google Scholar

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Anelia Angelova

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Anelia Angelova

Research Areas

Filter by:

Year

Research Area

Team

Join us

AI/ML Foundations  & Capabilities