a public good project by the
Synthesis
Company
of California

© 2026

AJ Piergiovanni | doi.page

0 works0 citations0 h-index

Google Scholar OpenAlex

AJ Piergiovanni

Google (United States)(US)DeepMind (United Kingdom)(GB)

Publications by Year

Research Areas

Multimodal Machine Learning Applications, Human Pose and Action Recognition, Domain Adaptation and Few-Shot Learning, Anomaly Detection Techniques and Applications, Advanced Image and Video Retrieval Techniques

Most-Cited Works

→ PaLI: A Jointly-Scaled Multilingual Language-Image Model(2022)194 cited
TokenLearner: Adaptive Space-Time Tokenization for Videos(2021)
→ Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning(2023)61 cited
→ Title Learning Latent Subevents in Activity Videos Using Temporal Attention Filters(2017)40 cited
→ PaLI-X: On Scaling up a Multilingual Vision and Language Model(2023)38 cited
→ AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification(2020)37 cited
→ F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models(2022)37 cited
→ On Scaling Up a Multilingual Vision and Language Model(2024)34 cited
→ AssembleNet: Searching for Multi-Stream Neural Connectivity in Video\n Architectures(2019)34 cited
→ Tiny Video Networks(2021)32 cited