Learning Human Action Recognition Representations Without Real HumansHoward ZhongSamarth Mishraet al.2023NeurIPS 2023
How Transferable are Video Representations Based on Synthetic Data?Yo-whan KimSamarth Mishraet al.2022NeurIPS 2022
Spoken Moments: Learning Joint Audio-Visual Representations from Video DescriptionsMathew MonfortSouyoung Jinet al.2021CVPR 2021