We propose visual memes, or frequently reposted short video segments, for tracking large-scale video remix in social media. Visual memes are extracted by novel and highly scalable detection algorithms that we develop, with over 96% precision and 80% recall. We monitor real-world events on YouTube, and we model interactions using a graph model over memes, with people and content as nodes and meme postings as links. This allows us to define several measures of inuence. These abstractions, using more than two million video shots from several large-scale event datasets, enable us to quantify and effciently extract several important observations: over half of the videos contain re-mixed content, which appears rapidly; video view counts, particularly high ones, are poorly correlated with the virality of content; the inuence of traditional news media versus citizen journalists varies from event to event; iconic single images of an event are easily extracted; and content that will have long lifespan can be predicted within a day after it first appears. Visual memes can be applied to a number of social media scenarios: brand monitoring, social buzz tracking, ranking content and users, among others. © 2011 ACM.