Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Mar 22, 2022

Tomáš Souček, Jean-Baptiste Alayrac, Antoine Miech, Ivan Laptev, Josef Sivic

Figure 1 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Figure 2 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Figure 3 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Figure 4 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Share this with someone who'll enjoy it:

Abstract:Human actions often induce changes of object states such as "cutting an apple", "cleaning shoes" or "pouring coffee". In this paper, we seek to temporally localize object states (e.g. "empty" and "full" cup) together with the corresponding state-modifying actions ("pouring coffee") in long uncurated videos with minimal supervision. The contributions of this work are threefold. First, we develop a self-supervised model for jointly learning state-modifying actions together with the corresponding object states from an uncurated set of videos from the Internet. The model is self-supervised by the causal ordering signal, i.e. initial object state $\rightarrow$ manipulating action $\rightarrow$ end state. Second, to cope with noisy uncurated training data, our model incorporates a noise adaptive weighting module supervised by a small number of annotated still images, that allows to efficiently filter out irrelevant videos during training. Third, we collect a new dataset with more than 2600 hours of video and 34 thousand changes of object states, and manually annotate a part of this data to validate our approach. Our results demonstrate substantial improvements over prior work in both action and object state-recognition in video.

* To be published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

View paper on

Share this with someone who'll enjoy it:

Title:Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Paper and Code