This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Zhihang Ren, University of California, Berkeley and these authors contributed equally to this work (Email: [email protected]);
(2) Jefferson Ortega, University of California, Berkeley and these authors contributed equally to this work (Email: [email protected]);
(3) Yifan Wang, University of California, Berkeley and these authors contributed equally to this work (Email: [email protected]);
(4) Zhimin Chen, University of California, Berkeley (Email: [email protected]);
(5) Yunhui Guo, University of Texas at Dallas (Email: [email protected]);
(6) Stella X. Yu, University of California, Berkeley and University of Michigan, Ann Arbor (Email: [email protected]);
(7) David Whitney, University of California, Berkeley (Email: [email protected]).
In total, we had 192 participants who annotated the videos in the VEATIC dataset. Eighty-four participants annotated video IDs 0-82. One hundred and eight participants annotated video IDs 83-123 prior to the planning of the VEATIC dataset. In particular, Fifty-one participants annotated video IDs 83-94, twenty-five participants annotated video IDs 95-97, and 32 participants annotated video IDs 98-123.
Another novelty of the VEATIC dataset is that it contains videos with interacting characters and ratings for separate characters in the same video. These videos are those with video IDs 98-123. For each consecutive video pair, the video frames are exactly the same, but the continuous emotion ratings are annotated based on different selected characters. Figure 11 shows an example. In this study, we first propose this annotation process because it affords future algorithms a way to test whether models learn the emotion of the selected characters given the interactions between characters and the exact same context information. A good emotion recognition algorithm should deal with this complicated situation.
This paper is available on arxiv under CC 4.0 license.