paint-brush
Multilevel Profiling of Situation and Dialogue-based Deep Networks: EMTD Datasetby@kinetograph

Multilevel Profiling of Situation and Dialogue-based Deep Networks: EMTD Dataset

Too Long; Didn't Read

In this paper, researchers propose a multi-modality framework for movie genre classification, utilizing situation, dialogue, and metadata features.
featured image - Multilevel Profiling of Situation and Dialogue-based Deep Networks: EMTD Dataset
Kinetograph: The Video Editing Technology Publication HackerNoon profile picture

Authors:

(1) Dinesh Kumar Vishwakarma, Biometric Research Laboratory, Department of Information Technology, Delhi Technological University, Delhi, India;

(2) Mayank Jindal, Biometric Research Laboratory, Department of Information Technology, Delhi Technological University, Delhi, India

(3) Ayush Mittal, Biometric Research Laboratory, Department of Information Technology, Delhi Technological University, Delhi, India

(4) Aditya Sharma, Biometric Research Laboratory, Department of Information Technology, Delhi Technological University, Delhi, India.

3. EMTD Dataset

The datasets in previous literature lack the uniform composition of movie genres. Hence, we propose an EMTD (English Movie Trailer Dataset) consisting of around 2000 unique Hollywood movie trailers downloaded from IMDB1 . EMTD contains 2000 unique trailers of 5 genres namely: action, comedy, horror, romance, science fiction. The dataset is extracted from IMDB by web scrapping procedure as follows: (1) fetch the list of movie titles available on IMDB (with at least 1 genre common to one mentioned above), (2) scrape metadata corresponding to each movie title including trailer link to download, and (3) download the trailers (.mp4) corresponding to the link into a folder, and list down all the information/metadata about the movie including trailer name, descriptions, plot, keywords, and genres in the form of a CSV file. In this work, the dataset is partitioned into train set (1700 trailers), validation set (300 trailers) as shown in Table 1.


The study is conducted with the above genres only because mostly these genres are observed in the movies. We also want to explore the performance of our architecture first on a small set of genres, so we go for choosing only 5 genres instead of going towards a broad set of genres.


Table 1: Dataset Composition


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

Kinetograph: The Video Editing Technology Publication HackerNoon profile picture
Kinetograph: The Video Editing Technology Publication@kinetograph
The Kinetograph's the 1st motion-picture camera. At Kinetograph.Tech, we cover cutting edge tech for video editing.

TOPICS

Languages

THIS ARTICLE WAS FEATURED IN...