Authors:
(1) Lukáš Korel, Faculty of Information Technology, Czech Technical University, Prague, Czech Republic;
(2) Petr Pulc, Faculty of Information Technology, Czech Technical University, Prague, Czech Republic;
(3) Jirí Tumpach, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic;
(4) Martin Holena, Institute of Computer Science, Academy of Sciences of the Czech Republic, Prague, Czech Republic. Table of Links Abstract and Introduction ANN-Based Scene Classification Methodology Experiments Conclusion and Future Research, Acknowledgments and References Abstract: This paper provides an insight into the possibility of scene recognition from a video sequence with a small set of repeated shooting locations (such as in television series) using artificial neural networks. The basic idea of the presented approach is to select a set of frames from each scene, transform them by a pre-trained singleimage pre-processing convolutional network, and classify the scene location with subsequent layers of the neural network. The considered networks have been tested and compared on a dataset obtained from The Big Bang Theory television series. We have investigated different neural network layers to combine individual frames, particularly AveragePooling, MaxPooling, Product, Flatten, LSTM, and Bidirectional LSTM layers. We have observed that only some of the approaches are suitable for the task at hand. 1 Introduction People watching videos are able to recognize where the current scene is located. When watching some film or serial, they are able to recognize that a new scene is on the same place they have already seen. Finally, people are able to understand scenes hierarchy. All this supports human comprehensibility of videos. The role of location identification in scene recognition by humans motivated our research into scene location classification by artificial neural networks (ANNs). A more ambitious goal would be a make system able to remember unknown video locations and using this data identify video scene that is located in that location and mark it with the same label. This paper reports a work in progress in that direction. It describes the employed methodology and presents first experimental results obtained with six kinds of neural networks. The rest of the paper is organized as follows. The next section is about existing approaches to solve this problem. Section 3 is divided to two parts. The first one is about data preparation before their usage in ANNs. The second one is about design of the ANNs in our experiments. Finally, Section 4 – the last section before the conclusion shows our results of experiments with these ANNs. This paper is available on arxiv under CC0 1.0 DEED license. Authors: (1) Lukáš Korel, Faculty of Information Technology, Czech Technical University, Prague, Czech Republic; (2) Petr Pulc, Faculty of Information Technology, Czech Technical University, Prague, Czech Republic; (3) Jirí Tumpach, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic; (4) Martin Holena, Institute of Computer Science, Academy of Sciences of the Czech Republic, Prague, Czech Republic. Authors: Authors: (1) Lukáš Korel, Faculty of Information Technology, Czech Technical University, Prague, Czech Republic; (2) Petr Pulc, Faculty of Information Technology, Czech Technical University, Prague, Czech Republic; (3) Jirí Tumpach, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic; (4) Martin Holena, Institute of Computer Science, Academy of Sciences of the Czech Republic, Prague, Czech Republic. Table of Links Abstract and Introduction Abstract and Introduction ANN-Based Scene Classification ANN-Based Scene Classification Methodology Methodology Experiments Experiments Conclusion and Future Research, Acknowledgments and References Conclusion and Future Research, Acknowledgments and References Abstract : This paper provides an insight into the possibility of scene recognition from a video sequence with a small set of repeated shooting locations (such as in television series) using artificial neural networks. The basic idea of the presented approach is to select a set of frames from each scene, transform them by a pre-trained singleimage pre-processing convolutional network, and classify the scene location with subsequent layers of the neural network. The considered networks have been tested and compared on a dataset obtained from The Big Bang Theory television series. We have investigated different neural network layers to combine individual frames, particularly AveragePooling, MaxPooling, Product, Flatten, LSTM, and Bidirectional LSTM layers. We have observed that only some of the approaches are suitable for the task at hand. Abstract 1 Introduction People watching videos are able to recognize where the current scene is located. When watching some film or serial, they are able to recognize that a new scene is on the same place they have already seen. Finally, people are able to understand scenes hierarchy. All this supports human comprehensibility of videos. The role of location identification in scene recognition by humans motivated our research into scene location classification by artificial neural networks (ANNs). A more ambitious goal would be a make system able to remember unknown video locations and using this data identify video scene that is located in that location and mark it with the same label. This paper reports a work in progress in that direction. It describes the employed methodology and presents first experimental results obtained with six kinds of neural networks. The rest of the paper is organized as follows. The next section is about existing approaches to solve this problem. Section 3 is divided to two parts. The first one is about data preparation before their usage in ANNs. The second one is about design of the ANNs in our experiments. Finally, Section 4 – the last section before the conclusion shows our results of experiments with these ANNs. This paper is available on arxiv under CC0 1.0 DEED license. This paper is available on arxiv under CC0 1.0 DEED license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Video Scene Location Recognition Using AI: Abstract and Introduction

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Coin3D Achieves Superior Control and Efficiency in 3D Generation

Video Scene Location Recognition Using AI: ANN-Based Scene Classification

Video Scene Location Recognition Using AI: Methodology

Video Scene Location Recognition Using AI: Experiments

Video Scene Location Recognition Using AI: Conclusion and Future Research

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

Coin3D Achieves Superior Control and Efficiency in 3D Generation

Video Scene Location Recognition Using AI: ANN-Based Scene Classification

Video Scene Location Recognition Using AI: Methodology

Video Scene Location Recognition Using AI: Experiments

Video Scene Location Recognition Using AI: Conclusion and Future Research

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps