This story draft by @escholar has not been reviewed by an editor, YET.

CNN Model & Tuning for Global Road Damage Detection: Dataset

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Table of Links

Abstract and I. Introduction

II. Dataset

III. Methods

IV. Experiments

V. Results

VI. Conclusions and References

II. DATASET

The Road Damage Dataset 2020 [2] was curated and annotated for automated inspection. This multi-country dataset is released as a part of IEEE Big Data Cup Challenge [23]. The task is to detect road damages at a global scale and report the performance on Test 1 and Test 2 datasets.


TABLE I. ROAD DAMAGE TYPE DEFINITIONS


The damages vary across countries. To generalize the damage category detection in Table I, classes considered for the analysis are; D00: Longitudinal Crack, D10: Transverse Crack, D20: Alligator Crack, D40: Pothole. Test 1 and Test 2 data is provided by the challenge [23] committee for evaluation and submission. Upon submission an Average F1 score is added to the private leaderboard as well as a public leaderboard if it exceeds all the previous scores in our private leaderboard.


A. Global Road Damage Dataset


The latest dataset is collected from Czech Republic and India in addition to what was made available by GIS Association of Japan. The 2020 dataset provides training images of size 600x600 with damages as a bounding box with associated damage class. Class labels and bounding box coordinates, defined by four numbers (xmin, ymin, xmax, ymax), are stored in the XML format as per PASCAL VOC [12].



The provided training data has 21041 total images. It consists of 2829 images from Czech (CZ); 10506 from Japan (JP); and 7706 from India (IN) with annotations stored in individual XML files. In Fig. 1, We can see the file structure, bounding box in xml tags and corresponding image example.


Fig. 2. Train (T), Validation (V) and Test (T) data split for experiments. Bars are for 4 damage class labels D00, D10, D20, D40 provided in the dataset.


The shared Test data are divided into two sets. Test 1 consists of 349 Czech, 969 India and 1313 Japan Road images without annotated ground truth. Test 2 consists of 360 Czech, 990 India and 1314 Japan Road images without annotated ground truth. The detection results on these test images is submitted to the challenge [23] for Avg F1 score evaluation.


In order to run the experiments, we split the given training dataset proportionally into 80:15:5 :: Train (T):Val (V):Test (T) data. This gives us the final image & annotations count in Fig. 2 that will be used for training and tuning.


As we fine tune the models, we need to create composite datasets with Train+Test (T+T) and Train+Val (T+V) dataset composition. This will help model use entire data for learning and evaluation.


B. Evaluation Strategy


Evaluation strategy includes matching of the predicted class label for the ground truth bounding box and that the predicted bounding box has over 50% Intersection over Union (IoU) in area. Precision and recall are both based on evaluating Intersection over Union (IoU), which is defined as the ratio of the area overlap between predicted and ground-truth bounding boxes by the area of their union.



The evaluation of the match is done using the Mean F1 Score metric. The F1 score, commonly used in information retrieval, measures accuracy using the statistics of precision p and recall r. Precision is the ratio of true positives (tp) to all predicted positives (tp + fp) while recall is the ratio of true positives to all actual positives (tp + fn). Maximizing the F1-score ensures reasonably high precision and recall.


The F1 score is given by:



Avg F1 score serves as a balanced metric for precision and recall. This is the metric we obtain in our private leaderboard, upon submitting the evaluation results on Test 1 or Test 2 datasets.


Authors:

(1) Rahul Vishwakarma, Big Data Analytics & Solutions Lab, Hitachi America Ltd. Research & Development, Santa Clara, CA, USA ([email protected]);

(2) Ravigopal Vennelakanti, Big Data Analytics & Solutions Lab, Hitachi America Ltd. Research & Development, Santa Clara, CA, USA ([email protected]).


This paper is available on arxiv under ATTRIBUTION-SHAREALIKE 4.0 INTERNATIONAL license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks