Authors:
(1) Tinghui Ouyang, National Institute of Informatics, Japan ([email protected]);
(2) Isao Echizen, National Institute of Informatics, Japan ([email protected]);
(3) Yoshiki Seo, Digital Architecture Research Center, National Institute of Advanced Industrial Science and Technology, Japan ([email protected]).
Description and Related Work of OOD Detection
Conclusions, Acknowledgement and References
In this paper, we addressed the issue of out-of-distribution (OOD) data in AI quality management by proposing a framework that combines deep learning and statistical measures for OOD detection. Initially, we leveraged the strong feature representation and dimensionality reduction capabilities of AE to extract activation traces from hidden neurons as input features for OOD analysis. Subsequently, we employed five statistical measures, namely KD, LOF, MD, kNN, and the proposed LCP, using these representative features for OOD detection. Our findings revealed that the proposed LCP, which incorporates both neighbor information and data reconstruction error, outperforms the other measures in OOD detection. Thus, it was proved to be valuable for describing and detecting OOD data. Furthermore, this research extracted reasonable corner case data with high OOD scores from the given datasets, such as MNIST, CIFAR10, and GTSRB datasets. These corner case data have high OOD scores and exhibit abnormal characteristics compared to normal data. Therefore, detecting such data using the proposed method is helpful in future AI quality assurance, particularly for quality analysis related to data security.
This research is supported by the New Energy and Industrial Technology Development Organization (NEDO) project ’JPNP20006’, and JSPS Grant-in-Aid for Early-Career Scientists (Grant Number 22K17961).
[1] The National Institute of Advanced Industrial Science and Technology (AIST). Machine learning quality management guideline, 2023. https://www.digiarc.aist.go.jp/en/publication/aiqm/.
[2] Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles, pages 1–18, 2017.
[3] Tinghui Ouyang, Yoshinao Isobe, Saima Sultana, Yoshiki Seo, and Yutaka Oiwa. Autonomous driving quality assurance with data uncertainty analysis. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 1–7. IEEE, 2022.
[4] Guansong Pang, Jundong Li, Anton van den Hengel, Longbing Cao, and Thomas G Dietterich. Anomaly and novelty detection, explanation, and accommodation (andea). In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 4145–4146, 2021.
[5] Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. Generalized out-of-distribution detection: A survey. arXiv preprint arXiv:2110.11334, 2021.
[6] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3):1–58, 2009. [7] Marco AF Pimentel, David A Clifton, Lei Clifton, and Lionel Tarassenko. A review of novelty detection. Signal processing, 99:215–249, 2014.
[8] Chuanxing Geng, Sheng-jun Huang, and Songcan Chen. Recent advances in open set recognition: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(10):3614–3631, 2020.
[9] R´emi Domingues, Maurizio Filippone, Pietro Michiardi, and Jihane Zouaoui. A comparative evaluation of outlier detection algorithms: Experiments and analyses. Pattern recognition, 74:406–421, 2018.
[10] Jing Li, Pengbo Lv, Huijun Li, and Wanghu Chen. Outlier detection based on stacked autoencoder and gaussian mixture model. In 2022 IEEE International Conference on Big Data (Big Data), pages 3763–3769. IEEE, 2022. [11] Michael E Tipping and Christopher M Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 61(3):611–622, 1999.
[12] John A Quinn and Masashi Sugiyama. A least-squares approach to anomaly detection in static and sequential data. Pattern Recognition Letters, 40:36–40, 2014.
[13] Reuben Feinman, Ryan R Curtin, Saurabh Shintre, and Andrew B Gardner. Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410, 2017.
[14] Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in neural information processing systems, 31, 2018.
[15] Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280, 2017.
[16] Ahmed Abusnaina, Yuhang Wu, Sunpreet Arora, Yizhen Wang, Fei Wang, Hao Yang, and David Mohaisen. Adversarial example detection using latent neighborhood graph. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7687–7696, 2021.
[17] Bruno Andriamanalimanana, Ali Tekeoglu, Korkut Bekiroglu, Saumendra Sengupta, Chen-Fu Chiang, Michael Reale, and Jorge Novillo. Symmetric kullback-leibler divergence of softmaxed distributions for anomaly scores. In 2019 IEEE Conference on Communications and Network Security (CNS), pages 1–6. IEEE, 2019.
[18] Jianbo Yu, Xiaoyun Zheng, and Shijin Wang. A deep autoencoder feature learning method for process pattern recognition. Journal of Process Control, 79:1–15, 2019.
[19] Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel M¨uller, and Marius Kloft. Deep one-class classification. In International conference on machine learning, pages 4393–4402. PMLR, 2018.
[20] Jinhan Kim, Robert Feldt, and Shin Yoo. Guiding deep learning system testing using surprise adequacy. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 1039–1049. IEEE, 2019.
[21] Zenan Li, Xiaoxing Ma, Chang Xu, and Chun Cao. Structural coverage criteria for neural networks could be misleading. In 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pages 89–92. IEEE, 2019.
[22] Shiqing Ma, Yingqi Liu, Guanhong Tao, Wen-Chuan Lee, and Xiangyu Zhang. Nic: Detecting adversarial samples with neural network invariant checking. In 26th Annual Network And Distributed System Security Symposium (NDSS 2019). Internet Soc, 2019.
[23] Tarem Ahmed. Online anomaly detection using kde. In GLOBECOM 2009-2009 IEEE Global Telecommunications Conference, pages 1–8. IEEE, 2009.
[24] Siying Xu, Huiyi Liu, Liting Duan, andWenjingWu. An improved lof outlier detection algorithm. In 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pages 113–117. IEEE, 2021.
[25] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
[26] Yann LeCun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
[27] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
[28] Tom Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861–874, 2006.
[29] Tinghui Ouyang, Vicent Sanz Marco, Yoshinao Isobe, Hideki Asoh, Yutaka Oiwa, and Yoshiki Seo. Corner case data description and detection. In 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN), pages 19–26. IEEE, 2021.
[30] Stallkamp, Johannes, Marc Schlipsing, Jan Salmen, and Christian Igel. The German traffic sign recognition benchmark: a multi-class classification competition. In The 2011 international joint conference on neural networks, pages 1453-1460. IEEE, 2011.
This paper is available on arxiv under CC 4.0 license.