paint-brush
Quality Assurance of a GPT-Based Sentiment Analysis Systemby@qualityassurance

Quality Assurance of a GPT-Based Sentiment Analysis System

tldt arrow

Too Long; Didn't Read

This paper presents a novel approach to out-of-distribution (OOD) data detection using deep learning and local conditional probability. By combining autoencoder-based feature extraction with a new statistical measure, the study offers a superior method for detecting OOD data in AI quality management, demonstrated through comprehensive experiments.
featured image - Quality Assurance of a GPT-Based Sentiment Analysis System
Quality Assurance Career Development Resources HackerNoon profile picture

Authors:

(1) Tinghui Ouyang, National Institute of Informatics, Japan ([email protected]);

(2) Isao Echizen, National Institute of Informatics, Japan ([email protected]);

(3) Yoshiki Seo, Digital Architecture Research Center, National Institute of Advanced Industrial Science and Technology, Japan ([email protected]).

Abstract and Introduction

Description and Related Work of OOD Detection

Methodology

Experiments and Results

Conclusions, Acknowledgement and References


Abstract—Data outside the problem domain poses significant threats to the security of AI-based intelligent systems. Aiming to investigate the data domain and out-of-distribution (OOD) data in AI quality management (AIQM) study, this paper proposes to use deep learning techniques for feature representation and develop a novel statistical measure for OOD detection. First, to extract low-dimensional representative features distinguishing normal and OOD data, the proposed research combines the deep auto-encoder (AE) architecture and neuron activation status for feature engineering. Then, using local conditional probability (LCP) in data reconstruction, a novel and superior statistical measure is developed to calculate the score of OOD detection. Experiments and evaluations are conducted on image benchmark datasets and an industrial dataset. Through comparative analysis with other common statistical measures in OOD detection, the proposed research is validated as feasible and effective in OOD and AIQM studies.


Index Terms—out-of-distribution (OOD), AI quality management (AIQM), local conditional probability (LCP), data quality analysis,

I. INTRODUCTION

In AI quality management (AIQM) [1], a reliable AI system is expected to have the ability to make accurate decisions on both data within the problem domain and unknown examples, e.g., to reject data out of the defined problem domain. A wellknown example is the postcode recognition system, designed to recognize images of handwriting digital numbers of 0-9. However, things often don’t go as planned in the practical operation scenario. An alphabet image is likely fed to the given system for reasons but predicted to a class of 0-9. It is easily understood that the prediction mismatches the realistic expectation because of the absence of a sound problem domain analysis in AIQM. When a similar issue happens in some safety-critical scenarios, like autonomous driving [2], [3], no warning or hand-over of driving control may cause serious accidents when facing unusual scenes. Therefore, for the consideration of safety and security, it is vital to investigate the problem domain and study data quality assurance in AIQM, especially detecting the usual data outside the problem domain.


According to the knowledge of general AIs, especially classification or recognition AI systems, their problem domains are usually designed based on closed-world assumptions [4]. Their training and testing data are also assumed to locate in the same problem domain. These problem domains are usually expressed by data distribution in the perspective of data quality analysis, while the unusual examples are regarded as data of out-of-distribution (OOD) [5]. Therefore, the easy solution to warning dangerous scenarios in AIQM is effectively detecting OOD data in advance. It is known that the problem of OOD detection is a typical task in AI quality research and is deeply related to anomaly detection (AD) [6], novelty detection (ND) [7], and open set recognition (OSR) [8]. Its general idea is to use data distribution or related structural characteristics (e.g., distance, density) for separating normal and outlier data, so the straightforward methods are unsupervised learning-based models. There are many OOD detection methods using different types of structural characteristics reported in the literature. For example, the first type uses probability or density, named probabilistic methods [9]. These methods leverage the probability density function of a dataset X under a given model parameter Θ, then determine data points having the smallest likelihood P(X|Θ) as outliers, e.g., the Gaussian Mixture Model (GMM) [10] fitting a number of Gaussian distributions for outlier detection. Probabilistic principal component analysis (PPCA) [11] and least-squares anomaly detection (LSA) [12] are other probabilistic OOD methods. The second type is density-based method. For example, kernel density (KD) estimators [13] approximate the density function of the dataset via kernel function calculation and distinguish outliers by low density in OOD detection. The third type is based on distance, using the idea that outliers are distant from the distribution of normal data. For instance, Mahalanobis distance (MD) is used to detect anomalies under the assumption of Gaussian-shaped data distribution [14]. Moreover, there are also some other metrics used in OOD detection methods, e.g., the local outlier factor (LOF) [15] and kNN [16] leveraging the neighbors’ information, a method using the information-theoretic measure (i.e., Kullback-Leibler (KL) divergence [17]) in OOD investigation.


It is generally straightforward to use these statistical measures to detect OOD data directly in the original data space and easily understood to achieve a good performance on simple data. However, facing complex and high-dimensional datasets, e.g., the image data, direct usage of these measures usually fail to effectively detect OOD because of the bad computational scalability and the curse of dimensionality. Therefore, suitable feature engineering is required before applying these statistical measures in the OOD study. Concerning the above issues, this paper proposes a novel OOD method with consideration of both effective feature engineering and a useful statistical measure. The concrete novelties and contributions of the proposed method are summarized as follows:


  1. A feature extraction method based on autoencoder (AE) [18] and neuron activation status is proposed. It is commonly known that deep learning (DL) has excellent feature learning ability and wildly succeeds in complex data, e.g., image-based data processing. Meanwhile, as a typical DL algorithm for feature learning, deep AEs attempt to learn low-dimensional salient features to reproduce the original data. Therefore, to preserve image data’s distribution information as much as possible and effectively reduce computation complexity, deep AE is selected as the basic architecture for feature learning in this paper. Then, assuming normal data and outliers may have different neuron behaviors, feature extraction based on deep AE’s neuron activation status is implemented on complex and high-dimensional data to generate data for the OOD study.


2) A novel statistical measure for OOD detection is proposed based on local conditional probability (LCP) and data reconstruction. Considering the advantages of different types of OOD methods, e.g., the kernel-function-based density having advantages at addressing data without parametric probability density distribution (PDF) function, the effectiveness of LOF and kNN on using neighbor information, this paper proposed a new metric combining these two ideas. The new metric considers calculating probability via neighbors’ kernel distance, using the local conditional probability to reconstruct data, and describing OOD data via reconstruction error. It can achieve comparative superiority over conventional statistical measures in OOD detection. Details are presented in Section 3.


3) The proposed method performs better than conventional OOD detection methods. This paper selects four conventional statistical measures for OOD study, such as KD, LOF, MD, and kNN. Then, experiments based on image benchmark datasets and an industrial dataset are implemented. Results illustrate the proposed method’s effectiveness and superiority on OOD data description detection.




This paper is available on arxiv under CC 4.0 license.