AI and Signal Processing Unite to Diagnose Machine Faults Faster

Authors: (1) Jing-Xiao Liao, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China and School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (2) Chao He, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China; (3) Jipu Li, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China; (4) Jinwei Sun, School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (5) Shiping Zhang (Corresponding author), School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (6) Xiaoge Zhang (Corresponding author), Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China. Table of Links Abstract and 1. Introduction 2. Preliminaries and 2.1. Blind deconvolution 2.2. Quadratic neural networks 3. Methodology 3.1. Time domain quadratic convolutional filter 3.2. Superiority of cyclic features extraction by QCNN 3.3. Frequency domain linear filter with envelope spectrum objective function 3.4. Integral optimization with uncertainty-aware weighing scheme 4. Computational experiments 4.1. Experimental configurations 4.2. Case study 1: PU dataset 4.3. Case study 2: JNU dataset 4.4. Case study 3: HIT dataset 5. Computational experiments 5.1. Comparison of BD methods 5.2. Classification results on various noise conditions 5.3. Employing ClassBD to deep learning classifiers 5.4. Employing ClassBD to machine learning classifiers 5.5. Feature extraction ability of quadratic and conventional networks 5.6. Comparison of ClassBD filters 6. Conclusions Appendix and References ABSTRACT Blind deconvolution (BD) has been demonstrated as an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD’s desirable feature in adaptability and mathematical interpretability, a significant challenge persists: How to effectively integrate BD with fault-diagnosing classifiers? This issue arises because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with downstream deep learning classifiers, the different learning objectives will be in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault classification. Firstly, we present a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating the seamless integration of BD and the deep learning classifier for co-optimization of model parameters. Specifically, the neural BD incorporates two filters: i) a time domain quadratic filter to utilize quadratic convolutional networks for extracting periodic impulses; ii) a frequency domain linear filter composed of a fully-connected neural network to amplify discrete frequency components. Subsequently, we develop a unified framework to use a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, 𝑙2∕𝑙4 norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. Consequently, the fault labels provide useful information to direct BD to extract features that distinguish classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three datasets demonstrate that ClassBD outperforms other state-of-the-art methods under noisy conditions. We have shared our code at https://github.com/asdvfghg/ClassBD. 1. Introduction Rotating machinery, such as aero engines, pumps, and wind turbines, plays an indispensable role in various industrial applications. However, the components that support the rotation, particularly rolling bearings, are susceptible to damage due to long working hours in high temperature, high speed, and other harsh conditions [1, 2]. The damage of bearings, i.e., cage fracture and race crack, causes unexpected machinery failures and leads to costly downtime and even catastrophic outcomes. Therefore, timely and accurate diagnosis of bearing faults is of great importance for ensuring the sound and reliable operations of rotating machinery [3]. Nevertheless, one of the major challenges in bearing fault diagnosis is that the measured vibration signals are often contaminated by background noise arising from complex transmission paths (mechanical transmission systems) and environmental sources (coupled vibration source from multiple machines). These noises substantially obscure and distort key information that is important for discriminating faults in rotating machinery. Hence, developing effective methods for extracting fault-specific features from noisy signals has emerged as an active research topic. Presently, methodologies for signal denoising fall into two distinct categories: data-driven methods and signal processing methods. Despite the significant advances made by data-driven methods in recent years, these models fall short in transparency, which hinders their utility in decision-making processes [4–8]. On the other hand, a plethora of signal processing approaches with rigorous mathematical foundations have been proposed for extracting faultrelated features. These include, but are not limited to, wavelet transform (WT) [9–12], variational mode decomposition (VMD) [13, 14], singular value decomposition (SVD) [15, 16], spectral kurtosis (SK) [17–19], cyclostationary analysis [20, 21], and blind deconvolution (BD) [2, 22, 23]. Among them, the BD method possesses some unique advantages due to its adaptability and lack of constraints on bandwidth or center frequency (filter’s maximum gain frequency), and these features make BD an ideal tool for extracting repeated transient impulses [24]. Hence, we concentrate on the exploring the BD-based method for fault diagnosis in this paper. In essence, the BD method recovers the input signal features from the output signals when both the system and the input signals are unknown [25]. In other words, BD only uses the measured signal to reconstruct the fault source signal by estimating the transmission path function. When processing vibration signal, BD optimizes an adaptive finite impulse response (FIR) filter to recover the repeated transient impulses, which are regarded as the informative features for bearing fault classification [2]. A key issue in BD is how to design an objective function to effectively characterize the fault impulsive signatures, such as sparsity and cyclic periodicity. For example, Wiggins [26] proposed the first of its kind in BD coined as the minimum entropy deconvolution (MED) in 1978 for non-stationary signal denoising. MED used kurtosis [27] as the objective function to search for an optimal inverse filter. However, kurtosis is only sensitive to outliers, and it thus fails to distinguish between random impulses and cyclic impulses [28]. As a result, many other objective functions are subsequently developed to capture cyclic information, such as the maximum correlated kurtosis deconvolution (MCKD) [21], the multipoint optimal MED adjusted (MOMEDA) [29], the secondorder cyclostationarity blind deconvolution (CYCBD) [23], and the adaptive cyclostationarity blind deconvolution (ACYCBD) [3]. These methods share a common goal in attempt to address the extraction of bearing fault-related characteristics from vibration signals in the presence of complex noise. However, a fundamental challenge of applying BD to bearing fault diagnosis remains unresolved: How to effectively integrate BD with fault diagnosing classifiers? Existing BD methods often assess BD performance by examining the recovered signals, but only a few works have attempted to integrate BD and convolutional neural networks (CNN) for end-to-end fault diagnosis [30]. Combination of BD and classifiers fails to yield optimal performance and even exerts a detrimental effect on the classification task. The main reason is that BD and the classifier operate in two separate optimization spaces, which possess distinct optimizers (filter’s optimizer vs CNN’s optimizer) and divergent optimization objectives (BD objective function vs cross-entropy loss function). This leads to a lack of consistency during training. For instance, BD may enhance the cyclic impulse of the fault signal, but at the same time, it may diminish the differences between various fault severities. Therefore, there is a need for a unified framework that can coherently and efficiently integrate BD and classifiers. In this paper, we propose a novel framework that uses neural networks to perform both BD and classification. First, we employ a neural BD to process the raw vibration signal. Established upon the multi-task neural network blind deconvolution (MNNBD) [31], we replace the conventional BD filters with neural networks. The neural BD gives rise to two advantages: i) It implement multi-channel and multi-layer filters by using convolutional kernels as adaptive filters, while the conventional BD filters are usually single-channel; ii) It employs the optimizer of convolutional neural networks (CNNs) to find the optimal filter coefficients, while the conventional BD methods rely on less-efficient matrix operations [23, 26] or particle swarm optimization (PSO) [32, 33]. Moreover, such neural BD can be easily integrated with deep learning classifiers to achieve co-optimization of weight parameters. Our proposed BD framework includes two neural network modules: a time domain quadratic convolutional filter and a frequency domain linear filter (we use the term “filter” to remain consistent with the terminology in the field of signal processing and BD). The former with two layers of symmetric quadratic convolutional neural networks (QCNN) [34, 35] excels in extracting periodic impulses in the time domain. The latter composed of a fully connected neural network filters signals in the frequency domain post-Fast Fourier transform (FFT), thus enhancing the capability to filter the signal’s frequency components. Furthermore, inspired by advances in physics-informed neural networks [36–38], we introduce a unified framework ClassBD to integrate BD and deep learning classifiers. ClassBD transforms conventional BD, typically an unsupervised learning problem, into an supervised learning task using fault labels. This guides BD in extracting class-distinguishing features amidst noise. Our threefold pipeline includes neural BD as a plug-and-play module in the first layer of deep learning classifier, a physics-informed loss function optimizing both BD filters and classifiers, and an uncertainty-aware weighing loss strategy balancing the three loss components during training. Our contributions are summarized as follows: 1. We introduce a plug-and-play time and frequency neural blind deconvolution module. This module comprises two cascaded components: a quadratic convolutional neural filter and a frequency linear neural filter. From a mathematical perspective, we demonstrate that the quadratic neural filter enhances the filter’s capacity to extract periodic impulses in the time domain. The linear neural filter, on the other hand, offers the ability to filter signals in the frequency domain and it leads to a crucial enhancement for improving BD performance. 2. We develop a unified framework – ClassBD – to integrate BD and deep learning classifiers. By employing a deep learning classifier to guide the learning of BD filters, we transition from the conventional unsupervised BD optimization to supervised learning. The fault labels supply useful information in guiding the BD to extract class-distinguishing features amidst background noise. To the best of our knowledge, this is the first BD method of its kind to achieve bearing fault diagnosis under heavy noise while providing good interpretability. The rest of the paper is organized as follows. Section 2 introduces some background knowledge of blind deconvolution in signal processing and quadratic neural networks. Section 3 is the proposed method in detail. In Section 4, we conduct computational experiments in two public and one private datasets. Section 5 analyzes the properties of the proposed method. Section 6 is the conclusions. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. Authors: (1) Jing-Xiao Liao, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China and School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (2) Chao He, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China; (3) Jipu Li, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China; (4) Jinwei Sun, School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (5) Shiping Zhang (Corresponding author), School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (6) Xiaoge Zhang (Corresponding author), Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China. Authors: Authors: (1) Jing-Xiao Liao, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China and School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (2) Chao He, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China; (3) Jipu Li, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China; (4) Jinwei Sun, School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (5) Shiping Zhang (Corresponding author), School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (6) Xiaoge Zhang (Corresponding author), Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China. Table of Links Abstract and 1. Introduction Abstract and 1. Introduction 2. Preliminaries and 2.1. Blind deconvolution 2. Preliminaries and 2.1. Blind deconvolution 2.2. Quadratic neural networks 2.2. Quadratic neural networks 3. Methodology 3. Methodology 3.1. Time domain quadratic convolutional filter 3.1. Time domain quadratic convolutional filter 3.2. Superiority of cyclic features extraction by QCNN 3.2. Superiority of cyclic features extraction by QCNN 3.3. Frequency domain linear filter with envelope spectrum objective function 3.3. Frequency domain linear filter with envelope spectrum objective function 3.4. Integral optimization with uncertainty-aware weighing scheme 3.4. Integral optimization with uncertainty-aware weighing scheme 4. Computational experiments 4.1. Experimental configurations 4.1. Experimental configurations 4.2. Case study 1: PU dataset 4.2. Case study 1: PU dataset 4.3. Case study 2: JNU dataset 4.3. Case study 2: JNU dataset 4.4. Case study 3: HIT dataset 4.4. Case study 3: HIT dataset 5. Computational experiments 5.1. Comparison of BD methods 5.1. Comparison of BD methods 5.2. Classification results on various noise conditions 5.2. Classification results on various noise conditions 5.3. Employing ClassBD to deep learning classifiers 5.3. Employing ClassBD to deep learning classifiers 5.4. Employing ClassBD to machine learning classifiers 5.4. Employing ClassBD to machine learning classifiers 5.5. Feature extraction ability of quadratic and conventional networks 5.5. Feature extraction ability of quadratic and conventional networks 5.6. Comparison of ClassBD filters 5.6. Comparison of ClassBD filters 6. Conclusions 6. Conclusions Appendix and References Appendix and References ABSTRACT Blind deconvolution (BD) has been demonstrated as an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD’s desirable feature in adaptability and mathematical interpretability, a significant challenge persists: How to effectively integrate BD with fault-diagnosing classifiers? This issue arises because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with downstream deep learning classifiers, the different learning objectives will be in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault classification. Firstly, we present a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating the seamless integration of BD and the deep learning classifier for co-optimization of model parameters. Specifically, the neural BD incorporates two filters: i) a time domain quadratic filter to utilize quadratic convolutional networks for extracting periodic impulses; ii) a frequency domain linear filter composed of a fully-connected neural network to amplify discrete frequency components. Subsequently, we develop a unified framework to use a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, 𝑙2∕𝑙4 norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. Consequently, the fault labels provide useful information to direct BD to extract features that distinguish classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three datasets demonstrate that ClassBD outperforms other state-of-the-art methods under noisy conditions. We have shared our code at https://github.com/asdvfghg/ClassBD. 1. Introduction Rotating machinery, such as aero engines, pumps, and wind turbines, plays an indispensable role in various industrial applications. However, the components that support the rotation, particularly rolling bearings, are susceptible to damage due to long working hours in high temperature, high speed, and other harsh conditions [1, 2]. The damage of bearings, i.e., cage fracture and race crack, causes unexpected machinery failures and leads to costly downtime and even catastrophic outcomes. Therefore, timely and accurate diagnosis of bearing faults is of great importance for ensuring the sound and reliable operations of rotating machinery [3]. Nevertheless, one of the major challenges in bearing fault diagnosis is that the measured vibration signals are often contaminated by background noise arising from complex transmission paths (mechanical transmission systems) and environmental sources (coupled vibration source from multiple machines). These noises substantially obscure and distort key information that is important for discriminating faults in rotating machinery. Hence, developing effective methods for extracting fault-specific features from noisy signals has emerged as an active research topic. Presently, methodologies for signal denoising fall into two distinct categories: data-driven methods and signal processing methods. Despite the significant advances made by data-driven methods in recent years, these models fall short in transparency, which hinders their utility in decision-making processes [4–8]. On the other hand, a plethora of signal processing approaches with rigorous mathematical foundations have been proposed for extracting faultrelated features. These include, but are not limited to, wavelet transform (WT) [9–12], variational mode decomposition (VMD) [13, 14], singular value decomposition (SVD) [15, 16], spectral kurtosis (SK) [17–19], cyclostationary analysis [20, 21], and blind deconvolution (BD) [2, 22, 23]. Among them, the BD method possesses some unique advantages due to its adaptability and lack of constraints on bandwidth or center frequency (filter’s maximum gain frequency), and these features make BD an ideal tool for extracting repeated transient impulses [24]. Hence, we concentrate on the exploring the BD-based method for fault diagnosis in this paper. In essence, the BD method recovers the input signal features from the output signals when both the system and the input signals are unknown [25]. In other words, BD only uses the measured signal to reconstruct the fault source signal by estimating the transmission path function. When processing vibration signal, BD optimizes an adaptive finite impulse response (FIR) filter to recover the repeated transient impulses, which are regarded as the informative features for bearing fault classification [2]. A key issue in BD is how to design an objective function to effectively characterize the fault impulsive signatures, such as sparsity and cyclic periodicity. For example, Wiggins [26] proposed the first of its kind in BD coined as the minimum entropy deconvolution (MED) in 1978 for non-stationary signal denoising. MED used kurtosis [27] as the objective function to search for an optimal inverse filter. However, kurtosis is only sensitive to outliers, and it thus fails to distinguish between random impulses and cyclic impulses [28]. As a result, many other objective functions are subsequently developed to capture cyclic information, such as the maximum correlated kurtosis deconvolution (MCKD) [21], the multipoint optimal MED adjusted (MOMEDA) [29], the secondorder cyclostationarity blind deconvolution (CYCBD) [23], and the adaptive cyclostationarity blind deconvolution (ACYCBD) [3]. These methods share a common goal in attempt to address the extraction of bearing fault-related characteristics from vibration signals in the presence of complex noise. However, a fundamental challenge of applying BD to bearing fault diagnosis remains unresolved: How to effectively integrate BD with fault diagnosing classifiers? Existing BD methods often assess BD performance by examining the recovered signals, but only a few works have attempted to integrate BD and convolutional neural networks (CNN) for end-to-end fault diagnosis [30]. Combination of BD and classifiers fails to yield optimal performance and even exerts a detrimental effect on the classification task. The main reason is that BD and the classifier operate in two separate optimization spaces, which possess distinct optimizers (filter’s optimizer vs CNN’s optimizer) and divergent optimization objectives (BD objective function vs cross-entropy loss function). This leads to a lack of consistency during training. For instance, BD may enhance the cyclic impulse of the fault signal, but at the same time, it may diminish the differences between various fault severities. Therefore, there is a need for a unified framework that can coherently and efficiently integrate BD and classifiers. In this paper, we propose a novel framework that uses neural networks to perform both BD and classification. First, we employ a neural BD to process the raw vibration signal. Established upon the multi-task neural network blind deconvolution (MNNBD) [31], we replace the conventional BD filters with neural networks. The neural BD gives rise to two advantages: i) It implement multi-channel and multi-layer filters by using convolutional kernels as adaptive filters, while the conventional BD filters are usually single-channel; ii) It employs the optimizer of convolutional neural networks (CNNs) to find the optimal filter coefficients, while the conventional BD methods rely on less-efficient matrix operations [23, 26] or particle swarm optimization (PSO) [32, 33]. Moreover, such neural BD can be easily integrated with deep learning classifiers to achieve co-optimization of weight parameters. Our proposed BD framework includes two neural network modules: a time domain quadratic convolutional filter and a frequency domain linear filter (we use the term “filter” to remain consistent with the terminology in the field of signal processing and BD). The former with two layers of symmetric quadratic convolutional neural networks (QCNN) [34, 35] excels in extracting periodic impulses in the time domain. The latter composed of a fully connected neural network filters signals in the frequency domain post-Fast Fourier transform (FFT), thus enhancing the capability to filter the signal’s frequency components. Furthermore, inspired by advances in physics-informed neural networks [36–38], we introduce a unified framework ClassBD to integrate BD and deep learning classifiers. ClassBD transforms conventional BD, typically an unsupervised learning problem, into an supervised learning task using fault labels. This guides BD in extracting class-distinguishing features amidst noise. Our threefold pipeline includes neural BD as a plug-and-play module in the first layer of deep learning classifier, a physics-informed loss function optimizing both BD filters and classifiers, and an uncertainty-aware weighing loss strategy balancing the three loss components during training. Our contributions are summarized as follows: 1. We introduce a plug-and-play time and frequency neural blind deconvolution module. This module comprises two cascaded components: a quadratic convolutional neural filter and a frequency linear neural filter. From a mathematical perspective, we demonstrate that the quadratic neural filter enhances the filter’s capacity to extract periodic impulses in the time domain. The linear neural filter, on the other hand, offers the ability to filter signals in the frequency domain and it leads to a crucial enhancement for improving BD performance. 2. We develop a unified framework – ClassBD – to integrate BD and deep learning classifiers. By employing a deep learning classifier to guide the learning of BD filters, we transition from the conventional unsupervised BD optimization to supervised learning. The fault labels supply useful information in guiding the BD to extract class-distinguishing features amidst background noise. To the best of our knowledge, this is the first BD method of its kind to achieve bearing fault diagnosis under heavy noise while providing good interpretability. The rest of the paper is organized as follows. Section 2 introduces some background knowledge of blind deconvolution in signal processing and quadratic neural networks. Section 3 is the proposed method in detail. In Section 4, we conduct computational experiments in two public and one private datasets. Section 5 analyzes the properties of the proposed method. Section 6 is the conclusions. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. available on arxiv