Understanding the Monotonicity of the Sparsity Objective Function

Table of Links Abstract and 1. Introduction 2. Preliminaries and 2.1. Blind deconvolution 2.2. Quadratic neural networks 3. Methodology 3.1. Time domain quadratic convolutional filter 3.2. Superiority of cyclic features extraction by QCNN 3.3. Frequency domain linear filter with envelope spectrum objective function 3.4. Integral optimization with uncertainty-aware weighing scheme 4. Computational experiments 4.1. Experimental configurations 4.2. Case study 1: PU dataset 4.3. Case study 2: JNU dataset 4.4. Case study 3: HIT dataset 5. Computational experiments 5.1. Comparison of BD methods 5.2. Classification results on various noise conditions 5.3. Employing ClassBD to deep learning classifiers 5.4. Employing ClassBD to machine learning classifiers 5.5. Feature extraction ability of quadratic and conventional networks 5.6. Comparison of ClassBD filters 6. Conclusions Appendix and References Appendix The monotonicity of the sparsity objective function is derived as follows [31, 39]: Then, the numerator of (38) is simplified to: when 𝑝 > 𝑞 > 0, the following inequality holds: Therefore, substituting (40) into (38), we have: Similarly, when 0 < 𝑝 < 𝑞, we have: References [1] R. B. Randall, V.-b. C. Monitoring, Industrial, aerospace and automotive applications, Vibration-based Condition Monitoring. West Sussex (2011) 13–20. [2] Y. Miao, B. Zhang, J. Lin, M. Zhao, H. Liu, Z. Liu, H. Li, A review on the application of blind deconvolution in machinery fault diagnosis, Mechanical Systems and Signal Processing 163 (2022) 108202. [3] Z. Wang, J. Zhou, W. Du, Y. Lei, J. Wang, Bearing fault diagnosis method based on adaptive maximum cyclostationarity blind deconvolution, Mechanical Systems and Signal Processing 162 (2022) 108018. [4] S. Li, J. Ji, Y. Xu, X. Sun, K. Feng, B. Sun, Y. Wang, F. Gu, K. Zhang, Q. Ni, Ifd-mdcn: Multibranch denoising convolutional networks with improved flow direction strategy for intelligent fault diagnosis of rolling bearings under noisy conditions, Reliability Engineering & System Safety 237 (2023) 109387. [5] S. Li, J. Ji, Y. Xu, K. Feng, K. Zhang, J. Feng, M. Beer, Q. Ni, Y. Wang, Dconformer: A denoising convolutional transformer with joint learning strategy for intelligent diagnosis of bearing faults, Mechanical Systems and Signal Processing 210 (2024) 111142. [6] S. Zhang, S. Zhang, B. Wang, T. G. Habetler, Deep learning algorithms for bearing fault diagnostics—a comprehensive review, IEEE Access 8 (2020) 29857–29881. [7] T. Zuo, K. Zhang, Q. Zheng, X. Li, Z. Li, G. Ding, M. Zhao, A hybrid attention-based multi-wavelet coefficient fusion method in rul prognosis of rolling bearings, Reliability Engineering & System Safety 237 (2023) 109337. [8] P. Liang, J. Tian, S. Wang, X. Yuan, Multi-source information joint transfer diagnosis for rolling bearing with unknown faults via wavelet transform and an improved domain adaptation network, Reliability Engineering & System Safety 242 (2024) 109788. [9] X. Lou, K. A. Loparo, Bearing fault diagnosis based on wavelet transform and fuzzy inference, Mechanical Systems and Signal Processing 18 (2004) 1077–1095. [10] Z. Peng, W. T. Peter, F. Chu, A comparison study of improved hilbert–huang transform and wavelet transform: Application to fault diagnosis for rolling bearing, Mechanical Systems and Signal Processing 19 (2005) 974–988. [11] P. K. Kankar, S. C. Sharma, S. P. Harsha, Rolling element bearing fault diagnosis using wavelet transform, Neurocomputing 74 (2011) 1638–1645. [12] Y. Xu, Y. Deng, J. Zhao, W. Tian, C. Ma, A novel rolling bearing fault diagnosis method based on empirical wavelet transform and spectral trend, IEEE Transactions on Instrumentation and Measurement 69 (2019) 2891–2904. [13] K. Dragomiretskiy, D. Zosso, Variational mode decomposition, IEEE Transactions on Signal Processing 62 (2013) 531–544. [14] Y. Wang, R. Markert, J. Xiang, W. Zheng, Research on variational mode decomposition and its application in detecting rub-impact fault of the rotor system, Mechanical Systems and Signal Processing 60 (2015) 243–251. [15] G. W. Stewart, On the early history of the singular value decomposition, SIAM Review 35 (1993) 551–566. [16] H. Li, T. Liu, X. Wu, Q. Chen, A bearing fault diagnosis method based on enhanced singular value decomposition, IEEE Transactions on Industrial Informatics 17 (2020) 3220–3230. [17] V. Vrabie, P. Granjon, C. Serviere, Spectral kurtosis: from definition to application, in: 6th IEEE international workshop on Nonlinear Signal and Image Processing (NSIP 2003), 2003, p. xx. [18] J. Antoni, The spectral kurtosis: a useful tool for characterising non-stationary signals, Mechanical Systems and Signal Processing 20 (2006) 282–307. [19] Y. Wang, J. Xiang, R. Markert, M. Liang, Spectral kurtosis for fault detection, diagnosis and prognostics of rotating machines: A review with applications, Mechanical Systems and Signal Processing 66 (2016) 679–698. [20] A. McCormick, A. Nandi, Cyclostationarity in rotating machine vibrations, Mechanical Systems and Signal Processing 12 (1998) 225–242. [21] G. L. McDonald, Q. Zhao, M. J. Zuo, Maximum correlated kurtosis deconvolution and application on gear tooth chip fault detection, Mechanical Systems and Signal Processing 33 (2012) 237–255. [22] C. A. Cabrelli, Minimum entropy deconvolution and simplicity: A noniterative algorithm, Geophysics 50 (1985) 394–413. [23] M. Buzzoni, J. Antoni, G. d’Elia, Blind deconvolution based on cyclostationarity maximization and its application to fault identification, Journal of Sound and Vibration 432 (2018) 569–601. [24] L. He, C. Yi, D. Wang, F. Wang, J.-h. Lin, Optimized minimum generalized lp/lq deconvolution for recovering repetitive impacts from a vibration mixture, Measurement 168 (2021) 108329. [25] S. Haykin, Adaptive filter theory, Prentice-Hall, Inc., USA, 1986. [26] R. A. Wiggins, Minimum entropy deconvolution, Geoexploration 16 (1978) 21–35. [27] K. Pearson, “das fehlergesetz und seine verallgemeiner-ungen durch fechner und pearson.” a rejoinder, Biometrika 4 (1905) 169–212. [28] Y. Cheng, B. Chen, W. Zhang, Adaptive multipoint optimal minimum entropy deconvolution adjusted and application to fault diagnosis of rolling element bearings, IEEE Sensors Journal 19 (2019) 12153–12164. [29] G. L. McDonald, Q. Zhao, Multipoint optimal minimum entropy deconvolution and convolution fix: application to vibration fault detection, Mechanical Systems and Signal Processing 82 (2017) 461–477. [30] S. Wang, J. Xiang, A minimum entropy deconvolution-enhanced convolutional neural networks for fault diagnosis of axial piston pumps, Soft Computing 24 (2020) 2983–2997. [31] J.-X. Liao, H.-C. Dong, L. Luo, J. Sun, S. Zhang, Multi-task neural network blind deconvolution and its application to bearing fault feature extraction, Measurement Science and Technology 34 (2023) 075017. [32] Y. Cheng, N. Zhou, W. Zhang, Z. Wang, Application of an improved minimum entropy deconvolution method for railway rolling element bearing fault diagnosis, Journal of Sound and Vibration 425 (2018) 53–69. [33] Y. Cheng, B. Chen, G. Mei, Z. Wang, W. Zhang, A novel blind deconvolution method and its application to fault identification, Journal of Sound and Vibration 460 (2019) 114900. [34] F. Fan, W. Cong, G. Wang, A new type of neurons for machine learning, International journal for numerical methods in biomedical engineering 34 (2018) e2920. [35] J.-X. Liao, H.-C. Dong, Z.-Q. Sun, J. Sun, S. Zhang, F.-L. Fan, Attention-embedded quadratic network (qttention) for effective and interpretable bearing fault diagnosis, IEEE Transactions on Instrumentation and Measurement 72 (2023) 1–13. [36] C. He, H. Shi, J. Li, Idsn: A one-stage interpretable and differentiable stft domain adaptation network for traction motor of high-speed trains cross-machine diagnosis, Mechanical Systems and Signal Processing 205 (2023) 110846. [37] Q. Ni, J. Ji, B. Halkon, K. Feng, A. K. Nandi, Physics-informed residual network (piresnet) for rolling element bearing fault diagnostics, Mechanical Systems and Signal Processing 200 (2023) 110544. [38] S. Yang, B. Tang, W. Wang, Q. Yang, C. Hu, Physics-informed multi-state temporal frequency network for rul prediction of rolling bearings, Reliability Engineering & System Safety 242 (2024) 109716. [39] L. He, D. Wang, C. Yi, Q. Zhou, J. Lin, Extracting cyclo-stationarity of repetitive transients from envelope spectrum based on prior-unknown blind deconvolution technique, Signal Processing 183 (2021) 107997. [40] B. Fang, J. Hu, C. Yang, Y. Cao, M. Jia, A blind deconvolution algorithm based on backward automatic differentiation and its application to rolling bearing fault diagnosis, Measurement Science and Technology 33 (2021) 025009. [41] B. Fang, J. Hu, C. Yang, X. Chen, Minimum noise amplitude deconvolution and its application in repetitive impact detection, Structural Health Monitoring (2022) 14759217221114527. [42] A. G. Ivakhnenko, Polynomial theory of complex systems, IEEE Transactions on Systems, Man, and Cybernetics (1971) 364–378. [43] Y. Shin, J. Ghosh, The pi-sigma network: An efficient higher-order neural network for pattern classification and function approximation, in: IJCNN-91-Seattle International Joint Conference on Neural Networks, volume 1, IEEE, 1991, pp. 13–18. [44] G. G. Chrysos, B. Wang, J. Deng, V. Cevher, Regularization of polynomial networks for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16123–16132. [45] G. Chrysos, S. Moschoglou, G. Bouritsas, J. Deng, Y. Panagakis, S. P. Zafeiriou, Deep polynomial neural networks, IEEE Transactions on Pattern Analysis and Machine Intelligence (2021). [46] G. G. Chrysos, M. Georgopoulos, J. Deng, J. Kossaifi, Y. Panagakis, A. Anandkumar, Augmenting deep classifiers with polynomial neural networks, in: European Conference on Computer Vision, Springer, 2022, pp. 692–716. [47] Z. Xu, F. Yu, J. Xiong, X. Chen, Quadralib: A performant quadratic neural network library for architecture optimization and design exploration, Proceedings of Machine Learning and Systems 4 (2022) 503–514. [48] G. Zoumpourlis, A. Doumanoglou, N. Vretos, P. Daras, Non-linear convolution filters for cnn-based learning, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4761–4769. [49] P. Micikevicius, S. Narang, J. Alben, G. Diamos, E. Elsen, D. Garcia, B. Ginsburg, M. Houston, O. Kuchaiev, G. Venkatesh, et al., Mixed precision training, in: International Conference on Learning Representations, 2018. [50] Y. Jiang, F. Yang, H. Zhu, D. Zhou, X. Zeng, Nonlinear cnn: improving cnns with quadratic convolutions, Neural Computing and Applications 32 (2020) 8507–8516. [51] P. Mantini, S. K. Shah, Cqnn: Convolutional quadratic neural networks, in: 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, 2021, pp. 9819–9826. [52] M. Goyal, R. Goyal, B. Lall, Improved polynomial neural networks with normalised activations, in: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, 2020, pp. 1–8. [53] J. Bu, A. Karpatne, Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes, in: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), SIAM, 2021, pp. 675–683. [54] W. Zhang, G. Peng, C. Li, Y. Chen, Z. Zhang, A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals, Sensors 17 (2017) 425. [55] F.-L. Fan, H.-C. Dong, Z. Wu, L. Ruan, T. Zeng, Y. Cui, J.-X. Liao, One neuron saved is one neuron earned: On parametric efficiency of quadratic networks, arXiv preprint arXiv:2303.06316 (2023). [56] F. Fan, J. Xiong, G. Wang, Universal approximation with quadratic deep networks, Neural Networks 124 (2020) 383–392. [57] W.-E. Yu, J. Sun, S. Zhang, X. Zhang, J.-X. Liao, A class-weighted supervised contrastive learning long-tailed bearing fault diagnosis approach using quadratic neural network, 2023. arXiv:2309.11717. [58] Y. Tang, C. Zhang, J. Wu, Y. Xie, W. Shen, J. Wu, Deep learning-based bearing fault diagnosis using a trusted multiscale quadratic attentionembedded convolutional neural network, IEEE Transactions on Instrumentation and Measurement 73 (2024) 1–15. [59] F.-L. Fan, M. Li, F. Wang, R. Lai, G. Wang, On expressivity and trainability of quadratic networks, IEEE Transactions on Neural Networks and Learning Systems (2023). [60] R. B. Randall, J. Antoni, S. Chobsaard, The relationship between spectral correlation and envelope analysis in the diagnostics of bearing faults and other cyclostationary machine signals, Mechanical Systems and Signal Processing 15 (2001) 945–962. [61] J. Antoni, Cyclic spectral analysis of rolling-element bearing signals: Facts and fictions, Journal of Sound and vibration 304 (2007) 497–529. [62] R. B. Randall, Vibration-based condition monitoring: industrial, automotive and aerospace applications, John Wiley & Sons, 2021. [63] L. Chi, B. Jiang, Y. Mu, Fast fourier convolution, Advances in Neural Information Processing Systems 33 (2020) 4479–4488. [64] H. Yu, J. Huang, L. LI, m. zhou, F. Zhao, Deep fractional fourier transform, in: Advances in Neural Information Processing Systems, volume 36, Curran Associates, Inc., 2023, pp. 72761–72773. [65] W. T. Peter, D. Wang, The design of a new sparsogram for fast bearing fault diagnosis: Part 1 of the two related manuscripts that have a joint title as “two automatic vibration-based fault diagnostic methods using the novel sparsity measurement–parts 1 and 2”, Mechanical Systems and Signal Processing 40 (2013) 499–519. [66] H. Zhang, X. Chen, Z. Du, R. Yan, Kurtosis based weighted sparse model with convex optimization technique for bearing fault diagnosis, Mechanical Systems and Signal Processing 80 (2016) 349–376. [67] L. Li, Sparsity-promoted blind deconvolution of ground-penetrating radar (gpr) data, IEEE Geoscience and Remote Sensing Letters 11 (2014) 1330–1334. [68] S. Ruder, An overview of multi-task learning in deep neural networks, arXiv preprint arXiv:1706.05098 (2017). [69] A. Kendall, Y. Gal, R. Cipolla, Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7482–7491. [70] B. Lin, Y. Zhang, LibMTL: A Python library for multi-task learning, Journal of Machine Learning Research 24 (2023) 1–7. [71] Z. Zhao, T. Li, J. Wu, C. Sun, S. Wang, R. Yan, X. Chen, Deep learning algorithms for rotating machinery intelligent diagnosis: An open source benchmark study, ISA Transactions 107 (2020) 224–255. [72] M. Zhao, S. Zhong, X. Fu, B. Tang, M. Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics 16 (2020) 4681–4690. [73] T. Li, Z. Zhao, C. Sun, L. Cheng, X. Chen, R. Yan, R. X. Gao, Waveletkernelnet: An interpretable deep neural network for industrial intelligent diagnosis, IEEE Transactions on Systems, Man, and Cybernetics: Systems 52 (2022) 2302–2312. [74] C. He, H. Shi, J. Si, J. Li, Physics-informed interpretable wavelet weight initialization and balanced dynamic adaptive threshold for intelligent fault diagnosis of rolling bearings, Journal of Manufacturing Systems 70 (2023) 579–592. [75] L. Jia, T. W. Chow, Y. Yuan, Gtfe-net: A gramian time frequency enhancement cnn for bearing fault diagnosis, Engineering Applications of Artificial Intelligence 119 (2023) 105794. [76] Q. Chen, X. Dong, G. Tu, D. Wang, C. Cheng, B. Zhao, Z. Peng, Tfn: An interpretable neural network with time-frequency transform embedded for intelligent fault diagnosis, Mechanical Systems and Signal Processing 207 (2024) 110952. [77] S. Ruder, An overview of gradient descent optimization algorithms, arXiv preprint arXiv:1609.04747 (2016). [78] I. Loshchilov, F. Hutter, Sgdr: Stochastic gradient descent with warm restarts, arXiv preprint arXiv:1608.03983 (2016). [79] C. Lessmeier, J. K. Kimotho, D. Zimmer, W. Sextro, Condition monitoring of bearing damage in electromechanical drive systems by using motor current signals of electric motors: A benchmark data set for data-driven classification, in: PHM Society European Conference, volume 3, 2016. [80] K. Li, X. Ping, H. Wang, P. Chen, Y. Cao, Sequential fuzzy diagnosis method for motor roller bearing in variable operating conditions based on vibration analysis, Sensors 13 (2013) 8013–8041. [81] G. E. Hinton, Visualizing high-dimensional data using t-sne, Vigiliae Christianae 9 (2008) 2579–2605. [82] Y. Miao, M. Zhao, J. Lin, X. Xu, Sparse maximum harmonics-to-noise-ratio deconvolution for weak fault signature detection in bearings, Measurement Science and Technology 27 (2016) 105004. [83] Y. Miao, M. Zhao, J. Lin, Y. Lei, Application of an improved maximum correlated kurtosis deconvolution method for fault diagnosis of rolling element bearings, Mechanical Systems and Signal Processing 92 (2017) 173–195. [84] J. Antoni, G. Xin, N. Hamzaoui, Fast computation of the spectral correlation, Mechanical Systems and Signal Processing 92 (2017) 248–277. [85] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. [86] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, H. Adam, Searching for mobilenetv3, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. [87] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems 30 (2017). [88] C. Cortes, V. Vapnik, Support-vector networks, Machine Learning 20 (1995) 273–297. [89] A. Mucherino, P. J. Papajorgji, P. M. Pardalos, A. Mucherino, P. J. Papajorgji, P. M. Pardalos, K-nearest neighbor classification, Data Mining in Agriculture (2009) 83–106. [90] T. K. Ho, Random decision forests, in: Proceedings of 3rd International Conference on Document Analysis and Recognition, volume 1, IEEE, 1995, pp. 278–282. [91] D. R. Cox, The regression analysis of binary sequences, Journal of the Royal Statistical Society Series B: Statistical Methodology 20 (1958) 215–232. [92] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T.-Y. Liu, Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems 30 (2017). Authors: (1) Jing-Xiao Liao, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China and School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (2) Chao He, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China; (3) Jipu Li, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China; (4) Jinwei Sun, School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (5) Shiping Zhang (Corresponding author), School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (6) Xiaoge Zhang (Corresponding author), Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. Table of Links Abstract and 1. Introduction Abstract and 1. Introduction 2. Preliminaries and 2.1. Blind deconvolution 2. Preliminaries and 2.1. Blind deconvolution 2.2. Quadratic neural networks 2.2. Quadratic neural networks 3. Methodology 3. Methodology 3.1. Time domain quadratic convolutional filter 3.1. Time domain quadratic convolutional filter 3.2. Superiority of cyclic features extraction by QCNN 3.2. Superiority of cyclic features extraction by QCNN 3.3. Frequency domain linear filter with envelope spectrum objective function 3.3. Frequency domain linear filter with envelope spectrum objective function 3.4. Integral optimization with uncertainty-aware weighing scheme 3.4. Integral optimization with uncertainty-aware weighing scheme 4. Computational experiments 4.1. Experimental configurations 4.1. Experimental configurations 4.2. Case study 1: PU dataset 4.2. Case study 1: PU dataset 4.3. Case study 2: JNU dataset 4.3. Case study 2: JNU dataset 4.4. Case study 3: HIT dataset 4.4. Case study 3: HIT dataset 5. Computational experiments 5.1. Comparison of BD methods 5.1. Comparison of BD methods 5.2. Classification results on various noise conditions 5.2. Classification results on various noise conditions 5.3. Employing ClassBD to deep learning classifiers 5.3. Employing ClassBD to deep learning classifiers 5.4. Employing ClassBD to machine learning classifiers 5.4. Employing ClassBD to machine learning classifiers 5.5. Feature extraction ability of quadratic and conventional networks 5.5. Feature extraction ability of quadratic and conventional networks 5.6. Comparison of ClassBD filters 5.6. Comparison of ClassBD filters 6. Conclusions 6. Conclusions Appendix and References Appendix and References Appendix The monotonicity of the sparsity objective function is derived as follows [31, 39]: Then, the numerator of (38) is simplified to: when 𝑝 > 𝑞 > 0, the following inequality holds: Therefore, substituting (40) into (38), we have: Similarly, when 0 < 𝑝 < 𝑞, we have: References [1] R. B. Randall, V.-b. C. Monitoring, Industrial, aerospace and automotive applications, Vibration-based Condition Monitoring. West Sussex (2011) 13–20. [2] Y. Miao, B. Zhang, J. Lin, M. Zhao, H. Liu, Z. Liu, H. Li, A review on the application of blind deconvolution in machinery fault diagnosis, Mechanical Systems and Signal Processing 163 (2022) 108202. [3] Z. Wang, J. Zhou, W. Du, Y. Lei, J. Wang, Bearing fault diagnosis method based on adaptive maximum cyclostationarity blind deconvolution, Mechanical Systems and Signal Processing 162 (2022) 108018. [4] S. Li, J. Ji, Y. Xu, X. Sun, K. Feng, B. Sun, Y. Wang, F. Gu, K. Zhang, Q. Ni, Ifd-mdcn: Multibranch denoising convolutional networks with improved flow direction strategy for intelligent fault diagnosis of rolling bearings under noisy conditions, Reliability Engineering & System Safety 237 (2023) 109387. [5] S. Li, J. Ji, Y. Xu, K. Feng, K. Zhang, J. Feng, M. Beer, Q. Ni, Y. Wang, Dconformer: A denoising convolutional transformer with joint learning strategy for intelligent diagnosis of bearing faults, Mechanical Systems and Signal Processing 210 (2024) 111142. [6] S. Zhang, S. Zhang, B. Wang, T. G. Habetler, Deep learning algorithms for bearing fault diagnostics—a comprehensive review, IEEE Access 8 (2020) 29857–29881. [7] T. Zuo, K. Zhang, Q. Zheng, X. Li, Z. Li, G. Ding, M. Zhao, A hybrid attention-based multi-wavelet coefficient fusion method in rul prognosis of rolling bearings, Reliability Engineering & System Safety 237 (2023) 109337. [8] P. Liang, J. Tian, S. Wang, X. Yuan, Multi-source information joint transfer diagnosis for rolling bearing with unknown faults via wavelet transform and an improved domain adaptation network, Reliability Engineering & System Safety 242 (2024) 109788. [9] X. Lou, K. A. Loparo, Bearing fault diagnosis based on wavelet transform and fuzzy inference, Mechanical Systems and Signal Processing 18 (2004) 1077–1095. [10] Z. Peng, W. T. Peter, F. Chu, A comparison study of improved hilbert–huang transform and wavelet transform: Application to fault diagnosis for rolling bearing, Mechanical Systems and Signal Processing 19 (2005) 974–988. [11] P. K. Kankar, S. C. Sharma, S. P. Harsha, Rolling element bearing fault diagnosis using wavelet transform, Neurocomputing 74 (2011) 1638–1645. [12] Y. Xu, Y. Deng, J. Zhao, W. Tian, C. Ma, A novel rolling bearing fault diagnosis method based on empirical wavelet transform and spectral trend, IEEE Transactions on Instrumentation and Measurement 69 (2019) 2891–2904. [13] K. Dragomiretskiy, D. Zosso, Variational mode decomposition, IEEE Transactions on Signal Processing 62 (2013) 531–544. [14] Y. Wang, R. Markert, J. Xiang, W. Zheng, Research on variational mode decomposition and its application in detecting rub-impact fault of the rotor system, Mechanical Systems and Signal Processing 60 (2015) 243–251. [15] G. W. Stewart, On the early history of the singular value decomposition, SIAM Review 35 (1993) 551–566. [16] H. Li, T. Liu, X. Wu, Q. Chen, A bearing fault diagnosis method based on enhanced singular value decomposition, IEEE Transactions on Industrial Informatics 17 (2020) 3220–3230. [17] V. Vrabie, P. Granjon, C. Serviere, Spectral kurtosis: from definition to application, in: 6th IEEE international workshop on Nonlinear Signal and Image Processing (NSIP 2003), 2003, p. xx. [18] J. Antoni, The spectral kurtosis: a useful tool for characterising non-stationary signals, Mechanical Systems and Signal Processing 20 (2006) 282–307. [19] Y. Wang, J. Xiang, R. Markert, M. Liang, Spectral kurtosis for fault detection, diagnosis and prognostics of rotating machines: A review with applications, Mechanical Systems and Signal Processing 66 (2016) 679–698. [20] A. McCormick, A. Nandi, Cyclostationarity in rotating machine vibrations, Mechanical Systems and Signal Processing 12 (1998) 225–242. [21] G. L. McDonald, Q. Zhao, M. J. Zuo, Maximum correlated kurtosis deconvolution and application on gear tooth chip fault detection, Mechanical Systems and Signal Processing 33 (2012) 237–255. [22] C. A. Cabrelli, Minimum entropy deconvolution and simplicity: A noniterative algorithm, Geophysics 50 (1985) 394–413. [23] M. Buzzoni, J. Antoni, G. d’Elia, Blind deconvolution based on cyclostationarity maximization and its application to fault identification, Journal of Sound and Vibration 432 (2018) 569–601. [24] L. He, C. Yi, D. Wang, F. Wang, J.-h. Lin, Optimized minimum generalized lp/lq deconvolution for recovering repetitive impacts from a vibration mixture, Measurement 168 (2021) 108329. [25] S. Haykin, Adaptive filter theory, Prentice-Hall, Inc., USA, 1986. [26] R. A. Wiggins, Minimum entropy deconvolution, Geoexploration 16 (1978) 21–35. [27] K. Pearson, “das fehlergesetz und seine verallgemeiner-ungen durch fechner und pearson.” a rejoinder, Biometrika 4 (1905) 169–212. [28] Y. Cheng, B. Chen, W. Zhang, Adaptive multipoint optimal minimum entropy deconvolution adjusted and application to fault diagnosis of rolling element bearings, IEEE Sensors Journal 19 (2019) 12153–12164. [29] G. L. McDonald, Q. Zhao, Multipoint optimal minimum entropy deconvolution and convolution fix: application to vibration fault detection, Mechanical Systems and Signal Processing 82 (2017) 461–477. [30] S. Wang, J. Xiang, A minimum entropy deconvolution-enhanced convolutional neural networks for fault diagnosis of axial piston pumps, Soft Computing 24 (2020) 2983–2997. [31] J.-X. Liao, H.-C. Dong, L. Luo, J. Sun, S. Zhang, Multi-task neural network blind deconvolution and its application to bearing fault feature extraction, Measurement Science and Technology 34 (2023) 075017. [32] Y. Cheng, N. Zhou, W. Zhang, Z. Wang, Application of an improved minimum entropy deconvolution method for railway rolling element bearing fault diagnosis, Journal of Sound and Vibration 425 (2018) 53–69. [33] Y. Cheng, B. Chen, G. Mei, Z. Wang, W. Zhang, A novel blind deconvolution method and its application to fault identification, Journal of Sound and Vibration 460 (2019) 114900. [34] F. Fan, W. Cong, G. Wang, A new type of neurons for machine learning, International journal for numerical methods in biomedical engineering 34 (2018) e2920. [35] J.-X. Liao, H.-C. Dong, Z.-Q. Sun, J. Sun, S. Zhang, F.-L. Fan, Attention-embedded quadratic network (qttention) for effective and interpretable bearing fault diagnosis, IEEE Transactions on Instrumentation and Measurement 72 (2023) 1–13. [36] C. He, H. Shi, J. Li, Idsn: A one-stage interpretable and differentiable stft domain adaptation network for traction motor of high-speed trains cross-machine diagnosis, Mechanical Systems and Signal Processing 205 (2023) 110846. [37] Q. Ni, J. Ji, B. Halkon, K. Feng, A. K. Nandi, Physics-informed residual network (piresnet) for rolling element bearing fault diagnostics, Mechanical Systems and Signal Processing 200 (2023) 110544. [38] S. Yang, B. Tang, W. Wang, Q. Yang, C. Hu, Physics-informed multi-state temporal frequency network for rul prediction of rolling bearings, Reliability Engineering & System Safety 242 (2024) 109716. [39] L. He, D. Wang, C. Yi, Q. Zhou, J. Lin, Extracting cyclo-stationarity of repetitive transients from envelope spectrum based on prior-unknown blind deconvolution technique, Signal Processing 183 (2021) 107997. [40] B. Fang, J. Hu, C. Yang, Y. Cao, M. Jia, A blind deconvolution algorithm based on backward automatic differentiation and its application to rolling bearing fault diagnosis, Measurement Science and Technology 33 (2021) 025009. [41] B. Fang, J. Hu, C. Yang, X. Chen, Minimum noise amplitude deconvolution and its application in repetitive impact detection, Structural Health Monitoring (2022) 14759217221114527. [42] A. G. Ivakhnenko, Polynomial theory of complex systems, IEEE Transactions on Systems, Man, and Cybernetics (1971) 364–378. [43] Y. Shin, J. Ghosh, The pi-sigma network: An efficient higher-order neural network for pattern classification and function approximation, in: IJCNN-91-Seattle International Joint Conference on Neural Networks, volume 1, IEEE, 1991, pp. 13–18. [44] G. G. Chrysos, B. Wang, J. Deng, V. Cevher, Regularization of polynomial networks for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16123–16132. [45] G. Chrysos, S. Moschoglou, G. Bouritsas, J. Deng, Y. Panagakis, S. P. Zafeiriou, Deep polynomial neural networks, IEEE Transactions on Pattern Analysis and Machine Intelligence (2021). [46] G. G. Chrysos, M. Georgopoulos, J. Deng, J. Kossaifi, Y. Panagakis, A. Anandkumar, Augmenting deep classifiers with polynomial neural networks, in: European Conference on Computer Vision, Springer, 2022, pp. 692–716. [47] Z. Xu, F. Yu, J. Xiong, X. Chen, Quadralib: A performant quadratic neural network library for architecture optimization and design exploration, Proceedings of Machine Learning and Systems 4 (2022) 503–514. [48] G. Zoumpourlis, A. Doumanoglou, N. Vretos, P. Daras, Non-linear convolution filters for cnn-based learning, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4761–4769. [49] P. Micikevicius, S. Narang, J. Alben, G. Diamos, E. Elsen, D. Garcia, B. Ginsburg, M. Houston, O. Kuchaiev, G. Venkatesh, et al., Mixed precision training, in: International Conference on Learning Representations, 2018. [50] Y. Jiang, F. Yang, H. Zhu, D. Zhou, X. Zeng, Nonlinear cnn: improving cnns with quadratic convolutions, Neural Computing and Applications 32 (2020) 8507–8516. [51] P. Mantini, S. K. Shah, Cqnn: Convolutional quadratic neural networks, in: 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, 2021, pp. 9819–9826. [52] M. Goyal, R. Goyal, B. Lall, Improved polynomial neural networks with normalised activations, in: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, 2020, pp. 1–8. [53] J. Bu, A. Karpatne, Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes, in: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), SIAM, 2021, pp. 675–683. [54] W. Zhang, G. Peng, C. Li, Y. Chen, Z. Zhang, A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals, Sensors 17 (2017) 425. [55] F.-L. Fan, H.-C. Dong, Z. Wu, L. Ruan, T. Zeng, Y. Cui, J.-X. Liao, One neuron saved is one neuron earned: On parametric efficiency of quadratic networks, arXiv preprint arXiv:2303.06316 (2023). [56] F. Fan, J. Xiong, G. Wang, Universal approximation with quadratic deep networks, Neural Networks 124 (2020) 383–392. [57] W.-E. Yu, J. Sun, S. Zhang, X. Zhang, J.-X. Liao, A class-weighted supervised contrastive learning long-tailed bearing fault diagnosis approach using quadratic neural network, 2023. arXiv:2309.11717. [58] Y. Tang, C. Zhang, J. Wu, Y. Xie, W. Shen, J. Wu, Deep learning-based bearing fault diagnosis using a trusted multiscale quadratic attentionembedded convolutional neural network, IEEE Transactions on Instrumentation and Measurement 73 (2024) 1–15. [59] F.-L. Fan, M. Li, F. Wang, R. Lai, G. Wang, On expressivity and trainability of quadratic networks, IEEE Transactions on Neural Networks and Learning Systems (2023). [60] R. B. Randall, J. Antoni, S. Chobsaard, The relationship between spectral correlation and envelope analysis in the diagnostics of bearing faults and other cyclostationary machine signals, Mechanical Systems and Signal Processing 15 (2001) 945–962. [61] J. Antoni, Cyclic spectral analysis of rolling-element bearing signals: Facts and fictions, Journal of Sound and vibration 304 (2007) 497–529. [62] R. B. Randall, Vibration-based condition monitoring: industrial, automotive and aerospace applications, John Wiley & Sons, 2021. [63] L. Chi, B. Jiang, Y. Mu, Fast fourier convolution, Advances in Neural Information Processing Systems 33 (2020) 4479–4488. [64] H. Yu, J. Huang, L. LI, m. zhou, F. Zhao, Deep fractional fourier transform, in: Advances in Neural Information Processing Systems, volume 36, Curran Associates, Inc., 2023, pp. 72761–72773. [65] W. T. Peter, D. Wang, The design of a new sparsogram for fast bearing fault diagnosis: Part 1 of the two related manuscripts that have a joint title as “two automatic vibration-based fault diagnostic methods using the novel sparsity measurement–parts 1 and 2”, Mechanical Systems and Signal Processing 40 (2013) 499–519. [66] H. Zhang, X. Chen, Z. Du, R. Yan, Kurtosis based weighted sparse model with convex optimization technique for bearing fault diagnosis, Mechanical Systems and Signal Processing 80 (2016) 349–376. [67] L. Li, Sparsity-promoted blind deconvolution of ground-penetrating radar (gpr) data, IEEE Geoscience and Remote Sensing Letters 11 (2014) 1330–1334. [68] S. Ruder, An overview of multi-task learning in deep neural networks, arXiv preprint arXiv:1706.05098 (2017). [69] A. Kendall, Y. Gal, R. Cipolla, Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7482–7491. [70] B. Lin, Y. Zhang, LibMTL: A Python library for multi-task learning, Journal of Machine Learning Research 24 (2023) 1–7. [71] Z. Zhao, T. Li, J. Wu, C. Sun, S. Wang, R. Yan, X. Chen, Deep learning algorithms for rotating machinery intelligent diagnosis: An open source benchmark study, ISA Transactions 107 (2020) 224–255. [72] M. Zhao, S. Zhong, X. Fu, B. Tang, M. Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics 16 (2020) 4681–4690. [73] T. Li, Z. Zhao, C. Sun, L. Cheng, X. Chen, R. Yan, R. X. Gao, Waveletkernelnet: An interpretable deep neural network for industrial intelligent diagnosis, IEEE Transactions on Systems, Man, and Cybernetics: Systems 52 (2022) 2302–2312. [74] C. He, H. Shi, J. Si, J. Li, Physics-informed interpretable wavelet weight initialization and balanced dynamic adaptive threshold for intelligent fault diagnosis of rolling bearings, Journal of Manufacturing Systems 70 (2023) 579–592. [75] L. Jia, T. W. Chow, Y. Yuan, Gtfe-net: A gramian time frequency enhancement cnn for bearing fault diagnosis, Engineering Applications of Artificial Intelligence 119 (2023) 105794. [76] Q. Chen, X. Dong, G. Tu, D. Wang, C. Cheng, B. Zhao, Z. Peng, Tfn: An interpretable neural network with time-frequency transform embedded for intelligent fault diagnosis, Mechanical Systems and Signal Processing 207 (2024) 110952. [77] S. Ruder, An overview of gradient descent optimization algorithms, arXiv preprint arXiv:1609.04747 (2016). [78] I. Loshchilov, F. Hutter, Sgdr: Stochastic gradient descent with warm restarts, arXiv preprint arXiv:1608.03983 (2016). [79] C. Lessmeier, J. K. Kimotho, D. Zimmer, W. Sextro, Condition monitoring of bearing damage in electromechanical drive systems by using motor current signals of electric motors: A benchmark data set for data-driven classification, in: PHM Society European Conference, volume 3, 2016. [80] K. Li, X. Ping, H. Wang, P. Chen, Y. Cao, Sequential fuzzy diagnosis method for motor roller bearing in variable operating conditions based on vibration analysis, Sensors 13 (2013) 8013–8041. [81] G. E. Hinton, Visualizing high-dimensional data using t-sne, Vigiliae Christianae 9 (2008) 2579–2605. [82] Y. Miao, M. Zhao, J. Lin, X. Xu, Sparse maximum harmonics-to-noise-ratio deconvolution for weak fault signature detection in bearings, Measurement Science and Technology 27 (2016) 105004. [83] Y. Miao, M. Zhao, J. Lin, Y. Lei, Application of an improved maximum correlated kurtosis deconvolution method for fault diagnosis of rolling element bearings, Mechanical Systems and Signal Processing 92 (2017) 173–195. [84] J. Antoni, G. Xin, N. Hamzaoui, Fast computation of the spectral correlation, Mechanical Systems and Signal Processing 92 (2017) 248–277. [85] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. [86] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, H. Adam, Searching for mobilenetv3, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. [87] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems 30 (2017). [88] C. Cortes, V. Vapnik, Support-vector networks, Machine Learning 20 (1995) 273–297. [89] A. Mucherino, P. J. Papajorgji, P. M. Pardalos, A. Mucherino, P. J. Papajorgji, P. M. Pardalos, K-nearest neighbor classification, Data Mining in Agriculture (2009) 83–106. [90] T. K. Ho, Random decision forests, in: Proceedings of 3rd International Conference on Document Analysis and Recognition, volume 1, IEEE, 1995, pp. 278–282. [91] D. R. Cox, The regression analysis of binary sequences, Journal of the Royal Statistical Society Series B: Statistical Methodology 20 (1958) 215–232. [92] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T.-Y. Liu, Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems 30 (2017). Authors: (1) Jing-Xiao Liao, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China and School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (2) Chao He, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China; (3) Jipu Li, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China; (4) Jinwei Sun, School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (5) Shiping Zhang (Corresponding author), School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (6) Xiaoge Zhang (Corresponding author), Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China. Authors: Authors: (1) Jing-Xiao Liao, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China and School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (2) Chao He, School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China; (3) Jipu Li, Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China; (4) Jinwei Sun, School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (5) Shiping Zhang (Corresponding author), School of Instrumentation Science and Engineering, Harbin Institute of Technology, Harbin, China; (6) Xiaoge Zhang (Corresponding author), Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, Special Administrative Region of China. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. available on arxiv