This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Amir Noorizadegan, Department of Civil Engineering, National Taiwan University;
(2) D.L. Young, Core Tech System Co. Ltd, Moldex3D, Department of Civil Engineering, National Taiwan University & [email protected];
(3) Y.C. Hon, Department of Mathematics, City University of Hong Kong;
(4) C.S. Chen, Department of Civil Engineering, National Taiwan University & [email protected].
PINN for Solving Inverse Burgers’ Equation
Results, Acknowledgments & References
This paper introduces a novel neural network structure called the Power-Enhancing residual network, designed to improve interpolation capabilities for both smooth and non-smooth functions in 2D and 3D settings. By adding power terms to residual elements, the architecture boosts the network’s expressive power. The study explores network depth, width, and optimization methods, showing the architecture’s adaptability and performance advantages. Consistently, the results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions. Real-world examples also confirm its superiority over plain neural network in terms of accuracy, convergence, and efficiency. The study also looks at the impact of deeper network. Moreover, the proposed architecture is also applied to solving the inverse Burgers’ equation, demonstrating superior performance. In conclusion, the Power-Enhancing residual network offers a versatile solution that significantly enhances neural network capabilities. The codes implemented are available at: https://github.com/CMMAi/ResNet_for_PINN.
Deep neural networks have revolutionized the field of machine learning and artificial intelligence, achieving remarkable success in various applications, including image recognition, natural language processing, and reinforcement learning. Moreover, their adaptability extends beyond these domains, as evidenced by their effective integration with physics-informed approaches [1]. One significant development in this domain was the introduction of residual networks, commonly known as ResNets [2,3], which demonstrated unprecedented performance in constructing deep architectures and mitigating the vanishing gradient problem. ResNets leverage skip connections to create shortcut paths between layers, resulting in a smoother loss function. This permits efficient gradient flow, thus enhancing training performance across various sizes of neural networks [4]. Our research aligns closely with theirs, particularly in our exploration of skip connections’ effects on loss functions. In 2016, Veit et al. [5] unveiled a new perspective on ResNet, providing a comprehensive insight. Velt’s research underscored the idea that residual networks could be envisioned as an assembly of paths with varying lengths. These networks effectively employed shorter paths for training, effectively resolving the vanishing gradient problem and facilitating the training of exceptionally deep models. Jastrzębski et al.’s research [6] highlighted Residual Networks’ iterative feature refinement process numerically. Their findings emphasized how residual connections guided features along negative gradients between blocks, and show that effective sharing of residual layers mitigates overfitting.
In related engineering work, Lu et al. [7] leveraged recent neural network progress via a multifidelity (MFNN) strategy (MFNN: refers to a neural network architecture that combines outputs from multiple models with varying levels of fidelity or accuracy) for extracting material properties from instrumented indentation (see [7], Fig. 1(D) ). The proposed MFNN in this study incorporates a residual link that connects the low-fidelity output to the high-fidelity output at each iteration, rather than between layers. Wang et al. [8] proposed an improved fully-connected neural architecture. The key innovation involves integrating two transformer networks to project input variables into a highdimensional feature space. This architecture combines multiplicative interactions and residuals, resulting in improved predictive accuracy, but with the cost of CPU time.
In this paper, we propose a novel architecture called the Power-Enhancing SkipResNet, aimed at advancing the interpolation capabilities of deep neural networks for smooth and non-smooth functions in 2D and 3D domains. The key objectives of this research are as follows:
• Introduces the “Power-Enhancing SkipResNet” architecture.
• Enhances network’s expressive power for improved accuracy and convergence.
• Outperforms conventional plain neural networks.
• Conducts extensive experiments on diverse interpolation scenarios and inverse Burger’s equation.
• Demonstrates benefits of deeper architectures.
Through rigorous analysis and comparisons, we demonstrate the advantages of the proposed architecture in terms of accuracy and convergence speed.
The remainder of this paper is organized as follows: Section 2 reviews the neural network and its application for solving interpolation problems. In Section 3, we briefly presents physics informed neural network for solving inverse Burgers’ equation. Section 4 discusses the residual network and the proposed Power-Enhancing SkipResNet, explaining the incorporation of power terms and its potential benefits. Section 5 presents the experimental setup and the evaluation of results and discusses the findings. Finally, Section 6 concludes the paper with a summary of contributions and potential future research directions.