paint-brush
Neural Networks in Dynamics: Approximating Multi-Body ODEs with Deep Learningby@hackerhezk
New Story

Neural Networks in Dynamics: Approximating Multi-Body ODEs with Deep Learning

by HezekMarch 18th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This article explores how neural networks approximate ordinary differential equations (ODEs) in multi-body dynamic systems. It covers network structure, activation functions, normalization techniques, and key challenges in balancing complexity with accuracy. Classical numerical methods like Taylor series are compared to deep learning approaches, highlighting their efficiency in high-dimensional problems.
featured image - Neural Networks in Dynamics: Approximating Multi-Body ODEs with Deep Learning
Hezek HackerNoon profile picture
0-item


The application of neural networks—particularly deep learning—is vast and spans numerous fields. One such field is mechanical engineering, where these techniques can be applied to dynamics.


In this article, we introduce the use of neural networks to solve ordinary differential equations (ODEs), a core aspect of dynamic and multi-body systems. We will explore the key concepts and nuances necessary for formulating these neural network models. In the next article, we will discuss a practical example.



A neural network operates by beginning with input preprocessing and weight initialization, then proceeding through feedforward propagation where activation functions are applied, followed by loss calculation, back-propagation to compute gradients, weight updates to refine the model, and ultimately achieving solution approximation through iterative training.


Before we proceed, let's state the formula for the input fed into the network, as shown in the image below:



Here, W represents the weights and b the biases. Through training, we determine optimal values for both. In this expression, l denotes the layer level and


from Tim de Ryck  Slide.



p^{(l)} represents the activation function applied at that level and A represents the affine transformation or the pre-activation output at layer l in a neural network.


It is important to note that the depth of the network corresponds to the number of affine maps, and the maximum dimension of neurons, n, represents the total number of neurons present in the network (for example, those within a cycle).



When approximating the solution to an ODE in a multi-body system, several normalization techniques can be employed—namely the C⁰-norm, Cᵏ-norm, and L²-norm. The C⁰-norm is defined as the maximum absolute difference between the functions, |f(x) – g(x)|, across the domain. In contrast, the Cᵏ-norm not only considers the maximum difference between the functions but also measures the discrepancies in their derivatives up to the kth order (for each derivative order 0≤l≤k0 \leq l \leq k0≤l≤k). Finally, the L²-norm calculates the square root of the integral of the squared differences between f and g, providing an overall measure of the approximation error over the entire domain.


A key question arises regarding the appropriate size of a neural network—specifically, how many neurons are necessary to achieve a good approximation such that the error ∥F−F^∥\|F - \hat{F}\| remains below a given threshold ε\varepsilon. Yarotsky (2017) notes that meeting this condition is not straightforward, and balancing network complexity with approximation accuracy is a significant challenge.


For a rectified linear unit (ReLU), positive values of xxx pass through unchanged—essentially acting as an identity mapping—while negative values are set to zero. This behavior is sometimes expressed as:


where (x)+(x)+ denotes the positive part of x.


Additionally, this concept extends to the formulation of the maximum function, which can be written as:



Both expressions are fundamental in understanding how neural networks approximate the linear functionalities we can achieve by leveraging the ReLU activation function.



For smooth approximation functions—such as those used in solving partial differential equations—a Taylor series approximation can serve as an effective classical numerical method for obtaining solutions to dynamic equations. However, these classical methods can become computationally expensive in high-dimensional problems or when rigorous uncertainty quantification is required, often leading to longer solution times. In contrast, the primary challenge with neural networks is selecting the optimal architecture and parameters. Once the right neural network is established, it can perform these estimations in a matter of seconds.


In the next article, we will delve deeper into solving neural ordinary differential equations (ODEs) and present a practical example.