Abstract
Using traditional computational fluid dynamics and aeroacoustics methods, the accurate simulation of aeroacoustic sources requires high compute resources to resolve all necessary physical phenomena. In contrast, once trained, artificial neural networks such as deep encoder-decoder convolutional networks allow to predict aeroacoustics at lower cost and, depending on the quality of the employed network, also at high accuracy. The architecture for such a neural network is developed to predict the sound pressure level in a 2D square domain. It is trained by numerical results from up to 20,000 GPU-based lattice-Boltzmann simulations that include randomly distributed rectangular and circular objects, and monopole sources. Types of boundary conditions, the monopole locations, and cell distances for objects and monopoles serve as input to the network. Parameters are studied to tune the predictions and to increase their accuracy. The complexity of the setup is successively increased along three cases and the impact of the number of feature maps, the type of loss function, and the number of training data on the prediction accuracy is investigated. An optimal choice of the parameters leads to network-predicted results that are in good agreement with the simulated findings. This is corroborated by negligible differences of the sound pressure level between the simulated and the network-predicted results along characteristic lines and by small mean errors.
Keywords: Deep convolutional neural networks, Aeroacoustic predictions, Lattice-boltzmann method
Introduction
State-of-the-art machine learning (ML), e.g., deep learning (DL) techniques that require very large datasets for successful training, can greatly benefit from high-performance computing (HPC) simulations. Such simulations can be used to generate lots of training data. They come with the flexibility to obtain datasets corresponding to various task setting parameterizations, which can be used to train ML models. In contrast, obtaining data from experiments can be costly, less flexible, and sometimes even impossible. Trained ML models are capable of performing different forms of predictions on variables of interest if novel input is provided. Their knowledge is based on observations of phenomena acquired from the training on simulated data. Such data-driven models are often used as surrogate models to accelerate predictions compared to classical computationally demanding simulators, given the accuracy provided is sufficient.
Especially in the field of computational fluid dynamics (CFD), DL models trained on simulated data are capable of accelerating the prediction of flow fields. Conventional flow solvers need time to reach solutions at which the impact of initial conditions vanishes. Then, they can be used to compute, e.g., averaged results of the flow. In this case, the period of averaging needs to be bridged before the results can be analyzed. To overcome this issue, methods to accelerate the prediction of steady flow fields using convolutional neural networks (CNNs) are studied
[3, 7]. In
[7], the flow over simplified vehicle bodies is predicted with CNNs. The corresponding surrogate model is considerably faster than traditional flow solvers. In
[3], CNNs are successfully applied to predict flow fields around airfoils with varying angles of attack and Reynolds numbers. Lee and You
[16] predict the unsteady flow over a circular cylinder using DL methods. They reveal large-scale vortex dynamics to be well predictable by their models. In
[17], CNNs to predict unsteady three-dimensional turbulent flows are investigated. The CNNs correctly learn to transport and integrate wave number information contained in feature maps. Additionally, a method that can optimize the number of feature maps is proposed. Unsteady flow and force coefficients are the main focus of the investigations in
[22], in which a data-driven method using a CNN for model reduction of the Navier-Stokes equations is presented. In
[27], a generative adversarial network (GAN) to forecast movements of typhoons is used and satellite images along with velocity information from numerical simulations are incorporated. This allows for 6-hour predictions of typhoons with an averaged error
km. Unlike numerical predictions on HPC systems, the GAN-based method takes only seconds. Bode et al.
[4] propose a physics-informed GAN and successfully model flow on subgrid scales in turbulent reactive flows.
To improve quality and robustness of DL models, training is frequently performed on very large data sets obtained from simulations run on HPC systems. In aerodynamic problems, small-scale structures and/or fluid mechanics based perturbations can strongly influence the acoustic field although they might contain only a small amount of total energy. In many engineering applications, modeling flow-induced sound requires interdisciplinary knowledge about fluid mechanics, acoustics, and applied mathematics. Furthermore, the numerical analysis demands high-resolution numerical simulations to accurately determine the various flow phenomena, e.g., turbulent shear layers [24], fluid-structure interactions [6], and combustion processes [29], that determine the acoustic field. The sheer quantity and often high dimensionality of the parameters describing such flow fields complicate post-processing of the simulated data. This poses a challenge to derive new control models and to make progress in design optimizations [13, 33]. The turn-around time between prototyping and manufacturing depends on the complexity of fundamental physical mechanisms. A recent effort to enhance the efficiency of design development employs an ML framework to predict acoustic fields of a variety of fan nozzle and jet configurations [21]. Although the concept has not yet been realized, this ML-based approach illustrates a prospective possibility to reduce design cycle times of new engine configurations.
The main objective of the present study is the prediction of acoustic fields via a robust ML model based on a deep encoder-decoder CNN. The CNN is trained by acoustic fields containing noise sources surrounded by multiple objects. The numerical results are obtained from simulations using a lattice-Boltzmann (LB) method. They include the simulation of wave propagation, reflection, and scattering due to the interaction with sound-hard surfaces.
In the following, the numerical methods to predict room aeroacoustics with CNNs are described in Sect. 2. Subsequently, results from the sound fields predicted by CNNs are presented and juxtaposed to results of LB simulations in Sect. 3. Finally, a summary is given, conclusions are drawn, and an outlook is presented in Sect. 4.
Numerical Methods
To generate training data for the CNN, aeroacoustic simulations are run with an LB method on two-dimensional rectangular meshes. The LB method is described in Sect. 2.1, followed by a presentation of the geometrical setup, and the computational meshes in Sect. 2.2. Section 2.3 explains the imposed boundary and initial conditions. Section 2.4 describes how the acoustic fields are analyzed. Finally, the network architecture for the prediction of aeroacoustic fields is presented in Sect. 2.5.
Lattice-Boltzmann Method
To compute the aeroacoustic pressure field, an LB method is employed. The governing equation is the Boltzmann equation with the simplified right-hand side (RHS) Bhatnagar-Gross-Krook (BGK) collision term [2]
![]() |
1 |
The particle probability density functions (PPDFs)
describe the probability to find a particle of a fluid around a location
with a particle velocity
at time t
[1, 8]. The left-hand side (LHS) of Eq. (1) describes the evolution of fluid particles in space and time, while the RHS describes the collision of particles. The collision process is governed by the relaxation parameter
with relaxation time
to reach the Maxwellian equilibrium state
. The discretized form of Eq. (1) yields the lattice-BGK equation
![]() |
2 |
The quantity
is the time increment and
is a function of the kinematic viscosity
and the speed of sound
, i.e.,
![]() |
3 |
In the LB context, the spatial and temporal spacing are set to
such that
. Table 1 exemplarily lists the LB viscosity for two meshes
and
with different resolutions. Note that these values are derived in Sect. 2.3. The LB viscosity is an artificial parameter simply influencing the time step, i.e., how much physical time
is covered by a single
in the simulation. Using the viscosities listed in Table 1 would lead to extremely small time steps. For this reason and in order to conduct numerically stable simulations,
is set to a feasible value according to
[28]. The indices k in Eq. (2) depend on the discretization scheme and represent the different directions of the PPDFs. In this work, the two-dimensional discretization scheme with 9 PPDFS, i.e., the D2Q9 model
[25] is used. The discretized equilibrium PPDF is given by
![]() |
4 |
where the quantities
are weighting factors for the D2Q9 scheme given by 4/9 for
, 1/9 for
, and 1/36 for
, and
is the fluid velocity. The macroscopic variables can be obtained from the moments of the PPDFs, i.e., the density
. The pressure can be computed using the ideal gas law by
.
Table 1.
Physical quantities of the setup and the non-dimensional viscosity
.
| Mesh | ![]() |
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|---|---|
![]() |
0.2 | ![]() |
58.8 | ![]() |
![]() |
![]() |
0.1 | ![]() |
117.6 | ![]() |
![]() |
The LB method has been chosen for several reasons [18]: (i) the computations can be performed efficiently in parallel, (ii) it is straightforward to parallelize the code, (iii) boundary conditions can easily be applied in contrast to, e.g., cut-cell methods, and (iv) there is no need to solve a pressure Poisson-equation for quasi-incompressible flow as the pressure and hence the acoustic field is an explicit result of the lattice-BGK algorithm. Furthermore, the LB method can be applied for low to high Knudsen numbers Kn. In the continuum limit, i.e. for small Kn, the Navier-Stokes and Euler equations can directly be derived from the Boltzmann equation and the BGK model [8].
Geometrical Setup and Computational Meshes
The computational domain has a square shape containing randomly distributed objects. In physical space, denoted in the following by
, the domain has an edge length of
m. Throughout this study, the number of objects varies depending on the complexity of a computation. The domain of the most complex case is shown in Fig. 1. It has two rectangular objects
and
and two circular objects
and
. Their size is a function of the characteristic length
, i.e.,
and
have edge lengths
, and
and
have radii
. All objects have a minimum distance of
from the domain boundaries and may overlap.
Fig. 1.

Computational domain.
Two-dimensional uniformly refined meshes
and
with two distinct resolutions are generated in Cartesian coordinates. In the fine mesh
each cell has an edge length of
m resulting in
cells. The coarse mesh
has a cell length of
m and a total of
cells.
Boundary and Initial Conditions
Two types of boundary conditions are imposed at the four domain boundaries according to
[11], i.e., non-reflecting (NRBCs) and wall boundary conditions (WBCs) are prescribed. As shown for boundaries III and IV in Fig. 1, the NRBCs have a buffer layer thickness of
to ensure a complete dissipation of acoustic waves and to avoid reflective phenomena at the domain boundaries. In the buffer layer, an absorption term
[11]
![]() |
5 |
with weighting factor
and
is added to Eq. (2). The quantity
is the distance to the buffer layer and
is a constant specified as 0.1.
The WBCs are characterized by a no-slip behavior, where the PPDFs are reflectively bounced back. They are imposed as a layer with thickness
as shown for boundaries I and II in Fig. 1, i.e., the computational domain is reduced by this thickness. In computations with WBC, a maximum number of three domain boundaries is specified as WBC in a random process. To prevent strong overlaps of acoustic waves, which may cause numerical instabilities, at least at one domain boundary an NRBC is imposed.
The acoustic fields, which are exploited to train the CNN model, are configured by a simple source S defined by a sinusoidal function given by
![]() |
6 |
with a frequency
and the amplitude
and
in the LB context. A set of the training data is generated by the computational domains with a noise source restricted by a geometry, i.e., the minimum distance
between the noise source and the sound-hard objects satisfies the condition
where L is a distance between monopoles and domain boundaries. With
, this yields a non-dimensionalized harmonic period of
. One wavelength
is computed from
, with
being the velocity with which information is transported in the LB context. This results in
for computations in this study, if not stated otherwise.
The relationship between
in LB context and the frequency
in physical space is obtained by inserting
![]() |
7 |
with the physical speed of sound
m/s at reference temperature
into the equation for the frequency
. The relationship between
in the LB context and the kinematic viscosity
in physical space is given by
![]() |
8 |
The latter equation is Sutherland’s law
[32] with
,
, and
. Table 1 lists all necessary variables in their dimensional and non-dimensional form for
and
.
Evaluation of Acoustic Fields
The acoustic fields are determined by a set of the computational domains which include at least one noise source and randomized solid surfaces. For fluid cells at location
, the sound pressure level SPL is defined by
![]() |
9 |
where the maximum number of mesh points m is
for the coarse grid and
for the fine grid configurations. The root-mean-square (rms) values of pressure fluctuations
are calculated by
![]() |
10 |
where
is the mean pressure averaged over the time period N, and
is the instantaneous pressure resulting from the simulation at a time step n within that period. Simulations are carried out for 3, 000 time steps. The averaging period
starts after 1, 000 time steps when the acoustic field is fully developed.
Machine Learning Techniques
An encoder-decoder CNN is trained to predict the SPL in a supervised manner using results of the aforementioned LB simulations. The CNN is fed with four types of input data:
-
(i)
types of boundary condition;
-
(ii)
location of monopoles;
-
(iii)
cell distances for objects;
-
(iv)
cell distances for monopoles.
To correctly predict aeroacoustic fields, the CNN needs to learn the impact of the various boundary conditions and the location of monopoles on the acoustic field. Therefore, considering inputs (i) and (ii), cells at location (i, j) are assigned segmentation values
![]() |
11 |
A sensitivity analysis of the input data has been performed before the training. This analysis revealed that solely using boundary parameters leads to poor predictions of the network, i.e., it is not effective for CNNs learning from flow simulations. This is in line with findings in
[7]. Since acoustic signals propagate with a certain wavelength and amplitude at a certain sound speed, distances are also important parameters for learning. For this purpose, inputs (iii) and (iv) are provided to the CNN in the form of distance functions
for objects and
for monopoles. Such an approach has previously been used for CNNs to predict steady-state flow fields
[3, 7]. The distance functions are defined by
![]() |
12 |
i.e., for each cell
with location (i, j) in a domain the minimal distances
and
to the boundary
of an object
and to a monopole M are determined. Obviously, it is
on the boundary and exactly at the monopole source. For
, an assignment of negative distances for cells inside of an object, as it is usually used by signed-distance functions, turned out to have a negative impact on predictions, which is why
. The distances are computed by the fast marching method
[30] and are normalized by
. Learning from distances like inputs (iii) and (iv) alone results in mispredictions near domain boundaries. A combination of all presented types of inputs has been found to favorably affect predictions.
In the following, the CNN used for predicting the SPL fields is referred to as acoustics field predictor (AFP). The corresponding network architecture is shown in Fig. 2 for a case that uses arrays with the size of
as inputs. Inputs (i) and (ii) are combined to one array. Together with fields (iii) and (iv) they are stacked to form channels of the input. It should be noted that physical quantities such as the pressure distribution are a solution of the acoustic fields computation and constitute the ground truth. They are not known a priori and hence cannot be used for training.
Fig. 2.
Network architecture of the AFP including size and number of feature maps (FMs) as a multiple of Y, kernel size (KS),
maximum pooling layers (MP), dropout layers (DO), convolutional blocks, and deconvolutional layers.
The architecture is inspired by distinct architectures that employ long skip connections between encoder and decoder layers
[19, 26, 34], like for instance U-net architectures, which have been successfully used for medical image segmentation
[26]. Skip connections between encoding and decoding paths allow the re-use and fusion of features on different scales. To preserve information from features on all scales, the activity of each encoder layer is directly fed to the corresponding decoder layer via long skip connections. These connections are chosen to have residual form, adding the activity of encoder layers to the output of decoder layers. This setup is similar to
[19], however, different from the original U-net architecture, where long skip connections have dense form and concatenate layers on the same scale. As depicted in Fig. 2, the residual long skip connections perform identity mapping by adding source encoder layer outputs to target decoder layer outputs
[9, 19]. This kind of connectivity allows for direct gradient flow from higher to lower layers across all hierarchy stages during the backward pass, which prevents common issues with vanishing gradients in deep architectures. In contrast to dense long skip connections, residual skip connections lead to smaller numbers of activations to be handled in the decoding path during the forward and backward passes. As a consequence, they decreased memory consumption and are more efficient and faster in training without sacrificing prediction accuracy. Short skip residual connections are also used in so called convolutional residual blocks (Conv-Blocks). Here, convolutional layers, batch normalization (BN), and rectified linear unit (ReLU) activation functions are employed. BN acts as a regularizer, shifting activity of the layers to zero mean, unit variance. This leads to faster and more reliable network convergence
[10]. The number of feature maps (FMs) is a multiple of a given factor Y. The output of the first convolutional layer is added to the input of the last ReLU activation, see Fig. 2, which defines residual short skip connections in Conv-Blocks. A combination of long and short skip connections leads to faster convergence and stronger loss reduction
[5]. In the encoder path, downscaling is performed by
maximum pooling layers (MP). To further avoid overfitting, yet another regularization method, dropout (DO)
[31] is used during training, with a DO probability of
. The final layer is fully connected with a linear activation function, which is frequently used for regression outputs
[15]. Weights and biases are initialized from a truncated normal distribution centered around the origin with a standard deviation of
, where f is the number of connections at a layer
[9]. They are updated by an adaptive moments (ADAM) optimizer
[12]. The ADAM optimizer adjusts the learning rate (LR) by considering an exponentially decaying average of gradients computed in previous update steps. The initial learning rate is set to
. The batch size BS represents the number of training data passed to the network in a single training iteration. In Sect. 3 it will be shown that in this context a batch size of
achieves the best results. Therefore, it is used throughout this study, if not stated otherwise. The ground truth GT distribution
is obtained from
![]() |
13 |
where
and
are the mean and the standard deviation of the complete training dataset of the a priori simulations. The predictions need to be denormalized before the SPL can be analyzed.
Data augmentation is used to increase training data diversity and to encourage learning of useful invariances. Therefore, the coordinate axes i and j are transposed randomly. Furthermore, for inputs (i) and (ii), the segmentation values
are changed to augmented inputs
according to
![]() |
14 |
The total loss
between simulated (superscript “sim”) and predicted (superscript “pred”) SPL values is defined by
![]() |
15 |
which is a combination of the mean squared error MSE
![]() |
16 |
with
and a gradient difference loss
. Gradient losses GDL in i- and j-directions are considered by
and
, and diagonal gradients by
and
.
Three types of gradient losses are addressed in this work. The four directions indicated by roman numbers I–IV in Eq. (15) are defined by introducing integer variables k and l, i.e., the four directions are denoted by
,
,
, and
. In the first type,
, the difference between two neighboring cells is considered, inspired by the gradient loss in the work of Mathieu et al.
[20]
![]() |
17 |
In Eq. (17) the gradient losses of four neighboring points are defined by the notations
,
, and
. The gradient loss terms of the first type have a 1st-order accuracy in terms of a forward difference (FD) formulation
[23]. To integrate radial propagation of a point source into the loss function, central difference (CD) schemes are added. The gradient loss
uses a 2nd-order accurate CD formulation that incorporates two neighboring cells. The 2nd-order accurate gradient loss terms in a two-dimensional domain read
![]() |
18 |
The third type of gradient loss,
, is formulated with a 4th-order accurate CD scheme and includes four neighboring cells, i.e., two cells in each direction
![]() |
19 |
The cell-wise prediction accuracy is evaluated by the absolute error
![]() |
20 |
between
and
with
and
. From the
distribution of each simulation a mean absolute error
![]() |
21 |
is calculated to evaluate the prediction quality.
Results
In the following, findings of a grid convergence study are discussed in Sect. 3.1. Results of network-predicted acoustic fields are presented for three cases 1–3 in Sects. 3.2, 3.3, and 3.4. The complexity of the cases is continuously increased.
The acoustic simulations are conducted on multiple graphics processing units (GPUs). At average, a solution on
is obtained in
s on a single GPU. Up to ten GPUs are employed to accelerate the process. Once trained, the network predictions take only a fraction of a second on a single modern GPU and only a few seconds on any low end computer such as a laptop. For all computations the GPU partition of the JURECA system
[14], Forschungszentrum Jülich, is employed. Each GPU node is equipped with two NVIDIA K80 GPUs.
Grid Convergence Study
A grid convergence study is conducted in a free-field domain containing only a single monopole at the center and no walls. The impact of doubling the number of cells used to resolve one wavelength
on the SPL accuracy is investigated. Therefore, the wavelength resolutions at a distance of up to 4 wavelengths in radial direction from the source, which corresponds to the maximum appearing distance considered in the subsequently discussed cases 1–3, is analyzed. In order to obtain results in a farfield from the center for
, the domain is extended to
cells. Figure 3a) shows the divergence
from the maximum SPL value
, which appears at a distance of one wavelength from the monopole location, for
,
, and
. From this figure, it is evident, that the divergence increases with increasing distance from the monopole. Furthermore, Fig. 3b) shows the error for
compared to
, i.e.,
. Throughout this work a wavelength of
is used, which covers distances up to
in cases 1–2, and up to
in case 3. At distances
and
, errors of
and
are obtained. It should be noted that using
would massively increase the computational effort and hence, as the corresponding error is acceptable, meshes with
are employed in all cases.
Fig. 3.
a)
at a distance from up to
in radial direction from a monopole placed in the center of a free field. Three resolutions for one wavelength are juxtaposed:
,
- - -,
—. b) Error E between
) and
.
Case 1: Simple Setup and Parameter Study
The domain in case 1 contains one monopole
at the center (8C, 8C) and one randomly positioned circular object
. Each computational domain consists of
cells in the two dimensions. The acoustic solutions of 3, 000 simulations are split into 2, 600 training data, 200 validation data, and 200 test data. Three sub-cases 1A, 1B, and 1C listed in Table 2 are configured by one noise source and one solid object. In case 1A, the number of FMs is investigated by varying the factor Y as shown in Fig. 2. Variations of
,
and
lead to 517, 867, 2, 066, 001 and 8, 253, 089 trainable parameters. It is evident from comparing Figs. 4b), 4c), and 4d) with the simulation results in Fig. 4a) that
qualitatively reproduces the simulation best. For
, the AFP completely fails to generate a physically meaningful SPL field. In case of
, acoustic waves distant from the object are reproduced well, but superpositions of acoustic waves in the vicinity of the object are too strong, see Fig. 4c). The SPL distribution shown in Fig. 4e) along the characteristic line LP1, see Fig. 4a), substantiates these findings. The valley between
and
, and the decrease of the SPL value in the shadow of
are only captured well for
. Furthermore, the CNN has problems capturing fluctuations at the center of
as non-physical SPL values are found at isolated locations close to the object. The mean error
listed in Table 2 shows
to have the lowest deviation among the three computations. The training time to reach a convergence of the loss function increased from approximately one hour for
up to two and four hours for
and
.
Table 2.
Simulation configurations defined by objects, the number of noise sources (no. noise) and simulations (no. sim) generated by randomized distributions of objects. The number of feature maps (FMs) is defined by Y. The gradient losses GDL are calculated by FD, 2nd-order-accurate, and 4th-order-accurate CD schemes. The quantities BS and
are the batch size and the mean acoustic error.
| Case | Object(s) | No. noise | No. sim | Y | GDL method | BS | ![]() |
|---|---|---|---|---|---|---|---|
| 1A | ![]() |
1 | 3,000 | 8 | FD | 5 | 0.17506 |
![]() |
1 | 3,000 | 16 | FD | 5 | 0.03312 | |
![]() |
1 | 3,000 | 32 | FD | 5 | 0.00887 | |
| 1B | ![]() |
1 | 3,000 | 32 | FD | 5 | 0.00887 |
![]() |
1 | 3,000 | 32 | 2nd order CD | 5 | 0.00671 | |
![]() |
1 | 3,000 | 32 | 4th order CD | 5 | 0.00222 | |
| 1C | ![]() |
1 | 3,000 | 32 | 2nd order CD | 5 | 0.00671 |
![]() |
1 | 3,000 | 32 | 2nd order CD | 10 | 0.00626 | |
![]() |
1 | 3,000 | 32 | 2nd order CD | 20 | 0.00413 | |
| 2 |
,
|
1 | 3,000 | 32 | 2nd order CD | 5 | 0.00359 |
,
|
1 | 6,000 | 32 | 2nd order CD | 5 | 0.00280 | |
| 3 |
, , ,
|
2 | 6,000 | 32 | 2nd order CD | 5 | 0.02581 |
, , ,
|
2 | 10,000 | 32 | 2nd order CD | 5 | 0.02268 | |
, , ,
|
2 | 20,000 | 32 | 2nd order CD | 5 | 0.01937 |
Fig. 4.
Example of SPL fields of case 1A: a) simulation result, b) network prediction with
, c)
, and d)
; e) SPL distribution at
along LP1: simulation result
, network prediction with
- - -,
—, and
-
-.
To overcome inaccurate predictions close to monopoles, the nature of a noise source is incorporated into the loss function of the AFP. A simple FD gradient loss does not consider that monopoles are point sources spreading waves into all directions. In case 1B, two variations of losses are investigated that are based on the CD formulations provided in Sect. 2.5. From Fig. 5 it is obvious that thereby non-physical SPL values vanish near objects. Furthermore, Fig. 5(c) shows an improvement of the SPL distribution at the center and surroundings of
predicted by a 2nd-order accurate CD gradient loss. In contrast, using a 4th-order accurate CD formulation lowers the accuracy of the predictions near
, see Fig. 5(d). It is, however, evident from Table 2 that a slightly lower
is achieved than using a 2nd-order formulation. This is due to the 4th-order accurate CD gradient loss computations reproducing simulations slightly better at locations distant from monopoles and objects, see Fig. 5(e). SPL fluctuations at the center of
are by far closer to the ground truth using the 2nd-order accurate formulation. Since this study focuses on the prediction of complex acoustic fields with multiple noise sources, the advantages of the 2nd-order accurate formulation are considered more valuable, i.e., in the following this type of loss is employed.
Fig. 5.
Example of SPL fields of case 1B: a) simulation result, b) network prediction with FD, c) a 2nd-order accurate CD , and d) a 4th-order accurate CD gradient loss; e) SPL at
along LP2: simulation result
, network prediction with FD - - -, 2nd-order accurate CD —, and 4th-order accurate CD gradient losses -
-.
The impact of BS is investigated in Fig. 6. Figure 6e) plots the SPL distribution along line LP3, see Fig. 6a). Although predictions with
and
show a slight decrease of
, see Table 2, several shortcomings are recognizable in predicted SPL fields. Figures 6c) and e) show that with
non-physical fluctuations near the objects are introduced. These fluctuations are also present for
and are superimposed by inaccuracies appearing in the vicinity of
and at the domain boundaries, i.e.,
delivers the best results.
Fig. 6.
Example for SPL fields of case 1C: a) simulation result, b) network prediction with
, c)
, and d)
; e) SPL distribution at
along line LP3: simulation result
, network prediction with
- - -,
—, and
-
-.
Case 2: Influence of the Number of Training Data
In case 2, the number of training, validation, and test data is analyzed. Compared to case 1, the complexity is increased by adding a rectangular object
to the domain. The training, validation, and test data are composed of 2, 600, 200 and 200 simulations for a total of 3, 000, and of 5, 200, 400, and 400 for a total of 6, 000 simulations. The setups for these cases are summarized in Table 2.
Figure 7 compares the results of an LB simulation qualitatively and quantitatively along line LP4, see Fig. 7a), with predictions generated by using 3, 000 and 6, 000 simulations for learning. When the amount of data is increased, non-physical fluctuations disappear in regions, where sound waves propagate towards the surface of
. Furthermore, the predictions of the acoustic field in the vicinity of
improve from 3, 000 to 6, 000 training datasets.
Fig. 7.
Example of SPL fields of case 2: a) simulation result, b) network prediction with 3, 000, and c) 6, 000 simulations; SPL distribution at
along LP4: simulation result
, network prediction with 3, 000 - - -, and 6, 000 simulations —.
Case 3: Complex Setup and Impact of Increasing Training Data
Case 3 ties on to the findings from the previous cases to predict SPL fields in a domain containing objects
,
,
, and
, see Fig. 1, on
. From
to
the number of trainable parameters increases from 8, 253, 089 to 8, 256, 225. NRBC and WBC boundary conditions are imposed randomly at the domain boundaries. Two monopoles
and
are placed inside of the domain.
is located at (5C, 5C) and
is positioned randomly. For the training, validation, and testing of the AFP, a total number of 20, 000 simulations is used. Results of computations with different simulation inputs are compared to the ground truth in Fig. 8. Note that the WBC is imposed at domain boundary IV, however, the complete thickness
is not visualized in the figure. The first case uses 6, 000 simulations with a distribution of 5, 200, 400, and 400 for training, validation, and testing. The second case employs 10, 000 simulations with a distribution of 8, 800, 600, and 600 for training, validation, and testing. The last case employs all 20, 000 simulations with a distribution of 18, 000, 1, 000, and 1, 000 for training, validation, and testing. For reference, the different setups and the corresponding results are listed in Table 2. Obviously, the error
decreases when the number of training data is increased. From Figs. 8(c) and (e) it is evident that the AFP trained with 8, 800 datasets overpredicts the SPL near
. In general, it can be stated that with an increasing complexity the SPL is more difficult to predict compared to cases 1 and 2. To be more specific, from case 1 to case 3 the error
increases by one order of magnitude, i.e., it is at
in case 3. However, complex acoustic fields are reproduced. For a number of 18,000 simulation, training took 96 hours to reach a convergence of the loss function.
Fig. 8.
Example of SPL fields of case 3: a) simulation result, b) network prediction with 6, 000, c) 10, 000, and d) 20, 000 simulations; SPL distribution at
along LP5: simulation result
, network prediction with 6, 000 - - -, 10, 000 —, and 20, 000 simulations -
-.
Summary, Conclusions, and Outlook
A deep learning method has been developed to predict the sound pressure level distribution in two-dimensional aeroacoustic setups including multiple randomly distributed rectangular and circular objects as hard reflective surfaces and monopoles as sound sources. The deep learning method is based on an encoder-decoder convolutional neural network, which has been trained with numerical simulations based on a lattice-Boltzmann method. To analyze the accuracy of the network predictions, various learning parameters have been tuned by successively increasing the complexity of the prediction cases and by analyzing different loss functions. A network containing 8, 256, 225 trainable parameters, a combination of the mean-squared error loss and gradient loss formulated by a 2nd-order accurate central difference scheme, and a batch size of five positively influenced the predictions. A number of 18, 000 datasets has been used to train the deep neural network. A mean absolute error of less than
shows the neural network being capable of accurately predicting the acoustic fields. The study has been complemented with a grid convergence study, which revealed that a resolution of 50 cells for a single wavelength is sufficient to yield accurate results.
At present, the method is spatially limited to two-dimensional cases. However, most engineering applications, e.g., design processes to find optimal layouts for low-noise turbojet engines, feature three-dimensional phenomena. Extending the presented deep learning method to learn from three-dimensional simulations will lead to accelerated predictions of three-dimensional aeroacoustic problems. Furthermore, realistic acoustic fields are frequently characterized by interactions of multiple noise sources with various frequencies and amplitudes. Therefore, it is necessary to extend the current setup to monopoles with multiple frequencies and amplitudes. Apart from increasing the domain’s complexity, the level of generalization will be increased. The presented acoustic field predictor has been trained and tested on similar situations. Its capabilities to generalize will be enhanced by testing on situations that have not been part of the training process, e.g., training with four objects and testing with five. Instead of strictly separating different gradient losses, the impact of combining them in a single loss and employing individual weights will be analyzed. In addition, physics-informed losses that allow the network to comply with physical laws of aeroacoustics will be integrated. Furthermore, adversarial training will be investigated by adding a discriminator with an adversarial loss to the current architecture. Such GAN type architectures have the potential to help finding a suitable loss from the training data. It is also worth mentioning that the method presented in this study has the potential to support solving noise control problems. It remains to investigate if a dedicated acoustic field predictor that can quickly give feedback on the arrangement of multiple monopoles is capable of finding optimal acoustic setups. Therefore, the presented acoustic field predictor will be integrated into a reinforcement learning loop.
Acknowledgments
The authors gratefully acknowledge the computing time granted through the Jülich Aachen Research Alliance (JARA) on the supercomputer JURECA [14] at Forschungszentrum Jülich. Furthermore, the authors would like to thank Forschungszentrum Jülich GmbH, RWTH Aachen University, and the JARA Center for Simulation and Data Science (JARA-CSD) for research funding. This work was performed as part of the Helmholtz School for Data Science in Life, Earth and Energy (HDS-LEE).
Contributor Information
Heike Jagode, Email: jagode@icl.utk.edu.
Hartwig Anzt, Email: hartwig.anzt@kit.edu.
Guido Juckeland, Email: g.juckeland@hzdr.de.
Hatem Ltaief, Email: hatem.ltaief@kaust.edu.sa.
Mario Rüttgers, Email: m.ruettgers@aia.rwth-aachen.de.
References
- 1.Benzi R, Succi S, Vergassola M. The lattice Boltzmann equation: theory and applications. Phys. Rep. 1992;222(3):145–197. doi: 10.1016/0370-1573(92)90090-M. [DOI] [Google Scholar]
- 2.Bhatnagar, P.L., Gross, E.P., Krook, M.: A Model for collision processes in gases. I. Small amplitude processes in charged and neutral one-component systems. Phys. Rev. 94(3), 511–525 (1954). 10.1103/PhysRev.94.511
- 3.Bhatnagar S, Afshar Y, Pan S, Duraisamy K, Kaushik S. Prediction of aerodynamic flow fields using convolutional neural networks. Comput. Mech. 2019;64(2):525–545. doi: 10.1007/s00466-019-01740-0. [DOI] [Google Scholar]
- 4.Bode, M., et al.: Using Physics-Informed Super-Resolution Generative Adversarial Networks for Subgrid Modeling in Turbulent Reactive Flows (2019)
- 5.Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C, et al. The importance of skip connections in biomedical image segmentation. In: Carneiro G, et al., editors. Deep Learning and Data Labeling for Medical Applications; Cham: Springer; 2016. pp. 179–187. [Google Scholar]
- 6.Ewert R, Schröder W. On the simulation of trailing edge noise with a hybrid LES/APE method. J. Sound Vib. 2004;270:509–524. doi: 10.1016/j.jsv.2003.09.047. [DOI] [Google Scholar]
- 7.Guo, X., Li, W., Iorio, F.: Convolutional neural networks for steady flow approximation. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2016, pp. 481–490. ACM Press, New York (2016). 10.1145/2939672.2939738
- 8.Hänel D. Molekulare Gasdynamik, Einführung in die kinetische Theorie der Gase und Lattice-Boltzmann-Methoden. Heidelberg: Springer; 2004. [Google Scholar]
- 9.He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034. IEEE (2015). 10.1109/ICCV.2015.123
- 10.Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML 2015: Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, pp. 448–456. W&CP (2015). 10.5555/3045118.3045167
- 11.Kam EWS, So RMC, Leung RCK. Lattice Boltzman method simulation of aeroacoustics and nonreflecting boundary conditions. AIAA J. 2007;45(7):1703–1712. doi: 10.2514/1.27632. [DOI] [Google Scholar]
- 12.Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization (2014)
- 13.Koh SR, Meinke M, Schröder W. Numerical analysis of the impact of permeability on trailing-edge noise. J. Sound Vib. 2018;421:348–376. doi: 10.1016/j.jsv.2018.02.017. [DOI] [Google Scholar]
- 14.Krause, D., Thörnig, P.: JURECA: modular supercomputer at Jülich supercomputing centre. JLSRF 4, A132 (2018). 10.17815/jlsrf-4-121-1
- 15.Lathuilière, S., Mesejo, P., Alameda-Pineda, X., Horaud, R.: A comprehensive analysis of deep regression (2018) [DOI] [PubMed]
- 16.Lee S, You D. Data-driven prediction of unsteady flow over a circular cylinder using deep learning. J. Fluid Mech. 2019;879:217–254. doi: 10.1017/jfm.2019.700. [DOI] [Google Scholar]
- 17.Lee, S., You, D.: Mechanisms of a convolutional neural network for learning three-dimensional unsteady wake flow (2019)
- 18.Lintermann, A., Meinke, M., Schröder, W.: Zonal Flow Solver (ZFS): a highly efficient multi-physics simulation framework. Int. J. Comut. Fluid Dyn. 1–28 (2020). 10.1080/10618562.2020.1742328
- 19.Mao, X., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Advances in Neural Information Processing Systems, pp. 2802–2810 (2016)
- 20.Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error (2015)
- 21.McKee, C., Harmanto, D., Whitbrook, A.: A conceptual framework for combining artificial neural networks with computational aeroacoustics for design development. In: Proceedings of the International Conference on Industrial Engineering and Operations Management (2018)
- 22.Miyanawala, T.P., Jaiman, R.K.: An efficient deep learning technique for the navier-stokes equations: application to unsteady wake flow dynamics (2017)
- 23.Moin, P.: Fundamentals of Engineering Numerical Analysis. Cambridge University Press, London (2001)
- 24.Niemöller A, Schlottke-Lakemper M, Meinke M, Schröder W. Dynamic load balancing for direct-coupled multiphysics simulations. Comput. Fluids. 2020;199:104437. doi: 10.1016/j.compfluid.2020.104437. [DOI] [Google Scholar]
- 25.Qian YH, D’Humières D, Lallemand P. Lattice BGK Models for Navier-Stokes Equation. EPL. 1992;17(6):479–484. doi: 10.1209/0295-5075/17/6/001. [DOI] [Google Scholar]
- 26.Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation (2015)
- 27.Rüttgers, M., Lee, S., Jeon, S., You, D.: Prediction of a typhoon trackusing a generative adversarial network and satellite images. Sci. Rep. 9 (2019). 10.1038/s41598-019-42339-y [DOI] [PMC free article] [PubMed]
- 28.Salomons EM, Lohman WJA, Zhou H. Simulation of sound waves using the lattice Boltzmann method for fluid flow: benchmark cases for outdoor sound propagation. PLoS ONE. 2016;11(1):e0147206. doi: 10.1371/journal.pone.0147206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Schlimpert S, Koh SR, Pausch K, Meinke M, Schröder W. Analysis of combustion noise of a turbulent premixed slot jet flame. Combust. Flame. 2017;175:292–306. doi: 10.1016/j.combustflame.2016.08.001. [DOI] [Google Scholar]
- 30.Sethian J. Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science. Cambridge: Cambridge University Press; 1999. [Google Scholar]
- 31.Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014;15(56):1929–1958. [Google Scholar]
- 32.Sutherland W. The viscosity of gases and molecular force. Philos. Mag. 1893;36(5):507–531. doi: 10.1080/14786449308620508. [DOI] [Google Scholar]
- 33.Zhou BY, Koh SR, Gauger N, Meinke M, Schröder W. A discrete adjoint framework for trailing-edge noise minimization via porous material. Comput. Fluids. 2018;172:97–108. doi: 10.1016/j.compfluid.2018.06.017. [DOI] [Google Scholar]
- 34.Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J. Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging. 2019;39(6):1856–1867. doi: 10.1109/TMI.2019.2959609. [DOI] [PMC free article] [PubMed] [Google Scholar]



































































