High-dimensional aerodynamic data modeling using a machine learning method based on a convolutional neural network

Modeling high‑dimensional aerodynamic data presents a significant challenge in aero‑loads prediction, aerodynamic shape optimization, flight control, and simulation. This article develops a machine learning approach based on a convolutional neural network (CNN) to address this problem. A CNN can implicitly distill features underlying the data. The number of parameters to be trained can be significantly reduced because of its local connectivity and parameter‑sharing properties, which is favorable for solving high‑dimensional problems in which the training cost can be prohibitive. A hypersonic wing similar to the Sanger aerospace plane carrier wing is employed as the test case to demonstrate the CNN‑based modeling method. First, the wing is parameterized by the free‑form deformation method, and 109 variables incorporating flight status and aerodynamic shape variables are defined as model input. Second, more than 7000 sample points generated by the Latin hypercube sampling method are evaluated by performing computational fluid dynamics simulations using a Reynolds‑averaged Navier–Stokes flow solver to obtain an aerodynamic database, and a CNN model is built based on the observed data. Finally, the well‑trained CNN model considering both flight status and shape variables is applied to aerodynamic shape optimization to demonstrate its capability to achieve fast optimization at multiple flight statuses.


Introduction
Aircraft aerodynamic data modeling is conducted to establish an explicit or implicit relationship between input variables and output responses through a trained physicsinformed or data-driven model. It provides a rapid evaluation and prediction of aerodynamic characteristics for an aircraft instead of conducting flight tests, wind tunnel experiments, or numerical simulations such as computational fluid dynamics (CFD).
Although sufficient samples are invariably required to build an adequate model with respect to the independent variables, such as flight status variables (or aircraft shape variables) and corresponding aerodynamic characteristics, it still can significantly reduce the cost of building a database covering the entire flight envelope compared with these multi-layer perceptron category. The trade-off between the size and learning capability of standard MLP is highly quantified, with more complex problems requiring larger scale data and more complex network structures, yielding wasteful connectivity and a high tendency to overfitting. It is noted that no more than 60 independent input variables are introduced into existing aerodynamic modeling using an MLP, limiting more refined designs.
Convolutional neural networks have rapidly replaced standard MLP techniques in many challenging ML tasks (e.g., image recognition [28,29]) because of their local connectivity and parameter-sharing scheme. Aerodynamic modeling research using airfoil images as input shows that it is advantageous for high-dimensional aerodynamic modeling. Thuerey [30] used a CNN to predict the airfoil flow fields for different Reynolds numbers ranging from 5 × 10 5 to 5 × 10 6 and angles of attack ranging from -22.5ºto 22.5º, comparing their results with those calculated from the RANS equations. The prediction error of pressure and velocity contours was less than 3% when the number of training sample points was 12,800.
Zhang [19] proposed a CNN-based prediction method for airfoil lift coefficients for various shapes at multiple free-stream Mach numbers, Reynolds numbers, and angles of attack. After data augmentation, about 80,000 sample points of the airfoil coordinates were fed to a CNN instead of shape variables from parameterization. Yu [31] proposed a feature-enhanced image approach to perform aerodynamic modeling of an SC1095 airfoil with a CNN trained with 11,550 pairs of normal input/output training data. Chen [32] also adopted a graphical prediction method for multiple airfoil aerodynamic coefficients based on CNN trained by 3360 samples and tested on 840 samples. CNNs have also been used to predict the aerodynamic characteristics of iced airfoils. He [33] proposed a prediction method for the aerodynamic characteristics of iced airfoils based on a CNN, using 11,200 training samples to realize rapid prediction from ice images to aerodynamic characteristics with prediction errors below 8%. The relevant aerodynamic modeling studies using ML-based methods for aerodynamic coefficient and flow field prediction are listed in Table 1.
It is noted that most aerodynamic modeling methods using CNNs adopt the idea of using airfoil graphics as input, which is limited to processing two-dimensional structured Table 1 Statistics of some related studies image data. For the three-dimensional aerodynamic shape, there is no research on modeling considering more than 100 parameterized independent variables, motivating the research described in this article.
The purpose of this study is to develop a CNN-based method for high-dimensional aerodynamic modeling to alleviate the "curse of dimensionality" [34] and apply it to surrogate-based aerodynamic shape optimization [35][36][37]. The convolution operation implicitly extracts feature information underlying the aerodynamic data and slightly squeezes the tensor's dimensionality. Usually, pooling layers are added after the convolutional layers to further reduce the number of neural network parameters (weights and biases) to be trained. With the assistance of local connectivity and parameter sharing, CNNs have better resistance to overfitting.
The remainder of this article is organized as follows. Section 2 presents a detailed introduction to multi-layer perceptrons and convolutional neural networks. Then, the CNN-based aerodynamic data modeling process is described in Section 3. In Section 4, a well-trained CNN is built by investigating the influence of the hyperparameters and applying it to fast aerodynamic shape optimization for a wing. Concluding remarks are provided in Section 5.

Fundamentals of convolutional neural networks
A convolutional neural network (CNN) is a supervised deep learning algorithm developed based on the multi-layer perceptron (MLP), and its basic theory is outlined in this section.

Multi-layer perceptron
The MLP is a machine learning model known as a fully-connected feedforward neural network. The typical MLP, shown in Fig. 1, receives an input (a single vector) and transforms it through a series of hidden layers. Each hidden layer consists of a set of neurons, where each neuron is fully connected to all neurons in the previous layer and where neurons in a single layer function independently and do not share any connections.
For the forward propagation of an MLP, the output of the k th layer, H (k) , can be expressed as follows: i,j is a scalar of weights to be determined between the i th neuron on the k th layer and the j th neuron on the previous layer, x (k−1) j is the input of this layer, i.e., the output of the j th neuron on the k − 1 layer, and b (k) i denotes a bias scalar of the i th neuron on the k th layer.

Convolutional neural network
A CNN is a type of ANN specifically designed for large-scale structured data such as images. Its structure makes the implementation more efficient and significantly reduces the number of parameters in the network through convolution and pooling operations [38,39]. The CNN architecture mainly consists of convolutional, pooling, and fully-connected layers. A schematic of a typical CNN architecture for handwritten digit recognition is shown in Fig. 2.
Unlike a fully-connected neural network, the number of a CNN's free parameters describing their shared weights does not depend on the input dimensionality, which avoids the pressure caused by the surge of high-dimensional input parameters in a neural network. The following section focuses on the convolutional and pooling layers. The structure of the fully-connected layer is consistent with that of an MLP mentioned above.
Inspired by the conventional two-dimensional CNN, we propose a modified structure (Fig. 3) that takes the one-dimensional tensor for parameterized variables. These variables taken as model inputs include shape variables (defined by the free-form deformation method, class-function shape-function transformation, and related methods), flight status variables (such as Mach number, angle of attack, Reynolds number, and flight altitude), and wing control surface deflection variables. The outputs are the predicted aerodynamic characteristics such as, C L , C D and C M .

Convolutional layer
The convolution layer processes the input data using a convolution filter and distills the local and global information. If two-dimensional input data are taken as the input of a Fig. 2 Conventional CNN architecture applied to a handwritten digit recognition task convolution layer (I with coordinates (m,n)), the convolution filter is a two-dimensional matrix (K), and the output data is a two-dimensional matrix S with coordinates (i,j). Therefore, the convolution process can be expressed by the following formula: Unlike the full connectivity of an MLP, only the dynamic connection between the convolutional filters and the neurons covered by the receptive field is established, significantly reducing the number of parameters to be trained in the CNN and contributing to the alleviation of overfitting.

Pooling layer
It is common to periodically insert a pooling layer between successive convolutional layers in a CNN architecture. A pooling layer's function is to progressively reduce the representation's spatial size to reduce the number of parameters and computations in the network and control overfitting [40,41]. Common pooling operations are maximum pooling and average pooling. The pooling layer keeps the maximum or average value in the pooling filter's receptive field and transfers it to the next layer while discarding the other values and moving the filter with a given stride to the next local region to perform the same operation.

Fully-connected layer
The last pooling layer is usually followed by several fully-connected layers whose structure is the same as the MLP. The depth and width of the fully-connected layers are defined according to the problem's complexity and the data size.

CNN-based aerodynamic data modeling method
The steps of the aerodynamic data modeling method using CNN, shown in Fig. 4, can be described as follows: meshing. Obtain the aerodynamic data at the sample points by conducting CFD simulations. c) Establishment of aerodynamic database preparation: Organize the design space and its corresponding aerodynamic characteristics into a database. d) Model training: Update model parameters using an optimization algorithm iteratively. e) Optimum prediction: Output the optimum prediction if achieving the required accuracy. Otherwise, refine the model through sample augmentation and hyperparameter adjustment.
Sufficient samples are required to build a data-driven model with reasonable accuracy. To obtain enough samples to train the model, we use CFD simulations to compute the aerodynamic force coefficients, such as lift coefficient, drag coefficient, and pitching moment coefficient, corresponding to different flight statuses (with a given range for the free-stream Mach number and angle of attack), and different shapes described by the wing planar variables and profile variables.
The Sanger aerospace plane carrier wing (the aircraft's configuration is shown in Fig. 7) is employed as a test case to validate the prediction capability of the CNN model. The following sections detail the five procedures for obtaining the aerodynamic data and building the CNN.

Validation of RANS flow solver
The flow solver must be validated before using the CFD simulation to obtain aerodynamic data. The RANS solver is validated by simulating the hypersonic flow over the FDL-5A configuration. An AUSM (advection upstream splitting method) scheme is used for the spatial discretization, and a k-ω SST turbulence model is adopted for turbulence closure. An image showing the unstructured computational mesh is provided in Fig. 5, and the number of computational cells in the mesh is 0.48 million.  The hypersonic flow over the FDL-5A is simulated at Ma = 7.98, Re = 3.832 × 10 6 . Figure 6 compares the computed force coefficients with the corresponding experimental data. The lift, drag, and pitch moment coefficients accurately match the experimental data at different angles of attack. Although the calculated Mach number for the flow solver verified here is 7.98, above that of data for aerodynamic modeling, they are both in the hypersonic regime. Consequently, it makes sense to validate the flow solver.

Parameterization
The wing is parameterized into 100 control point variables for five profiles at spanwise locations by the free-form deformation (FFD) method and planar shape variables for seven wing configuration variables. As shown in Fig. 7, five wing profiles are parameterized by the two-dimensional FFD method. New wings are obtained by independently perturbing the planar shape variables and the FFD control point variables, and the five perturbed wing profiles are assembled on the corresponding spanwise locations. Thus, the FFD control point variables are decoupled from the planar shape variables. Figure 8 shows the seven design variables used to parameterize the planar shape of the wing. These design variables are the root chord, leading-edge sweep angle of the inner wing segment, leading-edge sweep angle of the outer wing segment, trailing-edge sweep angle of the inner wing segment, trailing-edge sweep angle of the outer wing segment, span of the inner wing segment, and wingspan.
In total, there are 107 variables describing this configuration and two flight status variables (free-stream Mach number and angle of attack). The boundaries of these variable values are presented in Table 2. In particular, the 100 wing profile variables (Index ϵ  [10,109]) represent the Z-coordinates of the FFD control points, varying from -10% to 10% on the benchmark of the wing.

Design of experiments
Latin hypercube sampling (LHS) is used as the design of experiments (DoE) method to establish the distribution of input variables. LHS is a method of approximate random sampling from a multivariate parameter distribution belonging to hierarchical sampling technology and is often used in DoE. Samples x i j obtained using the LHS method can be expressed as follows: where i denotes the i th sample, j denotes the j th design variable, U denotes a random number in [0,1], and π j denotes a random permutation in {0, 1,…, N-1}. An example in which the LHS method selects 20 sample points in a DoE problem of two-dimensional input is shown in Fig. 9.

Mesh generation and CFD simulation
This work uses the mesh reconstruction method to generate meshes for different shapes. A mesh independence study [42,43] is performed to ensure sample data accuracy. As shown in Figs. 10 and 11, three meshes are generated: coarse, medium, and fine meshes having 0.41 million, 1.34 million, and 2.86 million cells, respectively. The results for the three meshes are in good agreement.
The variation in force coefficients with mesh size is shown in Fig. 12. The lift coefficient, drag coefficient and moment coefficient calculated by the coarse mesh differ from those calculated by the fine mesh by 0.00016, 1 count (1 count = 0.0001) and 0.000012, respectively.  The coarse mesh is chosen (Fig. 13) to build the aerodynamic dataset at a minimal computational cost. The same topology and mesh parameters are used for all sample points to keep the physical problem consistent. CFD simulations are conducted using the RANS equations, and a two-equation k-ω SST turbulence model is adopted for turbulence closure. An aerodynamic dataset consisting of 7431 sample points is constructed.

CNN Training
The CNN network weights and biases are optimized using the backpropagation algorithm [44]. For the regression problem, the mean square error (MSE) loss function of the model is expressed as: where N denotes the number of sample points in the training dataset, y i is the numerical simulation value calculated by the CFD simulation, and y i denotes the prediction value.
The dataset is divided into training and test datasets in a 19:1 ratio. The reason why the validation dataset is not used is that this article does not use techniques such as  early stopping and dynamic learning rate adjustment. Hence, there are 7059 wing sample points used to tune the parameters of the network and 372 sample points used to test the predicted accuracy of the CNN. The Adam [45] method optimizes the model to approximate the input data's underlying mapping. The initial learning rate is set at 0.0001, and the initial batch size is set to 128. The training procedure is performed on a GPU (NVIDIA RTX 3080).

Influence of CNN hyperparameters
The hyperparameters directly affect the training and predictive performance of neural networks. In this section, we investigate several CNN hyperparameters. Better hyperparameters can be found by observing the convergence performance of the loss function on the training and test datasets.

Number of convolutional layers
Four CNN models (CNN-1, CNN-2, CNN-3, and CNN-4) were set up, having 1, 2, 3, and 4 convolutional layers, respectively. The network structures and parameters are shown in Table 3. In the first convolutional layer of CNN-1, 16 indicates the number  of filters, 3 × 1 indicates the convolutional filter size, and 2 × 1 indicates the pooling filter size. In the fully-connected layer of CNN-1, 3 × 128 indicates 3 layers with 128 neurons. Without loss of generality, each CNN structure is trained 10 times to exclude the influence of random processes on the training results. The average values and standard deviation of the MSE are taken for comparison. Training is implemented for all four architectures, and the MSE convergences of the CNNs are shown in Fig. 14. The training histories of the four CNN structures with a learning rate of 0.00001 are given here. Although CNN-4 converges rapidly and performs better on the training dataset, significant overfitting occurs, i.e., the MSE on the test dataset is larger, while the MSE on the test dataset increases as the MSE on the training set decreases. Because it has the minimum MSE, CNN-2 is selected as the benchmark architecture for further parameter studies.

Number of convolutional filters
The influence of the number of filters is investigated by increasing them from 8 (NCF8) to 64 (NCF64) for CNN-2's first convolutional layer and from 16 to 128 for its second convolutional layer (see Table 4). Figure 15 shows the effect of the number of filters on the MSE convergence of the model. An increase in the number of filters significantly reduces the training MSE and also accelerates the MSE convergence. However, the MSEs of NCF8 and NCF64 are slightly higher than those of NCF16 and NCF32 on the test dataset. The MSE of NCF16 is the lowest and tends to decrease continuously. Therefore, NCF16 is chosen as the next setting for the study.

Number of fully-connected layers
CNN-2 is modified to have one, two, and four fully-connected layers (CNN-2-FC1, CNN-2-FC3, and CNN-2-FC4) to investigate the effect of the number of fullyconnected layers, as shown in Table 5. The MSE convergence histories are given in Fig. 16. More fully-connected layers improve the MSE convergence, and CNN-2-FC4 has the best training MSE convergence. However, CNN-2-FC3 finally achieves the minimum MSE on the test dataset. In addition, CNN-2-FC3 tends to decrease even further, which is better than CNN-2-FC4.

Learning rate
As a vital hyperparameter in CNN training, the learning rate (LR) largely determines the model convergence efficiency and final convergence level. In general, a high, inappropriate learning rate leads to model convergence failure, while a low, unsuitable learning rate results in higher training time costs and the risk of falling into a local optimum. The MSE convergence histories with different learning rates (1 × 10 -2 , 1 × 10 -3 , 1 × 10 -4 , and 1 × 10 -5 ) are shown in Fig. 17. These results indicate that LR0.01 and LR0.001 exhibit poor MSE convergence and severe oscillations. This may be because the optimization algorithm skips the optimal path because of a large initial learning rate [46]. Compared with LR0.00001, LR0.0001 achieves faster convergence, and then overfitting occurs. Ultimately, we consider LR0.00001, which converges smoothly and retains a downward trend, to be the superior learning rate, despite its longest training time.

Batch size
A study on the effect of batch size (BS) on MSE convergence is carried out for a series of batch sizes decreasing from 2048 (BS2048) to 64 (BS64) by factors of 2. The BS mainly impacts the amount of computation, i.e., a larger BS produces a faster computation but requires access to more samples to achieve the same error because there are fewer updates per epoch [47]. From Fig. 18, it is observed that a smaller batch size leads to a smaller MSE in the training dataset. Because BS128 has the minimum MSE in the test dataset and retains the descent gradient, we use a minibatch of 128 as the optimum for the training.
The effect of the convolutional filter size on CNN training is also investigated. These results show that its influence is very small, so the commonly used 3×3 filter size is used.

Validation of CNN prediction performance
Based on the preceding investigations, we adopt a CNN containing two convolutional layers with a 3 × 1 filter size, two pooling layers with a 2 × 1 filter size, and three fullyconnected hidden layers with 128 neurons per layer as the model structure. There are 16 convolutional filters in the first convolutional layer and 32 convolutional filters in the second convolutional layer. The learning rate and batch size are set to 0.00001 and 128, respectively. For full convergence of the model, the epoch is set to 6000.
The relative error ε, coefficient of determination R 2 , relative root-mean-square error (RRMSE), and relative maximum absolute error (RMAE) are employed as the metrics to validate the CNN's predictive performance, and their equations are where y i is the simulation value of the i th test sample calculated by CFD simulation, y i denotes the predicted value of the i th test sample, and N is the number of test samples. The model is perfectly accurate when R 2 = 1.0, whereas R 2 = 0.0 indicates an extremely poor approximation. The RRMSE reflects the model's global accuracy, and the RMAE is a criterion indicating the local prediction performance. Figure 19 shows the MSE convergence history of the CNN training. The MSEs of both the training and test datasets decrease smoothly, with no oscillation or overfitting. Figure 20 compares predicted versus CFD simulation values for the three aerodynamic coefficients in the test dataset. Nearly all the sample points of the test dataset are clustered near the 45° line, demonstrating the CNN's very reliable prediction error. Among them, the predictive performance for C L and C M is better but is relatively poor for C D . The distribution plot of the absolute error is shown in Fig. 21. The y-axis indicates the number of samples corresponding to the absolute error in the x-axis. It is apparent that the error distribution is Gaussian. From Fig. 21, the prediction errors for C D in the test dataset are less than 10 counts for most of the samples, and the prediction errors for C L and C M in the testing dataset are less than 0.002 for most of the samples.
As described in Table 1, the two-dimensional convolution operation is used in most previous works to process the pixel points of airfoils. The proposed architecture (CNN_1D shown in Fig. 3) in this article is also compared here with CNN_2D (using a The three architectures are shown in Table 6. "Conv_2D" represents a two-dimensional convolution layer, "10 × 10" denotes a 10-row, 10-column matrix including 100 FFD control point variables, and "9 × 1" denotes the remaining 9 variables. The "128" refers to the number of neurons in the fully-connected layer, "16, 3 × 3, 2 × 2" denotes that the layer contains 16 convolutional filters with a 3 × 3 filter size and a 2 × 2 pooling filter size. The hyperparameters of CNN_2D are the same as those of CNN_1D. The commonly used hyperparameters with a 0.001 learning rate, batch size of 128, and epoch of 6000 are adopted for MLP. Figure 22 shows the scatter plots of the absolute error and the corresponding box plots. The scatter plots at the top provide the absolute errors of the predicted force coefficients for each wing configuration and flight status, and the box plots at the bottom provide the statistics for these absolute errors. The white lines in the box plots indicate the median of these errors. The right and left edges of the box are the upper and lower quartiles, respectively. The triangle's location below the box refers to the average of these errors. Specifically, the largest CNN error for C L is 0.0031. The CNN predicts 82.80% of the sample points with an error of less than 0.001. For C D , the largest CNN error is 17   (Table 7), from which it is noted that CNN_1D performs significantly better than MLP and slightly better than CNN_2D.
The ability of the trained MLP and CNN models to predict the aerodynamic coefficients of the baseline configuration at different flight statuses, such as Mach number and angle of attack, is also evaluated. 121 additional sample points obtained by the uniform sampling method are calculated by conducting CFD simulation. These test sample points are plotted as white spheres in Fig. 23, with Mach numbers ranging from 5 to 6 and angles of attack in the range of 0 to 5 degrees. The response surfaces predicted by MLP and CNN are shown in Fig. 23(a), (d), and (g), respectively, from which several curves are sliced to obtain the variations of lift coefficient, drag coefficient, and pitch moment coefficient with respect to Mach number or angle of attack. CNN produces smoother response surfaces that conform to the flow mechanism and are a better match to the CFD simulation values than MLP, although the prediction error for the drag coefficient at α =1° is relatively large. Figure 24 shows contour plots of the absolute error between the CFD simulation values and values predicted by MLP and CNN for the three aerodynamic coefficients. The CNN's absolute error is significantly smaller than that of MLP. Four accuracy metrics are computed for a more detailed comparison (Table 8), from which it is observed that the CNN outperforms the MLP.

Optimization problem statement
The modeling approach proposed in this paper is now used with a genetic algorithm (GA) implemented on the well-trained CNN to optimize the five wing profiles to support fast aerodynamic shape optimization. In addition, three aerodynamic shape optimization cases are performed to verify the efficiency and convenience    advantages of the modeling approach, considering both flight status and shape variables for multiple flow statuses and optimization objectives. The details of the three optimization cases are shown in Table 9, where t represents the maximum profile thickness. All notations with subscript "0" indicate the baseline configuration, and the maximum profile thickness is also constrained by the structural requirements. An additional 5290 samples are added to the training dataset to further improve the model accuracy of the validation in this section. Scaling the flow conditions from a three-dimensional swept wing to a two-dimensional profile is not considered here because the scaling rule for small-or mediumswept subsonic or transonic wings may not apply to low aspect ratios and highly swept hypersonic wings. Although it is necessary to scale the flow conditions in a real-world design, it is still sensible to validate and demonstrate the applicability of the CNN-based modeling method for efficient aerodynamic shape optimization.
Note that the five wing profiles selected as the baseline for optimization already have good aerodynamic characteristics in the hypersonic regime, so we use adequate design variables incorporating 100 FFD control points for a more refined optimization to further improve the aerodynamic performance.  Table 10 Comparison of aerodynamic performance obtained from the CNN prediction and the CFD validation for the baseline and optimized wings (optimization case 1)

Optimization results
As seen from Fig. 25 and Table 10, with the support of the CNN providing the correct optimization direction, C L /C D = 2.8335, which is 15.93% larger than for the baseline wing, although the C D prediction is not very accurate, resulting in relatively poor C L /C D predictions. Figure 26 shows a comparison of the baselines and optimized geometric profiles. The maximum thickness of all profiles satisfies the constraints. The pressure coefficient contours and pressure distributions at three wing spanwise locations using the baseline and optimized wing profiles are shown in Fig. 27. A larger high-pressure regime is observed on the wing's lower surface using the optimized profiles, and a slightly larger low-pressure regime is observed on the upper surface.
As seen from Fig. 28 and Table 11 for optimization case 2, C L /C D = 5.3551, which is 6.59% larger than for the baseline wing. Figure 29 compares the five baseline and optimized wing profiles. Pressure coefficient contours and sectional pressure distributions are presented in Fig. 30.  Table 11 Comparison of aerodynamic performance obtained from the CNN prediction and the CFD validation for the baseline and optimized wings (optimization case 2)   Table 12 Comparison of aerodynamic performance obtained from the CNN prediction and the CFD validation for the baseline and optimized wings (optimization case 3)     Table 12, which is 4.88% less than for the baseline wing, while C L is also improved. A comparison of the five baseline and optimized wing profiles is given in Fig. 32. The thicknesses of the five wing profiles are decreased by varying degrees. Figure 33 shows the pressure coefficient contours and the sectional pressure coefficient distributions. Table 13 shows the GA parameters and the time cost for the three wing optimization cases. It is observed that the CNN-based high-dimensional aerodynamic modeling method considering the flight status and shape variables can quickly yield better aerodynamic shapes for all three cases, i.e., different design points and different optimization problems.

Discussion
From the results of the three optimization cases, the aerodynamic coefficient prediction errors of the CNN for the optimized solutions are somewhat larger than the global error, reflecting the fact that the current database size used to model a design space containing up to 100 variables is not sufficient. Therefore, the model cannot provide very accurate global predictions but can give the correct optimization orientation. Hence, it can still demonstrate the advantages of the proposed modeling approach in high-dimensional aerodynamic modeling and the ability to achieve fast aerodynamic shape optimizations for multiple flight statuses.

Conclusions
This article proposes a CNN-based machine learning approach for high-dimensional aerodynamic data modeling to provide fast and reliable aerodynamic performance predictions. This modeling approach is demonstrated by an aerodynamic modeling case similar to the Sanger aerospace plane carrier wing with a 109-dimensional input incorporating both the flight status and aerodynamic shape variables. The following conclusions can be drawn: a) The MLP adapts to the augmentation of samples by expanding the network scale when dealing with high-dimensional problems, resulting in an overfitting trend. The CNN, having the advantages of sparse connection and weight sharing, can alleviate this problem. b) The network's convergence depends largely on the CNN's hyperparameters. A learning rate (LR) between 1 × 10 -5 and 1 × 10 -4 is a good choice, with a higher LR leading to oscillation (or even a failure to converge) and a lower LR leading to increased training costs. Increasing the number of convolutional layers enhances the ability to distill information, but networks that are too deep are prone to overfitting. A smaller batch size gives rise to faster MSE convergence and greater computational intensity for the same epoch. c) Compared with the MLP, the CNN-based modeling method is dramatically more accurate, not only for high-dimensional modeling problems with respect to both aerodynamic shape variables and flight status variables but also for response surface prediction at different Mach numbers and angles of attack.
d) A surrogate model considering both the flight status and the shape enables fast aerodynamic shape optimization for multiple flight statuses without the need to conduct expensive CFD simulations to build additional surrogate models.
The CNN-based aerodynamic modeling approach emerges as a gradient-free, fast, and accurate tool that complements other traditional methods in aerodynamic research and provides a novel way to conduct practical design optimization. Furthermore, the targeted augmentation of sample points during modeling can further improve the predictive accuracy and is conducive to enhancing the quality of the optimization solution, which will be the subject of future research work.