Large-eddy simulation of wall-bounded turbulent flow with high-order discrete unified gas-kinetic scheme

*Correspondence: zhongcw@nwpu.edu.cn 1School of Aeronautics, Northwestern Polytechnical University, 127 West Youyi Road, 710072, Xi’an, People’s Republic of China 2National Key Laboratory of Science and Technology on Aerodynamic Design and Research, Northwestern Polytechnical University, 127 West Youyi Road, 710072, Xi’an, People’s Republic of China Abstract In this paper, we introduce the discrete Maxwellian equilibrium distribution function for incompressible flow and force term into the two-stage third-order Discrete Unified GasKinetic Scheme (DUGKS) for simulating low-speed turbulent flows. The Wall-Adapting Local Eddy-viscosity (WALE) and Vreman sub-grid models for Large-Eddy Simulations (LES) of turbulent flows are coupled within the present framework. Meanwhile, the implicit LES are also presented to verify the effect of LES models. A parallel implementation strategy for the present framework is developed, and three canonical wall-bounded turbulent flow cases are investigated, including the fully developed turbulent channel flow at a friction Reynolds number (Re) about 180, the turbulent plane Couette flow at a friction Re number about 93 and lid-driven cubical cavity flow at a Re number of 12000. The turbulence statistics, including mean velocity, the r.m.s. fluctuations velocity, Reynolds stress, etc. are computed by the present approach. Their predictions match precisely with each other, and they are both in reasonable agreement with the benchmark data of DNS. Especially, the predicted flow physics of three-dimensional lid-driven cavity flow are consistent with the description from abundant literature. The present numerical results verify that the present two-stage third-order DUGKS-based LES method is capable for simulating inhomogeneous wall-bounded turbulent flows and getting reliable results with relatively coarse grids.


Introduction
Turbulent flow is one of the most common and complicated problems in fluid dynamics. In the study of turbulent flows, they are usually formulated in terms of the Navier-Stokes (NS) equations for the macroscopic variables that are functions of position and time. Based on NS equations, many pioneers have successively proposed and applied the direct numerical simulation (DNS) [1,2] approaches for numerical simulation of turbulent flow. Theoretically, DNS is designed to resolve all the scales of turbulent motion using very fine grids, which is unrivalled in accuracy and in the level of description provided [3], but the computational costs are very high, especially for high-Reynolds number turbulent flows. In order to balance the application and research requirements with the computational costs, the Reynolds averaged Navier-Stokes (RANS) methods [3,4] have been developed and applied on unresolved grids. The RANS method solves the time-averaged NS equations and the effect of the unsteady turbulent motions on the mean flow-field is approximated by turbulence models. Another alternative method, namely, the largeeddy simulation (LES) method [5][6][7] has also attracted a vast amount of attention of the engineering and scientific communities. The LES directly computes the large-scale turbulent structures and establishes a sub-grid scale (SGS) model to capture the effects of the smaller unresolved scales. Over the past several decades, the LES has become an indispensable fundamental research and engineering tool in the prediction and analysis of unsteady, multi-scale, and multi-physics turbulent flows [8,9].
The conventional approach for analyzing continuum physics is that uses a coarsegrained model in terms of macroscopic variables [10]. In recent years, some kinetic schemes based on the mesoscopic models have been utilized to simulate turbulent flows via combining with large-eddy simulation methods. Among them, because of the simplicity in formulation and multi-functionality, the lattice Boltzmann equation (LBE) methods have been widely and successfully used in the numerical simulation of turbulence. At present, the majority of the existing LBM-LES methods introduce the effect of eddy viscosity from SGS models into the relaxation time [11][12][13]. Hou et al. [14] firstly reported a simple large-eddy simulation based on LBM for two-dimensional turbulent flow. Premnath et al. [11] implemented both the standard Smagorinsky model [5] modified by using the van Driest wall-damping function and the dynamic procedure [15] to simulate a wall bounded turbulent flow. Recently, the wall-adapting local eddy-viscosity (WALE) model [16] and Vreman model [17] were reported within the lattice Boltzmann framework, such as multiple-relaxation time (MRT) [11,12] and filter-matrix lattice Boltzmann (FMLB) [13] for simulating the fully developed turbulent channel flows. Besides, the gas-kinetic scheme (GKS) [18,19] and implicit high-order GKS [20] for solving NS equations have also been extended to turbulent large-eddy simulations.
Recently, a discrete unified gas kinetic scheme (DUGKS) [21,22] in a finite-volume framework has been developed, keeping the advantages of LBM, moreover, possessing a flexible mesh adaptation like unified gas kinetic scheme (UGKS) [23]. The characteristic difference solution of Boltzmann model equation, which couples the molecular advection and collision in numerical flux construction, makes the DUGKS capable of simulating the whole flow region from free molecular flow to continuum flow. In recent years, benefiting from the simulation capability of whole flow region, DUGKS has attracted a vast amount of attention from the CFD communities in many scientific domains. Zhu et al. [24] extended the DUGKS to unstructured meshes. Wu et al. [25] proposed a DUGKS method considering the external force. An implicit DUGKS [26] for simulating of steady flow in all flow regions was constructed with an implicit macroscopic prediction technique [27]. A simplified DUGKS was proposed recently [28]. Up to now, DUGKS has been applied to many fields, such as compressible flow [29,30], multi-phase flow [31,32], multicomponent flow [33], complex motion (with immerse boundary method) [34], radiative transfer [35], etc.
DUGKS has also been used in direct numerical simulation of simple incompressible turbulent flows [36,37], and another potential application of DUGKS is the large-eddy simulation of turbulent flows. Compared with the LBE method, the merits of DUGKS in turbulent simulation are better numerical stability and its support for non-uniform and unstructured grids [24]. However, there are still some issues to be improved. For example, the DUGKS has a relatively large numerical dissipation compared with the LBE method [36], which is one of the reasons why the high-order DUGKS is needed for large-eddy simulation of turbulent flows. In addition, the LES for turbulent flows generally requires higher-order (greater than second-order) numerical methods to minimize the numerical error, because the discretization error is dominant at all wave numbers for the secondorder accurate scheme, whereas the sub-grid stress is dominant at the low frequencies for the high-order accurate scheme [8]. Recently, Wu et al. [38], inspired by two-stage fourth-order time accurate discretization (TFTD) method [39,40], proposed a two-stage third-order DUGKS for low-speed isothermal flow by reducing the requirement of time accuracy. In this paper, for simulating low-speed incompressible flows, the incompressible discrete Maxwellian equilibrium distribution function [41,42] is introduced to reduce the compressibility error. For the channel flow driven by external force, the external force term is introduced into the high-order DUGKS. In addition, the distribution function will be approximated by the Taylor expansion with fourth-order accuracy.
The present work mainly focuses on combining the two-stage third-order DUGKS with the WALE model [16] and Vreman model [17] for large-eddy simulation of wall-bounded turbulent flows. The fully developed turbulent channel flow with a friction Reynolds number Re τ = 180, the turbulent plane Couette flow with a friction Reynolds number Re τ = 93 and the three-dimensional lid-driven cubical cavity flow with Reynolds number Re = 12000 are investigated to validate the capability and accuracy of the present high-order DUGKS (with D3Q19) for turbulent simulations. The rest of this paper is organized as follows. In Section 2, the basic algorithm of two-stage third-order DUGKS with a force term is described, meanwhile, the key elements of DUGKS-based LES method and computation procedure are also given in Section 2. In Section 3, three wall-bounded turbulent flows, respectively, the fully developed turbulent channel flow, the turbulent plane Couette flow and three-dimensional lid-driven cubical cavity flow are performed and discussed. Finally, conclusions and future outlook are given in Section 4.

Two-stage third-order temporal discretization
The Boltzmann-BGK equation with a force term can be written as: where L f , f , S f are transport term, BGK collision operator [43] and external force term, respectively, which can be expressed as follows: where f = f (x, ξ , t) is the distribution function which is defined as the density of particles at time t and in phase space (x, ξ ). Here, x is the physical space coordinates, and ξ is the particle velocity. In Eq. (3), τ = μ/p is the relaxation time where μ is the dynamic viscosity and p is the pressure, and f eq is the Maxwellian equilibrium distribution function defined in the following form: where R is the specific gas constant, D is the spatial dimension, ρ is the density, u is the macroscopic velocity of fluid, and T is the temperature. Two conservative flow variables, namely, the mass and momentum, are considered in the low-speed isothermal flows and defined as the moments of distribution function as follows: where ψ (ξ ) = (1, ξ ) T is the collision invariants. In Eq. (4), F = F (x, t) is external force per unit volume. Under the assumption of nearly incompressible low-speed isothermal flows, the external force term S f can be approximated as [44,45]: where H = F·(ξ −u) ρRT . In addition, the approximation in Eq. (7) can also be derived from Hermite expansion [37,45] for low Mach number flow.
For convenience in formula derivations, a new source term Q f is defined as the summation of BGK collision operator and external force term: Since the collision operator conserves mass and momentum, it is easy to verify that the source term Q f satisfies the following conservative law: Consequently, the Boltzmann-BGK equation with the force term can be rewritten as: The form of Eq. (10) is in accord with the Boltzmann-BGK equation without the force term which is used in the original two-stage third-order DUGKS [38], while the only difference is that the original collision operator f is replaced by a new source term Q f including collision operator and force term.
The present third-order DUGKS algorithm with the force term employs a finite-volume formulation. The flow domain is divided into a set of control volumes V j centered at x j . Then integrating Eq. (10) in control volume V j from time t n to t n + t with third order time discretization method for transport term and source term same [38]. But the difference is that an error term O (τ t) will be included due to the approximation of force term in Eq. (7), which has no effect on the flow without external force term. Finally, the evolution equation Eq. (11) with third-order temporal accuracy can be obtained as follows: where f j and Q j are the cell-averaged values of the distribution function and source term, e.g., and L n+1/6 j and L n+3/4 j are the micro-fluxes across the interface of control volumes V j : where V j and ∂V j are the volume and surface of control volumes V j , n is the outward unit normal vector on the surface, and x cf is the coordinates of point on this control surface (If the control surface is in discrete form, x cf is defined as the center of a discrete control surface). The time step t is determined by the Courant-Friedrichs-Lewy (CFL) condition and stability condition t/τ ≤ 12 [38]: where C is the CFL number, U 0 is the characteristic velocity of flow, and x min is the minimal grid spacing. Like the original DUGKS [21,38], in order to remove the implicit treatment of the source term Q n+1 j in Eq. (11), a new auxiliary distribution function is introduced: where the distribution function f can be expressed as follows from Eq. (17).
Since the source term conserves mass and momentum, the following explicit evolution equation of the distribution functionf is solved instead of the original implicit one: and the conservative variables can be calculated from the auxiliary distribution function f n+1 as follows: According to Eq. (19), it is necessary to evaluate source term Q n+1/3 j and micro-fluxes L n+1/6 j and L n+3/4 j in order to update the auxiliary distribution function. The calculation of the micro-fluxes will be described in the next section, and the second-order DUGKS method is used to update the source term Q n+1/3 j . Integrating Eq. (10) in control volume V j from t n to t n + t/3 with midpoint rule for the convection term and trapezoidal rule for the source term, it can be obtained that, The following new set of auxiliary distribution functions are also introduced to remove the implicit terms in Eq. (21): Then, Eq. (21) can be rewritten as: The relations between auxiliary distribution functionsf ,f + and distribution functions f,f are as follows: The hydrodynamic variables density and fluid macroscopic velocity can be solved from the following equation (Eq. (28)), once the auxiliary distribution functionsf at time t n + t/3 is obtained.
meanwhile, Q n+1/3 j can also be obtained as follows:

Flux evaluation
According to Eqs. (14) and (15), the key point in solving micro-fluxes L n+1/6 j and L n+3/4 j is calculating the distribution function f x cf , ξ , t n + t/6 and f x cf , ξ , t n + 3 t/4 on the control surface. First, the Boltzmann-BGK equation in Eq. (10) is integrated within a half time step h along the characteristic line x + ξ t whose end point x cf is located at the cell interface, which leads to the following characteristic line solution: where t * = t n , h = t/6 and t * = t n + t/3 , h = 5 t/12 can be chosen for calculating f x cf , ξ , t n + t/6 and f x cf , ξ , t n + 3 t/4 , respectively. Similar to the treatment in Eq. (11), another auxiliary distribution functionf is introduced as follows to remove the implicit term in Eq. (30): Then Eq. (30) can be expressed as, where the relationship between f andf is: The distribution functionf + x cf − ξ h, ξ , t * is approximated by second-order Taylor expansion in original third-order DUGKS. In the present work, the distribution functionf + x cf − ξ h, ξ , t * is approximated by the following third-order Taylor expansion to improve the spatial accuracy: where the coefficients are (x c , y c , z c ) and x cf , y cf , z cf are the central points of cell and cell-interface, respectively. V c is the cell whose center is located at (x c , y c , z c ). All derivatives in Eq. (35) are calculated by the central difference method, where the p-order accuracy of central schemes used to approximate q-order derivatives should satisfy the relation p + q ≥ 4. The conservative variables at the control surfaces can be obtained by calculating the moments of the auxiliary distribution functionf with Eq. (36). Once the conservative variables at control surface are known, the equilibrium distribution function f eq x cf , ξ , t * + h can be obtained directly.
Therefore, using Eqs. (35), (33), (36) and (34) in turn, the original distribution functions f x cf , ξ , t n + t/6 and f x cf , ξ , t n + 3 t/4 can be obtained. With the determination of these original distribution functions, the micro-fluxes L n+1/6 j and L n+3/4 j can be fully evaluated according to Eqs. (14) and (15). So far, the two-stage third-order DUGKS with external force is completely established, and the only remaining issue is the velocity space discretization.

Discrete velocity space
In the present two-stage third-order DUGKS, the Maxwellian equilibrium distribution f eq is approximated by its second-order Taylor (or Hermit) expansion for the Ma 1. By discretizing the velocity space and employing the Gauss-Hermite quadrature, the discrete equilibrium distribution function can be obtained, where W α are the weight coefficients corresponding to particle discrete velocity ξ α . For the incompressible flow, the slight compressible effect will cause small numerical error in discrete Maxwellian distribution Eq. (37). To reduce this numerical error, the density can be approximated as ρ = ρ 0 + ρ [41,42]. Then, the following new type of equilibrium distribution function can be adopted in two-stage third-order DUGKS.
Since only the nearly incompressible turbulent flows at constant temperature are considered in this paper, the following efficient D3Q19 model in Eq. (39) for discrete velocity space, which is originally used in the LBM method, can be adopted: where the associated weights are

Sub-grid scale model
Many SGS models have been proposed with the rapid development of LES method, most of them model the sub-grid scale tensor based on an eddy-viscosity assumption as follows [3]: where τ ij = u i u j −ū iūj is the residual-stress tensor, ν t is the eddy viscosity, andS ij is the symmetric part of the resolved velocity gradient. In the present high-order DUGKS, the effective viscosity can be expressed as the sum of molecular viscosity and eddy viscosity, therefore the effective relaxation time τ e can be expressed as: The WALE model [16] and Vreman model [17] will be used to solve the near-wall region problem in the present work, which offers a global coefficient SGS model that is afford-able, easy to implement, and applicable to fully inhomogeneous flows. The turbulent eddy viscosity in the WALE model is calculated by the following equations: with the model constant C w = 0.5 ∼ 0.6. TheS ij and¯ ij are respective symmetric part and anti-symmetric part of the resolved velocity gradient: In the Vreman model, the turbulent eddy viscosity is given by another template as follows: where the symbol In the present simulations, the WALE model employs model coefficient C w = 0.50, and the Vreman model simulation employs C v = 0.07 [16,17]. Moreover, the first-order derivatives of the velocity field in these models are calculated by a fourth-order central difference in the fully fluid cells, and the three-point scheme is used for the cells near wall boundary.

Computational procedure
In this section, the numerical procedure for one time step evolution of two-stage thirdorder DUGKS with force term is summarized as follows: Step1: Compute the microscopic flux L n+1/6 j across the surface of control volumes from time t n to t n + t/3, and the half time step is set as h = t/6; a) Solve the eddy viscosity ν t at time t n by WALE or Vreman models, and the effective relaxation time can be calculated by Eq. (42); b) Solve the auxiliary distribution functionf +,n j according to Eq. (46) and its spatial derivatives; c) Calculate the auxiliary distribution functionf x cf , ξ , t n + h with Eqs.

The fully developed turbulent channel flow
The fully developed turbulent channel flow is a canonical wall-bounded turbulent flow, which has been simulated by many pioneers [1,6,46,47] with macroscopic CFD methods. In addition, this case has also been simulated by several kinetic schemes based on mesoscopic models [11][12][13]37]. The focus of this study is validating the capability and accuracy of the present high-order DUGKS-based LES method for simulating the wall-bounded turbulent flows.
The parameters settings for the turbulent channel flow are as follows. As shown in Fig. 1, the size of the non-dimensional domain is (3πh, 2h, 4πh/3) in the streamwise (X coordinate), wall normal (Y coordinate) and spanwise (Z coordinate) directions, respectively. Re τ = u τ h/ν = 180 with a kinematic viscosity of ν = 4.5 × 10 −5 is the Re number based on the friction velocity u τ = √ τ w /ρ and channel half-height h, where τ w is the wall shear stress. In the simulation the channel half-height is set as h = 1. Meanwhile, the characteristic viscous length scale is defined as δ h = ν/u τ , and the characteristic time scale is defined as t c = h/u τ . The computational domain is divided into 108 × 96 × 108 in X, Y and Z directions, respectively, where the uniform grids are used in streamwise and spanwise directions. In order to resolve the wall layer, a non-uniform distribution of grid points (Fig. 2, the parameter α = 3.8) defined by Eq. (48) is used in the wall-normal direction.
where y 0 and y N are coordinates of the starting and ending points, respectively. The parameter α is a constant that determines the distribution of the grid, and a large value of α leads to a dense distribution of the grid near the walls.
In the present simulations, periodic boundaries are implemented in both the streamwise and spanwise directions, and the bounce-back treatment for distribution function [21] is applied to the solid walls to recover the macroscopic no-slip boundary condition. The initial density field is set to ρ 0 = 1, and the initial velocity is given by a laminar (Poiseuille) parabolic profile with divergence free fluctuating velocity satisfying wall boundary conditions. In the present simulations, the fluctuating velocity field is given by Eq. (49), wherex = 2πx/L x ,ỹ = 2πy/L y ,z = 2πz/L z , u c ≈ 18.20u τ = 0.14742 [1] is the mean velocity in the channel center and β = 0.1 is a constant that determines the magnitude of fluctuating velocity. In addition, the uniform pressure gradient in the streamwise direction is implemented as a driving force F x . Meanwhile, the force term will be adjusted at each time step [48] in order to provide the desired mass-flux as, where the initial force term is F 0 x = −dp/dx = ρ 0 u 2 τ /h. Theṁ n is the average mass-flux at time level n, andṁ 0 = 2hρ 0 u m is the initial mass-flux, where the u m = 1 2h u x,z dy is the bulk mean velocity calculated by u m /u τ = 15.63 [1]. The average mass-flux is given bẏ where ... x,z denotes an average in the X and Z directions. The simulation runs until stationary turbulence statistics are obtained. The initial run is carried out for approximately 55t c to obtain a statistically stationary state. Then, an additional run is carried out for 24t c to gather the average data. The averaging of flow quantities is performed in time as well as in space in the homogeneous directions (streamwise and spanwise directions). Figure 3 illustrates the distribution of the streamwise mean velocity along the wall normal direction with wall-layer scaling law. Besides, the result of direct numerical simulation from Moser et al. [46] is plotted as well. It can be found that the present computed mean velocity profiles satisfy u + = y + in the viscous sublayer region (y + ≤ 5) and logarithm law (y + ≥ 30), i.e., u + = A ln y + + B, where the values of A = 2.5 and B = 5.5 are known to be reasonably accurate for flow over smooth walls at Re τ ≈ 180 [1,3]. The results of the implicit LES and two models almost coincide with each other. Both of them over-predict the mean velocity profile in the log-law region by approximately 3.5%. Such differences are likely due to the error of wall shear stress calculation. The computed friction velocity u τ = 0.0077 ∼ 0.0078 is smaller than setting value that u τ = 0.0081. In addition, they are also influenced by the numerical dissipation of the computational approach. The influence of numerical scheme is dominant for mean velocity.
The profiles of root-mean-square (r.m.s.) velocity fluctuations are illustrated in Fig. 4a, b and c. The computed profiles match well with each other, and they are both in reasonable agreement with the results of DNS [46]. In the buffer layer (15 ≤ y + ≤ 60), the present results in the streamwise and spanwise slightly deviate from the reference data. Meanwhile, the r.m.s. velocity results in the normal direction show larger deviations than those in streamwise and spanwise. The Reynolds stress normalized by the wall-shear stress is Fig. 3 The computed mean velocity profile with wall-layer scaling laws in inner wall coordinates for the fully developed turbulent channel flow presented in Fig. 4d. The computed Reynolds stress profiles also coincide with each other, and follow the results of DNS [46] fairly closely with about 4% difference in the buffer regions. Figure 5 shows the computed r.m.s. pressure fluctuations, which are normalized by the wall shear stress and qualitatively consistent with the results of DNS [46]. However, there are also evident differences between the current results and reference data. The computed maximum pressure fluctuations have a maximum value of 1.75 at y + ≈ 34 for two models and 1.72 at y + ≈ 34 for implicit LES, which is close to the value of 1.75 at y + ≈ 30 [1] and  1.88 at y + ≈ 30 [46], respectively. In addition, the present pressure fluctuation is about 1.46 for two models and 1.44 for implicit LES at the wall boundary. It is smaller than prior data in Refs. [1,46] whose values are about 1.50 and 1.53, respectively. As discussed in Refs. [11,12], such difference could conceivably be due to compressibility effects and filter effects.
The fluctuations of r.m.s. vorticity components are illustrated in Fig. 6. Being same with the above analysis, the present results of LES are almost the same, and all of them are qualitatively consistent with the reference data. Because of the inhomogeneity and anisotropy of turbulence closer to the wall, it is found that there are evident differences among the components of vorticity. In addition, as expected, all the components of vorticity away from the wall tend to be consistent value.
The ratio of eddy viscosity ν t to kinematic viscosity ν in the near wall region is shown in Fig. 7. It can be found that the eddy viscosity keeps an almost constant value in the log-law region, and dramatically decreases near the wall. The eddy viscosity from WALE model drops faster than that from Vreman model as a result of the WALE model reproduces the proper scaling (ν t = O(y 3 )) at the wall, while Vreman model reproduces ν t = O(y) [49].

The turbulent plane Couette flow
The study on turbulent plane Couette flow [50,51] was relatively insufficient compared with the fully developed turbulent channel flow. One of the reasons is the existence of very-large-scale motions in Couette flow [51,52] (especially for large Re number) that is particularly challenging to implement DNS, therefore LES may be a good alternative.
In this sub-section, a turbulent plane Couette flow at low friction Re number (Re τ = 93) will be investigated. The flow domain (4πh in the streamwise, 2h in the wall normal and 2πh in the spanwise directions) and coordinate system are shown in Fig. 8. The half distance between the walls is set as h = 1. Meanwhile, the characteristic viscous length scale is defined as δ h = ν/u τ , and the characteristic time scale is defined as t c = h/u τ , where the ν = 4.95 × 10 −5 is kinematic viscosity and u τ is the friction velocity. The computational domain is divided into 108 × 96 × 108 in the streamwise (X coordinate), wall normal (Y coordinate) and spanwise (Z coordinate) directions, respectively, where the uniform grids are used in streamwise and spanwise directions. In order to resolve the wall layer, a non-uniform grid (Fig. 9) generated by Eq. (48) with parameter a = 3.5 is The streamwise mean velocity profiles along the wall normal direction are shown in Fig. 10 scaled in outer (Fig. 10a) and inner (Fig. 10b) units. The u + = y + in the viscous sublayer (y + ≤ 5) and logarithm law u + = ln y + /0.41+5.1 in the near-wall log-law region (y + ≥ 30) are also plotted on Fig. 10b. The present profiles, computed with two models and without using SGS model, agree well with the DNS results [51], where the maximum relative error in log-law region is 3.96% for implicit LES, 2.3% for Vreman model and 2.0% for WALE model.
The r.m.s. velocity fluctuations are illustrated in Fig. 11a, b and c. The r.m.s. velocity profiles from the Vreman and WALE models match well with each other, and they are in accord with the results of implicit LES. The present r.m.s. velocity fluctuations are both in reasonable agreement with the DNS results by Lee and Moser [51]. In the region of 15 ≤ y + ≤ 50, the v rms and w rms slightly deviate from the reference result, while u rms slightly deviates from the DNS data in the region of y + ≥ 40.
The Reynolds stress is also presented in Fig. 11d. Different from the turbulent channel flow, in the plane Couette flow, the Reynolds stress increases monotonically from the wall to an almost constant level in the center region. The present profiles by Vreman and WALE models are consistent with the result computed by direct numerical simulations [48], export for some slight deviations near the buffer layer and log-law region. These deviations may be due to the following reasons. Firstly, the grids in the streamwise and spanwise directions are relatively coarser than wall normal direction. Secondly, owing to great computational cost, the computational domain size (the streamwise and spanwise directions) for the present work is not long enough to cover the extent of the Fig. 9 The grid on X-Y plane for the turbulent plane Couette flow at Re τ = 93 large-scale motion in Couette flow [50,51]. It should be noted that the selected computational domain is sufficient for the present research purpose, and capturing and studying all the very-large-scale motions is out of scope for the current study. It can be found that the Reynolds stress computed without using SGS model obviously deviates from the DNS data in the log-law region. We think it needs a longer time to reach the statistical steady state compared with the Vreman and WALE models. A longer calculation time may improve the results for implicit LES.
The computed r.m.s. pressure fluctuations normalized by the wall shear stress are illustrated in Fig. 12, which are reasonably consistent with the DNS results [51]. The r.m.s. pressure fluctuation at the wall boundary is about 2.30 for implicit LES, 2.41 for Vreman model and 2.44 for WALE model, which are close to the prior data based DNS in Ref.  [51] whose value is about 2.37. The r.m.s. pressure fluctuations predicted by Vreman and WALE models are better than implicit LES. Figure 13 shows the r.m.s. of vorticity fluctuations components. Being same with the turbulent channel flow, the present results are almost the same, and all of them are qualitatively consistent with the DNS results [51].
The ratio of eddy viscosity ν t to kinematic viscosity ν in the near wall region is shown in Fig. 14. The trend of ν t /ν for Couette flow is consistent with that of channel flow, while the ν t /ν for Couette flow is smaller than channel flow.

Three-dimensional lid-driven cubical cavity flow
The two-dimensional lid-driven cavity flow is a classic benchmark case for testing new numerical methods and studying additional physical effects of flow phenomenon [53]. The three-dimensional cavity flow has also attracted a vast amount of attention and has  been studied both experimentally [54] and numerically [11,[55][56][57]. This section focuses on relatively high Reynolds number (Re = 12000) lid-driven cubical cavity flow. Generally speaking, when the Reynolds number based on the lid-driven velocity and cavity length is less than 2000, the flow is laminar. At Reynolds number higher than the critical value between 2000 and 3000, an instability appears in the vicinity of the downstream-cornereddy [58]. Although the critical Reynolds number is not clear, with the further increase of Re, there is turbulence over some part of the flow field. At Reynolds number higher than 10000, the flow near the downstream-corner-eddy becomes fully turbulent.
The lid-driven cubical cavity flow has a simple geometry, while it contains complex flow phenomena [55]. In addition, there is not any homogeneous flow direction, and the presence of side walls affect the full flow patterns. Due to the complex flow phenomena encountered within such a system at higher Re, it becomes a very challenging case. Especially, it requires accurate time and space discrete schemes to obtain results with long averaging times. Consequently, the present high-order DUGKS-based LES method will be used to study this problem, and its numerical capability is testified.
The fluid enclosed in the cavity is assumed to be nearly incompressible with uniform density and temperature. The flow domain of the lid-driven cubical cavity of length 2h is shown in Fig. 15a with coordinate system. The flow is driven by the top lid that moves in the X-direction with maximum velocity Uw = 0.15 to keep the Ma = Uw/ √ 3RT 1. Consequently, the Reynolds number used in our computation is Re = (2hU w ) /ν = 12000 with a viscosity of ν = 2.5 × 10 −5 . In order to obtain an efficient near-wall resolution, the stretching smooth grid of 64×64×64 in Fig. 15b is adopted. The grid points are generated by Eq. (48) with the parameter a = 3 in all directions and refined grids are used near the wall boundary.
In the present simulations, the time step is chosen as 8.4 × 10 −4 . The statistically stationary state of the flow field is achieved after running for 510t c , where the characteristic time is t c = (2h) /U w . In addition, the computed turbulence statistics, including mean velocity, Reynolds stress profiles, and r.m.s. velocity fluctuations are obtained by a long time averaging about 600t c .
As far as the constant velocity is imposed on the lid, it will lead to severe discontinuities along the top edges. Moreover, the presence of such discontinuities may undermine the accuracy and stability of numerical simulations [55,56]. In order to remove these weaknesses, a high degree polynomial velocity distribution as Eq. (52) on the lid is adopted according to the recommendation of Leriche and Gavrilakis [55].
where the moving boundary is at y = 2h (top wall) and Uw is the maximum velocity. This profile in Eq. (52) makes the velocity zero value at the intersection between the top wall and other walls, meanwhile, the lid-driven velocity grows rapidly to a constant value over a short distance away from side walls. The mean velocity profiles along the horizontal (X-direction) and vertical (Y-direction) symmetry axes are shown in Fig. 16. The predictions by both WALE and Vreman models are in good agreement with DNS data of Leriche and Gavrilakis [55] and LES predictions by Shetty et al. [57] with dynamic Vreman model, except that the v profile along X direction (Fig. 16a) is smaller than DNS data at the negative peak. Besides, the results of implicit LES are also shown in Fig. 16, which are also in accord with the reference data. As discussed in Ref. [55], the narrow peak near the downstream wall indicates that the flow in these regions is laminar but unsteady. It is also obvious from the velocity profiles that the flow parallel to the downstream wall is similar to a wall jet. As shown in Fig. 16b, it also contains a high average velocity gradient in the thin high momentum layer near the lid. The kinetic energy is injected into the flow by the shear stress at the top lid. The present prediction of the distance that the mean velocity u falling by half along the normal direction at the center of the lid is about 0.86% for implicit LES, 0.87% for Vreman model and WALE model of the separation between the cavity walls, which is in agreement with 0.8% [55].
The contour plots of the mean velocity component u and v in the mid-plane Z = 1 are shown in Fig. 17. The present results correctly capture all secondary corner eddies next to the upstream wall and located near the upper wall or bottom wall in the mean flow field. Furthermore, there is almost no difference between the two SGS models and without SGS model in prediction of the mean flow structure. The most dominant feature of the mean flow is the large-scale recirculation which spans the cavity [55]. As shown in Fig. 18, the flow in the thin high-momentum layer next to the lid changes its direction around the upper corner near the downstream. Then, the flow develops into an unsteady wall jet whose thickness varies along the downstream wall, and breaks up into two approximately elliptical free jets (also see Fig. 19). The prediction of the plane wall jet separates from the downstream wall at distances y (from the bottom wall) of about 0.5h − 0.6h, which is in line with the DNS data [55]. As shown in Fig. 19, the flow pattern is that: the flow structures (especially the wall jets) on both sides of the symmetry plane (Z = 1) are similar.
The profiles of r.m.s. velocity fluctuations along two lines (x, 1, 1) and (1, y, 1) are shown in Fig. 20. The present results show that the r.m.s. fluctuations along the line (x, 1, 1) ( Fig. 20a and b) and the line (1, y, 1) ( Fig. 20c and d) are in good agreement with the DNS results [55], except for the slight deviations at the peak. As for the comparison with LES data [57] provided by fifth-order WENO scheme and dynamic Vreman model under the same number of grids, the present results are better in most regions. The components of the Reynolds stress u v on these two central lines are also provided in Fig. 21. They are both consistent to the trend of the DNS data, while their maximums tend to be slightly underestimated, whereas their minimums are somewhat overestimated. In addition, it can be found that the mean velocity field predicted by the two models and implicit LES are almost the same, however, there are some slight differences in the r.m.s. velocity fluctuations and Reynolds stress.
The contours of the ν t /ν computed by two models at the mid-plane Z = 1 are shown in Fig. 22. The present results of both WALE and Vreman models are almost same, while the ν t computed by WALE model is about one order of magnitude larger than that computed by Vreman model. The eddy viscosity ν t near the downstream-corner between with top lid and bottom wall is larger than other regions, which indeed shows that the flow near the downstream-corner-eddy is fully turbulent.
The present results explicitly indicate that turbulence is generated on cavity walls. It is obvious that the turbulent fluctuations are about one order of magnitude larger near the downstream wall than near the upstream wall. Moreover, the fluctuations on the bottom wall are the largest. In the region near the downstream wall where the wall jet separated into two elliptical jets, the high gradients of velocity fluctuations are also well reproduced. In addition, the Reynolds stress below the top lid is small that can be ignored, which indeed shows that the flow under the lid is mainly laminar but transient. These physics seem to be consistent with the descriptions in the abundant literature which studies this problem [55,56].

Summary
In the present work, both the incompressible discrete Maxwellian equilibrium distribution function and the external force are introduced into the two-stage third-order DUGKS for simulating low-speed turbulent flows. In addition, a parallel code with three-dimensional domain decomposition for large-eddy simulation of turbulent flows is developed. Finally, three typical low Reynolds number wall-bounded turbulent flows, the fully developed turbulent channel flow, the turbulent plane Couette flow and lid-driven cubical cavity flow are investigated using the present high-order DUGKSbased LES method (based on D3Q19 model). The implicit LES and two SGS models, The turbulence statistics, including mean velocity, the r.m.s. fluctuations velocity, Reynolds stress, et al., are compared with the results of DNS and LES given in the previous literature. The present results of two SGS models and implicit LES are in reasonable agreement with the results of DNS. It is no surprise that the results from the Vreman and WALE models are in perfect agreement with each other. Besides, the results of implicit LES are almost the same as the explicit LES with Vreman model and WALE model, which indicate that the influence of numerical scheme is dominating, rather than the subgrid scale model in the present simulation. In order to resolve small scale eddies near solid walls, it usually needs the same amount of grid as DNS. As for lid-driven cubical cavity flow, the mean velocity field predicted by the two models and implicit LES are almost the same, while there are some slight differences in the r.m.s. velocity fluctuations and Reynolds stress. In general, the present computed mean velocity field is in good agreement with the results of DNS. Although the results of the r.m.s. fluctuations of velocity field and Reynolds stress slightly deviate from the DNS data, the flow physics are consistent with the descriptions of the abundant literature.
In summary, the LES for wall-bounded turbulent flows using high-order DUGKS shows that its results are reasonably accurate. It clearly proves that the present high-order DUGKS has the potential to be used as a LES tool for simulating turbulent flows. However, we should also note that the flow Reynolds number of the current examples is not too high, and large-eddy simulations of turbulent flow at extremely high flow Reynolds numbers are needed to further verify the capabilities and accuracy of the present method in the recent future.