$2\odot 2=4$: Temporal-Spatial Coupling and Beyond in Computational Fluid Dynamics (CFD)

With increasing engineering demands, there need high order accurate schemes embedded with precise physical information in order to capture delicate small scale structures and strong waves with correct"physics". There are two families of high order methods: One is the method of line, relying on the Runge-Kutta (R-K) time-stepping. The building block is the Riemann solution labeled as the solution element"1". Each step in R-K just has first order accuracy. In order to derive a fourth order accuracy scheme in time, one needs four stages labeled as"$1\odot 1\odot 1\odot 1=4$". The other is the one-stage Lax-Wendroff (L-W) type method, which is more compact but is complicated to design numerical fluxes and hard to use when applied to highly nonlinear problems. In recent years, the pair of solution element and dynamics, labeled as"$2$", are taken as the building black. The direct adoption of the dynamics implies the inherent temporal-spatial coupling. With this type of building blocks, a family of two-stage fourth order accurate schemes, labeled as"$2\odot 2=4$", are designed for the computation of compressible fluid flows. The resulting schemes are compact, robust and efficient. This paper contributes to elucidate how and why high order accurate schemes should be so designed. To some extent, the"$2\odot 2=4$"algorithm extracts the advantages of the method of line and one-stage L-W method. As a core part, the pair"$2$"is expounded and L-W solver is revisited. The generalized Riemann problem (GRP) solver, as the discontinuous and nonlinear version of L-W flow solver, and the gas kinetic scheme (GKS) solver, the microscopic L-W solver, are all reviewed. The compact Hermite-type data reconstruction and high order approximation of boundary conditions are proposed. Besides, the computational performance and prospective discussions are presented.


Introduction
In the simulation of compressible fluid flows or related problems, there are two families of commonly-used high order accurate numerical schemes: One is the family of methods of line, for which the fluid dynamical system is written in semi-discrete form and the Runge-Kutta (RK) temporal iteration is employed for the temporal discretization, such as RK-WENO [26], RK-DG [17] and their variants.The building blocks comprises of the solution element, the associated Riemann solution, which is labeled as "1" in order to be in contrast with the Lax-Wendroff (LW) type flow solvers.The fourth order RK temporal iteration is labeled as "1 ⊙ 1 ⊙ 1 ⊙ 1 = 4.This family of schemes have very favorable properties such as simplicity in time-stepping for complex engineering problems.The limitation is also obvious such as compactness, efficiency and fidelity.The other is the family of one-stage LW type methods, the numerical realization of Cauchy-Kowalevski (CK) approach [20] for the corresponding partial differential equations.This family of methods have the strong temporal-spatial coupling property, leading to very compact numerical schemes.However, when applied to high nonlinear problems, the complex construction of numerical fluxes hampers to develop high (more than two) order accurate schemes.Particularly, as strong waves (discontinuities ) are present in flows (solutions), the CK procedure loses its physical and mathematical meanings, exhibiting the instability of the resulting schemes near discontinuities.
Careful inspection of these two families of methods motivates to combine the merits of both methods: The simplicity of multi-stage RK methods and the temporal-spatial coupling of LW type methods.This straightforward combination immediately yields a two-stage fourth order accurate temporal discretization for the LW type flow solvers [32], which is labeled as "2 ⊙ 2 = 4".Here "4" just represents "fourth" order accurate temporal discretization, but "2" has much deeper implications, some of which are enumerated below.(i) "2" represents a pair.Unlike the methods of line, this method adopts the pair, the conservative variables and their dynamics, e.g., the velocity and the acceleration, as the building block to design numerical schemes.In [32], we call this pair as the Riemann solver and the LW type solver.
(ii) "2" implies the temporal-spatial coupling.The LW flow solver implies the temporal-spatial coupling property of resulting schemes.This is necessary to simulate the temporal-spatial coherent structures of fluid flows.
(iii) "2" stands for second order accuracy in time.Of course,"2" also symbolizes the temporal accuracy of resulting schemes and requires that at each of the two stages the above pair should be the building block.
(iv) "2" indicates the exchange of kinematics and thermodynamics.
The Gibbs relation plays a fundamental role in compressible fluid flows.In the dynamical process, there is always the interaction of kinematics and thermodynamics.The stronger nonlinear waves, e.g.shocks, exist in the fluid flows, the more fundamental role the thermodynamics plays.(v) "2" guarantees the compactness and efficiency.Since only two stages are taken to achieve fourth order temporal accuracy, half amount of spatial discretization treatments are saved and much smaller computational stencils are needed.Hence the resulting schemes are more compact and efficient.(vi) "2" reflects the consistency of mathematical and physical expressions of fluid dynamics.The fluid dynamical systems essentially consist of balance laws, which say the relation between the change rate of physical quantities and the associated fluxes.The form of balance laws always makes sense no matter whether there are discontinuities in the solution.The Lax-Wendroff type flow solvers inherently reflects the consistency between the physical implication of fluid dynamical systems and their mathematical formulation.
In this paper we will elucidate the idea of this new family of schemes by interpreting the philosophy from ordinary differential equations (ODEs) to fluid dynamical systems, reviewing the well-used GRP and GKS solvers as the representatives of the Lax-Wendroff type solvers, building high order temporalspatially coupled high order accurate schemes with favorable computational performance.
We organize this paper in the following sections.In Section 2, we propose this new family of methods and the corresponding "2⊙2" algorithm.In Section 3, we review the generalized Riemann problem (GRP) solver and in Section 4 continue to review a kinetic solver, the gas kinetic scheme (GKS) solver.In Section 5, we introduce the compact Hermite-type interpolation for the data reconstruction.In Section 6, we discuss the approximation of boundary conditions to suit for the 2⊙2 algorithm.In Section 7, we remark the computational performance of this approach in terms of computational efficiency, robustness and fidelity.
This section serves to elucidate the meaning of "2 ⊙ 2 = 4" for hyperbolic problems and particularly compressible fluid flows, and review the two-stage fourth order accurate schemes proposed in [32].We remind that this strategy may not be suitable for incompressible flows or it needs some modifications but certainly awaits for further improvement.

Start with ODEs and philosophic thinking
Let's recall the Runge-Kutta (RK) method for an ordinary differential equation The Runge-Kutta method takes the iteration procedure where h is the time increment, a ij , b i and c i satisfy the Butcher tableau [10].
The building block of RK is the solution element y.In order to devise a s-th order accurate scheme, one needs s-stage iteration, which is parameterdependent.In this paper, we focus on fourth order accurate schemes and therefore label the fourth order RK schemes as 1 ⊙ 1 ⊙ 1 ⊙ 1 = 4.The notation "⊙" is an operation satisfying certain requirement such as stability.
The RK method lays the foundation of numerical approximations to ODEs.Note that this method only uses the solution element "y", but ignores the dynamics element dy/dt.This sounds confusing, however, one may pay his attention to the role of the dynamics element if he is familiar with the symplectic algorithm for Hamitonian system [21] for which the pair of the position and momentum are together used for the computation in order to preserve the symplectic structure.The momentum can be regarded as the dynamics element of the position (solution element).The word "symplectic " itself has the meaning of "pair".
With this philosophical thinking, it is reasonable to construct multi-stage multi-derivative algorithms for dynamical systems (ODEs).Indeed, this was achieved in [13] with many subsequent works [47,43,15,22].The building block is the pair, the solution element and the dynamics element (the derivatives).Specifically, a multi-stage two-derivative algorithm is written as where we suppress the dependence of f on t for simplicity so that f = f (y), the coefficients a ij , âij , b i , and bi can be displayed in an extended Butcher tableau [13].Here the function g(y) = f ′ (y)f (y) is given using the chain rule The dynamical element is implicitly used in the construction of algorithm (3).This is why this method is of multi-stage two-derivative type with the pair (y, dy/dt) as the building block.In particular, as s = 2, we have the two-stage fourth order accurate time-stepping algorithm in the form labelled as the "2 ⊙ 2 = 4" algorithm, which was independently derived in [32] for hyperbolic conservation laws.See the discussion in the subsequent sections.For (5), the first "2" represents the two-stage approach, the second "2" means the pair of the solution element and the dynamics element, and "4" stands for the fourth order accurate approximation to (1).Certainly, the first "2" has more implications when applied to the fluid dynamical systems for compressible flows.Besides, the notation "⊙" is used here to symbolize the mathematical operation currently.Probably in the future, this notation could be replaced by a better one.

Lax-Wendroff flow solvers
The Lax-Wendroff method [28] plays a fundamental role in the development of high order accurate schemes for hyperbolic equations.The corresponding scheme is unique in the class of three-point schemes of second order both in space and time.The feature of uniqueness implies that it is the reference of high order accurate schemes, and the three-point stencil hints at the compactness.
Here we are going to show more fundamentals of this method, which is taken as the building block or higher order accurate schemes.Moreover, we would like to present as many details as possible because it is unusual that it has not been received "enough" attention since its birth.Part of the reason may be the presence of oscillations near discontinuities when used to simulated compressible fluid flows although it was modified, e.g., the flux limiter methods in 1980s ( [24] and its successors), to be suited for the capture of discontinuities; part of the reason is, more possibly, the seeming complexity compared to methods of line.Even more seriously, the misuse in various contexts, such as diffusion equations and (dispersive) KdV type equations, leads to many controversial issues.

The revisit of Lax-Wendroff method
Let us first recall the Lax-Wendroff method [28].Consider the advection equation where a is a constant.The boundary condition remains to be discussed in Section 6.We approximate (6) by assuming that the solution is sufficiently regular, and take the Taylor series expansion at any point (x, t) to obtain, A key step is the temporal-spatial coupling technique by taking use of ( 6) to quantify the differentiation relation between the change rate of u and the spatial variation, Ignoring truncation errors of order more than three, the Lax-Wendroff scheme is derived as (cf.[28]), where central difference approximations are made to guarantee the spatial accuracy, u n j represents the point value u(x j , t n ) at the grid point (x j , t n ), x j = j∆x, t n = n∆t, with the spatial and temporal increments ∆x and ∆t.The Taylor expansion process is the same as that in the Cauchy-Kowaleveski approach (see [20]), and therefore ( 9) is regarded as the numerical realization of the CK approach.Note that this process determines the feature of this scheme, implying its application only in the range of hyperbolic problems (local behavior or finite propagation property).Any other extension needs serious and cautious treatments.
The Taylor expansion also relies on the smoothness of the solution.The successive differentiation (6) gives rise to the risk in the following sense.
(i) Once equation (6) admits discontinuities in the solution, the manipulation for (8) does not make any sense.This is the main reason that (9) produces oscillations near discontinuities [28].
(ii) As this method is applied to highly nonlinear dynamical systems, this manipulation becomes horrible and hampers to develop higher order accurate schemes, due to the successive differentiations.
We will comment on this manipulation appropriately at later sections.Rather now, we reinspect ( 6) and ( 7) from another point of view (after ignoring high order truncation errors), actually in the finite volume framework, where the differentiation relation ∂ ∂t = −a ∂ ∂x is applied.We immediately realize that for any (x, t) u(x, t) + ∆t 2 as long as the solution is smooth in t (temporal direction or flow direction).Viewing (10) in the finite volume framework, we obtain over the control volume ). ( The prediction of the value u * j+ for the given initial data at t = t n for (6).
This pair of values actually provide all quite detailed information along the interface x = x j+ 1  2 of control volume and also the flux 1 ∆t The two formulae (9) and ( 12) are equivalent for smooth flows.However, the new formulation (12) is fundamentally different from (9) in the following sense.
(i) The formulae (12) is actually the finite volume formulation for (6).The formulation is more straightforward for fluid dynamical systems than other formulations because it is just the numerical version of balance laws and allows discontinuities as its solution.(ii) The manipulation (11) is legal because the flow should be smooth in time (but not in space), unlike the difference approximation for LW approach.(iii) The temporal-spatial coupling feature again plays an important role, e.g., This feature is crucial for a numerical scheme to preserve the fluid dynamical properties such as the Galilean invariance.
(iv) The successive differentiation (8) can be avoided, which is extremely important for nonlinear problems when discontinuities are involved because the manipulation (8) makes no sense both mathematically and physically.
For (6), we label "2" for the pair (u, ∂u ∂t ) in the Lax-Wendroff solver (13), which is the building block, as we see, in the Lax-Wendorff scheme.It is interesting to observe that (13) can be approximated in an upwind or central way.The upwind approximation can avoid superfluous information in the scheme.

Lax-Wendroff flow solvers for nonlinear hyperbolic balance laws
We consider hyperbolic conservation laws where the vector u is the conservative variable.The natural formulation of ( 16) is in the finite volume framework, the balance law over any interval or the control volume ( with If one would prefer to other formulations, such as the discontinuous Galerkin method [17], the following statements still hold. We shift (x j , t n ) to (0, 0) due to the invariance of ( 16) with respect to the translation of coordinates.In order to proceed in one of those frameworks, we have to solve (16) approximately subject to the initial data u(x, 0) = P ± (x), for ± x > 0, (20) where P ± (x) are smooth functions, typically polynomials, with a jump at x = 0.The same as in the linear case (13), a Lax-Wendroff flow solver for such a problem is an algorithm approximating In general, we consider hyperbolic balance laws in multi-dimensions, where h is the source term resulting from physics or geometry, x = (x, y, z) is the spatial coordinate.The initial data for ( 22) is set to be u(x, y, z, 0) = P ± (x, y, z), for ± µ where µ is the unit normal of a line or plane L : µ • x = 0 pointing from the negative side to the positive side, corresponding to the outer normal of interfaces of computational volume.The Lax-Wendroff solver for (22) is to find the pair of values with the same form as in (21), where the limit is taken along the spatial-temporal interface L × (0, ∆t).
We want to remark here that the pair (u L,0 , (∂u/∂t) L,0 ) can be modulated to any direction in order to suit for an arbitrary Lagrangian-Eulerian (ALE) method.

Rough comments on the correlation between LW solver and temporal-spatial coupling
The instantaneous temporal derivatives in ( 21) and ( 24) can be roughly using the Lax-Wendroff approach and then −∇ • F(u) and h are approximated using certain technologies such as WENO etc.The same as in (15), the coherent relation of spatial and temporal variations is rooted in this formula.
The intuitive outcome of this coupling is the following.
(i) The multidimensional effect, in particular the transversal effect, is input into the flux directly.Thinking of a single advection problem where a, b are constants.For an interface with the normal in the x-direction, the transversal effect, expressed in the y-direction, is ignored for the standard Riemann solver.This is further verified for the wave system where c 0 is a constant.In Table 1, we use three methods to simulate the periodic wave problem.It is observed that even with the same convergence rate, the RK method produces also ten times of errors than what the second order GRP does for which the transversal effect is included.The solution cannot even converge with the refinement of meshes if only normal flux is used but the transversal effect is not included.See [29].(ii) The source effect h is also reflected through such a process.It is simple to see that for hyperbolic balance law This input is essential and indispensable for the well-balancedness, as verified for the shallow water equations [31].
There are more fundamentals, such as the thermodynamical effect [33], resulting from the temporal-spatial coupling.

2 ⊙ 2 = 4: Two-stage fourth order accurate schemes
In [32], the fourth order accurate method is developed for hyperbolic conservation laws.We start with the review of the dynamical system where L is a linear or nonlinear operator of w.Then we have the following two-stage algorithm for (30).

Stage 1. Define intermediate values
where the second equation follows from (30), using the chain rule.
Stage 2. Advance the solution using the formula This algorithm provides a fourth order accurate approximation to w. Originally, this algorithm was proposed in [13,16], and independently in [32] based on Lax-Wendroff flow solvers.Along this direction, one can derive as high order accurate approximations as what he likes [13,37].
When applied to hyperbolic problems ( 16) and ( 22), one can formulate them in any appropriate framework such as finite volume framework [32] or discontinuous Galerkin (DG) framework [14].Hence we assume that the computational domain Ω is meshed as Ω = ∪ j∈J Ω j and formulate the problem in the form Thus, this problem boils down to the dynamical system in the form (30). Then we have a two-stage fourth order time-stepping method, now symbolized as the "2 ⊙ 2 = 4" method.The intuitive meaning is that we adopt the second order flow solvers as building blocks and use a two-stage time-stepping to achieve fourth order accurate numerical methods for hyperbolic problems or convection-dominated problems.We make a diagram in the following.
"4": A fourth order scheme = "2": Second order Lax-Wendroff type flow solvers + "2": A two-stage time stepping Careful readers may observe the validity of (33) when the above two-stage algorithm applies to the current case, which is why we have to develop the Lax-Wendroff type flow solvers based on hyperbolic balance laws rather than the formal partial differential equations [6].There are at least two points that we should concern: (i) System ( 33) is index-dependent and therefore each equation for fixed j is related to the neighboring equations; (ii) the continuity of L j is crucial when applying the above two-stage algorithm.It is at these points that (30) is substantially different from (1).Therefore, the regularity of flux is a key factor to guarantee the validity of this algorithm.Physically speaking, this regularity is natural by recalling the Lagrangian form of fluid dynamical systems [27].Hence in the computation of instantaneous values (21) or (24), we must be aware of the regularity of the flux that will be further emphasized in the next section about the GRP solver.

The GRP solver: A discontinuous and nonlinear LW flow solver
As is well-known, and also pointed out in the last section, the standard Lax-Wendroff solver results in an algorithm that producing oscillatory solutions if discontinuities are present.The GRP solver, the abbreviation of the generalized Riemann problem (GRP) solver, can be regarded as the discontinuous and nonlinear version of the Lax-Wendroff solver.This solver was originally proposed in [2] for compressible fluid flows and related problems.See [4] for the comprehensive summary of works before 2003.Later on a direct Eulerian version of GRP solver was derived in [7] and further extended to general hyperbolic conservation laws [5,41].The presentation below will follow the direct Eulerian GRP.The application to non-conservative systems is referred to [1].

1-D GRP solver
We first review one-dimensional GRP solver for hyperbolic balance laws subject to the initial data of form (20).An important prototype is the compressible Euler equations with cross section, where the variables ρ, u, p and E are the density, velocity, pressure and the total specific energy.The total specific energy consists of two parts E = u 2 2 +e, e is the internal specific energy.The function A(x) is the area of the duct.When A(x) ≡ 1, the system (35) represents the planar compressible Euler equations.Let T be the temperature.Then the entropy S can be defined, as usual, by Gibbs relation of thermodynamics, The local sound speed c is defined as We will distinguish the linear (acoustic) and nonlinear GRP solvers.Both are related.However, as strong waves are involved, the nonlinear GRP solver becomes crucial.Details can be found first in [3] and later in [5,41] for general versions.
We denote (I) Linear GRP solver.Linear hyperbolic equations describe the propagation of waves linearly.For the single linear case (6), with a damping constant α, we have obviously For the linear or semi-linear system case we need to diagonalize the system and pursue the characteristic decomposition.Denote by Then we decompose A as A = RΛR −1 , and denote again |A| = R|Λ|R −1 .It turns out that the instantaneous values take Therefore, the GRP solver for the linear system case is substantially the same as that for the single equation case.

(II) Acoustic Approximation
For nonlinear cases, if = 0, only linear waves emanate from the singularity point (0, 0).Then we can linearize the system (34), at the value u 0 = u ℓ = u r , as where A(u 0 ) is the Jacobian f ′ (u 0 ).Note that Immediately we have We can proceed to obtain any higher temporal derivatives (∂ m u/∂t m ) 0 , abbreviated as ADER in [46].However, in the framework of 2 ⊙ 2 = 4, we are satisfied with the first order temporal variation in (45).
As u ℓ − u r ≪ 1, weakly nonlinear waves emanate from (0, 0).Then we can carry out the so-called acoustic approximation.To be precise, we can use either exact or approximate Riemann solvers to obtain the intermediate state u 0 and linearize the system (34) to be in the form (43) so that the temporal derivative (∂u/∂t) 0 is calculated as in (45).
For the Euler equations (35), the acoustic approximation is the following.
(i) As u 0 − c 0 > 0 or u 0 + c 0 < 0, the acoustic waves moves to one side of the t-axis.Then (∂u/∂t) 0 is taken upwind.
(ii) As u 0 − c 0 < 0 < u 0 + c 0 , the t-axis is located between two acoustic waves.Then we have The quantity (∂ρ/∂t) 0 is solved according to the direction of the contact discontinuity, and For "cheap" engineering applications, one can use the local Riemann solution u 0 to linearize the nonlinear system and obtain a "linear" system so that the above acoustic approximation strategy can be adopted.

(III) Nonlinear GRP solver
As u ℓ − u r ≫ 1, nonlinear waves emanate from the singularity point (0, 0).The larger the difference between u ℓ and u r is, the stronger the strength of the waves is.In Figure 1 we use the acoustic GRP solver to simulate the big density ratio problem and observe that the numerical solution has large disparity from the exact solution.In Figure 2, we use the nonlinear GRP solver that will be described below, and see that the numerical solution is improved prominently [33].Hence it is essential to develop the nonlinear GRP solver as long as strong waves need resolving.We just illustrate the nonlinear GRP solver for Euler equations with cross section (35).For general hyperbolic balance laws, readers are referred to [5,41].
We just need to assume a typical case, as shown in Figure 3, that a rarefaction wave moves to the left, a shock moves to the right and a contact discontinuity lies in the middle.When the waves move to one side of t-axis, i.e., u ℓ − c ℓ > 0 or u r + c r < 0, it is treated as the acoustic case that (∂u/∂t) 0 is obtained upwind.
We rewrite (35) in terms of (ρ, u, S), where p is regarded as a function of ρ and S. In terms of ρ, u and p, the third equation of ( 49) can be replaced by, In order to resolve strong rarefaction waves, it is particularly essential to introduce the so-called generalized Riemann invariants, as in [5], Together with the entropy variable S, system (35) becomes where Here it is easily seen that the variable section A(x) acts on the dynamical behavior of φ and ψ, and thus on that of (ρ, u, p).The severe change of entropy inevitably leads to the variation of other physical variables.The GRP solver that we will derive tells precisely how the entropy and the cross section affect the dynamics.
The most important ingredient is the application of nonlinear geometric optics for the local expanding of rarefaction waves using local characteristic coordinates (α, β), as shown in Figure 3.With that, we can obtain the instantaneous temporal derivatives ∂S/∂t and ∂ψ/∂t as (restricted to polytropic gases), where T ℓ S ′ ℓ is defined through the Gibbs relation, PSfrag replacements (a) Wave pattern for the GRP.The initial data u(x, 0) = u ℓ + xu ′ ℓ for x < 0 and u(x, 0) = ur + u ′ r x for x > 0.
Several remarks are in order about the role of entropy variation and the cross section on the dynamics.
(i) The source term reflecting geometric variation always plays an important role in the dynamics of flows.Inherently, the critical gas indices are clearly exhibited in (55), which cannot be illustrated in other flow solvers.This is just an evidence that only GRP solver can distinguish different gases.(ii) The entropy change rate is essential and acts on other flow variables as its initial variation is severe.This tells why the GRP solver is indispensable when strong waves are simulated.As the involved waves are weak or T ∂S ∂x is small, ∂S ∂t is negligible and many approximations such as linearization are acceptable.
The shock is resolved by tracking its trajectory described by the Rankine-Hugoniot relation, where (ρ, u, p) and (ρ, u, p) are the states ahead and behind the shock, respectively, and for polytropic gases.
For the contact discontinuity x = x(t), we make use of the continuity property of pressure and velocity on both sides of the trajectory, Then the differentiation along the contact discontinuity gives where D/Dt = ∂/∂t + u∂/∂x is the material derivative.This relation bridges the rarefaction wave and the shock, just like that for the Riemann solver.We just remind that the density and entropy undergo jump across this contact discontinuity.
Thus we come to the nonlinear GRP solver that are distinguished as nonsonic and sonic cases.
Proposition 31 (Non-sonic case.)Assume a typical wave configuration for the generalized Riemann problem for (35) as shown in Figure 3 that the t-axis is located in the intermediate region.Then (∂u/∂t) 0 and (∂p/∂t) 0 satisfies the following pair of linear equations, where a ℓ , b ℓ , d ℓ and a r , b r , d r are specified below.Also the computation of (∂ρ/∂t) 0 are computed by the following two cases.
(i) If u 0 > 0, the contact discontinuity moves to the right and separates two states (ρ 0ℓ , u 0 , p 0 ), (ρ 0r , u 0 , p 0 ).The coefficients a ℓ , b ℓ and d ℓ are given as, The coefficients a r , b r and d r are given by The value (∂ρ/∂t) 0 is computed from the rarefaction side, by using the state equation p = p(ρ, S).
(ii) If u 0 < 0, the contact discontinuity moves to the left.The coefficients a r , b r and d r are given as, (a r , b r , d r ) = (ã r , br , dr ). (64) While the coefficients a ℓ , b ℓ and d ℓ are computed by The value (∂ρ/∂t) 0 is computed from the shock side, where g R ρ , g R p , g R u and f r are constant, explicitly given in the following, and H 1 , H 2 and H 3 are expressed by The other coefficients (ã ℓ , bℓ , dℓ ) and (ã r , br , dr ) are and where we use notations Proposition 32 (Sonic case).Assume that the t−axis is located inside the rarefaction wave associated with u − c.Then we have where dℓ is given in (69), with θ = c 0 /c ℓ , and (u 0 , ρ 0 , c 0 ) is the limiting value of (u, ρ, c) along the t-axis so that u 0 − c 0 = 0. Then density change rate is given as in (63).
The above formulae look complicated, but seem irreplaceable.We can go to [5] for technical derivation of them.

Temporal-spatial coupling and thermodynamical effect
Thermodynamics distinguishes compressible fluid flows from incompressible ones, and the Mach number can be regarded as a parameter of the compressibility.The entropy dissipation is a necessary condition guaranteeing the stability of numerical schemes.Let's consider the compressible Euler equations with uniform cross section (A(x) ≡ constant in (35)).The entropy inequality says (ρS However, it is a well-known open problem whether this inequality is satisfied at discrete level, particularly for high order accurate schemes.There are two origins of discrete errors: the data projection and the flux approximation.In a general setting of finite volume framework, given the initial data u n (x) ∈ P k at t = t n , we have to find the solution u n+1 (x) at next time level t = t n+1 , satisfying where (ρuS) approx j+ 1 2 is the numerical entropy flux, and T ol(∆x, ∆t) is the entropy production that has the maximum tolerance of order three, T ol(∆x, ∆t) = O(∆t 3 + ∆x 3 ).We comment on how to achieve this inequality at the discrete level in the following.
(i) The persistence space P k often consists of piecewise polynomials of degree k.Given the initial data u n (x) ∈ P k , we solve (35) and obtain the (analytic) entropy solution u(x, t) for t n < t < t n+1 , satisfying (ii) The projection of u(x, t) onto P k (reconstruction procedure) is required to satisfy This is an extremely difficult step.For scalar conservation laws, there was a nice discussion on MUSCL-type linear reconstruction [9].In general, the Jensen inequality tells that Hence, there is still some room to make (76) hold, which remains an open problem.
Open problem on data projection: Find an optimal reconstruction strategy so that (76) holds.
(iii) Assume that (76) holds for certain data projection.As shown above, we approximate the interface value in the following way In particular, we make use of the entropy information.It turns out that (79) Summarizing all together yields (74).
It is observed that the precise calculation of entropy in (53) is a direct way to achieve (79).Other ways may at most lead to (80) The error of O( u 2 ) is not tolerated unless for scalar cases or smooth flows, since this type of errors violate the entropy inequality in the limit.
It is no doubt that the achievement of (79) is the outcome of the direct use of the entropy equation in (49), which is actually the Lax-Wendroff procedure, a temporal-spatial coupling procedure.

M-D GRP solver and transversal effects
When computing multidimensional (M-D) problems, M-D GRP solver is necessary to reflect the transversal effect, which is impossible using the exact or approximate normal Riemann solvers.We restrict to two-dimensional hyperbolic balance laws where f and g are flux functions.3-D GRP solver is straightforward.The initial data takes the form where u ± (x, y) are two polynomials of degree k.The x−direction is the normal and the y-direction is the transversal.A particular case is The M-D GRP solver can be classified as M-D linear GRP solver, acoustic GRP solver, nonlinear normal GRP solver with transversal perturbation, and genuinely nonlinear M-D GRP solver.
(I) M-D linear GRP solver.We consider the linear case where A and B are constant matrices, and both of them have their respective real eigenvalues and the complete sets of eigenvectors.We first assume that (84) is subject to the initial data (83).Note that ∇u satisfies the same form of (84), (∇u but subject to the initial data This boils down to the standard Riemann problem for ∇u.Therefore, the gradient ∇u has an explicit expression, where the notations L(x, t) is the maximum value of ℓ such that x − λ ℓ t > 0, λ ℓ is the eigenvalue of A and r ℓ is the associated eigenvector, v is so defined that See [30].In particular, we have Then we immediately obtain As far as the more general initial data (82) is concerned, the solution u consists of piecewise polynomials of the same degree as the initial data since (84) is a linear system.Here we are satisfied with the second order GRP solver and solve (84) at any point (0, y 0 , 0) to obtain where ∇u(0, y 0 , 0 + ) are calculated as the same procedure as above.
For nonlinear cases (81), if the initial data (82) is continuous but discontinuous in its derivatives, acoustic waves emanate from the interface x = 0, just as one-dimensional case.For this case, we might as well take the initial data (83) and assume that u − = u + but ∇u 0,− − ∇u 0,+ = 0. Then we linearize (81) around the state u 0 = u − = u + to obtain Then we can exactly follow the linear case to calculate (∂u/∂t) 0 .
The acoustic approximation applies for the case u − − u + ≪ 1.We linearize (81) around the intermediate state u 0 resulting from the associated Riemann problem.Then the linear GRP solver applies for this case.
(III) M-D genuinely nonlinear GRP solver with transversal description.
As u − − u + ≫ 1, we have to deal with genuinely nonlinear GRP.Thinking of the initial value problem for (81) subject to the initial data (83), the solution is the envelope of Riemann solution along x = 0 locally at t = 0. Hence along x = 0, the associated Riemann solution is known.For example, at two points (0, y 1 , 0) and (0, y 2 , 0), we solve the Riemann problem locally for the normal conservation law, respectively, and obtain the local intermediate values u(0, y 1 , 0 + ) and u(0, y 2 , 0 + ).Then for any point (0, y, 0), y 1 < y < y 2 , we can approximate ∂u ∂y (0, y, 0 + ).Particularly, we approximate Then we regard the transversal term g(u) y and h(x, y, u) as a perturbation locally, and solve the following problem, This boils down to the one-dimensional planar problem locally, for which the GRP solver was proposed in [5].Detailed and complete M-D GRP solver is proposed in [40].

Transversal effects for genuinely M-D schemes
For multidimensional (M-D) problems, the balance law can be always written in the form, where Ω is the control volume, ∂Ω is the boundary and n is the outer unit normal.We ignore the external force just for the clarity of presentation.
Thanks to the Galilean invariance, we always assume that (1, 0) (the direction x-axis) is the normal direction, and (0, 1) (the direction of y-axis) is the transversal direction.The standard Riemann solver just reflects the normal effect.In contrast, the LW procedure can describe the transversal effect precisely.Consider a linear advection problem Then we use the temporal-spatial coupling property to obtain where ( ∂u ∂x ) ∂Ω and ( ∂u ∂y ) ∂Ω can be obtained by solving the associated Riemann problem.Also as remarked in Section 2 for the linear wave system, the transversal effect is substantial even though the convergence rate is formally the same.

The kinetic LW flow solver
The fluid dynamics can be described in various viewpoints, such as the kinetic description.The governing equation is the Boltzmann-type equation where f = f (t, x, ξ) is the density distribution, ξ is the velocity of molecules (particle), and B(f, f ) is the collision term, ǫ is the Knusner number.Ideally, for a given initial distribution, we solve (99) to obtain the solution f (t, x j+ 1 2 , ξ) and define the numerical flux where ψ = (1, ξ, ξ 2 ) ⊤ is the invariant, and the average of macroscopic variables is In general, it is difficult to solve the equation ( 99) analytically.To understand the relation of macroscopic equations (Euler and Navier-Stokes equations) and the kinetic equation, we first take the so-called railroad method in [35] as an example to illustrate how to devise kinetic schemes.

Railroad method
Consider the linear advection equation Introduce a distribution function f (t, x, ξ) and define the macroscopic variable u(x, t) as a moment of f , If we choose f to take the form, then f (t, x, ξ) satisfies subject to initial data The initial value problem (102) and the problem ( 103)-( 106) are equivalent: If one solution is known, then the other is defined.We write the solution of ( 105)-(106) as Then the solution u(x, ∆t) is given as (108) Note that the LW approach for (107) yields Therefore we have (110) which yields a second order approximation to the exact solution u(x, t).
The numerical solution is where We assume the initial data for (102) is piecewise smooth with possible discontinuity at x = x j+ 1  2 .Correspondingly, the initial data ( 106) for (105) consists of two parts.It turns out that the numerical flux F j+ 1  2 in (111) becomes where consist of three parts, respectively, ), (113) As the solution is smooth, the scheme (111) becomes the LW approach immediately.See [35] for details.

The LW type solver for gas kinetic schemes
Let's now work on a simplified model, the Bhatnagar-Gross-Krook (BGK) model [8], where ǫ is the collision time, and g is the equilibrium state, approached by f as ǫ goes to zero, where m and k are constant, ρ and T are density and temperature, respectively.Indeed, all macroscopic variables ρ, u and E are defined as The validity of BGK-model is clearly explained in [8].
Starting with (114), Xu and his collaborators successfully developed gas kinetic scheme (GKS) solver [39,51,[48][49][50].A key ingredient is that the explicit solution formula for (114) is used for the numerical flux approximation, subject to the initial data f 0 (x, ξ), where Here just the case of one-dimension is described.The full information contained in (117) provides "exact" expression of flux across the interface x = x j+ 1  2 , which is of course consistent with the LW flow solver.We can go to [50] for comprehensive description of the GKS solver.For the gas-kinetic scheme, the gas evolution is a relaxation process from kinetic to hydrodynamic scale through the exponential function, and the corresponding flux is a complicated function of time.
In order to obtain the time derivatives of the flux function at t n and t * = t n + ∆t/2 with the correct physics, the flux function should be approximated as a linear function of time within a time interval.Let's first introduce the following notation, In the time interval [t n , t n + ∆t], the flux is expanded as the following linear form The coefficients F n j+1/2 and ∂ t F n j+1/2 can be determined as follows, By solving the linear system, we have for the intermediate state can be constructed.For the two-dimensional computation, the corresponding fluxes in the y-direction can be obtained as well.Readers are referred to [37].
There are huge numbers of references about kinetic solvers, which are beyond the scope of the current paper.We stop to discuss further.

Compact reconstruction using the Hermite interpolation
The compactness is a key factor in the design of high order schemes, determining the dissipation of the schemes near discontinuities and the numerical treatment of boundary conditions.With the increase of time-stepping, the width of computational stencils is inevitably expanded for multi-stage methods.Hence it is very important to construct the data in a compact way.
Unlike WENO using the Lagrangian interpolation, we adopt the Hermitetype interpolation using both the average values of physical (conservative or primitive) variables, and the approximate gradient of the solution.Going back to the original GRP, we construct the data over the computation cell where the gradient is chosen through the procedure, Usually, α is chosen to be as large as possible.As α ∈ (1, 2), u n (x) behaves as sawtooth and implies that σ n j take mostly This is a natural approximation to the gradient.The boundary value u n,− j+ , as a strong solution along the cell interface x = x j+ 1  2 , is calculated from the history, are obtained already from the GRP solver, and no extra efforts need making.Some remarks on (121) are made here, and they can be applied for later high order data interpolations.
(i) Compared to the classical limiter algorithm, (121) takes (122) in smooth regions, and limit the gradient near discontinuities in order to suppress possible oscillations.This sawtooth-type reconstruction can produce sharper profiles of discontinuities.(ii) The piecewise linear data (120) is the embryonic form of Hermite polynomials for high order schemes.Since all values are already given using the GRP solver, no extra effort is made on the calculation of the gradient, unlike in DG methods or other Hermite interpolation [42].If one might argue the freedom of solution elements, he could regard the current treatment as the Lagrangian interpolation using five points values.
(iii) For the data (120), five points values are used for the data reconstruction.
Essentially, these values are defined in three computational cells, rather than in five cell, so that computational stencils are almost half saved.This is one of key factors achieving compactness.Now we extend this to the two-stage fourth order method, by reviewing the result in [18].Given the average ūj and the derivative ∆u j of the function u(x) over the cell we want to construct a polynomial p(x) such that u j+ 1 2 ,− is its left limiting values at x = x j+ 1 2 .Choose three stencils On stencil S (0) , ūj−1 , ūj and ūj+1 are used to construct a polynomial p (0) for the interpolation.Hence at x j+ 1 2 , we have Similarly, p (−1) and p (1) are constructed by using ūj , ūj−1 , ∆u j−1 on S (−1) and by using ūj , ūj+1 , ∆u j+1 on S (1) , respectively, If the solution is smooth on the large stencil I −1 ∪ I 0 ∪ I 1 , we have Thus the linear weights of the three stencils are which ensure The smoothness indicators are defined by in the same way as in the WENO reconstructions where p (r) (x) is the interpolation polynomial on stencil S (r) .Their explicit expressions are (131) Then we compute the nonlinear weights in the same way as the WENO-Z method does where τ z = |β (1) − β (−1) | and ε is a small parameter in order to avoid a zero denominator.Finally we have The right interface value u j− 1 2 ,+ can be reconstructed in a similar way by mirroring the above procedure with respect to x j = 1 2 (x j− 1 2 + x j+ 1 2 ).Since the GRP solver has to use the spatial derivative (∂u/∂x) j+ 1  2 ,± , we approximate them using the interpolation, It is observed in this interpolation does not need the WENO-type stencil selection procedure.We define this procedure as HWENO , terming a Hermite type interpolation using the WENO interpolation strategy.GRP4-HWENO5 refers to the two-stage fourth order scheme based on this Hermite type interpolation using the GRP solver.
For two-dimensional cases, we can develop the similar approach over rectangular meshes.See [18,25].As far as unstructured meshes are concerned, there still remains space to explore.

High Order Boundary Conditions
Approximation to boundary conditions may be one of the most challenging issues in CFD.On one hand, mathematical modelings of fluid flows near physical boundaries are diverse.On the other hand, highly nonlinear behaviors and complex boundaries make the approximation notoriously involved.
We briefly illustrate their idea in the finite difference framework by considering the initial boundary value problem (IBVP) for a scalar conservation law x ∈ (0, 1), u(0, t) = g(t), t > 0. (136) Assume that f ′ (u) > 0 for all u ∈ R so that x = 0 is an inflow boundary and x = 1 is an outflow boundary.We equally distribute M + 1 points {x j = (j + 1/2)h : j = 0, 1, . . ., M } in the computational domain (0, 1), as shown in Figure 5.We use u j to denote the value of u at x = x j and suppress the index for the time levels.Obviously at the inflow boundary, the solution value at the Fig. 5 The computational domain (0, 1).Set x 0 = h/2 and x M = 1−h/2.Then x −1 = −h/2 and x M +1 = 1 + h/2 are ghost points.
ghost point x −1 is required in order to perform a second-order finite difference at x 0 .For this purpose, a polynomial is constructed in the region around the inflow boundary by using point-wise values u −1 , u 0 and u 1 , from which we want to find the value u −1 .The Lagrangian interpolation tells that Then u −1 can be obtained by solving the linear equation u(0, t) = L(0) where u(0, t) = g(t).At the outflow boundary x = 1, we simply use the extrapolation to obtain the value u M+1 since the signal goes out of the computational domain at this end.
The extension to high order is highly nontrivial.Let's review the approach developed in [19].The same as other multi-stage methods (e.g. in [12]), the current two-stage fourth order method needs careful treatment at the intermediate stage for the approximation in order to preserve the accuracy.There are two key points: The construction at ghosts points and the approximation of boundary conditions.The inverse LW approach is applied here [44], but the current treatment is much simpler and easier to be implemented since no higher derivatives need calculating.

Ghost values
We still consider IBVP (136).Then the values ū−1 , ū−2 , ∆u −1 and ∆u −2 , defined over I −1 = (−h, 0) and I −2 = (−2h, −h), ghost cells, are needed in the reconstruction procedure for the values indexed by j = 0 and j = 1.To obtain the values mentioned above, a cubic polynomial, is constructed over With the constraints (141) into (140), we determine the coefficients α 0 , α 1 , α 2 and α 3 as in which ū−1 and ū−2 are yet to be determined and they are obtained by evaluating p(0) and p ′ (0) at the boundary x = 0, (143) Solving (143) in terms of ū−1 and ū−2 yields (by ignoring high order terms) Substituting ( 144) into (142), in turn, gives us the explicit expressions of α i , i = 0, . . ., 3, and then the expression of p(x).Therefore we have (by ignoring high order terms) Thus ( 144) and (145) together provide the values in the ghost cells I −1 and I −2 .Not that in (143) the inverse LW approach is used, As there are discontinuities close to the inflow boundary, a WENO-type stencil selecting procedure can be applied.Assume that there is a discontinuity in either I 0 or I 1 , we shorten the stencil cell by cell.Denote the stencils by Denote by p (r) (x) the interpolation polynomial on S (r) , r = 0, 1, 2, just as the polynomial p(x) constructed before.Then define The expressions of ū −1 and ∆u (r) −2 for r = 0, 1, 2 will be listed in A.1.
The smoothness indicators are defined in the same way as for the classical WENO interpolation, and the values are given where the linear weights of each stencil are (1) . (149)

Inflow boundary condition treatment at intermediate stages
The same as in other multi-stage temporal discretization [11,38], the direct use of exact boundary conditions at intermediate stages in the process of multistage approaches will cause the lose of the numerical accuracy.In order to offset such a defect, our strategy is made as follows.We first focus on the leftmost control volume I 0 and write out the solution advancing formula, ūn+1 The difficulty results from the presence of (∂u/∂x) and (∂u/∂x) evaluated at the intermediate stage t = t n+ 1 2 .In order to restore the fourth-order accuracy of the two-stage fourth-order scheme, we use ∂u ∂x where the exact boundary values g(t n+ 1 2 ) and g ′ (t n+ 1 2 ) are replaced by The detailed analysis is given in [19].

Outflow boundary condition
We set x M+ 1 2 = 1 as an outflow boundary, at which no boundary condition is prescribed theoretically.Numerically, we have to set required values ūM+1 , ūM+2 , ∆u M+1 and ∆u M+2 in ghost cells.Since the signal propagates out of the computational domain through the boundary x = 1, the extrapolation can be used to construct the data in the ghost cells I M+1 and I M+2 .A cubic polynomial is constructed in order to achieve the fourth-order accuracy, This gives the values If there is a discontinuity in either I M−3 , I M−2 , I M−1 or I M , a WENO-type stencil selection can be applied.

Hyperbolic systems
At moment, the boundary treatment for systems of hyperbolic balance laws is basically achieved through the diagonalization process.Then we distinguish various cases such as the solid boundary condition, inflow and outflow boundary conditions for practical applications.Details can be found in [19].

Computational Performance
In our series of papers, we have demonstrated the performance of current temporal-spatially coupled algorithms through many challenging benchmark problems, particularly in [36] using the GKS solver.Here I would like to give some remarks in terms of computational efficiency, robustness and fidelity.

Computational Efficiency
Computational efficiency is always an important issue for practical engineering problems.We have tested and compared the efficiency with the popular WENO algorithm in [37] and with DG in [14].Specifically, in [37] we evaluate the computational costs of the WENOtype reconstruction and the flux evaluation quantitatively.The time for each reconstruction is denoted by T R , the time for second-order gas-kinetic solver is T 2nd , and the time for third-order flux solver is T 3rd .According to the data provided in [37, Table 1, Page 203], we can estimate the time used for the computations of flux and reconstruction with the following relations, where the estimation is based on the characteristic variable reconstruction and each flux is shared by two cells.Thus, the time for reconstruction is T R = 0.71289s, the time for second-order gas-kinetic flux solver is T 2nd = 0.06499s and the time for third-order gas-kinetic flux solver is T 3rd = 0.33445s.For classical fourth-order Runge-Kutta schemes, the computational time for four spatial reconstruction alone will become much higher than the fourth-order gas-kinetic scheme for the update of each cell averaged values 4T R = 2.85156s > 2T R + 12T 2nd = 2.20566s.Similar estimation can be done for the conservative variables reconstruction.Even without counting on the cost of the flux evaluation in the traditional fourth-order Runge-Kutta method, such as those commonly used with the Lax-Friedrichs flux, the current fourth-order time stepping method is still more efficient than the classical methods.
The efficiency is mainly attributed to the half of reconstruction steps compared to that for the same order of other line methods.This is further verified in the framework of DG methods [14].In Table 2 and Figure 6 through simulating shock-vortex interaction problem, demonstrating that nearly 55% CPU time can be saved using the GRP-DG(s2p3) method compared to the same order SSP RKDG(s5p3) method.This result meets the expectation well as compared to the RKDG(s5p3) method which needs five stages of evaluating DoFs and performing reconstruction to achieve fourth order, while the GRP-DG(s2p3) method only takes two stages to provide totally comparable results.Fig. 6 Comparison of CPU time(s) between RKDG(s 5 p 3 ) method and GRP-DG(s 2 p 3 ) method for the shock vortex interaction problem.

Robustness
The robustness is always an important indicator for a practical numerical method.In the framework of multistage multi-derivative algorithm, the strong stability preserving (SSP) property was taken over to show the stability [22].However, SSP seems not work when the Lax-Wendorff type flow solvers are taken as the building block.Therefore new stability framework is worth exploring in the future.
Nevertheless, in practice, the current two-stage fourth order accurate algorithm has the same stability as the second order version: the Courant number is taken above 0.5 except extreme cases such as the large density ration problem.Empirically, the current "2 ⊙ 2" algorithm is more robust than other multi-stage methods.

Fidelity
In the community of CFD, the fidelity is termed for a numerical simulation of very complex problems using a specific algorithm.Since there is no reliable mathematical theory in general supporting the current CFD simulation, the verification of high fidelity appears very valuable.We pursue such studies in the whole process of the current algorithm.For example, we resolve the associated GRP analytically and use the GRP solution for Hermite-type data reconstruction.In [33] we elaborate the so-called large density ratio problem [45] using the GRP solver.When the current "2 ⊙ 2" algorithm is adopted, quite few grids are needed to obtain satisfactory results, as shown in Figure 7. Also readers are recommended to test the benchmark problems in [36], for which all simulations are made in the "2 ⊙ 2" framework using the GKS solver.

Prospective Discussions
It is natural to require the temporal-spatial coupling of a numerical method when simulating compressible fluid flows, for which the GRP solver and the GKS solver are reviewed briefly as the representatives of Lax-Wendroff type flow solvers.The direct embedding into any numerical frameworks, such as the finite volume/DG framework, already results in favorable second order numerical schemes.Interested readers can refer to papers by Jiequan Li and his collaborators for GRP methods (www.ams.org/mathscinet,scholar.google.comor researchgate.net).
As to the "2⊙2" algorithm itself, it is just at the beginning stage, and many issues are awaiting for our study.Below are some immediate doable problems.
P 1 What is a good framework for stability analysis?P 2 It is valuable to compare and develop multi-stage two-derivative algorithms with arbitrary order of accuracy.
P 3 Develop implicit "2 ⊙ 2" algorithm with various applications such as detonation simulation.P 4 Apply this algorithm for the simulation of turbulence flows and other engineering problems.
You are welcome to join this new branch of high order numerical methods for CFD.

A The interpolation results in Subsection 5
This appendix is dedicated to list the interpolation results in Section 5. Recall that we assume x = 0 and x = 1 are the inflow and outflow boundaries for the IBVP (136) of the one-dimensional scalar conservation laws, respectively.The stencils are denoted in (146).

Fig. 1 Fig. 2
Fig.1The numerical solutions computed by the second order acoustic GRP scheme (with the exact Riemann solver) are compared with the exact solution (only 66 cells are shown).

Fig. 3
Fig. 3 Typical wave pattern for the generalized Riemann problem

Fig. 7
Fig.7The comparison of the density profile for the large pressure ratio problem in Example 6.The schemes are GRP4-HWENO5 (squares) and RK4-WENO5 (dots) with m cells.The solid lines are the exact solution.

Table 1 L
1error and convergence order of u for the periodic wave problem at final time t = 2. with the methods GRP2D, RK2 and GRP1D

Table 2
Comparison of CPU time(s) between RKDG and GRP-DG methods for shock vortex interaction problem.