Soil water movement has direct effects on environment, agriculture and
hydrology. Simulation of soil water movement requires accurate determination of model parameters as well as initial and boundary conditions. However, it is difficult to obtain the accurate initial soil moisture or matric potential profile at the beginning of simulation time, making it necessary to run the
simulation model from the arbitrary initial condition until the uncertainty of
the initial condition (UIC) diminishes, which is often known as “warming up”.
In this paper, we compare two commonly used methods for quantifying the UIC
(one is based on running a single simulation recursively across multiple
hydrological years, and the other is based on Monte Carlo simulations with
realization of various initial conditions) and identify the warm-up time

Understanding the movement of soil water is of great importance due to its direct effects across different disciplines, such as environment, agriculture and hydrology (Doussan et al., 2002). However, modeling of flow in variably saturated soil is complicated by many difficulties, including highly variable and nonlinear physical processes as well as limited information about the soil hydraulic properties, initial conditions and boundary conditions (DeChant, 2014; Rodell et al., 2005; Seck et al., 2014; Bauser et al., 2016; Li et al., 2012). The soil hydraulic parameter uncertainty is identified as a major uncertainty source in vadose zone hydrology, and many studies have been focused on this topic. A highly relevant research area, inverse modeling, has been developed to reduce the uncertainty of the parameter by incorporating observational data (Erdal et al., 2014; Montzka et al., 2011; Wu and Margulis, 2011, 2013). Boundary conditions also introduce uncertainty during the simulation of soil water flow (Ataie-Ashtiani et al., 1999; Forsyth et al., 1995; Szomolay, 2008). For instance, the uncertainty introduced by flawed or noise-contaminated meteorological data or the fluctuating groundwater table has been investigated in the past (Freeze, 1969; French et al., 1999; van Genuchten and Parker, 1984; Ji and Unger, 2001; Xie et al., 2011).

Many publications have addressed the issue of the uncertainty of the initial condition (UIC) in modeling soil water movement. For example, Walker and Houser (2001) compared the simulation with the degraded soil moisture initial condition to that with the true initial condition and found that the discrepancy did not fade away even after 1 month. Then, Mumen (2006) concluded that the initial soil water state was one of the most important factors for estimating soil moisture in the case of bare soil. Chanzy et al. (2008) tested three initial water potential profiles and found that initialization had a strong impact on the soil moisture prediction. These studies showed that the incorrect initial condition may lead to false results. Based on the availability of information, different initialization approaches can be used for constructing initial conditions, e.g., an arbitrary uniform profile (Chanzy et al., 2008; Das and Mohanty, 2006; Varado et al., 2006), a linear interpolation with in situ observation (Bauser et al., 2016) or a steady-state soil moisture profile induced with a constant infiltration flux (Freeze, 1969). All of the approaches involve great uncertainties due to nonlinearity of soil moisture profile, observation error or the inaccurate boundary condition. As a result, it is crucial to explore the effects of the UIC on model outputs and compare the uncertainties inherited from various initialization approaches.

Besides the simple initialization methods referred to above, another common
approach is to obtain the initial condition inherited from the warm-up model
with preceding meteorological data. Starting from an arbitrary initial
condition, this approach runs the model using a certain period (i.e.,
warm-up time

As well as model predictions, the UIC also has considerable effects on parameter estimation. One of the commonly used inverse methods in the field of vadose zone hydrology is the data assimilation approach (Vereecken et al., 2010; Chirico et al., 2014; Medina et al., 2014a, b). Previous studies showed that a poor initial soil moisture profile can be corrected by assimilating near-surface measurements (Galantowicz et al., 1999; Walker and Houser, 2001; Das and Mohanty, 2006). Oliver and Chen (2009) discussed several possible approaches for improving the performance of data assimilation through improved sampling of the initial ensemble and suggested the use of the pseudo-data. Recently, Tran et al. (2013) found that decreasing assimilation interval could improve the soil moisture profile results induced by the wrong initial condition, and Bauser et al. (2016) addressed the importance of the UIC in the data assimilation framework. However, these preliminary investigations of the influence of the UIC in data assimilation results are degraded by the narrow choice of initialization and data assimilation methods and the lack of comprehensive assessment of the temporal evolution of state or parameter uncertainty when the UIC and the parameter uncertainty coexist. For instance, during data assimilation, the initial ensemble is often assumed to be known without uncertainty (Shi et al., 2015) or created by adding Gaussian noise to the initial estimate (Huang et al., 2008), both of which may result in false outputs. The researches mentioned above are all based on a sequential data assimilation approach (i.e., ensemble Kalman filter or EnKF; Walker and Houser, 2001; Oliver and Chen, 2009), which incorporates observation in a sequential fashion so that the effect of the UIC can be eliminated quickly. Compared to EnKF, an iterative ensemble smoother (IES), which assimilates all data available simultaneously, can obtain reasonably good history-matching results and performs better in strongly nonlinear problems (Chen and Oliver, 2013). However, the IES utilizes all the observation simultaneously at every iteration, and the UIC may have a more persistent effect on the IES. Thus, a systematical analysis for the effects of the UIC and initialization methods within various data assimilation frameworks is necessary and obligatory.

The objectives of this paper, therefore, are to (a) compare the temporal
evolution of the UIC with two common methods (spin-up method and Monte Carlo
method) and identify the warm-up time

Richards' equation can be used to describe the one-dimensional, vertical
soil water movement, which is given as

To obtain the solution of Eq. (1), the knowledge of functions

Initial and boundary conditions are needed to solve the one-dimensional
Richards' equation. The initial condition could be the state of soil
moisture,

The state-dependent, atmospheric boundary condition can be described as
(Šimůnek et al., 2013)

The bottom boundary condition is the free drainage boundary:

The investigation of uncertainty in this study includes model states (e.g.,
soil moisture) and model parameters, where the UIC is a special case of state
uncertainty at

The uncertainty of the initial condition can be measured by the percent change (PC) for the spin-up method (Ajami et al.,
2014; Seck et al., 2014) or the ensemble spread

The ensemble spread (

We employ EnKF and the IES for data assimilation in this study. Figure 1 illustrates the basic ideas and differences of the two methods.

Flow charts of simulation period – or data assimilation period with

The EnKF approach was first proposed by Evensen (1994) and has been widely used in variably saturated flow problems (Huang et al., 2008; De Lannoy et al., 2007). This approach is a sequential data assimilation method (as shown in Fig. 1a) which incorporates observations into the model in order.

In this part, we assume that hydraulic parameters

Compared to EnKF, the IES gives a better estimate by taking all the available
observations into consideration (van Leeuwen and
Evensen, 1996), as presented in Fig. 1b. Thus, it can keep the overall
consistency of parameters and state variables over time effectively and has
been increasingly used to solve the parameter estimation problem in
hydrology (Crestani et al.,
2013; Emerick and Reynolds, 2013). By calculating iteratively, the nonlinear
relationship between observation and parameter is linearized and the
information content of the observations can be fully utilized
(Chen and Oliver, 2013). In
this case, we write the analyzed vector of model parameters

To assess model parameter and state estimations, the root-mean-square error (RMSE) of
estimated parameters (RMSE

Soil hydraulic parameters used in simulation.

A series of synthetic numerical experiments are performed in this section.
In Sect. 3.1, we give a general description of the numerical experiments.
In order to gain a better understanding of the propagation of the UIC, all
the hydraulic parameters (i.e.,

As shown in Table 1, four soils (sand, loam, silt and clay loam) are chosen in this study to explore the impacts of soil hydraulic property on the UIC. The values of hydraulic parameters are determined according to Carsel and Parrish (1988). Besides this, the effects of the meteorological condition are also considered: M-AC, M-SC and M-HC in Fig. 2 represent three sets of precipitation and potential evaporation data from three different regions (arid region, semi-arid region and humid region) in China.

Synthetic rainfall (blue bars) and potential evaporation (red bars)
of three typical climates:

Unless otherwise specified, a uniform soil profile with the 50 % relative saturation (a value of 0.254 for loam) is chosen as the initial condition (IC-HfSatu). The soil profile is set to be 300 cm thick and is filled with loam. The flow domain is discretized into 60 grids with a grid size of 5 cm, which has been proved to be sufficient for evaluating the UIC in our study (results not shown). Besides this, the total simulation time during the synthetic simulation is 1 year (365 d). In addition, the default upper and bottom boundaries are set to be M-SC and the free drainage boundary, respectively. Other specifications and assumptions for our model simulation runs are given in Table 2.

The default model settings used in the simulations.

A synthetic experiment is conducted to compare two methods (i.e., spin-up
method and Monte Carlo method) in quantifying the UIC. Using the spin-up method,
the first case runs a single simulation for 10 years by repeating the
preceding meteorological condition starting with IC-HfSatu (Fig. 3a), and
the percentage cutoff PC is calculated. In the second case, the Gaussian noise
with a standard deviation of 3 % (determined according to the observation
error of soil moisture) is added to the IC-HfSatu to generate an ensemble
with different initial soil moisture profiles. Then we run different model
realizations (Fig. 3b). Finally, the PC and

Comparison of spin-up and Monte Carlo methods in determining warm-up
time.

As shown in Fig. 3a, there is a visible difference between the
monthly averaged soil moisture at the beginning and the 12th month,
while the difference is much smaller for

The determination of the threshold value when the UIC is regarded to have
negligible effects on modeling has been discussed in previous studies (Ajami
et al., 2014; Lim et al., 2012; Seck et al., 2014). PC or

The Monte Carlo method is used in this part to obtain the warm-up time

Figure 4 plots

The length of warm-up time

In addition, the meteorological condition has a strong impact on the UIC. For
example, with soil loam, the order of

It should be noted that the

To investigate the effects of soil profile length on warm-up time, we
investigate the

The value of the warm-up time

In addition,

Besides IC-HfSatu, two other common methods to prescribe initial conditions
in variably saturated flow model based on the availability of information
are also considered in this study, including a linear interpolation between
observations (at depths of 10, 80, 150, 220 and 290 cm) at the
beginning of simulation (IC-ObsInt) and a steady-state soil moisture profile
by warming up the model with a constant infiltration flux of 1 mm d

Thus, a total of five initialization methods (IC-HfSatu, IC-ObsInt,
IC-NetFlux, IC-WUP and IC-WUE) are assessed to investigate the effect of the UIC
on model state and parameter estimations within two data assimilation
frameworks (EnKF and IES). The initial realizations of soil hydraulic
parameters

Several test cases are conducted to explore the effects of initialization on
parameter estimation under various data assimilation frameworks. Case 1
investigates the effects of five initialization methods on individual
parameter estimation with EnKF and the IES, respectively. Then, we increase the
ensemble size of IC-HfSatu and IC-ObsInt to 500 (hereafter referred to as
IC-HfSatu-500 and IC-ObsInt-500) in Case 2 to demonstrate the impacts of
ensemble size. Case 3 explores the effects of the uncertainty magnitude of
the initial ensemble on the parameter estimations. A Gaussian noise with a
standard deviation of 0.017 (counted from IC-WUP) is added to both
IC-HfSatu-500 and IC-ObsInt-500 (hereafter referred to as IC-HfSatu-500-Un
and IC-ObsInt-500-Un). Furthermore, to find out the role of the initial
condition in multi-parameter inverse problems, Case 4 is conducted to
estimate

The synthetic observations used for data assimilation are generated by
running the model with the true parameter (loam) and true initial
condition (produced by warming up model with a sufficient long time of 10 years). The generated observations are perturbed by a Gaussian noise with a
standard deviation of 0.01. A total number of 37 observations are
assimilated into the model. The observation depth is at

Case summary for parameter estimation within EnKF and IES.

Note: values that are not given use the default values. The default value of initial uncertainty for IC-ObsInt and IC-HfSatu is 0.

The results of

The results for parameter estimation (

The impacts of increased ensemble size (Case 2) and uncertainty of
initial state (Case 3) on the results of

For the data assimilation problem, the ensemble variance is increasingly
underestimated over time and iteration, which may cause the filter inbreeding
problem (Hendricks Franssen and Kinzelbach, 2008). To determine if our data
assimilation runs are affected by filter inbreeding, the temporal change of
the standard deviation of estimated

Increasing the ensemble size and model uncertainty is an efficient approach
for reducing the filter inbreeding (Hendricks Franssen and Kinzelbach, 2008).
To demonstrate the impacts of ensemble size and initial uncertainty on data
assimilation results, the results of

The results of IC-HfSatu-500 and IC-ObsInt-500 with the ensemble size of 500
in Fig. 7 are similar to those of IC-HfSatu and IC-ObsInt (Fig. 6),
indicating that the improvement of the parameter estimation result is slight
when the ensemble size increases from 300 to 500. Hence, the ensemble size
of 300 is sufficient for the data assimilation problem in this study. In
contrast, the influences of adding the uncertainty to the initial state on
parameter estimation are totally different for EnKF and the IES. Compared with
the results of IC-ObsInt-500 and IC-HfSatu-500, the results of

Moreover, the parameter estimation results of IC-WUP are still superior to those of IC-HfSatu-500-Un and IC-ObsInt-500-Un even though they all have a similar computational cost, showing the promising performance of warm-up methods. The results are reasonable, since all ensemble Kalman filter methods are affected by the quality of the auto-covariance matrix and the mean value of the predicted state ensemble (Eqs. 11 and 12 for EnKF; Eqs. 14 and 15 for IES). For the WUP method, the initial condition is constructed by warming up the model with the prior parameter; thus IC-WUP contains useful information of prior parameter, even it is biased. Besides this, the state covariance matrix is implicitly inflated due to the introduction of uncertain prior parameter ensemble during warming up. These two aspects ensure the robust performance of warm-up methods. However, the initial state ensembles of IC-HfSatu-500-Un and IC-ObsInt-500-Un are independent of the prior parameter, which introduces additional uncertainties, making the data assimilation results worse. Therefore, even using a larger ensemble size and enlarging the state uncertainty (covariance inflation), warm-up methods are still the optimal choice for both EnKF and IES algorithms. We also construct another case with larger parameter uncertainty to alleviate the filter inbreeding problem, and the data assimilation for all cases are improved (not shown). The results also agree with our conclusion that WUP performs the best among the five initialization methods.

The RE results of parameter estimations (

To evaluate the effects of the UIC in the multi-parameter inverse problem, the RE results of

To examine the impact of assimilation time on parameter estimation with the IES,
Case 5 with a shorter assimilation period (60 d) and a fewer number of
observations (i.e., six) is conducted. Figure 9 shows the RE results, and it is
inferior to those in Case 4, where the simulation time is 1 year (Fig. 8b). Nevertheless, the effects of assimilation time on parameter
estimation are different for different parameters. For instance, parameter

The RE results of parameter estimations under five initialization methods with IES when the simulation time is 60 d (Case 5).

Synthetic observation in previous section is generated by running the model with uncertainty sources that are exactly known. By conducting synthetic experiments, we can thoroughly analyze the impact of the UIC during data assimilation, with scenarios having different numbers of observations and/or unknown parameters, and more decisive conclusions can be drawn. In contrast, the field observations contain additional uncertainties which are largely unknown (e.g., the calculated evapotranspiration is inaccurate for the real-world case). In order to examine the real-world applicability of the conclusions drawn from synthetic case, field data are necessary for validating our results. A field experiment is conducted in the irrigation-drainage experimental site of Wuhan University (Li et al., 2018; Fig. 10a). Meteorological data, including air temperature, relative humidity, atmospheric pressure, incident solar radiation and precipitation, are continuously monitored by an automatic weather station (LoggerNet 4.0), which can be used as an upper boundary condition after the calculation of the potential evaporation (Penman–Monteith's equation) on the bare soil (see Fig. 11a). A vertically inserted frequency domain reflectometry (FDR) tube was used to monitor soil moisture (Fig. 10b). The in situ soil moisture observation was measured every 3 d. The tube gave soil moisture measurements at the depths of 10, 20 and 30 cm. During 18 April to 30 May 2017, the measurements were repeated 14 times and 42 soil moisture data were collected (see Fig. 11b). Besides this, the soil moisture at the depths of 10, 20, 30, 40, 60 and 80 cm at the beginning of the simulation time is also available to construct an initial profile via IC-ObsInt.

The experimental site:

To analyze the experimental data, the 1-D numerical domain is set as 2 m and discretized in 50 grids. The top 40 grids have a size of 2.5 cm, and the rest have a size of 10 cm. The upper boundary is set as an atmospheric boundary using the data shown in Fig. 11a, and the bottom boundary is set to be free drainage, since the groundwater table is much deeper than the bottom of the domain.

The meteorological information and observed soil moisture over the
experimental time.

The prior parameter distributions follow the study of Li et al. (2018). The saturated soil moisture

The assimilation results with four different initialization results
(IC-HfSatu, IC-ObsInt, IC-flux and IC-WUP) are presented in this part. Since
the true hydraulic parameters at the experimental site are unknown, we
assess the estimation by comparing the predicted (using estimated
parameters) and observed soil moisture during the validation period. The
RMSE

RMSE

In order to evaluate the overall performances of the four initialization
methods, the soil moisture observations and predictions at all depths are
plotted in Fig. 12. The coefficients of determination under the four
scenarios are 0.033, 0.599, 0.083 and 0.553, and the RMSE

The comparisons between soil moisture observations and predictions (with estimated parameters from IES combined with different initialization methods) at all observation depths.

The study investigates the effects of the UIC on variably saturated flow
simulations subject to different soil hydraulic parameters, meteorological
conditions and soil profile lengths. Two common approaches (spin-up and
Monte Carlo methods) are applied to explore the required warm-up time

Under the atmospheric boundary condition, the soil moisture value near the upper
boundary could approach its upper and lower bounds (saturated water content
and residual water content) due to rainfall and evaporation. This
significantly reduces the UIC of soil moisture profile near the soil
surface. Our investigation shows that the coarse-textured soil results in
faster reduction of the soil moisture UIC because of fast redistribution of
water in sandy soil. Regarding the influence of boundary conditions, we find
that heavy rainfall can reduce the UIC significantly, while an initial condition
in a drier status leads to a growth of

Ideally, the initial ensemble should represent the error statistics of the initial guess for the model state during data assimilation (Evensen, 2003). Thus, effort should be invested in reducing the impact of the UIC on data assimilation. Methods which do not consider the UIC (i.e., assuming an initial ensemble arbitrarily without uncertainty, which was used in some studies; e.g., Shi et al., 2015) can induce significant bias according to our data assimilation results. By constructing the initial condition using the information of observations or the boundary condition (averaged flux), the data assimilation results can be improved. However, these two initialization methods are also suboptimal due to the oversimplification of the complex initial condition. By warming up the model with available meteorological data, the initialization methods can improve data assimilation results. Moreover, EnKF is more sensitive to the filter inbreeding problem than the IES. The initial condition with a larger state uncertainty gains better performance than that without covariance inflation for EnKF, while for the IES, this inflated uncertainty cannot decrease over iterations, making the results inferior.

In this study, we only use the soil moisture observations rather than the pressure head to construct the initial profile. For homogeneous soil column, there is a one-to-one relationship between the spread of soil moisture and the pressure head (i.e., UIC in terms of the pressure head can be converted from that of soil moisture). The situation will be much more complex if the soil is heterogeneous, since a large number of unknown hydraulic parameters may introduce significant nonlinearity during the transformation between the head and soil moisture. For instance, the soil moisture profile is discontinuous in layered soils. The use of the pressure head instead of soil moisture as the initial condition for heterogeneous soils deserves investigation in our future work.

Our work leads to the following major conclusions:

The spin-up method and Monte Carlo method can both quantify the UIC, and they
agree well with each other after a sufficiently long simulation. A threshold
of 0.5 % for percentage cutoff PC or ensemble spread

Warm-up time varies nonlinearly with soil textures, meteorological conditions and soil profile length. Under most situations (e.g., loam with the soil profile length less than 5 m under non-arid climate), a 1-year warm-up time is sufficient for soil water movement modeling, but an extremely long time (exceeds 10 years) is needed to warm up the model for a long, fine-textured soil profile under an arid meteorological condition.

The IES shows better performance than EnKF in the strongly nonlinear problem and is affected less by the UIC with a long period of observations. In addition, covariance inflation of the initial condition could improve the data assimilation results for EnKF but deteriorate them for the IES.

The following procedure is recommended to initialize soil water modeling:
(1) evaluate the approximate warm-up time based on the model settings,
(2) initialize the model using the method of the WUP (if meteorological data are
available) and make sure the warm-up time is larger than the required

All the data used in this study can be requested by email to the corresponding author Yuanyuan Zha at zhayuan87@gmail.com.

YZ and JY developed the new package for soil water flow modeling based switching the primary variable of the numerical Richards' equation; DY and YZ developed EnKF and the IES codes for data assimilation of variably saturated flow and designed and conducted the numerical cases and field data validation for this study. Seven of the co-authors made non-negligible efforts in preparing the paper.

The authors declare that they have no conflict of interest.

This work is supported by the Natural Science Foundation of China through grant nos. 51609173, 51779179 and 51861125202. The authors appreciate Michael Tso (Lancaster University, UK) for editing the paper. We thank the three reviewers for their constructive comments and suggestions.

This research has been supported by the National Natural Science Foundation of China (grant nos. 51609173, 51779179 and 51861125202).

This paper was edited by Bill X. Hu and reviewed by Heng Dai and two anonymous referees.