Parameter optimisation for a better representation of drought by LSMs: inverse modelling vs. sequential data assimilation

a representation LSMs: Abstract. Soil maximum available water content (Max-AWC) is a key parameter in land surface models (LSMs). However, being difﬁcult to measure, this parameter is usu-ally uncertain. This study assesses the feasibility of using a 15-year (1999–2013) time series of satellite-derived low-resolution observations of leaf area index (LAI) to estimate MaxAWC for rainfed croplands over France. LAI interannual variability is simulated using the CO 2 -responsive version of the Interactions between Soil, Biosphere and Atmosphere (ISBA) LSM for various values of MaxAWC. Optimal value is then selected by using (1) a simple inverse modelling technique, comparing simulated and observed LAI and (2) a more complex method consisting in integrating observed LAI in ISBA through a land data assimilation system (LDAS) and minimising LAI analysis increments. The evaluation of the MaxAWC estimates from both methods is done using simulated annual maximum above-ground biomass ( B ag ) and straw cereal grain yield (GY) values from the Agreste French agricultural statistics portal, for 45 administrative units presenting a high proportion of straw cereals. Signiﬁcant correlations ( p value < 0.01) between B ag and GY are found for up to 36 and 53 % of the administrative units for the inverse modelling and LDAS tuning methods, respectively. It is found that the LDAS tuning experiment gives more realistic values of MaxAWC and maximum B ag than the inverse modelling experiment. Using undisaggregated LAI observations leads to an underestimation of MaxAWC and maximum B ag in both experiments. Median annual maximum values of disaggregated LAI observations are found to correlate very well with MaxAWC.


Introduction
Extreme weather conditions markedly affect agricultural production. The interannual variability of rainfed crop yields is driven to a large extent by the climate variability. Comparing agricultural statistics to climate data shows the impact of atmospheric conditions on vegetation production. For example, lower temperature in northern Europe tends to shorten the period of crop growth. Conversely, persistent high temperatures as well as droughts in southern Europe are linked to negative anomalies of crop yields (Olesen et al., 2011). Li et al. (2010) showed that air temperature tends to influence mean crop yields at small scales (400 to 600 km), whereas rainfall drives crop yields at larger scales (50 to 300 km). Capa-Morocho et al. (2014) also showed the influence of air temperature on crop yields. They established a link between temperature anomalies related to the El Niño phenomenon and potential crop yield anomalies, obtained from reanalysis data and crop model, respectively.
In the context of climate change and of natural climate variability, there is a need to monitor the impacts of droughts on crops and water resources at continental and global scales (Quiroga et al., 2011;Van der Velde et al., 2012;Crow et al., 2012;Bastos et al., 2014). Modelling of continental surfaces into atmospheric and hydrological models has evolved in recent decades towards land surface models (LSMs) able to simulate the coupling of the water, energy and carbon cycles (Calvet et al., 1998;Krinner et al., 2005;Gibelin et al., 2006). In particular, LSMs are now able to simulate photosynthesis and plant growth. A major source of uncertainty in both LSMs and crop models is the maximum available water content of the soil (MaxAWC). This quantity represents the amount of water stored in the soil available for plant transpi-Published by Copernicus Publications on behalf of the European Geosciences Union. ration along the vegetation growing cycle (Portoghese et al., 2008;Piedallu et al., 2011). MaxAWC is constrained by soil parameters and by the plant rooting depth. In regions vulnerable to drought risk exposure, MaxAWC is a key driver of the plant response to the climate variability.
Soil characteristics influence the vegetation response to climate (Folberth et al., 2016). In a model benchmarking study (Eitzinger et al., 2004) simulated evapotranspiration, soil moisture and biomass were compared with observations. Eitzinger et al. used three crop models that differ in the representation of the available soil water content (AWC): WOFOST (WOrld Food Studies model; Van Diepen et al., 1989), CERES (Crop Environment REsource Synthesis model; Ritchie and Otter, 1985) and SWAP (Statewide Agricultural Production model; van Dam et al., 1997). They showed that a better description of rooting depth and evapotranspiration, taking account of soil type and crop type, could significantly improve these models. Tanaka et al. (2004), Portoghese et al. (2008) and Piedallu et al. (2011) have highlighted the important role of the soil characteristics (soil texture, rooting depth) on MaxAWC. Soylu et al. (2011) and Wang et al. (2012) illustrated the major impact of Max-AWC on evapotranspiration. While soil properties such as soil texture determine the soil water holding capacity (in kg m −3 ), information on rooting depth is needed to determine MaxAWC (in kg m −2 ). A better representation of Max-AWC could improve the simulated interannual variability of both water fluxes and vegetation biomass by LSMs.
The lack of in situ observations of MaxAWC to calibrate and assess LSMs limits the ability of LSMs to represent drought effects on plants. Using satellite observations and data assimilation techniques could be a solution to this problem. A list of atmospheric, oceanic and terrestrial essential climate variables (ECVs) which can be monitored at a global scale from remote sensing observations was proposed by the Global Climate Observing System (GCOS). Leaf area index (LAI), fraction of absorbed photosynthetically active radiation (FAPAR) and soil moisture are key ECVs for land surface modelling. The use of these satellite-derived products to verify LSM simulations or to optimise key LSM parameters has been assessed by several authors (e.g. Becker-Reshef et al., 2010;Crow et al., 2012;Ferrant et al., 2014;Ford et al., 2014;Ghilain et al., 2012;Ichii et al., 2009;Kowalik et al., 2014;Szczypta et al., 2012Szczypta et al., , 2014. Data assimilation is a field of active research. Data assimilation techniques allow the integration of different observation types (e.g. in situ or satellite derived) into LSMs in order to optimally combine them with model outputs: the correction applied to the model state is called the increment and the corrected model state is the analysis. Previous works have studied the impact of assimilation of LAI observations and found that it can significantly improve the representation of vegetation growth (e.g. Albergel et al., 2010;Barbu et al., 2011Barbu et al., , 2014. The Interactions between Soil, Biosphere and Atmosphere (ISBA) LSM includes a modelling option able to simulate photosynthesis and plant growth (Calvet et al., 1998;Gibelin et al., 2006). ISBA produces consistent surface energy, water and carbon fluxes, together with key vegetation variables such as LAI and the living above-ground biomass (B ag ). Previous studies showed that this model can represent well the interannual variability of B ag over grassland and straw cereal sites in France provided MaxAWC values are tuned Canal et al., 2014). In these studies, Max-AWC for straw cereals was retrieved by maximising the correlation coefficient between simulated annual maximum B ag (B agX ) and grain yield (GY) observations. The MaxAWC values were obtained for 45 French administrative units ("départements") presenting a large proportion of rainfed straw cereals. For grasslands, dry matter yield observations were used. Significant correlations were found between the simulated B agX value of grassland and dry matter yield of grasslands for up to 90 % of the administrative units. On the other hand, only 27 % of the 45 straw cereals départements (i.e. only 12 départements) had significant correlations. A possible cause of the difficulty to simulate the interannual variability of straw cereals' GY was that the standard deviation of GY represented less than 10 % of the mean GY. This was a relatively weak signal. For grasslands a much larger value, of about 30 % of the mean dry matter yield, was observed (Canal et al., 2014).
The main purpose of this study is to estimate MaxAWC for straw cereals using reverse modelling techniques based on satellite-derived LAI observations disaggregated over separate vegetation types. Simulated and observed LAI are compared for a 15-year period (1999-2013) over the same 45 agricultural spots used in the previous studies of Canal et al. (2014). We use LAI observations instead of GY to estimate MaxAWC. The GY observations are used to verify the interannual variability of the simulated B agX . This can be considered as an indirect validation of the retrieved MaxAWC. In a first experiment, we use a simple inverse modelling technique to estimate MaxAWC together with the mass-based leaf nitrogen content, minimising a cost function based on observed and simulated LAI values. In another experiment, we use a land data assimilation system (LDAS) able to sequentially assimilate LAI observations. In this case, MaxAWC solely is retrieved by minimising the LAI analysis increments. In the following, these two experiments are referred to as inverse modelling and LDAS tuning, respectively.
The main goals of this study are to (1) assess the usefulness of integrating satellite-derived LAI observations into a LSM, (2) compare inverse modelling and LDAS tuning and (3) determine MaxAWC values.
The observation data sets are described in Sect. 2, together with the version of ISBA used in this study and the LDAS. Results obtained from both methods are presented in Sect. 3, and analysed and discussed in Sect. 4. Conclusions and prospects are given in Sect. 5. and inverse modelling only (green downward-facing triangle). The "×" symbol indicates départements where no significant correlation between biomass simulations and GY could be found.

Material and methods
The forcing and validation observations used in this study over the 1999-2013 period are described below. The location of the considered straw cereal spots is presented in Fig. 1.

Satellite LAI product
We use the GEOV1 global LAI product (Baret et al., 2013) provided in near real time (every 10 days) at a spatial resolution of 1 km × 1 km by the European Copernicus Global Land Service (http://land.copernicus.eu/global/). The GEOV1 LAI product is derived from SPOT-VGT satellite observations starting in 1999. The complete 1999-2013 LAI time series comes from SPOT-VGT and is fully homogeneous. The product is well evaluated against ground observations (see the Supplement). The GEOV1 product is a low-resolution product (1 km × 1 km). At this spatial scale, it is not possible to isolate pure straw cereal pixels and it is preferable to disaggregate the LAI (i.e. compute the LAI of each vegetation type) before integrating it into a straw cereal model. We disaggregated the GEOV1 LAI data following the method developed by Carrer et al. (2014), based on a Kalman filtering technique. This method permits separating the individual LAI of different vegetation types that co-exist in a grid pixel and then provides dynamic estimates of LAI for each type of vegetation within the pixel (Munier et al., 2017). The Kalman filter optimally combines satellite LAI data and prior information from the ECOCLIMAP land cover database (Faroux et al., 2007;Masson et al., 2003). ECOCLIMAP prescribes physiographic parameters (fractional vegetation cover, soil depth, etc.) for several vegetation types including grasslands, forests and C3 crops like straw cereals. Mean annual LAI cycles per vegetation type from ECOCLIMAP are used as a first guess to partition the GEOV1 LAI every time a new satellite observation is available.

Atmospheric forcing
The global WFDEI data set (Weedon et al., 2014) is used in this study to drive the ISBA simulations. It provides 3hourly surface atmospheric variables on a 0.5 • × 0.5 • grid: air temperature, air humidity, wind speed, atmospheric pressure, solid and liquid precipitation, incoming shortwave and longwave radiation. WFDEI is based on the ERA-Interim atmospheric reanalysis (Dee et al., 2011). It includes elevation corrections and seasonal monthly bias corrections from ground-based observations.

Agricultural GY statistics
The Agreste portal (http://agreste.agriculture.gouv.fr/) provides annual statistical surveys over France which make it possible to establish a database of yearly GY values. The GY estimates are available per crop type and per administrative unit (département). We use GY values for rainfed straw cereals such as barley, oat, rye, triticale and wheat, for the same 45 départements as in Calvet et al. (2012) and Canal et al. (2014). Calvet et al. (2012) and Canal et al. (2014) used Agreste data for the 1994-2008 and 1994-2010 periods, respectively, for both straw cereals and fodder production. We use Agreste data from 1999 to 2013, only for straw cereal GY.

The ISBA model
The ISBA LSM is included in the SURFEX (SURFace EXternalisée) modelling platform . The newest version of SURFEX (version 8) is used in this study with the "NIT" biomass option for ISBA. The "C3 crop" plant functional type is considered.
ISBA simulates the diurnal course of heat, water and CO 2 fluxes, including gross primary production (GPP). The set of ISBA options we use permits the simulation of LAI and B ag on a daily basis (Calvet et al., 1998(Calvet et al., , 2008. The model includes a soil moisture stress function (F s ) applied to photosynthesis key parameters. For low vegetation such as grass or crops, the parameters related to soil moisture stress (Calvet, 2000) are the mesophyll conductance (g m ) and the maximum leaf-to-air saturation deficit (D max ). Values of g m and D max for straw cereals in well-watered conditions are given in Ta-ble S1 (in the Supplement), together with other model parameters. It must be noted that this value of g m was derived using inverse modelling by Canal et al. (2014) for the same straw cereal sites as those considered in this study. In moderately dry conditions, g m and D max are affected by F s in such a way as to increase the intrinsic water use efficiency (WUE). This corresponds to a drought-avoiding behaviour (Calvet, 2000). The model is also able to represent a drought-tolerant behaviour (stable or decreasing WUE) and Calvet et al. (2012) showed that straw cereals tend to behave as drought avoiding, while grasslands tend to behave as drought tolerant.
The above-ground biomass (B ag ) consists of two components within ISBA: the structural biomass and the active biomass. The latter corresponds to the photosynthetically active leaves and is related to B ag by a nitrogen dilution allometric logarithmic law (Calvet and Soussana, 2001). The mass-based leaf nitrogen concentration (N L ) is a parameter of the model affecting the specific leaf area (SLA), which is the ratio of LAI to leaf biomass (in m 2 kg −1 ). The SLA depends on N L and on plasticity parameters (Gibelin et al., 2006). The N L parameter is key for LAI simulations and has to be included in any inverse modelling experiment involving LAI.
The net assimilation of CO 2 by the leaves (A n ) is driven by environmental factors such as the atmospheric CO 2 concentration, air humidity, the incoming solar radiation and the leaf surface temperature. To upscale the net assimilation of CO 2 and transpiration at the vegetation level, a multilayer radiative transfer scheme is used . The daily canopy-scale accumulated value of A n serves as an input for the vegetation growth and mortality sub-models, and the phenology is completely driven by A n (no growing degree-day parameterisation is used).
The plant transpiration flux is used to calculate the soil water budget through the root water uptake. The soil hydrology scheme used in this study is referred to as "FR-2L" in SUR-FEX. It represents two soil layers: a thin surface layer with a uniform depth of 1 cm and a root-zone layer of depth Z r . The latter is used as a surrogate for MaxAWC in the calibration process. Soil texture parameters such as the gravimetric fraction of sand and clay are extracted from the Harmonized World Soil Database (Nachtergaele et al., 2012). Physical soil parameters such as volumetric soil moisture at field capacity (θ Fc ) and wilting point (θ Wilt ) are calculated thanks to pedotransfer functions based on soil texture. The MaxAWC parameter is given by (1) Parameters are defined in the Appendix and model parameter values are summarised in Table S1 in the Supplement.

Land data assimilation system
We used the LDAS described in Barbu et al. (2011Barbu et al. ( , 2014. It consists of a sequential data assimilation system operated offline (uncoupled with the atmosphere). The assimilation is based on a simplified extended Kalman filter (SEKF), able to integrate observations such as LAI and soil moisture in the ISBA model. In this study, only LAI observations are assimilated and the LDAS produces analysed LAI values.
The key update equation of the SEKF is where x is the analysis increment, x is a control vector of one dimension representing LAI values propagated by the ISBA LSM, and y o is the observation vector representing the GEOV1 LAI observations. The t superscript stands for time. The initial time is denoted by the 0 superscript. The "a", "f" and "o" subscripts denote analysis, forecast and observation, respectively. The y t term of Eq.
(2) represents the model value at the analysis time: where h is the observation operator and The Kalman gain is derived from the background error covariance matrix B and from the observation error covariance matrix R. Matrix H that appears in Eq. (4) represents the Jacobian of potentially nonlinear h function: which gives the following Jacobian matrix: The initial state at the beginning of an assimilation window is analysed via the information provided by an observation at the end of the assimilation window . In this approach, the LAI increments (Eq. 2) are applied at the end of 1-day assimilation intervals. The elements of the Jacobian matrix are estimated by finite differences, individually perturbing each components of the control vector x by a small amount δx: The background error covariance matrix B is assumed to be constant at the start of each analysis cycle. The covariance matrices B and R are assumed to be diagonal. In the simplified version of the EKF used in this study, namely SEKF, the B matrix does not evolve with time. The standard deviation of errors of GEOV1 LAI is assumed to be 20 % of GEOV1 LAI. The same assumption is made for the standard deviation of errors of the modelled LAI (20 % of modelled LAI) for modelled LAI values higher than 2 m 2 m −2 . For modelled LAI Hydrol. Earth Syst. Sci., 21, 4861-4878, 2017 www.hydrol-earth-syst-sci.net/21/4861/2017/ values lower than 2 m 2 m −2 , a constant error of 0.4 m 2 m −2 is assumed. This assumption is based on option 3 presented in Barbu et al. (2011). They showed that this option gives the best simulated LAI over an instrumented grassland site in southwestern France.

Upscaling disaggregated LAI observations to département level
Each agricultural spot shown in Fig. 1 corresponds to the area within a département presenting the highest fraction of straw cereals. These 45 locations were chosen by Calvet et al. (2012) on a 8 km × 8 km grid using fractions of vegetation types derived from ECOCLIMAP . Disaggregated LAI observations have a spatial resolution of 1 km × 1 km. This represents a small area compared to the size of a département (from 2000 to 10 000 km 2 ). Local values of the straw cereal LAI may not be representative of the straw cereal production at the département level described by Agreste. Preliminary tests showed that averaging the disaggregated LAI on the same 8 km × 8 km grid cell used by Calvet et al. (2012) was not sufficient to represent the interannual variability of the GY observations at the département level. Therefore, an analysis of the consistency of the two observation data sets (in situ GY and disaggregated satellite LAI) is performed. The average maximum annual value of the disaggregated GEOV1 LAI observation (LAIo max ) is calculated for various grid cell sizes for this task. In practice, the LAIo max value corresponds to the mean LAI values above a given fraction of the observed maximum annual LAI (LAI max ). We consider five grid cell sizes of 5 km × 5 km, 15 km × 15 km, 25 km × 25 km, 35 km × 35 km and 45 km × 45 km (from 25 to 2025 km 2 ). The five LAIo max time series are compared with the GY time series for each département. The area size corresponding to the largest number of départements presenting a significant correlation between LAIo max and GY is selected.

Model calibration/validation
The feasibility of retrieving MaxAWC from LAI satellite data is explored using two different approaches: inverse modelling and LDAS tuning. For the two approaches, this calibration step is followed by a validation step aiming at demonstrating the relevance of the retrieved MaxAWC values and the added value of the retrieval technique. The satellite LAI observations are available year-round but the sensitivity of straw cereal LAI to MaxAWC may change greatly from one period of the year to another. Prior to calibrating the model, a sensitivity study of the time window used for the MaxAWC retrieval is performed. Three periods are considered: (1) growing period (from 1 March to the date of the observed LAI max ); (2) peak LAI (period for which observed LAI is higher than 50 % of observed LAI max ); (3) senescence (from the date when observed LAI max is reached to 31 July). The ISBA simulations are stopped on 31 July as this date corresponds to the maximum harvest date at most locations.
The validation of the calibrated model consists of comparing the interannual variability of the simulated maximum annual above-ground biomass to the interannual variability of the GY observations. The 1999-2013 period is considered. In drought conditions, modelled B ag can rise to a maximum value and then drop rapidly. Therefore the peak B ag can be dependent on modelling uncertainties and on uncertainties in the atmospheric forcing. In order to limit the impact of model errors, an average value of the simulated B agX is used instead of an instantaneous value. This average value is calculated using all the B ag values above a threshold corresponding to 90 % of the maximum annual B ag . It was checked that this threshold value permits the maximisation of the number of départements presenting a significant correlation with GY. Then, scaled anomalies of the average simulated B agX are compared with scaled anomalies of the GY observations, and the R 2 score is calculated. Scaled anomalies (A s ) are calculated using the mean and standard deviation of the two variables over the 1999-2013 period: The interannual variability of the modelled LAI max is assessed using the coefficient of variation (CV). The CV is given in % and is calculated according the following formula with σ the standard deviation and µ the mean: The MaxAWC retrieval is considered to be successful if the Pearson correlation is significant at 1 % level (F -test p value < 0.01).  77,88,99,110,121,132,154,176,198 and 220 mm. Since N L is a key parameter for LAI simulations, it has to be retrieved together with MaxAWC and this simulation process is repeated 5 times, for the following N L values: 1.05, 1.30, 1.55, 2.05 and 2.55 %. The LAI root mean squared error (RMSE) over the period between the occurrence of the observed LAI max and 31 July for the 15 years is used to select the best simulation. The MaxAWC used in the simulation with the lowest RMSE is selected as the optimal one: where LAI is for simulated LAI, LAIo is for observed LAI and n is the length of the data vector.

LDAS tuning
The LAI observations are integrated into ISBA by the LDAS. The LDAS produces analysed values of LAI and B ag . Therefore, there is no need to estimate N L and the only degree of freedom in this case is the value of MaxAWC. Thirteen analyses are made, corresponding to the same MaxAWC values used in the inverse modelling experiment (Sect. 2.8.1). The median analysis increment (Eq. 2) can present positive or negative values. Small corrections provided by the LDAS indicate that simulation outputs are close to observations and that the dynamics is well represented. The value closest to zero indicates the best simulation and the corresponding MaxAWC value is considered as the retrieved Max-AWC.

Disaggregated satellite LAI vs. grain yield observations
In a first step before integrating the disaggregated LAI observations into the ISBA model, we checked the consistency of the interannual variability of LAIo max (Sect. 2.6) with that of the observed GY from Agreste. We investigated several values of the size of the area around each site coordinate to calculate the average of LAIo max , from 25 to 2025 km 2 . Individual LAIo max values at a spatial resolution of 1 km × 1 km correspond to the mean of LAI values above the LAIo max threshold (Sect. 2.6). Several LAIo max threshold values ranging from 40 to 95 % of LAI max were investigated together with the grid cell size (see Fig. S1 in Supplement). A LAIo max threshold value of 50 % and a grid cell size of 35 km × 35 km (1225 km 2 ) were selected. In this configuration, a significant temporal correlation (F -test p value < 0.01) between the average LAIo max and the observed GY is obtained for 31 départements. The latter are shown in Fig. 1 (empty blue circles). The 45 grid cells of 35 km × 35 km are further used to calculate average 10-day LAI observations to be integrated in the ISBA model through either inverse modelling or LDAS tuning. The fraction of straw cereals derived from ECOCLIMAP for these grid cells ranges from 15 to 100 %, with a median value of 68 % (see Table S2 in Supplement). The temporal correlation between LAIo max and GY is illustrated in Fig. 2. The two 15-year time series correspond to average annual values of LAIo max and GY across the 31 départements where LAIo max is found to correlate with GY. The two time series present a very good correlation, with R 2 = 0.84. This shows that the disaggregated satellitederived LAI is able to capture the interannual variability of GY. Figure 3 presents the impact of ISBA parameters on the simulated annual maximum LAI and on its interannual variability. Two key parameters are considered: MaxAWC and N L . The same parameter values are applied to all 45 départements, and mean modelled LAI max are used to calculate CV values (Eq. 10). The CV values are shown in Fig. 3 as a function of these parameters, together with LAI max .

Sensitivity study
It appears that the interannual variability of the modelled LAI max is governed by MaxAWC. The CV values of more than 12 % are derived from the ISBA simulations at low values of MaxAWC (e.g. 50 mm). On the other hand, high Max-AWC values (> 200 mm) correspond to limited interannual variability of LAI max (CV < 4 %), in relation to a lower sensitivity of plants to drought.
The N L parameter has a limited impact on CV and its impact depends on MaxAWC. For large (small) MaxAWC values above (below) the standard average value of 132 mm used in ISBA, the largest values of N L tend to cause a decrease (increase) of CV. In the inverse modelling experiment, N L mainly impacts the average simulated LAI max value. In the ISBA model, N L is linearly related to the specific leaf area (SLA) and large N L values correspond to large SLA values, i.e. larger LAI values for a given simulated leaf biomass (Gibelin et al., 2006). However, Fig. 3 shows that Max-AWC has a more pronounced impact than N L on LAI max . Increasing MaxAWC from 50 to 250 mm triggers a rise in LAI max , from about 2 m 2 m −2 at low N L values to 3 m 2 m −2 Hydrol. Earth Syst. Sci., 21, 4861-4878, 2017 www.hydrol-earth-syst-sci.net/21/4861/2017/ at high N L values. Switching N L from low to high values at a given MaxAWC level also raises LAI max , but not more than 2 m 2 m −2 . This result confirms that MaxAWC is the key parameter to be retrieved in order to improve the representation of straw cereal biomass, for both inverse modelling and LDAS tuning experiments. The impact of MaxAWC on the cost functions (LAI RMSE and median LAI analysis increments, Eqs. 11 and 2, respectively) may depend on the LAI observation period. We tested the two retrieval methods for three different optimisation periods: start of growing period, peak LAI and senescence (see Sect. 2.7). This is illustrated in Fig. 4, which shows the average cost function across all 45 départements. In both experiments, MaxAWC has little impact on the cost function during the start of the growing season. The most pronounced response of both LAI RMSE and analysis increments is observed during the senescence. For this period of the growing cycle, both cost functions present a minimum value at MaxAWC = 110 mm. Also, the largest RMSE and increment values are observed during senescence, indicating that the processes at stake during this period are more difficult to simulate. For straw cereals, senescence is related to soil moisture stress (Cabelguenne and Debaeke, 1998) and during this period the value of MaxAWC has a marked impact on the representation of the effect of drought by the model. The peak LAI period is less favourable to the integration of LAI observations into the model, with a reduced accuracy on the retrieved MaxAWC.

Outcomes of the optimisation
A direct result of the optimisation procedure is the reduction of the cost function value. This is illustrated in Fig. 5 for all 45 départements. Figure 5 presents the impact of the optimisation on the cost functions of inverse modelling and LDAS tuning during the senescence period: LAI RMSE and LDAS LAI increments, respectively. départements of (a) inverse modelling and (b) LDAS tuning experiments for three different optimisation periods: (red) start of growing period (from 1 March to LAI max date); (green) peak LAI (dates for which LAI > 0.5 LAI max ); (black) senescence (from LAI max date to 31 July). Inverse modelling is based on the minimisation of LAI RMSE (Eq. 11). LDAS tuning is based on the minimisation of the median LAI analysis increment (Eq. 2).
The RMSE values are systematically reduced by the inverse modelling experiment. For all 45 départements, the median value of the LAI RMSE drops from 1.6 to 1.2 m 2 m −2 . While LAI RMSE exceeds 1.5 m 2 m −2 for 29 départements before the optimisation, this RMSE value is exceeded for only three départements after the optimisation. It must be noted that this is a much better result than the RMSE obtained in Fig. 4 (1.6 m 2 m −2 ) for the cost function including all 45 départements, with a MaxAWC value of 110 mm. This shows the impact of the spatial variability of MaxAWC.
For LDAS tuning, most of the median daily increment values are sharply reduced: while 17 values are larger (smaller) than 0.2 (−0.2) m 2 m −2 before the optimisation, all the values range from −0.1 to 0.1 m 2 m −2 after the optimisation. The spatial median value of the LDAS LAI increments varies from −0.03 m 2 m −2 for original LDAS to −0.01 m 2 m −2 for LDAS tuning, for all 45 départements. Table 1 summarises results showing the impact of the optimisation on indicators such as the number of départements presenting a significant correlation of B agX with GY and the median value of the cost functions.  1.2 ± 0.2 1.2 ± 0.2 1.3 ± 0.2 1.4 ± 0.2 1.5 ± 0.2 133 ± 46 mm for significant départements) is close to the standard value used in ISBA (132 ± 2 mm) but the standard deviation is much larger. This shows that LDAS tuning is able to generate spatial variability in MaxAWC values. A similar degree of variability is obtained by inverse modelling, but the retrieved MaxAWC presents much lower values for all 45 départements: 111 ± 44 mm. On the other hand, a much larger value of 153 ± 40 mm is found for the 16 validated départements. The retrieved N L (1.05 ± 0.20) is smaller than the default value of 1.30 %. The role of N L in the optimisation is discussed in Sect. 4.1. Figure 6 shows the impact of optimising MaxAWC on the mean annual LAI cycle, with respect to the observed annual LAI cycle over the 45 départements. Inverse modelling tends to produce a smaller LAI max median value (3.59 m 2 m −2 for all 45 départements) than basic ISBA simulations or LDAS simulations (3.84 and 3.98 m 2 m −2 , respectively). Inverse modelling tends to reduce simulated LAI in May and June, while the LDAS simulations (either original LDAS or LDAS tuning) are much closer to the observations. The two optimisation methods succeed in reducing the LAI RMSE of the basic ISBA simulations (1.6 m 2 m −2 for all 45 départements). With optimised MaxAWC, the tuned LDAS annual mean LAI cycle is closer to the observations than LAI resulting from inverse modelling, with LAI RMSE equal to 1.1 m 2 m −2 for LDAS tuning, as against 1.2 m 2 m −2 for inverse modelling.

Validation
Agricultural GY statistics (Sect. 2.3) are used for validation. The optimisation is considered as successful in départements where the correlation between yearly time series of B agX and GY is significant (p value < 0.01). Table 1 shows that even without tuning MaxAWC, the integration of LAI in ISBA by the original LDAS permits an increase in the number of départements where p value < 0.01 from 18 in basic ISBA simulations to 21. LDAS tuning further increases this number to  Fig. 7 shows that parameter tuning does not significantly improve R 2 values. Basic ISBA and original LDAS simulations present R 2 values of 0.65 and 0.80, as against 0.65 and 0.82 after inverse modelling and LDAS tuning, respectively. Figure 8 further shows that the interannual variability of B agX is markedly better represented using LDAS tuning. The scaled modelled B agX and the scaled GY observations averaged over 45 départements present a R 2 value of 0.82, against 0.65 for inverse modelling. Considering only the successfully validated départements, more similar R 2 values are observed: 0.88 and 0.80, respectively. Figure 9 presents the spatial correlation between the scaled B agX and the scaled GY observations averaged over the 15-year period considered in this study. Considering the 45 départements, R 2 = 0.61 for LDAS tuning and R 2 = 0.58 for inverse modelling. Again, LDAS tuning supersedes inverse modelling, including when the comparison is limited to successfully validated départements, with R 2 values of 0.74 and 0.63, respectively.
It must be noted that all the correlations presented in Figs. 8 and 9 are significant, with all p values smaller than 0.001.
In addition to GY data, we made an attempt to use the satellite-derived GLEAM evapotranspiration product (Mi- ralles et al., 2011) but very poor correlations were obtained for most départements (the median R 2 value was less than 0.06). On the other hand, good correlations were found for photosynthesis using the GPP FLUXNET-MTE product described in Jung et al. (2009). With respect to basic ISBA simulations, GPP RMSE was nearly systematically improved by the original LDAS simulations, and LDAS tuning drastically reduced the largest RMSE values, observed in southwestern France (see Figs. S2 and S3 in the Supplement).

Impact of the optimisation technique on MaxAWC estimates
Differences in validation results can be caused by uncertainties in Agreste GY observations or by the difficulty in upscaling the observations and the simulations (Sect. 2.6).
In order to limit this effect, we further compared the Max-AWC estimates and the simulated vegetation variables for a subset of the départements corresponding to the 15 départements which are validated for both inverse modelling and LDAS tuning. The similarity in MaxAWC estimates and the contrasting simulated LAI values are illustrated in Fig. 10. Analysed LAI from LDAS tuning is closer to the LAI observations than the simulated LAI resulting from inverse modelling. The Max-AWC estimates are slightly smaller for inverse modelling and correlate very well with the MaxAWC estimates from LDAS tuning (R 2 = 0.81). The latter result is also valid when all 45 départements are considered, with R 2 = 0.72. The LDAS approach leads to more realistic simulations of LAI (see Fig. 10) and slightly improves Bag simulations (Figs. 7-10). In addition, N L does not need to be determined because LAI is directly constrained by the LAI observations. Minimising analysis increments to estimate MaxAWC is a much more complex approach than inverse modelling. Overall, MaxAWC estimates from the two methods are relatively consistent (see Sect. 3.5) but inverse modelling tends to produce smaller values. On the other hand, GY observations show that the simulated vegetation variables are more realistically simulated after LDAS tuning than after inverse modelling. This can be explained by a better capability of the LDAS to use the observations to drive the model trajectory: the sequential assimilation of LAI is able to constrain the simulated LAI values. It can be shown that the impact of tuning N L in the inverse modelling method can be significant. Table 2 presents Max-AWC and LAI RMSE values obtained from inverse modelling when only one parameter, MaxAWC, is optimised. Results are shown for five values of N L ranging from 1.05 to 2.55 %. The number of validated départements drops when N L increases, from 16 at N L = 1.05 % to only 3 at N L = 2.55 %. At the same time, the MaxAWC estimates tend to present smaller values, down to 88 ± 40 mm at N L = 2.55 %. This result can be explained by the fact that larger values of either MaxAWC or N L tend to increase LAI max (Fig. 3).
Improving the simulation of vegetation variables has a positive impact on the quality of simulated hydrological variables such as evapotranspiration and soil moisture (Szczypta et al., 2012). Therefore, the larger MaxAWC values obtained from LDAS tuning (129 ± 44 mm) are likely to be more realistic than those obtained from inverse modelling (111 ± 44 mm).
4.2 Are MaxAWC estimates and simulated peak B ag realistic?
Independent MaxAWC estimates confirm that the Max-AWC values obtained from LDAS tuning (129 ± 44 mm) are more realistic than those obtained from inverse modelling (111 ± 44 mm). On the other hand, the two techniques give similar median B agX values (Table 1). In order to verify the MaxAWC values derived from LAI observations, we extracted MaxAWC values from a map produced by l'Institut National de la Recherche Agronomique (INRA) at a spatial resolution of 1 km × 1 km. This map was established using pedotransfer functions based on soil physical properties information such as soil texture, soil depth, bulk density and organic matter (Al Majou et al., 2008). A given local MaxAWC value corresponds to a soil typological unit (STU). The 1 km × 1 km soil mapping units may contain several STUs and the STU fraction is known. We computed weighted-average MaxAWC values for every 35 × 35 km grid cell. The resulting INRA MaxAWC values of the 45 départements present a median value of 151 mm and a standard deviation of 54 mm.
The median peak B ag values are about 1.2 kg m −2 in all simulations. This is consistent with total maximum aboveground biomass values for cereals, which range between 1.1 and 1.7 kg m −2 (e.g. Loubet et al., 2011). Because B agX corresponds to the mean B ag above 90 % of the peak B ag value (Sect. 2.7), median B agX values are smaller than peak B ag and do not exceed 1 kg m −2 (Table 1).

Are LAI satellite data suitable for the optimisation of MaxAWC?
Our results show that using disaggregated LAI observations is key. The optimisation methods used in this study are based on disaggregated LAI satellite data and the quality of the results depends on the reliability of the observation data set. The MaxAWC parameter is a crucial parameter for the senescence period, between LAI max and harvesting (Fig. 4). Because LAI max is related to a large extent to MaxAWC (Fig. 3d), an underestimation of observed maximum LAI values would force the retrieval method to underestimate Max-AWC. Figure 11 compares the mean of annual maximum values of raw LAI and disaggregated LAI for the 45 départements and for 1225 km 2 grid cells. Using disaggregated Hydrol. Earth Syst. Sci., 21, 4861-4878, 2017 www.hydrol-earth-syst-sci.net/21/4861/2017/ Figure 11. Comparison of mean annual LAI max of the raw GEOV1 product and of the disaggregated GEOV1 product.
LAI increases the observed value of maximum LAI by up to 40 % with respect to raw LAI. The mean difference is 0.43 m 2 m −2 . This mitigates a marked underestimation of the MaxAWC estimates. As shown in Table S3, the MaxAWC values obtained from LDAS tuning (110 ± 38 mm) and from inverse modelling (83 ± 30 mm) are much lower (15 to 25 %) than those retrieved using disaggregated LAI observations. Moreover, the number of validated départements using GY observations presenting significant positive correlation is reduced: only 10 and 18 for inverse modelling and LDAS tuning, as against 16 and 24 with disaggregated LAI, respectively. Also, peak B ag values (for all 45 départements) are smaller: 1.01 and 1.08 kg m −2 , against 1.14 and 1.17 kg m −2 with disaggregated LAI, respectively.
4.4 Can model simulations predict the relative gain or loss of agricultural production during extreme years?
The continuous constraint on the model applied by the LDAS on simulated vegetation variables allows the indirect representation of adverse effects. This is illustrated in Fig. 7: the negative anomaly of 2007 is much better represented by the LDAS than by simple ISBA simulations. The observed disaggregated LAI and GY in Fig. 2 show that 2004,2008,2009 and 2012 were favourable years for straw cereal production, while 2001, 2003, 2007 and 2011 were unfavourable years. Unfavourable conditions for straw cereal production were caused by droughts, by excess of water, or by a deficit in solar radiation. For example, the 2000-2001 winter was characterised by extensive floods and by a deficit of solar irradiance until the end of the spring. These climate events markedly affected plant growth especially in northern France (Agreste Bilan, 2001). The 2003 and 2011 years were particularly warm, with a marked precipitation deficit at springtime (Agreste Bilan, 2003. Concerning 2007, although climate conditions were favourable to plant growth during spring, extremely wet conditions occurred at the end of the growing season. This triggered accessibility issues and disease development (Agreste Bilan, 2007). These processes limiting biomass production in response to an excess of water are not represented in the ISBA model.

Can observed LAI characteristics be used to estimate MaxAWC?
We show that satellite-derived LAI observations have potential to map MaxAWC very simply. We investigated the use of a simple statistical analysis of the disaggregated LAI observations to estimate MaxAWC. Figure 3 shows that there is a marked relationship between MaxAWC and the simulated LAI max and LAI CV. To what extent are these relationships observable?
In order to answer this question, we used the LDAS tuning MaxAWC estimates as a reference data set. We compared the observed median annual maximum LAI and LAI CV with MaxAWC. No significant correlation could be shown for LAI CV, with R 2 smaller than 0.2. On the other hand, a very good correlation (R 2 = 0.70 for all 45 départements) was found for median annual maximum LAI (Fig. 12). Using this simple linear regression model, MaxAWC can be estimated with a RMSE of 28.7 mm. A very similar result is obtained considering only the 24 validated départements for LDAS tuning. The modelled MaxAWC values are given in Table S2 (see Supplement).

Conclusions
Satellite data are used to optimise a key parameter of the ISBA land surface model for straw cereals in France: the maximum available soil water content, MaxAWC. Two optimisation methods are used. Inverse modelling consists in minimising the LAI RMSE and LDAS tuning consists in minimising LAI analyses increments. The added value of the optimisation is evaluated using simulated above-ground biomass, through its correlation with in situ grain yield observations.
It is found that disaggregated LAI observations during the senescence are more informative than raw LAI observations and than LAI observations during the growing phase. The best results are obtained using LDAS tuning: the simulated above-ground biomass correlates better with grain yield observations, and the retrieved MaxAWC values are more realistic. It is shown that LDAS simulations can predict the relative gain or loss of agricultural production during extreme years, much better than model simulations even after parameter optimisation.
Finally, it is shown that median annual maximum disaggregated LAI observations correlate with MaxAWC estimates over France. This simple metric derived from LAI observations could be used to map MaxAWC. More research is needed to investigate to what extent this conclusion holds for other regions of the world and other vegetation types.

Appendix A: Nomenclature
List of symbols A s,BagX Scaled anomaly of B agX of a given year (-) A s,GY Scaled anomaly of GY of a given year (z score) (-) AWC Simulated available soil water content (kg m −2 ) B ag Simulated living above-ground biomass (kg of dry matter m −2 ) B agX Maximum of simulated living above-ground biomass (kg of dry matter m −2 ) CV Coefficient of variation (%) D max Maximum leaf-to-air saturation deficit (kg kg −1 ) F s Soil moisture stress function g m Mesophyll conductance in well-watered conditions (mm s −1 ) GY Annual grain yields of crops (kg m −2 ) LAI Leaf area index (m 2 m −2 ) LDAS Land data assimilation system LSM Land surface model MaxAWC Maximum available soil water content (mm or kg m −2 ) NIT Photosynthesis-driven plant growth version of ISBA-A-gs N L Leaf nitrogen concentration (% of leaf dry mass) SLA Specific leaf area (m 2 kg −1 ) WUE Leaf-level water use efficiency (ratio of net assimilation of CO 2 to leaf transpiration) Z r Depth of the root zone layer (m) Greek symbols ρ Water density (kg m −3 ) θ Volumetric soil water content (m 3 m −3 ) θ Fc Volumetric soil water content at field capacity (m 3 m −3 ) θ Wilt Volumetric soil water content at wilting point (m 3 m −3 )