Modelling evapotranspiration during precipitation deficits: identifying critical processes in a land surface model

Surface fluxes from land surface models (LSMs) have traditionally been evaluated against monthly, seasonal or annual mean states. The limited ability of LSMs to reproduce observed evaporative fluxes under water-stressed conditions has been previously noted, but very few studies have systematically evaluated these models during rainfall deficits. We evaluated latent heat fluxes simulated by the Community Atmosphere Biosphere Land Exchange (CABLE) LSM across 20 flux tower sites at sub-annual to interannual timescales, in particular focusing on model performance during seasonal-scale rainfall deficits. The importance of key model processes in capturing the latent heat flux was explored by employing alternative representations of hydrology, leaf area index, soil properties and stomatal conductance. We found that the representation of hydrological processes was critical for capturing observed declines in latent heat during rainfall deficits. By contrast, the effects of soil properties, LAI and stomatal conductance were highly sitespecific. Whilst the standard model performs reasonably well at annual scales as measured by common metrics, it grossly underestimates latent heat during rainfall deficits. A new version of CABLE, with a more physically consistent representation of hydrology, captures the variation in the latent heat flux during seasonal-scale rainfall deficits better than earlier versions, but remaining biases point to future research needs. Our results highlight the importance of evaluating LSMs under water-stressed conditions and across multiple plant functional types and climate regimes.


Introduction
Droughts are expected to increase in frequency and intensity (Allen et al., 2010;Trenberth et al., 2014) in some regions due to the effects of climate change (Collins et al., 2013).This would have profound implications for affected regions and their socio-economic systems.Land surface models (LSMs) are a key tool for understanding the evolution of historical droughts and predicting future water scarcity when coupled to global climate models (Dai, 2012;Prudhomme et al., 2014).LSMs have been extensively evaluated for simulated water, energy and carbon fluxes, typically at seasonal to inter-annual timescales (Abramowitz et al., 2007;Best et al., 2015;Blyth et al., 2011;Dirmeyer, 2011;Zhou et al., 2012), and have been found to perform reasonably well under well-watered conditions (e.g.Best et al., 2015).However, recent studies have indicated that the ability of current LSMs to simulate these fluxes during waterstressed conditions is limited (De Kauwe et al., 2015a;Powell et al., 2013).LSMs have been shown to poorly characterise the magnitude, duration and frequency of droughts when evaluated against site-and catchment-scale observa-tions of latent heat (Q E ) and streamflow (De Kauwe et al., 2015a;Li et al., 2012;Powell et al., 2013;Prudhomme et al., 2011;Tallaksen and Stahl, 2014).Similarly, LSM projections of future drought occurrence have been shown to be highly model dependent, with greater uncertainty in future projections arising from differences between LSMs than from the climate model projections used to force them (Prudhomme et al., 2014).Furthermore, changes in soil moisture in the future are also linked to changes in the probabilities and intensities of other extremes including heatwaves (Seneviratne et al., 2010).Clearly, a better understanding of limitations in LSMs under more extreme conditions, and ultimately improved performance by these models, is necessary for improving the future projections of drought and other land surface influenced extremes by climate models.
We investigate the performance of the Australian Community Atmosphere-Biosphere Land Exchange (CABLE) in simulating observed declines in latent heat during rainfall deficits.CABLE is the LSM used within the Australian Community Climate and Earth Systems Simulator (ACCESS; Bi et al., 2013), a global climate model which has participated in the Fifth Assessment Report of the International Panel on Climate Change (IPCC, 2013) and is used for numerical weather prediction research in Australia (Puri et al., 2013).In common with other LSMs (Prudhomme et al., 2011;Tallaksen and Stahl, 2014), CABLE has been found to poorly simulate the evolution of droughts, systematically underestimating site-scale Q E during seasonal-scale droughts (De Kauwe et al., 2015a;Li et al., 2012).There could be many reasons for this systematic error.For example, unrealistic representation of plant drought responses has been identified as a major limitation in LSM simulations of drought (Egea et al., 2011;Powell et al., 2013), including CABLE (De Kauwe et al., 2015a).Recent studies have revised vegetation drought responses in CABLE but not fully resolved existing model biases (De Kauwe et al., 2015a;Li et al., 2012).In this paper, we examine CABLE in the context of another major area of uncertainty: how hydrological processes are parameterised and how associated parameters are selected.A better representation of hydrological processes, particularly soil moisture, has been identified as necessary for improving LSM simulations of drought (Tallaksen and Stahl, 2014), but this has not been widely explored.The parameterisations of soil hydrology and stomatal conductance (g s ) have recently been revised in CABLE and shown to improve seasonal-to annual-scale simulations of Q E in CABLE (De Kauwe et al., 2015b;Decker, 2015).We explore whether changes to these hydrological processes can also improve simulations of Q E during dry periods and guide development of more realistic drought mechanisms in LSMs.
We quantify the uncertainty arising from key model parameters: soil properties and leaf area index (LAI) inputs.CABLE-simulated Q E has been shown to be sensitive to these parameters, but they remain uncertain at both site and large scales (De Kauwe et al., 2011;Kala et al., 2014;Koster et al., 2009;Zhang et al., 2013).Quantifying the sensitivity of CABLE to LAI and soil properties is useful for separating parameter uncertainties from inadequate model parameterisations.Where the LSM cannot capture the observations, despite variations in LAI and soil parameters, systematic errors in the model's representation of physical processes are probable (assuming negligible errors in flux tower data used to drive and evaluate the model).While other parameters, including additional vegetation characteristics such as rooting depth (Li et al., 2012), are also potentially important, soil properties and leaf area index can be constrained from readily available global-scale data sets widely used in large-scale LSM applications.
We therefore explore CABLE performance at 20 flux tower sites distributed globally.We contrast model behaviour at annual to sub-seasonal scales to explore uncertainties in hydrological processes and parameters under conditions ranging from wet to dry.We concentrate on the ability of the model to capture the onset of drought in a drying phase as a pre-requisite for capturing the magnitude and intensity of droughts.

Flux tower sites
The flux tower data were collated as part of the Protocol for the Analysis of Land Surface models (PALS; Abramowitz, 2012) Land sUrface Model Benchmarking Evaluation pRoject (PLUMBER; Best et al., 2015), originally obtained through the Fluxnet LaThuile Free Fair-Use subset (fluxnet.ornl.gov).The PLUMBER sites represent a broad range of vegetation and climate types, and were also selected to maximise the length of measurement records (Best et al., 2015).Here, we focus on the results for six sites with a pronounced period of low precipitation (Fig. 1 and Table S1 in the Supplement), each representing a different climate and vegetation type, but provide results for all study sites as supplementary information (Figs.S1-S4 in the Supplement).
The 20 flux tower sites provide meteorological and flux measurements at 30 min resolution.The observed meteorological data (precipitation, short-and long-wave radiation, surface air pressure, air temperature, specific humidity and wind speed) were used to drive CABLE simulations.Observed Q E was then used to evaluate simulations because it is the variable that links the land surface energy, water and carbon budgets (Pitman, 2003).It is also one of the variables supplied by a LSM to the atmosphere and is therefore important to a climate model.We note that it would also be desirable to evaluate soil moisture outputs from LSMs.Ultimately this is problematic (Koster et al., 2009) as site measurements at depths which reflect the plants' root-zone ac-Hydrol.Earth Syst. Sci., 20, 2403-2419, 2016 www.hydrol-earth-syst-sci.net/20/2403/2016/ q q q q q q q q q q q q q q q q q q q q Amplero Blodgett Howard Springs Palang Tumbarumba UniMich q q Selected site Supplementary site   Bi et al., 2013).It has been used widely in coupled (Cruz et al., 2010;Lorenz and Pitman, 2014) and offline (Haverd et al., 2013;Huang et al., 2015;Zhou et al., 2012) simulations and has been extensively evaluated against flux site (De Kauwe et al., 2015a;Li et al., 2012;Wang et al., 2011;Williams et al., 2009) and regional-to global-scale observations (De Kauwe et al., 2015b;Decker, 2015).Previous model inter-comparisons have shown that simulated latent and sensible heat fluxes perform comparably to other LSMs (Best et al., 2015).CABLE consists of sub-models for radiation, canopy, soil and ecosystem carbon.Canopy processes are represented with a two-leaf model, which calculates photosynthesis, stomatal conductance and leaf temperature separately for sunlit and shaded leaves (Leuning, 1995;Wang and Leuning, 1998).The soil module simulates the transfer of water within the soil and snowpack following the Richards equation.CA-BLE has 11 plant functional types (PFTs).A detailed description of model components can be found in Wang et al. (2011).
We used CABLE version 2.0 (revision 2902; http://trac.nci.org.au/trac/cable/wiki)forced with site-specific meteorological data at 30 min time steps.Site PFT was determined by matching site vegetation (http://fluxnet.ornl.gov) to CA-BLE PFTs.PFT parameters were taken from a standard lookup table provided with CABLE 2.0 and were not calibrated to match site characteristics.The model was run using two alternative hydrological modules, two stomatal conductance parameterisations, three soil types and three LAI time series.The new hydrological scheme implements a topographic slope parameter, which was varied between two values in additional simulations.This parameter controls the drainage rate and can in principle be constrained from high-resolution elevation data.We vary the slope parameter between 0 and 5 • , broadly coinciding with the observed range of 0-6 • at the flux sites as derived from the approximately 1 km spatial resolution Global 30 arc-sec Elevation (GTOPO30) elevation data set (https://lta.cr.usgs.gov/GTOPO30).
CABLE was run with all parameterisation combinations, resulting in 18 simulations using the default and 36 using the new hydrological scheme.This enabled the quantification of individual parameter and/or parameterisation uncertainties on model simulations and accounts for interactions between different parameterisations.The individual model parameterisations varied in this study are detailed below.

Hydrological parameterisation
We use two different representations of hydrology.The two schemes are fully detailed in Decker (2015), but we will briefly describe the main differences here.The default soil hydrological scheme in CABLE simulates the exchange of water and heat based on six soil layers and up to three snow layers.The default parameterisation for soil moisture processes was developed by Kowalczyk et al. (1994) and later revised by Gordon et al. (2002) and is described in detail in Kowalczyk et al. (2006) and Wang et al. (2011).The default scheme only generates infiltration excess surface runoff when the top three soil layers are ≥ 95 % saturated and otherwise lacks an explicit runoff generation scheme (Decker, 2015).It does not simulate saturated and un-saturated top soil fractions separately or consider groundwater aquifer storage.The default scheme solves the vertical redistribution of soil water using the 1-D Richards equation.
where q sub is the subsurface drainage (mm s −1 ), θ n the soil moisture content of the bottom soil layer (mm 3 mm −3 ) and C drain a tunable parameter (mm s −1 ) (Decker, 2015).The scheme thus assumes a free draining lower boundary, and water below the model domain (e.g.groundwater) cannot recharge the water content of the above soil column.Soil evaporation (E soil ; W m −2 ) is given by where L v is the latent heat of vaporisation (J kg −1 ), ρ a the air density (kg m −3 ), q sat (T srf ) the saturated specific humidity at the surface temperature (kg kg −1 ), q a the specific humidity of air (kg kg −1 ) and r g (s m −1 ) the aerodynamic resistance term.β s is a dimensionless scalar (varying between 0 and 1) used to reduce E soil when soil moisture is limiting and is given by the linear function where θ 1 the soil moisture content of the first soil layer (m 3 m −3 ), θ w the wilting point (m 3 m −3 ) and θ fc the field capacity (m 3 m −3 ).Decker (2015) developed an improved representation of sub-surface hydrological processes similar to that implemented in the Community Land Model (Lawrence and Chase, 2007;Oleson et al., 2008).The new scheme explicitly simulates saturation-and infiltration-excess runoff generation and has a dynamic groundwater component with aquifer water storage.The scheme solves the vertical redistribution of soil water (θ ) following the modified Richards equation (Zeng and Decker, 2009): where K (mm s −1 ) is the hydraulic conductivity, (mm) the soil matric potential, E (mm) the equilibrium soil matric potential, z is soil depth (mm) and F soil (mm s −1 ) is the sum of subsurface runoff and transpiration (Decker, 2015).An unconfined aquifer is located below the six soil layers and is presented with a simple water balance model: where W aq is the mass of water in the aquifer (mm), q aq,sub the subsurface runoff removed from the aquifer (mm s −1 ) and q re the water flux between the aquifer and the bottom soil layer, given from Darcy's law as where z wtd is the water table depth (mm), K aq is the hydraulic conductivity of the aquifer (mm s −1 ), and z n is the depth of the lowest soil layer (mm).The bottom boundary condition is given as as the scheme assumes that the groundwater aquifer sits above an impermeable layer of rock (Decker, 2015).Subsurface runoff (q sub ) is calculated from where dz/dl is the mean slope, qsub the maximum rate of subsurface drainage for a fully saturated soil column (mm s −1 ) and f p is a tunable parameter (Decker, 2015).q sub is removed from the bottom three soil layers (which account for 4.366 m of the total soil thickness of 4.6 m) by weighting the amount of water removed from each layer based on the mass of liquid water present in each layer (Decker, 2015).Sub-grid-scale heterogeneity in soil moisture is permitted and a modified soil evaporation formulation reflects this.At point scales the runoff generation from sub-grid heterogeneity in soil moisture is neglected as the saturated fraction of the grid cell is assumed to be equal to zero.Soil evaporation is given as (here assuming a saturated grid cell fraction (F sat ) of 0) where E * soil is the soil evaporation prior to soil moisture limitation (mm s −1 ).β s is calculated using a non-linear function following Sakaguchi and Zeng (2009): where θ unsat is the first layer soil moisture content in the unsaturated portion (the entire soil layer in this study) (m 3 m −3 ).Both schemes calculate transpiration using the same method and, in common with many LSMs (Verhoef and Egea, 2014), limit gas exchange during low soil moisture using a dimensionless scalar (β) varying between 0 and 1: where θ i is the soil moisture content of soil layer i (m 3 m −3 ) and f root,i the root mass fraction of soil layer i.
The default version of CABLE tends to overestimate Q E at annual to seasonal scales when used coupled with the AC-CESS climate model (Lorenz et al., 2014), but significantly underestimates Q E during soil moisture deficits across six European flux tower sites (De Kauwe et al., 2015a) in uncoupled experiments.Decker (2015) showed that the new model reduced overestimations of Q E by 50-70 % compared to the default scheme and yielded an improved simulation of seasonal cycles when evaluated against observations from large river basins.The new scheme was also shown to better capture total water storage anomalies (an integral over depth of soil moisture changes) from the Gravity Recovery and Climate Experiment (GRACE; http://grace.jpl.nasa.gov)than the default scheme.It is not known whether these improvements will also allow CABLE to better capture observed Q E during dry-down, and we explore this here.

Stomatal conductance parameterisation
We use two alternative parameterisations for stomatal conductance (g s ).The default CABLE currently implements an empirical g s formulation following Leuning (1995): where A is the net assimilation rate (µmol m −2 s −1 ), (µmol mol −1 ) is the CO 2 compensation point of photosynthesis, C s (µmol mol −1 ) and D (kPa) are the CO 2 concentration and the vapour pressure deficit at the leaf surface, respectively.g 0 (mol m −2 s −1 ), D 0 (kPa) and α 1 are fitted constants representing the residual stomatal conductance when A = 0, the sensitivity of stomatal conductance to D and the sensitivity of stomatal conductance to assimilation, respectively.Although the g s formulation following Leuning (1995) (or the equivalent Ball-Berry model; Ball et al., 1987) is widely used in LSMs, the model parameters are empirical.Thus, we cannot attach any theoretical distinction as to how parameters vary across data sets or among species (De Kauwe et al., 2015b;Medlyn et al., 2011).Consequently, as is common with many LSMs (e.g.Community Land Model version 4.5) (Oleson et al., 2013) and the ORganizing Carbon and Hydrology in Dynamic EcosystEms model (Krinner et al., 2005), the default scheme only varies stomatal conductance parameters between photosynthetic pathways (C 3 vs.C 4 ), rather than among PFTs.
As an alternative, we also ran CABLE using the g s model following Medlyn et al. (2011), a theoretical formulation based on the premise of optimal stomatal behaviour: where g 1 (kPa 0.5 ) is a fitted parameter representing the sensitivity of conductance to the assimilation rate.Unlike the α 1 parameter in the Leuning model, g 1 has biological meaning, representing a plant's water use strategy.Values of g 1 were derived previously for each of CABLE's PFTs (De Kauwe et al., 2015b) based on a global synthesis of stomatal behaviour (Lin et al., 2015).Further details and associated parameter values can be found in De Kauwe et al. (2015b).
The Medlyn g s model has been shown to improve existing CABLE biases, particularly overestimations of Q E in evergreen needleleaf and C 4 biomes (De Kauwe et al., 2015b).We will explore whether the re-parameterisation of g s also improves the simulation of dry-down in CABLE.

Soil parameterisation
The soil parameters were derived from a data set provided by CABLE developers (https://trac.nci.org.au/trac/cable/wiki;Global Soil Data Task Group, 2000;Zobler, 1999).The data set consists of nine soil classes; here the two classes with the highest sand and clay contents were used.The coarse sandy soil has an 83 % sand content and the fine clay soil a 67 % clay content.The soil classes have eight associated parameters for soil hydraulic and thermal capacities, fully detailed in Table S2.In addition, an arbitrary "medium" soil class was created with equal fractions of sand, silt and clay, with other soil parameters set as the median of the coarse sand and fine clay soil classes (Table S2).CABLE was run with these three alternative soil classes, fixing the soil parameters across all sites to generate a range in soil parameter values.The default hydrological scheme uses all soil parameters directly, whereas the new scheme calculates the eight parameters governing hydraulic properties from sand, silt and clay fractions using the Clapp and Hornberger (1978) pedotransfer functions.The soil parameter values used by both schemes are detailed in Table S2.

Leaf area index
Leaf area index (LAI) plays an important role in the surface energy balance in CABLE, scaling sunlit and shaded leaf-level fluxes of photosynthesis, g s and latent heat flux to the canopy.LAI was obtained from 8-daily gridded Moderate Resolution Imaging Spectroradiometer (MODIS) data at 1 km resolution (Yuan et al., 2011).The data were averaged to monthly time steps to smooth the time series and subsequently three alternative LAI time series were created for each site to take some account of uncertainties in LAI inputs.The first time series was constructed by extracting the grid cells that contained each site ("centre" time series).Two alternative time series were created using the minimum and maximum LAI values of the grid cell and its immediate neighbours ("minimum" and "maximum" time series, respectively).Time-varying LAI was used for years where the flux observations and MODIS data overlap (i.e. after 2000); a monthly climatology of common years was used otherwise.The minimum and maximum time series differ from the centre time series by 30 % on average, but the range varies between sites.The alternative LAI time series are shown in Fig. S5.
A. M. Ukkola et al.: Modelling evapotranspiration during precipitation deficits

Analysis methods
We analyse CABLE's performance across three timescales: the whole observational, annual and sub-annual periods.As the observational records are generally short for characterising hydrological extremes (∼ 5 years on average; Table S1), we have not adopted a formal statistical method for identifying periods of rainfall anomalies and thus do not refer to them as "droughts".We also note that no one definition for droughts exists; instead, various indices have been employed based on, for example, precipitation, streamflow, soil moisture and measures of evaporative demand (Sheffield and Wood, 2011).In this study, the dry periods were defined based on precipitation as this allowed the use of available observations, but we note that the simulated fluxes will also depend on other processes such as soil moisture availability.For the majority of sites (Howard Springs, Palang, and all supplementary sites), we selected the year with the lowest precipitation total as the 1-year period, whilst for Amplero, Blodgett, Tumbarumba and UniMich, we selected a year when the default CABLE significantly underestimated latent heat fluxes during a rainfall deficit ("dry-down") period.The dry-down period generally coincides with the maximum and the following minimum observed latent heat flux during the 1-year period, but has been adjusted using expert judgment for some sites to best demonstrate typical model behaviour (Fig. S6).Observed and simulated data were averaged to 14-day running means for all analyses.
We follow Abramowitz et al. (2007) and the PALS protocol for calculating model metrics.We use the normalised mean error (NME) to evaluate general model performance: where M represents the model values and O the observations.NME accounts for mean model biases and the temporal coincidence and magnitude of variability, but does not distinguish between them (Best et al., 2015).An NME of 0.0 represents perfect agreement and a value of 1.0 represents model performance equal to that expected from a constant value equal to the mean of all observations.We examine mean bias error (MBE) to estimate absolute biases in CABLE simulations; it is simply the difference between the mean modelled and observed values: 3 Results

Whole time period
We first evaluated Q E simulated by CABLE during the whole data period available for each flux site (ranging from 2 to 7 years for selected sites; Table S1).CABLE, using the default hydrological parameterisation, captures the general features, such as the timing and magnitude of seasonal cycles, in observed Q E across the different sites (Fig. 2, left column panels).CABLE including the new hydrological scheme also captures these general features (Fig. 2, right column panels).
Quantifying the performance of these two versions of CA-BLE over the full length of record does not indicate that there is a significant difference between the versions in either NME (Fig. 3) or MBE (Fig. 4).The average NME for all sites and parameter choices was 0.90 for the old scheme and 0.75 for the new scheme, and the average MBE was −1 and 6 W m −2 , respectively.The NME metric is < 1.0 for the majority of sites using the new scheme, regardless of the choice of g s , LAI or soil parameterisation.We note that the magnitude of Q E for the evergreen broadleaf sites (Palang and Tumbarumba and supplementary site Espirra) is poorly captured (Fig. 4).Overall, both hydrological parameterisations tend to overestimate peak Q E (Fig. 2); this tendency for excessive evapotranspiration has also been demonstrated in global applications of CABLE in both offline (De Kauwe et al., 2015b) and coupled (Lorenz et al., 2014) simulations.Furthermore, both schemes systematically overestimate Q E in spring, particularly at cooler temperate sites such as UniMich (Fig. 2; also see deciduous broadleaf and needleleaf supplementary sites; Fig. S1), and over-predict the short-term variability in Q E (see e.g.Amplero in Fig. 2).Despite clear biases in simulated fluxes, the MBE metric approaches zero at most sites when evaluated at inter-annual timescales.While encouraging, this is due to compensating errors, such that early season overestimations in Q E are counteracted by underestimations during the dry-down periods (see e.g.Blodgett and Tumbarumba in Fig. 5).This is particularly evident with the default hydrology scheme.We therefore focus the remaining analyses on shorter time periods where compensating biases are less likely to hide weaknesses in the model performance.

Annual and dry-down period
CABLE-simulated Q E was then evaluated during annual and seasonal dry-down periods to explore model performance during rainfall deficits.The default scheme demonstrates a range of major biases (Fig. 5).The model dries down too quickly at the Amplero, Blodgett, Palang, Tumbarumba and UniMich sites.At these sites, and at Howard Springs, Q E drops too low and drops to that minimum too early in the year.At several sites, including Blodgett, Tumbarumba and UniMich, CABLE systematically overestimates Q E in spring.These characteristics of CABLE are not dependent on the choice of LAI or soil inputs or g s parameterisations; the range in Q E fails to overlap the observations irrespective of how these properties are varied.This suggests parameterisation error as distinct from parameter choices as the cause of the model weaknesses.
The new hydrological scheme demonstrates clear improvements at Amplero, Howard Springs and Palang.At Blodgett, Tumbarumba and UniMich, the observations are within the uncertainty due to the choice of g s , LAI or soil parameters in the second half of the year, but the excessive Q E during spring and early summer remains a problem.While there are obviously remaining errors, the new hydrological scheme clearly improves the simulation of Q E over the annual cycles (Fig. 5).Assessing the overall performance at annual timescales also highlights clear improvements with the new hydrology.Figure 3 shows that for NME, the new hydrology scheme in CABLE performs as well as or better than the default at every site, with an average NME across all sites of 0.68 compared to 0.90 for the default scheme.This is true also of MBE (Fig. 4) for all sites except Tumbarumba.Observed latent heat is shown in black.The grey shading denotes the selected dry-down period.All time series run from January to December, except Tumbarumba, which runs from July to the following June.
Assessing the performance of the two schemes over the dry-down period using NME is shown in Fig. 5. Using the default hydrology leads to worse performance on this shorter timescale at Amplero, Blodgett, Palang and to a lesser degree at Howard Springs and Tumbarumba compared to annual and inter-annual scales.In contrast, CA-BLE with the new hydrology performs similarly to the longer (≥ 1 year) timescales at Blodgett, Palang, Howard Springs, Tumbarumba and UniMich and only marginally poorer at Amplero.Comparing NME over this dry-down period shows that the new scheme strongly outperforms the default parameterisation (Fig. 3; the average NME is 0.68 and 1.27 for new and default schemes, respectively).A similar conclusion is reached using MBE (on average −4 and −22 W m −2 for the new and default schemes, respectively).In short, the new hydrology does not dramatically improve the performance of CABLE in the long term (i.e.inter-annual scales) (Fig. 2) due to compensating biases in the default CABLE.These include overestimated spring and early summer Q E , and consequently, at least in part, underestimated Q E during the drydown.Once we focus on shorter, sub-annual timescales that lack these compensating biases, CABLE with the new hydrology strongly outperforms the default version in the simulation of Q E .

Impact of varying LAI, g s and soil parameters
We now explore the individual contributions from soil parameters, g s and LAI to uncertainties in simulated Q E .Figures 6 and 7 show the uncertainty in model simulations due to soil parameters, g s and LAI using the default and new hydrological schemes, respectively.Both hydrological schemes are sensitive to soil parameters during the dry-down period but show smaller variations due to soil during other parts of the year (see Amplero, Blodgett, Howard Springs and Palang in Figs. 6 and 7).This transition from low to high sensitivity occurs as soil moisture stores begin to deplete and Q E becomes increasingly limited by moisture supply (Fig. S8).Both schemes show a similar sensitivity to g s and LAI variations, which is generally smaller compared to soil variations, although the new scheme is more sensitive to g s at Blodgett, Howard Springs and Palang, and to LAI in Amplero and Palang during dry-down.
While the new hydrological parameterisation systematically improved model performance across most sites (Figs. 3  and 4), the effect of LAI, g s and soil parameters on the mean magnitude of simulated fluxes is highly site-specific during the annual and dry-down periods (Fig. 8).In agreement with De Kauwe et al. (2015b), the choice of g s scheme generally has a larger effect in needleleaf (Blodgett) and C 4 grass (Howard Springs) sites.Some sites, such as Howard Springs, are sensitive to multiple parameters, whilst others such as UniMich only respond minimally to parameter perturbations (Fig. 8).Whilst there is no a priori expectation that this should be the case, it highlights the importance of investigating model uncertainties and performance across multiple sites to capture the full range of model sensitivities to parameter perturbations.
The results have so far assessed CABLE with the new hydrology using a 0 • slope parameter as this enables a direct comparison with the default hydrology.The slope parameter, which can be derived from high-resolution elevation data, is scale dependent and was introduced by Decker ( 2015 by landscape geometry.The slope parameter affects the rate of subsurface drainage and represents a key difference between the new and default schemes.With the exception of the UniMich site, Figs. 8 and 9 show that the model is highly sensitive to the choice of the slope parameter across all sites, particularly during the dry-down period.The slope appears more critical for simulation of Q E than the other parameterisations investigated here and has a strong effect on the magnitude of fluxes primarily during the dry-down (see e.g.Howard Springs and Palang in Fig. 9).Whilst this highlights the need to carefully set the slope parameter, it is unclear how well it can be constrained at the site scale.The surface slope derived from elevation data may not reflect large-scale features, such as subsurface geology, which can affect drainage rates and thus water availability for Q E in highly site-specific ways.

Simulation of dry-down
We have shown that the default version of CABLE significantly underestimates Q E during rainfall deficits (Fig. 5).
We have also shown that it is unlikely that uncertainties in key model soil and vegetation (LAI) inputs account for these biases (Fig. 6).The observations used to drive and evaluate the model themselves include errors, notably lack of energy balance closure (Leuning et al., 2012).However logical processes in the default version of CABLE (Figs. 3  and 4).The default CABLE has been shown to perform similarly to other LSMs in Best et al. (2015) and indeed in other model evaluation studies (Abramowitz et al., 2008).Hence, it is likely that the errors of the kind identified here may be common among other models, as model benchmarking rarely examines sub-annual behaviour.The poor simulation of drydown periods is important: if LSMs in general struggle to simulate this period, they will fail to correctly capture water fluxes when serious soil moisture deficits are established.
A model that simulates dry-down too fast will enter drought early and will tend to simulate longer, deeper and more frequent droughts than a model that enters drought too slowly.We suggest that systematic evaluation of LSMs during drydown periods would lead to the identification of major limitations in some models that are hidden by compensating errors over longer timescales.Resolution of those problems has the potential to improve the simulation of drought in climate models.
We also showed that the effect of individual parameterisations was magnified during dry periods (Figs. 6 and 7).Whilst the new hydrological scheme did not present a significant improvement on the annual and inter-annual timescales analysed here, it had an increasingly large positive impact on shorter timescales and in particular during the dry-down periods (Figs.3-5).Similarly, the contribution of soil (Figs.6a and 7a), g s (Figs.6b and 7b) and LAI (Figs. 6c and 7c) parameterisations to model uncertainties was generally larger during the dry-down.We will discuss each of these points below.

Soil and LAI inputs
We evaluated the uncertainty in Q E simulations arising from inputs of soil properties and LAI.These variables are generally obtained from gridded data sets in LSM simulations and remain uncertain at the site (and larger) scale (De Kauwe et al., 2011;Koster et al., 2009).
Soil parameters feature in many hydrological model components and our results show that the range in simulated Q E due to the choice of soil parameters is largest during the drydown for both hydrological schemes (Figs. 6 and 7).Parameters for wilting point (θ w ) and field capacity (θ fc ) are particularly important during drying conditions as they determine how evapotranspiration is reduced as soil moisture becomes limiting (following Eqs. 3, 10 and 11).The model is also sensitive to the value of the matric potential at saturation (Table S2).The vertical diffusive flux of soil water between adjacent soil layers is proportional to the saturated matric potential.This control on the rate of vertical water movement alters the profile of vertical soil water during the dry-down, impacting the water available for transpiration in a given soil layer.Using the default hydrological scheme, the observed Q E could only be captured by varying the soil properties at Howard Springs.Elsewhere, the model underestimated observed Q E during dry-down regardless of how the soil properties were varied (Fig. 6).This suggests that uncertainties in soil parameters cannot account for the poor simulation of dry-down by the default model.
Similarly, LAI, as it was varied here, could not explain the underestimation of Q E .The range in LAI varied by site (30 % on average) according to the remotely sensed data, but was not lower at the drought sites.The model was generally not sensitive to changes in LAI during dry-down regardless of the choice of hydrological scheme (Figs. 6 and 7).This implies that the correct characterisation of canopy structure is probably not critical for the simulation of Q E in CABLE during dry-down or that the scale of the errors in the simulations are too large to see any more subtle impact of these LAI variations.Nevertheless, we do note that leaf drop during drought events could lead to an increased or compensatory reflectance signal from deeper in the canopy profile, resulting in erroneous estimates of LAI from optically remote sensed products (cf.Amazon drought studies; Samanta et al., 2010).

Hydrological schemes
The new hydrological scheme was shown to improve CA-BLE simulations of Q E during dry-down (Figs. 3 and 4).This results from higher soil moisture content simulated by the new scheme compared to the default model, particularly in the bottom soil layers (Fig. S7).This allows higher ET fluxes to be maintained during dry periods, mainly due to higher transpiration rates (Fig. S8).The alternative hydrological schemes make different assumptions about subsurface drainage and how this is treated upon exiting the bottom soil column boundary.The default model assumes a free draining boundary for solving vertical water flow (Eq.1), such that the bottom soil layer essentially acts as a sink for the rest of the soil column, as it can only remove water from the column.Conversely, the new scheme simulated an unconfined groundwater aquifer below the bottom soil layer that is assumed to sit on an impermeable layer of rock so that no water is lost from the aquifer through downward flow (Eq.7).Soil moisture content of above soil layers can then be replenished through recharge from the aquifer to maintain higher soil moisture during dry periods (given a water table depth near the bottom of the soil column; Eq. ( 6); Fig. S7).Zeng and Decker (2009) demonstrated that assuming a free draining lower boundary requires an unrealistically high precipitation rate to maintain a relatively wet soil moisture content that allows vegetation to transpire without encountering soil moisture stress.Using a hypothetical example, the authors estimate that a minimum precipitation rate of 17.2 mm day −1 is required to maintain non-waterstressed conditions (a value much higher than is observed in most environments), implying overly dry soil conditions in many cases.Our results therefore suggest that the replacement of a constant drainage assumption in the original model with a physically based, dynamic bottom boundary condition for the soil column is important for improving Q E fluxes in CABLE under water-stressed conditions.
Whilst it was not possible to evaluate these simulations against soil moisture data due to a lack of observations for soil depths used in CABLE, Decker (2015) showed that the new scheme could better capture total soil column water anomalies and evapotranspiration (two variables that strongly depend on the correct simulation of soil moisture content) in comparison to the default scheme at river basin scales.This gives us confidence that the higher soil moisture levels simulated by the new scheme are supported by some observations.This result should be evaluated in future work against locations where deep soil moisture measurements are made available, or efforts to obtain observed soil moisture coincident with tower measurements of the fluxes should be encouraged.

Stomatal conductance schemes
Our results showed that CABLE is generally not sensitive to the choice of g s scheme during dry-down at most sites , with the exception of Howard Springs (a C 4 grass site) and Blodgett (an evergreen needleleaf site).This result is largely to be expected: during drought both schemes are limited in the same fashion, with β (Eq.11) reducing the slope that relates g s to photosynthesis.The noted differences between schemes at the C 4 grass and evergreen needleleaf sites are consistent with results from De Kauwe et al. (2015b). At Howard Springs, De Kauwe et al. (2015b) found that the high g 0 value assumed in the Leuning model (0.04 mol m −2 leaf s −1 ) accounted for the difference between schemes when g s approached zero (for example during a drought).Differences between schemes at Blodgett stem, at least in part, from the use of a parameterisation of a conservative water use, found in evergreen needleleaf forests (Lin et al., 2015).
We note that the two stomatal schemes have different sensitivities to vapour pressure deficit (see De Kauwe et al. (2015b) for details).However, under current climatic conditions this assumption only results in a small difference between schemes, although this effect could be amplified in the future with expected increased vapour pressure deficit in a warmer world.

Overestimation of soil evaporation
We identified systematic biases in the simulation of peak and spring Q E , particularly at forested sites (e.g.Tumbarumba and Blodgett) (Figs. 2 and S7).The biases in the timing and magnitude of spring and peak fluxes not only have implications for the correct simulation of seasonal cycles, but can also affect the magnitude of dry-down simulated by the model.The excessive spring and early summer Q E may reduce soil moisture levels prior to the dry-down, leading to the simulation of more severe reductions in Q E during dry periods.Both hydrological schemes showed a tendency to significantly overestimate these fluxes.The reason for the overestimation of peak fluxes is not clear, but is not resolved by the new hydrological scheme despite this parameterising many of the relevant processes differently.At many sites, the high Q E in spring is associated with excessive soil evaporation and is not linked to transpiration, which closely follows the observed seasonal cycle (see e.g.Bugac, Harvard, Howland and Hyytiälä in Fig. S9).
There are a number of possible causes of and solutions to this excessive soil evaporation.Insufficient drainage, and consequently overestimated surface soil moisture, and/or insufficient reduction of soil evaporation during soil drying may explain the excess spring Q E .The default scheme uses a linear function to reduce soil evaporation when soil moisture is limiting following Eq.( 3).This is replaced with a nonlinear function presented in Eq. ( 10) in the new scheme.The non-linear function provides a much stronger limitation on soil evaporation as soil moisture declines but, based on these results, this approach is not sufficient for resolving the excessive soil evaporation.Haverd and Cuntz (2010) showed the inclusion of litter layer dynamics in an earlier version of CABLE improved the simulated timing of spring Q E at Tumbarumba by suppressing soil evaporation, but this was not implemented in the current study.Adding a litter layer may resolve excessive soil evaporation at some sites by adding an additional resistance to evaporation, but it is unclear that this approach would resolve errors at all PFTs.Errors in the timing of spring greenup at deciduous sites in the LAI inputs (e.g.Fisher and Mustard, 2007) may also contribute to excessive spring evaporation, whereby a delayed green-up would allow excessive radiation to reach the ground surface in early spring, increasing soil evaporation rates.We encourage researchers to make use of the Best et al. (2015) experimental protocol to fully explore this problem.Using multi-LSM simulations should be able to identify where CABLE is anomalous, and ideally implement the model parameterisations used in other LSMs that do not simulate excessive spring soil evaporation.

Further model uncertainties
In this study, we explored and quantified model uncertainties due to LAI, g s , hydrological and soil parameters, limiting our analysis to parameters that can be derived from observationally based global data sets (despite considerable uncertainties).Other model processes, particularly more realistic representations of vegetation drought responses, have been identified as critical for capturing drought processes and shown to improve CABLE performance during droughts, but were not explicitly explored here.
The simulation of the effects of soil moisture limitation on photosynthesis and stomatal conductance remains a key uncertainty for drought responses in LSMs (Zhou et al., 2013).Models rely on differing assumptions about the effects of water stress on photosynthesis and stomatal conductance (Egea et al., 2011;Keenan et al., 2009) but generally assume similar drought responses across different PFTs (including CA-BLE as employed here) despite experimental evidence pointing to systematic differences in plant adaptations to drought (De Kauwe et al., 2015a;Zhou et al., 2013).In common with many other LSMs (Verhoef and Egea, 2014), CABLE limits gas exchange during low soil moisture using the dimensionless scalar β following Eq.( 11).The function is strongly linked to soil properties (through wilting point and field capacity parameters) and does not directly consider vegetation characteristics beyond rooting depth (which varies little by PFT).De Kauwe et al. (2015a) evaluated CABLE against flux site observations during the 2003 European drought using an alternative drought model with experimentally derived drought sensitivities.They showed significant underestimations of Q E using the default CABLE, but these were improved using different vegetation sensitivities to drought (varying from low sensitivity in xeric environments to high in mesic environments, in line with experimental evidence) and a dynamic weighting of water uptake across soil layers.Experimental data to inform the parameterisation of PFT-specific drought responses, however, remain limited (De Kauwe et al., 2015a), complicating the implementation of such responses in LSMs.Li et al. (2012) showed the underestimation of CABLE-simulated Q E under water-stressed conditions could be improved by employing an alternative root water uptake scheme.The default root water uptake function in CABLE employed here (Wang et al., 2011) assumes a constant efficiency of water uptake per unit root length (Li et al., 2012).CABLE with the alternative scheme, combining a function allowing variable root-density distribution (Lai and Katul, 2000) with a hydraulic redistribution scheme (which allows roots to move water from wetter to drier soil layers), was shown to correctly capture the magnitude of seasonalscale droughts across three flux tower sites.The implementation of more realistic vegetation responses and adaptations to droughts should further refine the performance of the new hydrological scheme during dry-down periods.
Furthermore, in the simulations described here prescribed monthly MODIS LAI was used.Whilst CABLE and many other LSMs are capable of simulating LAI dynamically, it is common practice, particularly in coupled online simulations, to rely on prescribed monthly climatology instead of timevarying LAI.This limits the realistic simulation of reductions in LAI during severe droughts and consequent feedbacks with radiative and evaporative processes such as interception losses.Canopy defoliation may, for example, decrease transpiration and interception but also increase radiation reach-ing the soil surface, potentially increasing soil evaporation in the presence of available moisture.As these feedbacks were not considered in this study, the rate of dry-down may have been overestimated at sites which experienced LAI reductions during rainfall deficits, but which may not have been captured in the MODIS LAI inputs.However, as only the magnitude of LAI was varied in this study, it is not possible to quantify the effects of temporal errors in LAI on simulated Q E .Since both hydrological models were forced with identical LAI, it is unlikely uncertainties in the prescribed LAI explain the excessive dry-down in the default hydrological scheme.
We have limited our analysis to short-term, seasonal-scale rainfall deficits.Multi-annual droughts, such as the Millennium drought in eastern Australia (van Dijk et al., 2013), are likely to exhibit different dynamics in terms of vegetation responses and consequent feedbacks with land surface fluxes, soil moisture states and albedo.Prudhomme et al. (2011), for example, showed the JULES LSM to more successfully reproduce long-term hydrological droughts than short-term events in terms of duration and severity.Realistic representations of plant adaptations to drought and dynamically varying LAI are likely to be increasingly important for representing vegetation resilience and coupled land surface processes during long-term droughts.We therefore suggest future studies of LSM performance under water-stressed conditions should evaluate models against drought events at different temporal scales.

Conclusions
This study evaluated the CABLE land surface model for seasonal-scale precipitation deficits using 20 flux tower sites distributed globally.We varied the soil hydrological and stomatal conductance parameterisations, and the inputs for LAI and soil properties.Our goal was to determine whether CABLE could capture dry-down associated with rainfall deficits as these components of the model are varied, or whether the model lacks the physical parameterisations to simulate this phenomenon.
On long timescales (annual and above), compensating biases mean that the two versions of CABLE performed similarly.However, as our analysis focused more on periods of rainfall deficit, a new hydrological parameterisation based on Decker (2015) clearly improved the capability of CABLE to simulate Q E .However, neither version of CABLE, and no reasonable choice of soil parameter, LAI or stomatal conductance resolved systematic seasonal-scale biases in excessive spring soil evaporation.The reasons for these biases cannot be determined in isolation and we will next pursue these model limitations using the PLUMBER multi-model benchmarking framework (Best et al., 2015).
Our study highlights some opportunities for land modellers.First, our study again demonstrates the value in freely available flux tower data for identifying systematic biases in LSMs.The value of these data extends well beyond their common use in evaluating means or seasonal cycles.Second, a major role for LSMs is to simulate feedbacks to the atmosphere associated with rainfall deficits.We have demonstrated that there is skill in CABLE in simulating these feedbacks as a landscape dries, but clearly more work needs to be invested in capturing all the elements of a drying soil and its impacts on Q E .While the parameterisation of hydrology has been explored over the years, we remind the community that there are on-going challenges in modelling soil moisture and links between soil moisture and evaporation that are not yet resolved.Third, we note that CABLE performs reasonably relative to other LSMs (Abramowitz et al., 2007;Best et al., 2015), and yet when we interrogate the model's performance at timescales when compensating biases are limited, CABLE displays some concerning behaviour.It is inevitable that other LSMs, if examined using these periods of precipitation deficit, will also exhibit problems.Clearly, formally testing LSMs against more extreme conditions, and in the context of a specific phenomenon (e.g.drought or heatwave), is a necessary step to build confidence in the projections from climate models that utilise LSMs.

Data availability
The model code is available at the SVN repository as per Sect.2.
The data are available on request from the author.
The Supplement related to this article is available online at doi:10.5194/hess-20-2403-2016-supplement.

Figure 1 .
Figure 1.Location of selected and supplementary flux tower sites.

Figure 2 .
Figure2.The range in simulated latent heat (red) during the whole observational data period using the default (left panel) and new (right panel) hydrological schemes with alternative LAI, g s and soil parameterisations.Observed latent heat is shown in black.The grey shading denotes the selected 1-year period.

Figure 3 .
Figure3.The range in normalised mean error metrics of latent heat simulations using the default (red) and new (blue) hydrological schemes with alternative LAI, g s and soil parameterisations during the whole, annual and dry-down periods.Values closer to 0.0 indicate better model performance.

Figure 4 .
Figure 4.The range in mean bias error metrics of latent heat simulations using the default (red) and new (blue) hydrological schemes with alternative LAI, g s and soil parameterisations during the whole, annual and dry-down periods.Values closer to 0.0 indicate better model performance.

Figure 5 .
Figure5.The range in simulated latent heat (red) during the 1-year period using the default (left panel) and new (right panel) hydrological schemes with alternative LAI, g s and soil parameterisations.Observed latent heat is shown in black.The grey shading denotes the selected dry-down period.All time series run from January to December, except Tumbarumba, which runs from July to the following June.

Figure 6 .
Figure6.The range in simulated latent heat (red) arising from the individual effects of soil parameters (left panel), g s (centre panel) and LAI (right panel) using the default hydrological scheme during the 1-year period.Observed latent heat is shown in black.The grey shading denotes the selected dry-down period.The individual effects were determined by fixing the other parameterisations at their default values (medium soil, Medlyn g s and centre LAI).

Figure 7 .
Figure7.The range in simulated latent heat (red) arising from the individual effects of soil parameters (left panel), g s (centre panel) and LAI (right panel) using the new hydrological scheme during the 1-year period.Observed latent heat is shown in black.The grey shading denotes the selected dry-down period.The individual effects were determined by fixing the other parameterisations at their default values (medium soil, Medlyn g s , centre LAI and 0 • slope).

Figure 8 .
Figure8.The range in simulated mean latent heat arising from the individual effects of soil (brown), g s (blue), LAI (green) and slope (red) parameterisations using the new hydrological scheme during the 1-year and dry-down periods.The individual effects were determined by fixing the other parameterisations at their default values (medium soil, Medlyn g s , centre LAI and 0 • slope).

Figure 9 .
Figure9.The range in simulated latent heat (red) arising from the individual effects of the slope parameter using the new hydrological scheme during the 1-year period.Observed latent heat is shown in black.The grey shading denotes the selected dry-down period.The individual effects were determined by fixing the other parameterisations at their default values (medium soil, Medlyn g s and centre LAI).