Evaluating the value of a network of cosmic-ray probes for improving land surface modelling

Q: GENERAL COMMENTS OVERALL QUALITY The paper by Baatz et al. (2016432) describes an effort to use soil moisture data from nine closely-spaced (2000 km2) cosmic-ray probes (CRPs) with data assimilation scheme to improve the assessment of soil moisture in land-surface models. The goal is worthy and the execution is thorough. The results are significant: (1) the joint state (soil moisture) and parameter (soil properties, like sand percentage) estimation within data assimilation scheme produces better results than just state estimation; (2) in absence of soil data and meteorological data, CRPs alone can improve data assimilation results. On that account, the paper is suitable for publication in HESS.


1
Introduction 20 Soil water content (SWC) is a key variable of land surface hydrology and has a strong control on the partitioning of net radiation between latent and sensible heat flux. Knowledge of SWC is relevant for the assessment of plant water stress and agricultural production, as well as runoff generation as a response to precipitation events Robinson et al., 2008). In atmospheric circulation models, SWC is important as a lower boundary condition and is calculated as a state variable in land surface models. Coupling of atmospheric circulation models and land surface models allows quantifying the 25 role of soil moisture on atmospheric processes such as soil moisture-precipitation feedbacks (Koster et al., 2004;Eltahir, 1998) and summer climate variability and drought (Oglesby and Erickson, 1989;Sheffield and Wood, 2008;Seneviratne et al., 2006;Bell et al., 2015). It is therefore important to improve the modelling and prediction of SWC, but this is hampered by model deficiencies and lack of high quality data (Vereecken et al., 2016). Soil moisture measured by space-born remote sensing technologies provides information over large areas (e.g. Temimi et al., 2014). However, space-born remote sensing 30 supplies only information on the upper few centimeters, and data are not reliable for areas with dense vegetation. Therefore, in this paper an alternative source for soil moisture information is explored. Cosmic-ray probes (CRPs) measure fast neutron intensity which allows estimating SWC at an intermediate scale (Zreda et al., 2008;Desilets et al., 2010;Cosh et al., 2016;Lv et al., 2014) which is closer to the desired application scale of land surface models (Ajami et al., 2014;Chen et al., Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License.
2007; Shrestha et al., 2014). Fast neutrons originate from moderation of secondary cosmic particles from outer space by terrestrial atoms. These particles are mainly fast neutrons, which are moderated most effectively by hydrogen because of the similar atomic mass. Therefore, the corresponding neutron intensity measured by CRPs strongly depends on the amount of hydrogen within the CRP footprint, allowing for a continuous non-invasive soil moisture estimate over an area of ~15 ha (Kohli et al., 2015). The spatial extend of this measurement is desirable as it matches with the desired grid cell size of a land 5 surface model (Crow et al., 2012) and small scale heterogeneities are averaged over a larger area (Franz et al., 2013a;Desilets and Zreda, 2013). Vertical measurement depth ranges from a maximum of ~70 cm under completely dry conditions and decreases to roughly ~12 cm under wet conditions (e.g. 40 vol. % soil moisture) . Worldwide several CRP networks exist, like the North American COSMOS network (Zreda et al., 2012), the German CRP network (Baatz et al., 2014) installed in the context of the TERENO infrastructure measure (Zacharias et al., 2011) and the Australian 10 COSMoZ network (Hawdon et al., 2014).
Soil moisture data assimilation provides a way to improve imperfect land surface model predictions with measured soil moisture data by merging model predictions and data, and can consider the uncertainty of initial conditions, model parameters and model forcings. Ensemble Kalman Filtering (EnKF) is one of the most commonly applied data assimilation 15 methods (Evensen, 1994;Burgers et al., 1998). Soil moisture data assimilation has been the subject of intensive study for more than a decade now. An early contribution was provided by Houser et al. (1998) who assimilated remotely sensed soil moisture observations from a microwave radiometer into a land atmosphere transfer scheme using four-dimensional variational data assimilation. Rhodin et al. (1999) assimilated soil moisture data for a four-day period in order to obtain an improved characterization of the lower boundary condition for an atmospheric circulation model. They also used a 20 variational data assimilation approach. More recently, the Ensemble Kalman Filter (Reichle et al., 2002a;Dunne and Entekhabi, 2005;Crow, 2003), the Extended Kalman Filter (Draper et al., 2009;Reichle et al., 2002b), four-dimensional variational methods (Hurkmans et al., 2006) and the Local Ensemble Transform Kalman Filter (Han et al., 2015;Han et al., 2013) were applied for updating soil moisture states in land surface models. Reichle et al. (2002a) performed a synthetic experiment using L-band microwave observations of the Southern Great Plains Hydrology Experiment (Jackson et al., 1999) 25 to analyze the effect of ensemble size and prediction error. Dunne and Entekhabi (2005) showed that an Ensemble Kalman Smoother approach, where data from multiple time steps was assimilated to update current and past states, can yield a reduced prediction error compared to a pure filtering approach.
More recent work addressed joint state-parameter estimation in hydrologic land surface models with data assimilation 30 methods. Joint state-parameter estimation with EnKF is possible by an augmented state vector approach (Chen and Zhang, 2006), a dual approach (Moradkhani et al., 2005) or an approach with an additional external optimization loop (Vrugt et al., 2005). Pauwels et al. (2009) optimized soil hydraulic parameters of a land surface model with synthetic aperture radar data. Lee (2014) used Synthetic Aperture Radar soil moisture data to estimate soil hydraulic properties at the Tibetan plateau Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License.
using the EnKF and a Soil Vegetation Atmosphere Transfer model. Bateni and Entekhabi (2012) et al., 2007). Shi et al. (2014) used the Ensemble Kalman Filter for a synthetic multivariate data assimilation problem with a 5 land surface model and then applied it to real data (Shi et al., 2015). The cases illustrated a way to use real world data for estimating several parameters in hydrologic land models. (Kurtz et al., 2016) developed a particular CPU-efficient data assimilation framework for the coupled land surface-subsurface model TerrSysMP (Shrestha et al., 2014). They successfully updated 2 × 10 7 states and parameters in a synthetic experiment. Whereas these studies were made with land surface models, also in soil hydrological applications recently data assimilation was used to estimate soil hydraulic parameters. Early work 10 was by Margulis (2011, 2013) in the context of real-time control of waste water reuse in irrigation. Erdal et al. (2014) investigated the role of bias in the conceptual soil model and explored bias aware EnKF as a way to deal with it. Erdal et al. (2015) focused on handling of strong non-Gaussianity of the state variable in EnKF under very dry conditions. Montzka et al. (2013; explored the role of the particle filter for handling non-Gaussianity in soil hydrology data assimilation. They showed that the ability of a data assimilation system to correct the soil moisture state and estimate 15 hydraulic parameters strongly depends on the nonlinear character of the soil moisture retention characteristic. Song et al. (2014) worked on a modified iterative filter to handle the non-linearity and non-Gaussianity of data assimilation for the vadose zone. For a further literature review on data assimilation in the context of hydrological and land surface models we refer to Reichle (2008) and Montzka et al. (2012). 20 Shuttleworth et al. (2013) developed the Cosmic Ray Soil Moisture Interaction Code (COSMIC), which is a forward operator to be applied for assimilating neutron intensity observations from CRPs. The COSMIC code was evaluated for several sites (Baatz et al., 2014;Rosolem et al., 2014). Its capability to propagate surface soil moisture information into the deeper soil column was analyzed by Rosolem et al. (2014). The COSMIC operator was successfully implemented in the Data Assimilation Research Testbed (Rosolem et al., 2014) to allow for state updating by the Ensemble Adjustment Kalman Filter 25 (Anderson, 2001). The COSMIC operator was implemented in a python interface that couples the land surface model CLM and the LETKF for joint state parameter updating (Han et al., 2015). Neutron counts measured by CRP have been used in data assimilation studies to update model states (Han et al., 2015;Rosolem et al., 2014). Soil hydraulic parameters were also updated by assimilation of neutron counts (Han et al., 2016), but only for a synthetic study which showed its feasibility.
CRPs were also used for inverse estimation of soil hydraulic parameters of the Hydrus-1D model (Villarreyes et al., 2014). 30 This work further explores the value of measured neutron intensity by CRPs to improve modelling of terrestrial systems at the catchment scale (Simmer et al., 2015) using a land surface model. Compared to existing work the main novelties are: Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License.
(i) Data from a network of nine CRPs were assimilated in the Community Land Model version 4.5 (CLM) with an evaluation of the information gain by this assimilation at the larger catchment scale. Until now evaluations with CRPs were made for a single location, but not for a complete network of CRPs. It is a very important question whether CRPs can also improve the soil moisture characterization at the larger catchment scale and how dense the CRP network should be. The high variability of soil moisture at a short distance could potentially limit the CRP measurement value and make updating of soil moisture 5 contents further away from the sensor meaningless. On the other hand, soil maps and atmospheric forcings show spatial correlations over larger distances which suggests that CRP measurements potentially carry important information to update soil moisture contents for larger regions. If it is found that CRP networks with a density like in this study (10 stations per 2354 km 2 ) can improve soil moisture content characterization at the larger catchment scale, this is of high relevance and importance for agricultural applications, flood prediction and protection, and regional weather prediction (Whan et al., 10 2015;Koster et al., 2004;Seneviratne et al., 2010). The main research question addressed in this paper is therefore whether a CRP network of the density in this study can improve large scale soil moisture characterization by state and parameter updates.
(ii) Soil hydraulic parameters were updated together with the soil moisture states in a real-world case study at the larger catchment scale. The study in this paper also allows some evaluation of the feasibility of the updated large scale soil 15 hydraulic parameters.
In the following paragraphs are presented the model site and the measurements (2.1), the Community land Model and its parameterization (2.2), the COMIC forward model (2.3) and the data assimilation procedure (2.4). 20 2 Materials and methods

Site description and measurements
The model domain, the Rur catchment (2354 km 2 ), is situated in western Germany and illustrated in Fig and land use follow the topography. Annual precipitation ranges between less than 600 mm in the North to 1200 mm in the hilly South (Montzka et al., 2008). Annual potential evapotranspiration varies between 500 mm in the South and 700 mm in the North (Bogena et al., 2005). The Rur catchment CRP network comprises nine CRPs (CRS1000, HydroInnova LLC, 2009) which were installed in 2011 and 2012 (Baatz et al., 2014). Climate and soil texture of the CRP sites can be found in Table 1. The CRPs were calibrated in the field using gravimetric soil samples. At each site, 18 soil samples were taken along 30 three circles with distances of 25, 75 and 175 meters from the CRP, six samples evenly distributed along each circle. Each sample was extracted with a 50.8 x 300 mm round HUMAX soil corer (Martin Burch AG, Switzerland). The samples were split into 6 sub-samples with 5 cm length each and oven dried at 105 °C for 48 hours to measure dry soil bulk density and soil moisture. Lattice water was determined for each site using a heat conductivity detector. Soil bulk density, soil moisture, Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License. lattice water and 12 hour averaged measured neutron intensity were used to determine calibration parameters specific for each CRP and the COSMIC operator.

Community land model and parameterization
The Community Land Model version 4.5 (CLM) was the land surface model of choice for simulating water and energy 5 exchange between the land surface and the atmosphere (Oleson et al., 2013). Some of the key processes which are solved by CLM are radiative transfer in the canopy space, interception of precipitation by the vegetation and evaporation from intercepted water, water uptake by vegetation and transpiration, soil evaporation, photosynthesis, as well as water and energy flow in the subsurface. SWC in CLM is influenced by precipitation, infiltration into the soil, water uptake by vegetation, surface evaporation and surface and subsurface runoff. Oleson et al. (2013) provide further details on CLM4.5. To limit the 10 scope and complexity of this study, CLM was run using satellite phenology, e.g. prescribed leaf area index data and the biogeochemical module turned off.
The spatial domain is discretized by rectangular grid cells by CLM. Each grid cell may have several types of land cover: Lake, urban, vegetated, wetland, and glacier. The vegetated part of the grid cell can be covered by several plant functional 15 types which are all linked to a single soil column. The soil column is vertically discretized by ten soil layers and five bedrock layers. Layer thickness increases exponentially from 0.007 m at the surface to 2.86 m for layer 10. Vertical water flow in soils is modelled by the 1D Richards equation. Soil hydraulic parameters are determined from sand and clay content using pedotransfer functions for the mineral soil fraction (Clapp and Hornberger, 1978;Cosby et al., 1984), and organic matter content for the organic soil fraction (Lawrence and Slater, 2008). 20 The following equations describe how soil texture and organic matter define the soil hydraulic properties in CLM such as porosity, hydraulic conductivity, the empirical exponent B and soil matric potential. Hydraulic conductivity ( [ ] in mm/s) at the depth z between two layers (i and i+1) is a function of soil moisture ( in m 3 /m 3 in layers i and i+1), saturated hydraulic conductivity ( in mm/s at z), saturated soil moisture ( in m 3 /m 3 ) and the empirical exponent B: 25 = } 1 where is the ice impedance factor. The ice impedance factor was implemented to simplify an increased tortuosity of water flow in a partly frozen pore space. It is calculated with = 10 −Ω using the resistance factor Ω = 6 and the frozen fraction of soil porosity = , ⁄ . Soil hydraulic properties are calculated separately for the mineral (min) and organic matter (om) soil components. Total porosity , is calculated using the fraction of organic matter ( , ) with: Hydrol. Earth Syst. Sci. Discuss., doi: 10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci.

2.3
Cosmic-ray forward model SWC retrievals were calculated from neutron intensity observations with the Cosmic-ray Soil Moisture Interaction Code COSMIC (Shuttleworth et al., 2013) following calibration results and the procedure of Baatz et al. (2014). COSMIC parameterizes interactions between neutrons and atoms in the subsurface, relevant for soil moisture estimation. COSMIC was calibrated against the more complex Monte Carlo Neutron Particle model MCNPx (Pelowitz, 2005) and needs 5 considerably less CPU-time than the MCNPx model. The reduced CPU-time need and the physically based parameterization make COSMIC a suitable data assimilation operator. The code was tested at multiple sites for soil moisture determination (Baatz et al., 2014;Rosolem et al., 2014) and analyzed in detail by Rosolem et al. (2014).
COSMIC assumes that a number of high energy neutrons enter the soil. In the soil, high energy neutrons are reduced by 10 interaction with the soil leading to isotropic generation of fast neutrons with less energy in each soil layer. Before resurfacing, fast neutrons are reduced again by soil interaction (Shuttleworth et al., 2013). The number of neutrons that reaches the CRP can be summarized in a single integral as where is an empirical coefficient that is CRP specific and needs to be estimated by calibration, ( ) is the integrated average attenuation of fast neutrons, = 0.404 − 0.101 × is the site specific empirical coefficient for the 15 creation of fast neutrons by soil, is the dry soil bulk density in g/cm 3 , is the total soil water density in g/cm 3 , and are the mass of soil and water, respectively, per area in g/cm 2 . 1 = 162.0 g cm -2 and 2 = 129.1 g cm -2 are empirical coefficients that were estimated using the MCNPx code (Shuttleworth et al., 2013). The integrated average attenuation of fast neutrons ( ) can be found numerically by solving where is the angle along a vertical line below the CRP detector to the element that contributes to the attenuation of fast 20 neutrons, 3 = −31.65 + 99.29 × is determined from soil bulk density and 4 = 3.16 g cm -2 is another empirical coefficient estimated using the MCNPx code (Shuttleworth et al., 2013). The COSMIC operator is discretized into 300 vertical layers of one cm thickness up to a depth of three meters. For each CLM grid cell in the model domain, simulated SWC is used to generate a SWC retrieval using the COSMIC code. Simulated SWC is handed from the CLM simulation history files to the COSMIC operator. Given the vertical SWC distribution of the individual CLM soil column, COSMIC 25 internally calculates the contribution of each layer to the simulated neutron intensity signal at the COSMIC soil surface. In this study, the contribution of each layer was used to calculate the weighted CLM SWC retrieval corresponding to the vertical distribution of simulated SWC in each grid cell. Measured neutron intensity of CRPs was used to inversely Hydrol. Earth Syst. Sci. Discuss., doi: 10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License. determine a CRP SWC retrieval as by Baatz et al. (2014) assuming a homogeneous vertical SWC distribution. Then, the weighted CLM SWC retrieval is used in the data assimilation scheme to relate the CRP SWC retrieval to the model state.

Data assimilation
This study uses the local ensemble transform Kalman filter (LETKF) (Hunt et al., 2007) to assimilate SWC retrievals by 5 CRPs into the land surface model CLM. Besides other Ensemble Kalman Filter variants, the LETKF is applied in atmospheric sciences (Liu et al., 2012;Miyoshi and Kunii, 2012), ocean science (Penny et al., 2013) and also in land surface hydrology (Han et al., 2014a;Han et al., 2015). Updates were calculated either for states or jointly for states and parameters.
In case of joint state-parameter updates, a state augmentation approach was followed (Hendricks Franssen and Kinzelbach, 15 2008;Han et al., 2014b). In this case, the augmented model state vector is constructed from the weighted SWC, and the grid cell's sand, clay and organic matter content.
In order to relate the measured neutron intensity with the simulated SWC of CLM, the observation operator H (COSMIC) is applied on the measured neutron intensity in order to obtain the expected weighted SWC retrieval for each of the stochastic 20 realizations: The ensemble realizations of the modelled SWC retrieval 1 to with respect to the ensemble mean ̅ are stored in the vector Y f : The observation error correlation was reduced in space by the factor using the spherical model: where d is the distance to the observation and = 40 is the maximum observation correlation length, about half the 25 size of the catchment. Only SWC retrievals within the maximum observation correlation length were used for assimilation.
In addition, the mean weight vector ̅ is obtained as follows: where 0 is CRP SWC retrieval. In the ensemble space, a perturbation matrix is calculated from the symmetric square root of : 5 The final analysis is obtained from: A more detailed description of the LETKF can be found in (Hunt et al., 2007) and details on the implementation of the LETKF in combination with CLM are given by (Han et al., 2015).

Model Setup
In this study, discretization and parameterization of the hydrological catchment was done on the basis of high resolution data. The Rur catchment domain is spatially discretized by rectangular grid cells of 0.008 degree size (~750 m). The model time step was set to hourly. Land cover was assumed to consist of vegetated land units only, and a single plant functional type (PFT) for each grid cell was defined. The plant functional types were derived from a remotely sensed land use map 15 using RapidEye and ASTER data with 15 m resolution (Waldhoff, 2012). Sand content, clay content and organic matter content were derived from the high resolution regional soil map BK50 (Geologischer Dienst Nordrhein-Westfalen, 2009).
The BK50 soil map provides the high resolution soil texture for the catchment and is the most detailed soil map available for the defined region. As an alternative, simulations were also performed for a biased soil texture distribution with a fixed sand content of 80 % and clay content of 10 % (S80 soil map). This represents a large error with respect to the expected true soil 20 properties. It allows evaluating the joint state-parameter estimation approach because given the expected bias, we can evaluate whether and to what extend the soil properties are modified by the data assimilation to be closer to the available high resolution soil map. In addition, in many regions across the Earth a high resolution soil map is not available and land surface models which are applied for those regions, for example in the context of global simulations, might be strongly affected by the error in soil properties. It was tested how this impacted the simulation results. Maximum saturated fraction, a 25 surface parameter which is used for runoff generation, was calculated from a 10 meter digital elevation model (scilands GmbH, 2010). Leaf area index data were derived from monthly averaged Moderate Resolution Imaging Spectrometer data Hydrol. Earth Syst. Sci. Discuss., doi: 10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License.
(MODIS). CLM was supplied with hourly atmospheric forcing data from a reanalysis data set for the years 2010 to 2013 from the German Weather Service (DWD). The data was downscaled from a resolution of 2.8 km 2 to the CLM resolution using linear interpolation based on Delaunay triangulation. Forcing data include precipitation in mm/s, incident solar and longwave radiation in W/m 2 , air temperature in K, air pressure in hPa, wind speed in m/s and relative humidity in kg/kg at the lowest atmospheric level. 5

Model ensemble
Uncertainty was introduced into the regional CLM model by perturbed soil parameters and forcings. Contents of sand, clay and organic matter were perturbed with spatially correlated noise from a uniform sampling distribution with mean zero and standard deviation 10 % or 30 % (Han et al., 2015). By perturbing texture, soil parameters are also perturbed through the 10 pedotransfer functions used in CLM as specified in Sect. 2.2. Precipitation (σ = 0.5 or 1.0; lognormal distribution) and shortwave radiation (σ = 0.3; lognormal distribution) were perturbed with multiplicative noise with mean equal to one.
Longwave radiation (σ = 20 W m -2 ) and air temperature (σ = 1K) were perturbed with additive noise. The forcing perturbations were imposed with correlations in space (5 km) using a fast Fourier transform. Correlation in time was introduced with an AR(1)-model with autoregressive parameter=0.33. These correlations and standard deviations were 15 chosen based on previous data assimilation experiments (Reichle et al., 2010;Kumar et al., 2012;De Lannoy et al., 2012;Han et al., 2015). In this work, only results for precipitation perturbation with σ = 0.5 will be shown as results for σ = 1.0 were very similar. An ensemble size of 95 realizations was used in the simulations. Based on previous work (Baatz et al., 2015), the SWC retrieval uncertainty for CRPs was estimated to be 0.03 cm 3 /cm 3 .  In total, 26 simulation experiments were carried out using different setups (Table 2). Two open loop simulations were run for the BK50 soil map (OL-BK50) and the S80 soil map (OL-S80), respectively, without data assimilation and soil parameter perturbation of 30 %. These simulations are referred to as reference runs for the respective soil map. Simulation results of data assimilation runs were compared to the reference runs for quantification of data assimilation benefits. Simulations were Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Besides the data assimilation experiments also a larger number of jackknifing simulations was run to evaluate the data assimilation performance. These simulations allow evaluating the impact of the CRP network to improve SWC characterization at other locations, without CRP. In a jackknife experiment, data from eight CRP locations were assimilated and one CRP was excluded from the assimilation for evaluation purpose. At the evaluation location, simulated SWC (which is affected by the assimilation of the other eight probes) was compared to CRP SWC retrievals. For jackknife simulations, 10 the perturbation of soil texture was set to 30 % and precipitation perturbation was done with σ= 0.5. States and parameters at these sites were jointly updated, and simulations were made using the BK50 and the S80 soil maps as input. Therefore, a total of 18 jackknife simulations (jk-S80-* and jk-BK50-*) was performed (two soil maps times nine different simulations leaving away one CRP at a time).

15
Simulation results were evaluated with the root mean square error (E RMS ): where n is the total number of time steps, , is the CLM SWC retrieval at time step t and SWC t,CRP is the CRP SWC retrieval at time step t. In case SWC was assimilated at the corresponding time step, , is SWC prior to assimilation.
In the case the E RMS is estimated at a single point in time over all CRPs available, the number of time steps n can be replaced by the number of CRPs available. The second evaluation measurement in this study is the bias: 20  illustrates that SWC at both sites was lower with the S80 soil map than with the BK50 soil map. In Gevenich and 10

General Results
Merzenhausen, mean open loop SWC in 2011 was 0.17 cm 3 /cm 3 for the S80 soil map at both sites and 0.27 cm 3 /cm 3 for the BK50 soil map at both sites. CRP measurements at Merzenhausen started in May 2011. In the data assimilation runs SWC was immediately affected at the Merzenhausen and Gevenich sites as soon as Merzenhausen CRP SWC retrievals were assimilated. The simulated SWC for the PAR-S80-30 data assimilation run increased as compared to the S80 open loop simulation. The first observation at Gevenich was recorded on July 7 th , 2011. By that date, the simulated CLM SWC 15 retrieval was already close to the CRP SWC retrieval at the Gevenich site (Fig. 2) due to SWC updates which showed to have a beneficial impact. Fig. 2 also shows that the BK50 open loop run was close to the observed SWC at both sites, even without data assimilation. open loop (OL-S80) and CRP SWC retrievals. At Heinsberg, results show that assimilated SWC was closer to the CRP SWC retrieval when both states and parameters were updated (PAR-S80-30) than if only states were updated (Stt-S80). This is the case in the assimilation period and in the evaluation period. At the beginning of the evaluation period, the Stt-S80 simulation shows an increase in bias between modeled CLM SWC retrievals and CRP SWC retrieval within the first few days of 2013. 25 The bias of Stt-S80 remained throughout the evaluation period. In contrast, modeled SWC during the evaluation period was close to the CRP SWC retrieval if parameters were previously updated (PAR-S80-30).
The CRP at Wildenrath started operating on May 7 th , 2012. SWC retrievals at other CRPs were assimilated already from May 2011 onwards and affected SWC at Wildenrath (Fig. 3)

Verification period
The year 2013 was the verification year without data assimilation. E RMS values for the evaluation period 2013 are reported in Table 4. On the one hand, BK50 data assimilation runs with joint state-parameter estimation resulted in improved SWC at Bias calculated on the basis of a comparison of hourly SWC measured by CRP and simulated for 2013 is reported in Table 5. negative at all sites indicating that modeled SWC was higher than measured SWC. Joint state-parameter updates reduced the absolute bias on average to 0.03 cm 3 /cm 3 (PAR-S80-30) and 0.02 cm 3 /cm 3 (PAR-S80-10) for the S80 soil map. In case of the 25 BK50 soil map, the bias in 2013 increased to 0.03 cm 3 /cm 3 by joint state-parameter updates. State updates without parameter updates reduced the biases only marginally to 0.01 cm 3 /cm 3 for the BK50 soil map and to 0.09 cm 3 /cm 3 for the S80 soil map.
This indicates that state updates also can slightly improve SWC-characterization in the verification period due to improved initial conditions.

Jackknife simulations
The jackknife simulations investigated the impact of the network of CRPs for improving estimates of SWC at locations between the CRPs, outside the network. The errors shown in Table 4 (Table 5). Hence, bias in the jk-S80-* simulations improved compared to the open loop run but not in the jk-BK50-* simulations, where bias was already small.

Temporal evolution of parameters 20
The temporal evolution of the percentage sand content during the assimilation period for the nine CRP sites is shown in  in the Northern part of the catchment through data assimilation. The differences between open loop ET and data assimilation ET were larger for the S80 soil map than for the BK50 soil map. This could be related to the larger update in SWC in case of 25 the S80 scenario compared to the BK50 scenario.

Discussion
The applied data assimilation scheme improved soil moisture characterization in the majority of simulation experiments with texture and the soil hydraulic parameters was not stable. Temporal fluctuations imply that there may be multiple or seasonal optimal parameter values. This is also supported by the findings of the temporal behavior of E RMS during the evaluation period e.g. when in the dry summer 2013 the E RMS peaked in the PAR-S80-30 simulation. Many possible error sources were not subject to calibration in this study but they could be crucial for an even better soil moisture and more stable soil parameter estimation. In this study we only considered uncertainty of soil parameters, but also vegetation parameters are 5 uncertain. Also a number of other CLM-specific hydrologic parameters (e.g. decay factor for subsurface runoff and maximum subsurface drainage) strongly influence state variables in CLM and hence show also potential for optimization.
Considering this uncertainty could give a better uncertainty characterization. Precipitation is an important forcing for the model calculations and its estimate could be improved. For this study, precipitation data from the COSMO_DE re-analysis were used. A product which optimally combines gauge measurements and precipitation estimates from radar could give 10 better precipitation estimates. This could improve the soil moisture characterization and also potentially lead to better parameter estimates. Also other error sources like the ones related to the model structure play a significant role. This should be subject of future investigation.
Evaluation simulations for 2013 led to partly improved and partly deteriorated E RMS values when the BK50 soil map was 15 used as prior information on the soil hydraulic properties. The simulations with the S80 soil map on the contrary showed an improved soil moisture characterization in all simulation scenarios and the updated soil hydraulic parameter estimates for those simulations approached the values of the BK50 soil map. These results indicate that the soil hydraulic parameters derived from the BK50 soil map were already well suited for soil moisture predictions and updating soil texture and soil parameters could not improve further the results. E RMS values for simulations with state updates only  in 2013 imply the beneficial role of state updates only. However, the improvements in the evaluation period by state updates (without parameter values) are small compared to the improvements obtained by joint state-parameter estimation.
This illustrates the benefits of joint state-parameter updates compared to state updates only, and that soil moisture states are strongly determined by soil hydraulic parameters. It also illustrates that the improved characterization of soil moisture states in the assimilation period which results in improved initial states for the verification period loses its influence in the 25 verification period fast over time.
The jackknife simulations illustrated that a network of CRPs can improve modeled SWC if the soil map information is not sufficient. Temporal evolution of subsurface parameters of the jackknife simulations (e.g. jk-S80-*) was close to the evolution of parameter estimates by other simulations (e.g. PAR-S80-10). Parameter estimates at jackknife test sites were 30 inferred from multiple surrounding CRP sites, while updates at sites with CRP information were strongly inferred from single site information. A comparison of parameter estimates at the end of the assimilation period indicates that initial soil parameterization has a limited effect on the resulting parameter estimates. Parameter estimates of jk-BK50-30* and jk-S80-30* are close together at the end of the assimilation period. The CRP network led to improved results for the jackknife Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License. evaluation simulations in case of the biased soil map. This suggests that assimilation of CRP data is particularly useful for regions with little information on subsurface parameters. We expect a tradeoff between the initial uncertainty on soil moisture content (related to the quality of the soil map and meteorological data) and the density of a CRP network. In case of a large uncertainty, like in regions with limited information about soils and a low density of meteorological stations, a sparse network of probes can already be helpful for improving soil moisture characterization. On the other hand, in regions with a 5 high density of meteorological stations and a high resolution soil map it can be expected that a high resolution CRP network is needed to further lower the error of soil moisture characterization. Further experiments in other regions with networks of CRPs are needed to get more quantitative information about this.
A question that remains to be answered is whether it is more beneficial to assimilate neutron counts measured by CRPs 10 directly or to assimilate CRP SWC retrievals derived from the neutron counts, as done in this study. Fast neutron intensity measured by CRPs is also affected by vegetation. Neutron count rate decreases with increasing biomass because of the hydrogen content in vegetation (Baatz et al., 2015). Seasonal biomass changes at a single site have a rather small impact on neutron intensity compared to differences between grass land site and a forest site (Baatz et al., 2015). Therefore, using measured neutron flux directly in a data assimilation framework in a catchment with different vegetation types would require 15 to account for the effects of vegetation types on neutron intensity. Hence, vegetation estimates for each grid cell would be necessary. At present, there are two methods that include biomass in the CRP calibration process (Baatz et al., 2015;Franz et al., 2013b) but both methods naturally require accurate biomass estimates, which are typically not available. Besides the uncertainty associated with CRP methods using biomass in the calibration process, biomass estimates also come along with high uncertainties. Therefore, in the case of a catchment with different vegetation types, it is desirable to circumvent the use 20 of biomass estimates, and assimilate directly SWC retrievals obtained at the observation sites instead of assimilating neutron intensity. Therefore, this study uses CRP SWC retrievals in the data assimilation scheme assuming that seasonal changes of biomass can be neglected.

Conclusions and Outlook 25
This study demonstrates the benefits of assimilating data from a network of nine cosmic-ray probes (CRP) in the land surface model CLM version 4.5. Although information on neutron flux intensity was only available at few locations in the catchment, the local ensemble transform Kalman filter (LETKF) allows updating of soil water content (SWC) at unmonitored locations in the catchment considering model and observation uncertainties. Joint state-parameter estimates improved soil moisture estimates during the assimilation and during the evaluation period. The E RMS and bias for the soil 30 moisture characterization reduced strongly for simulations initialized with a biased soil map and approached values similar to the ones obtained when the regional soil map was used as input to the simulations. E RMS -values in simulations with a regional soil map were not improved, because open loop simulation results were already close to the observations. The beneficial results of joint state-parameter updates were confirmed by additional jackknife experiments. This real-world case Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License. study on assimilating CRP SWC retrievals into a land surface model shows the potential of CRP networks to improve subsurface parameterization in regional land surface models, especially if prior information on soil properties is limited. In many areas of the world, less detailed soil maps are available than the high resolution regional soil map applied in this study.
In these areas, more advanced sub-surface characterization is possible using CRP measurements and the data assimilation framework presented in this study. 5 For now, CRP neutron intensity observations were not assimilated directly. In future studies it would be desirable to use the COSMIC operator for assimilating neutron intensity observations directly. However, in this case the impact of biomass on the CRP measurement signal would have to be taken into account. Therefore, it is desirable to further develop the COSMIC operator to include the impact of biomass on neutron intensities. Using the biogeochemical module of CLM would then 10 allow to characterize local vegetation states as input for the measurement operator. Remotely sensed vegetation states are another option to characterize vegetation states as input for the measurement operator. Both methods require additional field measurements for the verification of vegetation state estimates. The further extension of the data assimilation framework would also enable the estimation of additional sub-surface parameters. The impact of other sub-surface parameters such as subsurface drainage parameters and the surface drainage decay factor on SWC states and radiative surface fluxes has already 15 been shown . Estimation of these parameters is desirable because of the inherent uncertainty of these globally tuned parameters. However, estimation of soil texture and organic matter content was demonstrated to be already beneficial for improved SWC modeling. Hence, this study represents a way forward towards the integration of CRP information in the calibration of large scale weather prediction models.

Data Availabilitiy 20
Most data presented in this study are freely available via the TERENO data portal TEODOOR (http://teodoor.icg.kfajuelich.de/). Atmospheric data were licensed by the German Weather Service (DWD), and the BK50 soil map was licensed by the Geologischer Dienst Nordrhein-Westfalen.

Acknowledgements
We gratefully acknowledge the support by the SFB-TR32 "Pattern in Soil-Vegetation-Atmosphere Systems: Monitoring, 25 Modelling and Data Assimilation" funded by the Deutsche Forschungsgemeinschaft (DFG) and TERENO (Terrestrial    Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License. Fig. 1. Map of the Rur catchment and locations of the nine cosmic-ray probes. The hilly South of the catchment is prone to more rainfall, lower average temperatures and less potential evapotranspiration than the North of the catchment.  Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-432, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 August 2016 c Author(s) 2016. CC-BY 3.0 License. Table 2: Overview of simulation scenarios: Open loop (OL-*) with variation in the soil map BK50 or S80, data assimilation run with state update (Stt) or joint state-and parameter update (PAR) with variation in the soil map perturbation (-10 or -30), and jackknife evaluation runs (jk-S80-1 to 9, and jk-BK50-1 to 9