Incorporation of globally available datasets into the cosmic-ray neutron probe method for 1 estimating field scale soil water content 2 3

14 The need for accurate, real-time, reliable, and multi-scale soil water content (SWC) 15 monitoring is critical for a multitude of scientific disciplines trying to understand and predict the 16 earth’s terrestrial energy, water, and nutrient cycles. One promising technique to help meet this 17 demand is fixed and roving cosmic-ray neutron probes (CRNP). However, the relationship 18 between observed low-energy neutrons and SWC is affected by local soil and vegetation 19 calibration parameters. This effect may be accounted for by a calibration equation based on local 20 1 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-92, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 2 March 2016 c © Author(s) 2016. CC-BY 3.0 License.

soil type and the amount of standing biomass. However, determining the calibration parameters 21 for this equation is labor and time intensive, thus limiting the full potential of the roving CRNP 22 in large surveys and long transects, or its use in novel environments. In this work, our objective 23 is to develop and test the accuracy of using globally available datasets (clay weight percent, soil 24 bulk density, and soil organic carbon) to support the operability of the CRNP. Here, we develop 25 a 1 km product of soil lattice water over the CONtinental United States (CONUS) using a 26 database of in-situ calibration samples and globally available soil taxonomy and soil texture data. 27 We then test the accuracy of the global dataset in the CONUS using comparisons from 61 in-situ 28 samples of clay percent (RMSE = 5.45 wt. %, R 2 = 0.68), soil bulk density (RMSE = 0.173 29 g/cm 3 , R 2 = 0.203), and soil organic carbon (RMSE = 1.47 wt. %, R 2 = 0.175). Next, we conduct 30 an uncertainty analysis of the global soil calibration parameters using a Monte Carlo error 31 propagation analysis (maximum RSME ~0.035 cm 3 /cm 3 at a SWC = 0.40 cm 3 /cm 3 ). In terms of 32 vegetation, fast growing crops (i.e. maize and soybeans) contribute to the CRNP signal primarily 33 through the water within their biomass and this signal must be minimized for accurate estimation 34 of SWC. We estimated the biomass water signal by using a vegetation index derived from 35 MODIS imagery as a proxy for standing wet biomass (RMSE < 1 kg/m 2 ). Lastly, we make 36 recommendations on the design and validation of future roving CRNP experiments. By the year 2050, over nine billion people are predicted to inhabit the Earth (United 40 Nations, 2015). The monumental task of feeding the projected global population will require a 41 near doubling of grain production (FAO, 2009). As of today, the majority (~2/3) of water microwaves lack significant penetration depths (~ 2-5 cm Njoku et al., 1996), limiting their 67 effectiveness as a remote sensing input for full root zone coverage in LSMs. 68 Alternatively, the field of geophysics offers a variety of techniques to help fill the spatial 69 and temporal gaps between point sensors and remote sensing products (Robinson et al., 2008).  (Zreda et al., 2012). CRNP estimate the area-average SWC because neutrons are well mixed 77 within the footprint of the sensor which typically has a radius of ~300 m and depths of ~12-76 78 cm Zreda 2013, Kohli et al., 2015). 79 To date, the CRNP method has been mostly used as a fixed system in one location to 80 continuously measure SWC as part of a large monitoring network (Zreda et al., 2012, Hawdon et 81 al., 2014. Recent advancements have allowed the CRNP to be used in mobile systems to 82 monitor transects across Hawaii (Desilets et al., 2010), monitor entire basins in southern Arizona 83 (Chrisman et al., 2013), compare against remote sensing products in central Oklahoma (Dong et 84 al., 2014), and monitor ~140 agricultural fields in eastern Nebraska (Franz et al., 2015). In order 85 to accurately estimate SWC, the CRNP method relies on a calibration function to convert 86 observed low-energy neutron counts into SWC (Desilets et al., 2010, Bogena et al., 2013 Sec. 2.2 for full details). The calibration procedure requires site specific sampling of both soil 88 and vegetation data in order to determine the required parameters. While the calibration of a 89 4 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-92, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. fixed CRNP is fairly standardized (Zreda et al., 2012;Franz et al., 2012;Iwema et al., 2015, 90 Baatz et al., 2015, the heterogeneous nature of soil and vegetation characteristics across a 91 landscape makes the pragmatic calibration of the mobile CRNP a significant challenge. 92 Specifically, the presence of water within vegetation and the soil minerals may alter the shape of 93 the local calibration function and thus accuracy of SWC. The need for reliable, accurate, depth-94 dependent, and localized soil and vegetation spatial information for use in the calibration 95 function is critical in order to fully harness the potential of the CRNP to monitor landscape scale 96 SWC across the globe.

97
The objective of this study is to explore the utility and accuracy of currently available 98 global soil and vegetation datasets (soil organic carbon, soil bulk density, soil clay weight 99 percent, and crop biomass) for use in the calibration function. To accomplish our objective, we 100 aimed to answer the following questions: 101 1) Can global datasets of soil bulk density, soil organic carbon, and soil clay weight percent be 102 used to in lieu of in-situ sampling within reasonable error for use in the CRNP calibration 103 function?
104 2) Can the use of remotely sensed vegetation products, specifically the Green Wide Dynamic 105 Range Vegetation Index (GrWDRVI) be used to quantify fresh biomass with reasonably low 106 error (< 1 kg/m 2 ) for use in the CRNP calibration function?

107
To answer these questions, we tested the accuracy of these datasets against in-situ sample 108 datasets of the same parameters. Existing in-situ datasets from across the CONUS were then 109 combined with in-situ datasets from eastern Nebraska, which focused on fast growing crops of 110 maize and soybean. Specifically, we tested the accuracy and use of a ~1 km global soil dataset 111 5 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-92, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 2 March 2016 c Author(s) 2016. CC-BY 3.0 License. (Shangguan et al., 2014). In addition, we examined the use of the Green Wide Dynamic Range 112 Vegetation Index (GrWDRVI, Gitelson, 2004) derived from NASA's MODIS sensor aboard the 113 Terra satellite for use in estimating the amount of fresh crop biomass.

114
The remainder of the paper is organized as follows: In the Methods section, the CRNP 115 method is first presented, with emphasis on the integration of the calibration function and soil 116 and vegetation parameters to convert observed low-energy neutron counts into SWC. Next, in-117 situ methods for estimating the soil and vegetation calibration parameters are discussed, which is 118 followed by discussions on the soil and vegetation products available globally at ~1 km 119 resolution. In the Results section, we first compare the in-situ soil sampling against the global 120 datasets. Next, we develop a 1 km CONUS soil lattice water map using in-situ samples. We then 121 compare the GrWDRVI against in-situ samples from Nebraska to estimate the changes in maize 122 and soybean fresh biomass. Lastly, we present an error propagation analysis investigating the 123 potential uncertainty of using the global soil calibration data vs. local in-situ sampling. The paper 124 concludes with a discussion on best practice recommendations for calibrating and validating a 125 roving CRNP experiment. The CRNP estimates area-averaged SWC via measuring the intensity of epithermal 130 neutrons near the ground surface (Zreda et al. 2008(Zreda et al. , 2012. A cascade of neutrons with varying 131 energy levels are created in the earth's atmosphere when incoming higher energy particles 132 produced within supernovae interact with atmospheric nuclei (Zreda et al., 2012 and Kohli et al., 133 6 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-92, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 2 March 2016 c Author(s) 2016. CC-BY 3.0 License. 2015). After fast neutrons are created, they continue to lose energy during numerous collisions 134 with nuclei in air and soil, and become epithermal neutrons (i.e., the neutrons which are 135 primarily measured by the moderated detector). The abundance of hydrogen atoms in the air and 136 soil largely controls the removal rate of epithermal neutrons from the system (Zreda et al. 2012).  In addition to the fixed CRNP measuring hourly SWC, a roving version of the CRNP has 156 been used to reliably measure SWC at temporal resolutions as low as 1 minute (Chrisman et al.,157 2013; Dong et al., 2014) providing the ability to make SWC maps over hundreds of square 158 kilometers in a single day. Moreover, Franz et al. (2015) found that a combination of fixed and 159 roving CRNP data in a statistical framework has the ability to form an accurate, real-time, and    calibration function has been successfully tested against direct sampling and point sensor 176 measurements with RMSE < 0.03 cm 3 /cm 3 across the globe including arid shrublands in 177 Arizona, USA (Franz et al., 2012), semi-arid forests in Utah, USA (Lv et al., 2014), to humid where ! 0 (g/g) is the gravimetric pore water content in the soil, ! 23 (g/g) is the soil lattice water, 187 and ! 456 (g/g) is the soil organic carbon water equivalent. The volumetric soil water content, 188 SWC, (cm 3 /cm 3 ) is found by multiplying ! 0 by 7 8 7 9 , where : ; (g/cm 3 ) is dry soil bulk density and 189 : < = 1 g/cm 3 is the density of water.

190
To account for effects of time varying above-ground vegetation on the epithermal neutron

205
We note the coefficients are less suitable for forest canopies given the need for a neutron 206 geometric efficiency factor described further in the supplemental material of Franz et al. (2013). 207 We also refer the reader to Coopersmith et al. (2014)  composite samples can be analyzed directly for lattice water (g/g), soil total carbon (TC, g/g), 221 and inorganic carbon (TIC, g/g) determined by measuring CO 2 after the sample is acidified (e.g.  (Table S1). We note that this procedure could 255 be used globally if in-situ lattice water samples were available for all 25 soil taxonomic groups.

256
From these relationships, a map of the CONUS lattice water weight percent was developed by 257 using either the mean value of the in-situ lattice water or the linear relationships between clay 258 weight percent (from the GSDE) and the lattice water in-situ samples. Additionally, in-situ 259 samples of soil organic carbon, bulk density, clay weight percent, and lattice water were 260 compared against the same parameters derived from the GSDE.  . Using these data, we simulated 100,000 realizations of the "true" (i.e. from the in-situ 307 sampling) and perturbed soil properties using a multivariate normal distribution. Using a range of 308 observed neutron counts and solving equations (1-2) with the true and perturbed soil properties, 309 we also estimated the true and perturbed SWC. In order to provide realistic constraints on the

320
The comparisons between observed clay weight percent, soil bulk density, soil organic 321 carbon and the GSDE values are summarized in Table S1 and Figure 2 a soils (Greacen, 1981). We found that a significant linear relationship existed between clay wt. %   the error propagation analysis described in section 2.6 and 3.3. We note that each of the mean 343 differences followed a normal distribution (see Table S1 for in-situ and GSDE values).

346
Using the 11 years of destructive vegetation sampling from 3 fields near Mead, NE, we 347 found that the GrWDRVI was able to predict SWB when partitioning the data into maize and within the uncertainty of destructive biomass sampling in crops (Franz et al., 2013;2015

381
In order to further assess the accuracy of our datasets, we synthetically altered the 382 parameters via a Monte Carlo error analysis. This was done using the GSDE soil parameters 383 (! 23 , ! 456 , and : ; ) as compared to using local sampling ( Figure 6). The analysis revealed that 384 for the given bounds of ! 23 , ! 456 , and : ; , the maximum RSME was around 0.035 cm 3 /cm 3 at a 385 SWC = 0.40 cm 3 /cm 3 . The asymmetric shape of all the curves is expected given the nonlinear 386 calibration function in Eq. (4) and the bounded nature of soil moisture. We found that : ; was by 387 far the most sensitive parameter, followed by ! 23 and then ! 456 . We expect the influence of 388 vegetation changes to be small on the overall accuracy of SWC (<0.01 cm 3 /cm 3 ) given the low 389 RMSE described in section 3.2 (< 1 kg/m 2 , which is ~1 mm of water or 0.0033 cm 3 /cm 3 for a soil 390 depth of 300 mm). We also note the critical factor in the error propagation analysis is the 391 assumed range of : ; , given that it is directly multiplied by the gravimetric water content in the  water for the mollisol soil taxonomic group (see Greacen, 1981;Zreda et al., 2012). This strong given the small standard errors of the means (not shown but can be calculated from data in Table   415 1). The current analysis did not contain enough samples for the soil taxonomic groups of andisol, 416 gelisol, histosol, oxisol, or vertisol to perform a linear regression or assign a mean value. We  Table S1). Given the widespread interest in both the fixed and roving cosmic-ray technology, a While the developed regression relationships for maize and soybean (Table S3) were 466 tested against independent biomass estimates from Waco, NE ( Figure 5), we note that further 467 validation is needed. In terms of a strategy for estimating SDB, we suggest that proxies such as 468 crop type and growth stage be used. Franz et al. (2013 and2015) found that in early stages, compute proxies (e.g. growing degree days) or simulated from crop models (Allen et al. 1998). 473 We note that having a reasonably accurate estimate of SWB and thus BWE (within ~ 1 kg/m 2 ) is 474 all that is required to have a relatively small impact (< 0.01 cm 3 /cm 3 ) on the estimated SWC. 475 Finally, we note that this methodology is not applicable to areas with woody biomass. Following

482
In this work, we developed a framework using globally available datasets for estimating        (Table S1).  Table 1 for summary of data by taxonomic group, Table S1 for raw data, 744 and Table 2 for statistical summary of differences between in-situ and GSDE product. Note error 745 bars denote +/-1 standard deviation.  Table 1  bulk density, soil lattice water, soil organic carbon, and clay weight fraction collected over a 12.6 752 ha circle and averaged over the top 30 cm (Table S1). Missing areas indicate surface water 753 bodies or soil taxonomic groups with no or limited in-situ lattice water sampling (see Table 1).