Estimating unconsolidated sediment cover thickness by using the horizontal distance to a bedrock outcrop as secondary information

Unconsolidated sediment cover thickness (D) above bedrock was estimated by using a publicly available well database from Norway, GRANADA. General challenges associated with such databases typically involve clustering and bias. However, if information about the horizontal distance to the nearest bedrock outcrop (L) is included, does the spatial estimation of D improve? This idea was tested by comparing two cross-validation results: ordinary kriging (OK) where L was disregarded; and co-kriging (CK) where cross-covariance between D and L was included. The analysis showed only minor differences between OK and CK with respect to differences between estimation and true values. However, the CK results gave in general less estimation variance compared to the OK results. All observations were declustered and transformed to standard normal probability density functions before estimation and back-transformed for the cross-validation analysis. The semivariogram analysis gave correlation lengths for D and L of approx. 10 and 6 km. These correlations reduce the estimation variance in the cross-validation analysis because more than 50 % of the data material had two or more observations within a radius of 5 km. The small-scale variance of D, however, was about 50 % of the total variance, which gave an accuracy of less than 60 % for most of the cross-validation cases. Despite the noisy character of the observations, the analysis demonstrated that L can be used as secondary information to reduce the estimation variance of D.


Introduction
Global warming and natural climate fluctuations give rise to urgent calls from water authorities to quantify impacts on the hydrological cycle.These needs are based on numerous indications of expected changes in the pattern of precipitation, temperature and vegetation (Haddeland et al., 2013;Bierkens, 2015;Tang and Oki, 2016).A cardinal question in hydrological modeling is the storage capacity of water in the catchment.Storage capacity determines catchment response to input from rainfall or snowmelt events.Storage volumes are therefore important for river discharge calculations and water balance assessments (Meyles et al., 2003;Tromp-van Meerveld and McDonnell, 2006;Beven, 2006;Skaugen et al., 2015).The primary storage capacity in many catchments is governed by the spatial distribution of sediments in the landscape (Lamb et al., 1997).
Most hydrological models use lumped averages for physical parameters in space, either for large areas or for the entire catchments (Beven and Binley, 1992;Devi et al., 2015).In some of these models, the storage volume is a calibration parameter that may be difficult to assess.In such cases the interpretation of the storage parameter may be misleading or even inconsistent with physics (Skaugen and Onof, 2013).Thus, to increase prediction reliability, calibration parameters should be replaced by physically based estimates as far as possible.
Soil properties have been registered and mapped by national authorities for many years, but the same attention has not been directed towards the sediment thickness and the Observations of D i are indicated in three boreholes (i = 1, 2, 3) and with the associated horizontal distance to the nearest outcrop (L i ).In areas where B is not exposed, B can be estimated by using observations of D as the primary variable and information of L as secondary information.
bedrock topography.Some remarkable exceptions do exist.One example is the bedrock topography map of Iowa, USA (Witzke et al., 2003).This map was constructed by using well data and digital soil maps that also included observations of outcrops and sparse cover ( < 1 m) of sediments (R. R. Anderson, personal communication, 2011).In the study presented below, similar data sources were used: a public well database and geological maps showing exposed bedrock and very thin cover of sediments.The intention with this paper is to test simple geostatistical methods to produce similar maps with less consumption of time.
Monitoring of environmental variables takes place as a response to an increasing awareness of human impact on nature.A large number of such variables are available today in public databases.One example is the Norwegian well database GRANADA (NGU, 2016a).According to Norwegian legislation, new wells, boreholes and probe drillings are reported to the Norwegian Geological Survey (Lovdata, 1996).One of the variables stored in GRANADA is the thickness of unconsolidated sediments at the borehole location D(u i ).The purpose of this study was to explore the possibilities of using recordings of D(u i ) to estimate sediment thickness E[D(u)], and estimation variance Var[D(u)].The number of recorded D(u i ) is increasing for every day, but the average spatial density of D(u i ) is still relatively sparse.Hence, to improve the estimation quality, which in this context means to minimize the estimation variance Var[D(u)], an auxiliary function is attached to D(u), namely the horizontal distance to the nearest outcrop L(u) (Fig. 1).
L(u) is interesting to explore as a secondary variable because it is easy to derive at any location of interest.The statistical relation, however, between D(u) and L(u) is not obvious except when the bedrock is exposed to the atmosphere.If L(u j ) = 0, then by definition D(u j ) = 0.It does not imply, however, that if L(u j ) is small, D(u j ) is also small, because the bedrock topography may be very irregular or even discontinuous in some places.The contrary is also true: if L(u j ) is enormous, then D(u j ) is not necessarily always large.The reason is of course that the bedrock may undulate below a thin cover of sediments.Even though there are local anomalies, there might exist a statistical relation between L(u) and D(u) that could be used to reduce the estimation uncertainty of D(u).
It should be emphasized that the relation between D(u) and L(u) depends on the geological setting.The data used for the current study are taken from an area where the distribution of unconsolidated sediments is determined by the last glaciation period.
Before presenting the data material in more detail, some statistical challenges should be mentioned.In brief, these challenges are related to asymmetric probability density functions (pdfs), clustering, and bias of empirical data.
High-resolution environmental data usually deviate strongly from Gaussian pdfs.The experimental pdfs of D(u i ) and L(u i ) reveal a majority of small values and a few extremely large values.Standard Gaussian statistics can therefore not be applied directly, at least not without modifications.The challenge of non-Gaussian pdfs is relevant for all problems dealing with processes at different scales.Bayesian statistics have given successful contributions to the estimation of non-Gaussian variables by using Markov chain Monte Carlo (MCMC) simulation algorithms and by including independent (a priori) information (Omre and Halvorsen, 1989;Andrieu et al., 2003).Recently, an efficient numerical method was introduced (Rue et al., 2009).In this method the estimation is expressed as a stochastic partial differential equation and the pdfs are derived for heterogeneous stochastic fields.
It is beyond the scope of this article to review the large number of different methods, but it should be kept in mind that there exist numerous methods that are available for exploring environmental data.The present study uses the normal score transform (Deutsch and Journel, 1998), which means that after the transform, standard Gaussian statistics were utilized for estimation and afterwards transformed back to the original sampling domain.
Environmental data are prone to preferential sampling.Preferential sampling usually implies clustering and bias.In this context clustering means inhomogeneous sampling frequency in space, while bias is systematic oversampling (or undersampling) with respect to low (or high) values.Bias and clustering may appear as independent processes, but they may also be related to each other by another (hidden) factor.The data material used for the current study was af-fected by serious clustering.The reason is simply that wells, boreholes and probe drillings are located where people live.Urban areas account for a higher density of observations than rural or remote areas (Fig. 2).Clustering affects the estimation of statistical moments, and the effect of overand under-representation of observations should therefore be suppressed.Omre (1984) suggested calculating Thiessen polygons to control clustering effects.The area of the polygons is proportional to the weight coefficients associated with the different observations.In other studies observations are iteratively removed in the calculations of statistical moments (Olea, 2007).For the current study, a grid-based method was applied where declustering weights were obtained by gridding the sampling domain.The number of observations within each grid cell were used to calculate weight coefficients (Deutsch and Journel, 1998).In this way areas with a high density of observations received less weight than areas with less frequent observations.Biased experimental data are ubiquitous in environmental science.A prominent example is observations of precipitation.Several studies document a systematic deficit in the observations due to wind and turbulence (Wolff et al., 2015).In the context of sediment thickness D(u), there are also reasons for systematic underrepresentation of observations with large D(u i ).In locations where D(u) is large, it is more likely that drilling is terminated before reaching the basement, because of the drilling costs, than in locations with less sediment thickness.Abandoned wells are not recorded in the database, and the result is a systematic overrepresentation of wells with minor D(u i ).The working hypothesis is to use the statistical relation between D(u) and L(u) to improve the estimates of D(u) in a similar way to how wind speed is used as secondary information for better estimates of precipitation (Wolff et al., 2015).

Point observations of sediment thickness
In 1996, Norwegian authorities implemented mandatory reporting of all drillings related to groundwater in mainland Norway (Lovdata, 1996).The purpose of the legislation was to provide the society with relevant groundwater observations.The Geological Survey of Norway (NGU) manages the regulations and stores the data in the GRANADA well database.As a public service, the data are freely accessible for downloading (NGU, 2016a).According to recent statistics, about 44 % of the recorded boreholes were drilled for the purpose of energy extraction (NGU, 2016b).At the startup of this study the total number of recorded observations was 54 194 (Table 1).Of these recordings, 48 628 were boreholes, 3740 wells were in unconsolidated sediments, and 1826 were probe drillings.Explicit documentation of D was not avail-able for all GRANADA recordings.For boreholes, however, it is possible to derive D with quite high precision by using information of the casing length.A casing is necessary in locations with unconsolidated material to prevent sediments from entering the well.Because casing is a considerable cost, the casing length is usually reported.Based on the GRANADA recordings, the casing was on average drilled 2 m into the bedrock.Hence, in cases where only casing length was reported, D was set equal to the casing length minus 2 m.In the following, the GRANADA recordings are referred to as boreholes because this is the vast majority of the data material.

Land cover information
The secondary variable, L, was calculated from digital maps of unconsolidated sediments (NGU, 2016c).The total areal extensions of different sediments are listed in Table 1.The sediments are represented in terms of polygons in a geographical information system (GIS).Sediments covered by water (lakes, rivers, and glaciers) are not included in Table 1.The total sum of land cover polygons is 307 104 km 2 , while the total area of mainland Norway is 323 781 km 2 (Kartverket, 2016).The difference should in principle be identical to the areal extension of lakes, rivers and glaciers.Thus, according to the land cover polygons (Table 1), water covers 5.2 % of mainland Norway.Updated figures from the Norwegian Mapping Authority, however, show that lakes (5.7 %), glaciers (0.8 %) and rivers (0.4 %) constitute together 6.9 % of mainland Norway (Kartverket, 2016).The difference (1.7 %) indicates the irreducible uncertainty for this kind of statistics.The relative uncertainty for individual categories is higher because positive and negative deviations cancel out each other.It is also important to keep in mind that the actual uncertainty, with respect to areal information, increases with decreasing size of the land category.This precaution is relevant when point information from one data source (GRANADA) is combined with areal information from another source (GIS maps).

Geological setting
Before explaining the primary screening of boreholes, a few words on the geological setting are required.The vast bulk volume of unconsolidated sediments in mainland Norway is from the last glaciation (Weichselian).More than 90 % of the glacial erosion products were deposited offshore, and exposed bedrocks or sparse covers of sediments characterize the Norwegian landscape (Olsen et al., 2013).Here, in the current study, the term "exposed bedrock" includes polygons identified as uncovered bedrock (id.130, Table 1).In addition, polygons labeled as "exposed bedrock or very thin cover of soil or organic matter" were included (id.100, 101 and 140, Table 1 thin till material covers about 20 % of the land area (id.12, Table 1), and Olsen et al. (2013) include this category when they define areas classified as exposed bedrock.In that case exposed bedrock makes up 55 % of the land area.Peatlands cover 5 % of the country (id.90, Table 1).According to Olsen et al. (2013) the average thickness of the continuous till is approximately 6 m.They did not include any further discussion on the estimation of sediment thickness based on recorded data.This issue will be elaborated further in the study presented below.

Exploratory data analysis
Figure 2 shows that both D and L deviate strongly from Gaussian (normal) probability density functions (pdfs).The same is also true for the logarithmic values (Fig. 2).The mean value of D = 5.5 m corresponds well to the value reported by Olsen et al. (2013), but 50 % of the recorded data had D <= 2 m, which implies a positively skewed pdf.
The average horizontal distance to the outcrop is L = 832 m, while 50 % of the boreholes had L ≤ 460 m.Clustering of boreholes (Fig. 2) can easily be seen on the GRANADA webpage (NGU, 2016a).This uneven spatial sampling affects the inference of statistical moments and the spatial correlation structure.
The mean and standard deviation of D and L as a function of separation distance h are given in Fig. 3 for h = 20 m and h = 150 m.It should be noted that the highest values of the mean and standard deviation of D occur at small (h < 100 m) separation distances.This is opposite to what is shown for mean and standard deviations of L, which are small for minor separation distances, and which increase to maximum values around h = 2.5 km, and then decay towards h = 10 km.From Fig. 3 it is clear that when the separation distance h to the nearest borehole increases, the number of low values of D increases.This feature might be caused by preferential sampling, which implies that there is a systematic overrepresentation of drillings that has minor D values.Thus, Fig. 3 indicates a bias in the observations of D.

Method
For the current study, multi-Gaussian methods were applied to estimate sediment thickness D(u), where u ∈ and is the geographical domain covered by the database (in this case mainland Norway).Multi-Gaussian methods are well documented in the literature (Isaaks and Srivastava, 1989;Journel and Huijbregts, 1989;Deutsch and Journel, 1998), but to make it easier for interested readers to reproduce and improve the results, the most important equations and algorithm are presented in the following.As mentioned above, the main purpose of the study was to evaluate whether the secondary information L can be used to improve the estimates of the primary variable D or not.This question was addressed by performing a conventional cross-validation of the GRANADA boreholes by successively leaving out information on D (but not L), and estimating D at the locations where observations of D were left out.First, the cross-validation was performed by including the primary variable D only.Then, secondly, the cross-validation was done by including the secondary variable.
More formally expressed, two cumulative density functions (cdfs) were compared to each other for all borehole locations u j where j = 1, . .., N and N is the number of GRANADA boreholes (cf. the section above).If the function of interest is Gaussian Z ∈ N (0, 1), then the complete cdf is described by the first two moments.Thus, the task was to compare estimates based on D alone, with estimates based on D and L: and Here, j = 1, . .., N , and i = 1, . .., j − 1, i = j, j + 1, . .., N , where N is the number of observations (cf.Sect 2.4).For this case study Eq. ( 1) was obtained by ordinary kriging (OK) and Eq. ( 2) by co-kriging (CK).Before solving Eqs. ( 1) and (2), the experimental data need preprocessing to suppress effects of preferential sampling, and since Gaussian estimation methods were applied, the data need to be transformed to a standard normal pdf.

Declustering
The purpose of declustering is to compensate for uneven sampling.This was done by giving less weight to observations in areas of high sampling density and a relative increase in weights in areas of sparse sampling.For this case study, the weights were found by gridding of the sampling domain and counting the number of observations in each grid cell.The weights were set equal to the inverse of the number of boreholes in the corresponding grid cell.These weights, however, are grid dependent.Hence, the following procedure was implemented to minimize the grid dependency.
1. Decide the size for the grid elements u = ( x, y) that constitute a uniform grid.
2. Choose an (arbitrary) origin u 0 and make a regular mesh that covers the estimation area .The mesh consists of u k elements, where k = 1, . .., M, and M is the number of grid elements.
3. Count the number of boreholes n k (u 0 ), and calculate the declustering weights c k (u 0 ), for each well in u k : where M is the number of grid elements in the mesh.
4. Because n k (u 0 ) in Eq. ( 3) depends on the grid origin u 0 , it is necessary to repeat steps (2) to (3) and change the grid origin to where r = 1, . .., p and the lag δ u.The number of iterations p should be large enough to get a stable average.Deutsch and Journel (1998) recommend p ≥ 6.Here, in the current case study, p = 7 and δ = 100 m.

Finally,
where c i denotes the declustering weight for the individual boreholes in the database, i = 1, . .., N , where N is the number of boreholes.
The declustering coefficients c in Eq. ( 5) imply that the total variance of the experimental data is reduced and the correlation length is increased.This effect is called regularization in geostatistical terminology.It means that the declustering coefficients also depend on the grid size u.Thus, the final step is to repeat (1) to (5) above, but with a different grid size.The grid size that minimizes the regularization effect should be employed.

Normal score transform
Application of Gaussian interpolation methods implies that the estimated function Z belongs to a standard normal pdf N(0, 1).In this case, the stochastic function X = (D, L) is not Gaussian ( ∈ N (0, 1)), which means that a transformation is necessary.The normal score transform implies that the quantiles p k in the original cdf, F (X), correspond to the quantiles in a standard normal Gaussian cdf, G(Z), where Z ∈ N (0, 1) (Goovaerts et al., 2005): where p * k is the quantiles in the standard normal cdf, and ϕ denotes the transformation of X corresponding to the inverse Gaussian G −1 cdf of D or L. The transformation in Eq. ( 6) was done by linear interpolation (or extrapolation) from the table of regular sampled Z ∈ N (0, 1) based on the ranked values (percentiles) of X(u i ) ∈ N (0, 1).
The normal score transform requires a monotonic function to be unique.This is a problem if the data are censored (Huang and Wellner, 1997;Deutsch and Journel, 1998;Goovaerts et al., 2005;Saito and Goovaerts, 2000), which means that the true value is only observed within intervals.This is the case for the lower values in the current experimental data (D = [0.1;0.5; 1] m), which indicate that the true depth is only roughly recorded.For the current study, the normal score transform was done on declustered data which "corrected" the observations and thus removed overrepresentation of some observations; thus, the transformation to N (0, 1) was unique.

Experimental semivariogram and cross-semivariogram
The spatial structure of the data Z was described by the experimental semivariogram function: where N (h) is the number of data pairs in the separation interval h, and where Z X is the normal score transform (Eq.6) of either D or L.
In addition to the experimental semivariogram, the mean m(h) and the variance s 2 (h) were calculated as a function of h, and where N (h) is the number of observations for the separation interval h.
The experimental cross-semivariogram was estimated by expressing the two functions Z D (h) and Z L (h) as a sum of each other: This is possible because D and L were sampled in the same locations, and after the normal score transform (Eq.6) we know by definition that E[Z D ] = 0 and E[Z L ] = 0.In that case, the cross-semivariogram can be found by (Myers, 1982) which is valid if Z D (h) and Z L (h) are stationary functions in space with finite variance.These properties are difficult to prove in practice, but Myers (1982) suggests that if then Eq. ( 11) is valid.

Semivariogram and cross-semivariogram maps
Anisotropy structures in the experimental data may be discovered by calculation of semivariogram and crosssemivariogram maps.The same equations (Eqs.7 and 11) are applied, but instead of the separation vector h the intrinsic values are calculated as a function of the north-south and east-west components (h x , h y ) of the separation vector: where Z 1 and Z 2 denote stochastic functions.If Z 1 = Z 2 (i.e., the normal score transform of D or L), then Eq. ( 13) is the semivariogram map for Z D or Z L .If Z 1 = Z D and Z 2 = Z L , then Eq. ( 13) is equivalent to the cross-semivariogram map between Z D and Z L .The semivariogram (or crosssemivariogram) maps are similar to the experimental semivariogram function, but the semivariance is visualized in N.-O.Kitterød: Estimating sediment thickness terms of a separation matrix instead of a separation vector.By calculating the semivariance in terms of a separation matrix, it is possible to reveal large-scale (systematic) directional dependencies -called anisotropy.If anisotropy in the observation material is evident, the next step is to calculate directionally dependent experimental semivariograms, where the direction of the searching sector is taken from the semivariogram map.The directionally dependent properties can be taken into account in the estimation procedure by using the directionally dependent searching directions derived from the semivariogram maps.An alternative is to transform the observation coordinates to an isotropic and orthogonal coordinate system (Langsholt et al., 1998).

Semivariogram -and covariance model
The semivariogram model, which was fit to the experimental semivariogram had the following form: where C 0 , C 1 , a, and α were the fitting parameters.In geostatistical terms C 0 is called the nugget (the variance at h → 0), C 0 +C 1 is the sill (the variance at h → ∞), a is the range, and α is the exponential coefficient (1 ≤ α ≤ 2).The constant β determines the variance at h = a.In this case β = − ln(20), which is equivalent to 95 % of γ (∞).For that reason β is called the practical range in the literature.The model parameters in Eq. ( 14) were obtained by minimizing the objective function ϒ: where K is the number of distance classes in the semivariogram.For the case study, the objective function was minimized by using the simulated annealing algorithm (Matlab, 2015).
The kriging equations below are expressed in terms of the covariance function: where the constant β = − ln(20), and the parameters C 0 , C 1 , a, and α were found by minimizing Eq. ( 15).

Kriging and co-kriging equations
For this project the kriging and co-kriging equations were implemented in Matlab ( 2015), which makes it convenient to express the equations in terms of matrix notation.A thorough mathematical derivation of the equations can be found in Myers (1982).In matrix notation the estimation is expressed as where Ẑ is the estimated variable in location u.If k = 1, . .., m variables are involved, then Ẑ is a row vector with m entries (1 × m matrix), Z obs contains the observations in a 1 × m matrix, and is an m × m matrix where the column vectors are the estimation weights.In this case, m = 2, Eq. ( 17) is written as For the present case study, the observations Z D (obs) and Z L (obs) were available in the same locations u i , i = 1, . .., n.
The weights are found by solving the kriging equations (Myers, 1991): where C −1 denotes the inverse of the matrix C, which in this case reads as where C kk (h) is the covariance model (Eq.16) and where k = D, L. I DD = I LL are row vectors of ones (1 × n), and I DL = I LD are row vectors of either ones or zeros depending on whether all weights should sum up to one or not (I T is the transposed of I ).
The matrix C0 denotes the covariance between the point of estimation (u) and the observations: where C0 kk (h) is given in Eq. ( 16) and where k = D, L. The symbol 0 * indicates that the entry might be one or zero, depending on the Lagrange condition that all weights should sum up to one or only the weights for the single variable estimation problem.Again, zero is the default value.The estimation weights kk (h) and the Lagrange multipliers µ kk (h) (k = D, L) are contained in the X matrix: The estimation variance σ 2 K can then be written (Myers, 1982) as where the total variance is Var The quality of the estimation method depends on the absolute error A E , which is the difference between the observed value and the estimated value: Average values for all estimates are given by the mean absolute error M AE , and the standard deviation of the absolute error S AE : where n is the number of cross-validated observations.In addition, it is necessary to quantify the precision of the estimates.Two concerns are taken into account in this study: first, if the estimate is within a given confidence interval (P R ): where ω depends on the level of confidence.The accuracy A C of the estimates is then given by and the accuracy is given as a fraction of the total number of observations F A C : where n is the number of cross-validated observations.If two methods have the same level of accuracy, then the method that gives the best precision should be preferred.Precision can be taken into account by scaling the absolute error by the estimation uncertainty: and the scaled precision S P is written as and with the mean scaled precision M SP , expressed as and the standard deviation of the mean scaled precision S SP : where n is the number of cross-validated observations.

Declustering and normal score transform
The GRANADA boreholes used in the current study were clustered in urban areas (Fig. 2).To minimize the impact of this uneven spatial sampling, declustering weights were calculated according to the procedure described in Sect 3.1.The window sizes (w = wx = wy) applied to calculate the declustering weights were w = [500; 1000; 2000; 4000] m.Average declustering coefficients were calculated by moving the grid in seven steps p = 7, with an offset δ = 100 m (Eq.4).
The skewness given by the ratio of the median to the mean for the different declustering windows w shows that maximum skewness appears for w = 500 m (Table 2).For w = 1000 m, however, the skewness was more similar to the original (raw) observations; thus, for the cross-validation analysis the declustering weights were calculated with w = 1000 m.The declustering coefficients show that about 13 % of the boreholes had 10 or more boreholes located within a neighborhood of 5 km.More than 50 % of the boreholes had two or more boreholes within a search radius of 5 km, and about 23 % had no other wells within a 5 km neighborhood.The normal score transform (Eq.6) yields by definition a normal pdf of the variables involved.The transform relies, however, on the experimental data, which means that sampling of extreme values has an impact on the results.The dataset used for calculations (N = 19 682 samples) had a minimum observed D = 0.05 m and a maximum D = 229 m (Fig. 2).Some of the extreme high values may represent outliers or recording errors; thus, for the cross-validation, study boreholes with recorded sediment thicknesses of more than 100 m were not included in the calculations.The scatter plot of the raw observations shows the censored character of the data with a high frequency of recordings at even numbers (0.10; 0.20; 0.30 m; etc.).This is very clear from 0.1 to 1 m, and to some degree from 1 to 10 m (Fig. 4a).After declustering, the censored character was less obvious (Fig. 4b).The semivariogram analyses and the kriging procedures were employed on the normal score data (Z D and Z L ).After kriging, the estimation results were transformed back by the inverse normal score transform Eq. ( 6) and divided by the declustering coefficients.

Semivariogram maps
Semivariogram maps (Eq.13) of depth to bedrock Z D and horizontal distance to outcrop Z L were calculated to detect large-scale anisotropy in the data material.Anisotropy might be identified in Fig. 5 for the range (correlation length) of Z D .The range varies apparently as a function of direction with the slowest decay in the north-westerly direction (N35W-N45W) and with a somewhat faster decay in the south-easterly direction.had, however, a similar structure, which indicates that the apparent anisotropy might be an artifact due to the clustering of the observations.This presumption was tested by calculating artificial semivariogram maps based on the same borehole locations but where the observations were substituted by a random number.The artificial semivariogram maps revealed similar structures that can be seen in Fig. 5. Hence, the presumption of an artifact due to clustering cannot be ruled out.
For this reason no directional experimental semivariograms were calculated as part of this case study.

Experimental semivariograms and cross-semivariograms
The results of the semivariogram analysis confirm the existence of a correlation structure in the data (Fig. 6) that might be capitalized when estimating D(u).The model parameters given in Fig. 6 were obtained by minimizing the objective function (Eq.15) by the simulated annealing algorithm (Matlab, 2015).First, all parameters [C 0 , C 1 , a, α] were optimized; and then, secondly, C 0 was fixed and the remaining parameters [C 1 ; a; α] were simulated.This automatic procedure gave the model parameters shown in Fig. 6.The minimum of the objective function is not well defined everywhere and different combinations of model parameters gave almost similar results.The model parameters in Table 3 were evalu- ated in the cross-validation procedure below.The automatic calibration procedure gave an optimal correlation length of about a = 10 km for depth to bedrock Z D (h) (Fig. 6a).The most prominent feature, however, is the large nugget value C 0 , which in this case is about 50 % of the total variance: The experimental semivariogram for the horizontal distance to outcrop Z L (h) had a minor nugget value compared to the total variance (Fig. 6b).At the same time the correlation length (a = 5.9 km) was somewhat shorter compared to Z D (h).The experimental cross-semivariogram between Z D (h) and Z L (h) was calculated according to Eqs. ( 10) and ( 11).The nugget value was about 10 % of the total variance in this case, with a correlation length of a = 2.7 km. Finally, the cross-semivariogram was tested according to Eq. ( 12), but none of the parameter combinations in Table 3 violated the criterion.

Cross-validation
The purpose of the cross-validation was to evaluate the impact of using horizontal distance to outcrop as an additional variable for estimation of sediment thickness above the bedrock.In this case, the cross-validation was performed by leaving one observation out.At the point where the observed value was left out, ordinary kriging (OK) and co-kriging (CK) were performed by using the global model parameters given in Table 3.The differences between the estimation results and the observations left out were used to quantify the quality of the estimation procedure.Three criteria were used to distinguish the two estimation procedures: the mean absolute error (Eqs.24 and 25); the accuracy of the estimation results (Eqs.28 and 29); and the precision of the estimation results (Eqs.31 and 32).
In general, both OK and CK overestimate minor depths to bedrock and underestimate large depths (Fig. 7).The most important estimation criterion is usually considered to be the mean absolute error (Eq.25).With the model parameters tested in Table 3 there are only minor differences in the mean absolute error (Eq.25) between the OK and CK estimates (Table 4).The CK estimates have slightly lower mean absolute errors than the OK estimates unless the nugget value (C 0 ) for the cross-covariance between D and L approaches half of the total variance: C 0 + C 1 (Table 4).
In cases with a minor difference in the mean absolute error, the estimation results might be ranked according to criteria for estimation accuracy (Eq.28) and precision (Eq.31).For the present case study, the definitions of accuracy and precision were both related to the estimation variance (Eq.23), and in this respect, CK was superior compared to OK (Fig. 8).
In Fig. 9 scaled precision (Eq.31) is sorted and given as a function of cumulative accuracy S A C : where n max is the number of estimates where A C = 1.As long as the absolute estimation errors (Eq.24) are similar, OK yields a higher accuracy than CK because CK has a lower estimation variance.This result follows directly from the definitions in Eqs. ( 27) and ( 28).With ω = 1 in Eq. ( 27), the OK estimates gave an accuracy from 60 to 65 %, while CK had an accuracy of 50-60 %.At the same time CK yields an overall higher precision than OK because of lower estimation variances (Fig. 9).
A final result that deserves some attention is the location of estimates that did or did not fulfill the accuracy criteria.This is illustrated for mainland Norway and the Oslo area in Fig. 10.Three categories were visualized: (i) locations with low accuracy (A C = 0, Eq. 28); (ii) locations with good accuracy (A C = 1, Eq. 28) obtained either by OK or CK; and (iii) locations with good accuracy (A C = 1, Eq. 28) obtained only by the CK method.For all cases ω = 1 (Eq.27).

Discussion
Attention has been directed towards sediment thickness, D, in this article.The question has been raised whether information derived from public well databases on D(u i ) can be utilized for continuous estimation of D(u).A motivation for this attention has been the potential application of spatial estimates of D(u) in hydrology and geo-engineering.Combined with available information on soil properties or digital terrain elevation, storage capacity of water or bedrock topography might be estimated within predefined uncertainties and with feasible resources.It should be emphasized, however, that the purpose of the application should be taken into account when choosing the estimation method.In this case study, the normal score transforms and Gaussian estimation methods were applied, but none of these methods provide robust estimates of extreme values.If, for example, maximum D(u) is an important issue, stochastic simulation or non-Gaussian methods should be taken into account.Such topics, however, are left for further studies.

Clustering and bias
For the current case study, D(u) was derived from the GRANADA open-access database (NGU, 2016a).Public databases are prone to preferential sampling.In this context, preferential sampling implies two specific challenges that need to be discussed, namely clustering and bias.Clustering is due to the fact that wells and boreholes are located where people need them; thus, the spatial frequency of boreholes mirrors the population density (Fig. 2).Clustering of observations has an impact on statistical inference regarding statistical moments and semivariograms.Different approaches have been suggested to control the clustering effects.Olea (2007) suggested removal of wells randomly in areas with high density of observations, and then recalculation of the experimental semivariograms based on the remaining observations.The experimental semivariograms, however, turned out to be sensitive to the size of the searching window where clustered observations were removed.Thus, this method was Hydrol.Earth Syst.Sci., 21, 4195-4211, 2017 www.hydrol-earth-syst-sci.net/21/4195/2017/  3 and 4), with ordinary kriging OK and co-kriging CK results.The scatter plot to the left (a) shows that CK estimation variances are lower than OK estimation variances.The black line indicates a 1 : 1 relation.The histograms to the right (b) show the estimation variance for OK and CK.
disregarded in the current case study because the algorithm did not yield robust results.Omre (1984) suggested controlling clustering effects in the semivariogram by calculating weights that were inversely proportional to the Thiessen polygons for each observation.This method provides a set of weights that are mathematically sound, but it is relatively expensive with respect to computer resources, especially if the number of observations is large.Instead of Thiessen polygons a less computer demanding algorithm was employed, namely the moving grid method (Deutsch and Journel, 1998).By this method the declustering weights were inversely proportional to the average number of observations within the moving window (cf.Sect.3.1).The declustering weights depend on the size of the window (Table 2).In general it is recommended to use the window size w that maximizes the skewness of the pdf(s), which in this case was w = 500 m.However, w = 1000 m gave a skewness for D(u) that was closer to the original data; thus, the semivariograms were based on a declustering window w = 1000 m.The mean value from raw (not declustered) data was 5.5 m, but the declustered mean was reduced to 3.4 and 2.8 m for w = 500 and w = 1000 m, respectively (Table 2).
The problem of biased recordings of D(u) in the database is more difficult to assess.There are good reasons to expect that bias exists and that minor sediment thicknesses are overrepresented in the database.One indication is that mean and standard deviation are highest at minor separation distances,  3. Ordinary kriging (OK) yields the highest accuracy for most cases, but co-kriging CK gave the overall highest precision.
5   3 and 4).For this case, 37 % of the locations did not fulfill the accuracy criteria (Eq.28) indicated by the red dots (A C = 0); 63 % of the locations did fulfill the accuracy criteria by either the OK or CK method, indicated by the blue dots (A C = 1 from OK or CK)); for 3.5 % of the locations the accuracy criteria were fulfilled by the CK method and not the OK method, indicated by the green dots (A C = 1 for CK, not OK)).For 9.2 % of the locations the accuracy criteria were met by the OK method only (not shown).Geographical coordinates are given for UTM zone 33.
which indicates that willingness to continue drilling is less if D(u) is large and if there are no other wells in the close neighborhood (Fig. 3).Biased observations are a common problem for datasets sampled in open large-scale environments.The impact of bias may be controlled if there exists independent information on processes related to the variable of interest.Goovaerts et al. (2005) did a case study based on biased observations of arsenic concentration in groundwater.They used geological maps and utilized knowledge of arsenic concentration in specific geological units to control the bias.Wolff et al. (2015) reported biased recordings of precipitation from a meteorological gauge station.In this case the bias was due to turbulence in the wind field around the gauge equipment.They recorded wind speed and temperature together with precipitation and other meteorological variables, and derived func-tions for bias correction by application of Bayesian statistics.A similar token was applied in the current study.Here, horizontal distance to outcrop L(u) was evaluated as secondary information to control the impact of biased observations of sediment thickness D(u).
The cross-validation exercise presented here cannot verify a general relation between D(u) and L(u), but the results show that the estimation uncertainty was reduced by using L(u) as a secondary function.Non-biased relations between D(u) and L(u) ought to be investigated by further research, for example by utilizing datasets from geotechnical probe drillings.Results from such studies would increase the value of the GRANADA database and other similar databases.

Cross-validation
The cross-validation analysis indicates low estimation accuracy in urban areas.One reason for this result might be anthropogenic reallocation of unconsolidated matter, which includes removal of sediments in some places and deposition of unconsolidated matter in others.Similar problems might also be valid for identification of horizontal distance to outcrop.For further studies such locations might be disregarded or given less weights.One option is to allocate a quality tag to the D(u) recordings in the same manner as was done for recordings of geographical coordinates.
Both OK and CK overestimated small D(u) and underestimated large D(u) (Fig. 7).Such results are typical for Gaussian estimation methods applied on observations with positively skewed pdfs.Other case studies report similar results (Goovaerts et al., 2005), but it should be noticed that the double logarithmic scale exaggerates the deviations especially for minor depths.
The observations of D(u) had a high fraction of smallscale noise (C 0 in Fig. 6) relative to the total variance: C 0 + C 1 (Fig. 6).Efforts should be taken to control C 0 .One abatement measure might be achieved by attaching a quality assurance tag to D(u).In this way low-quality recordings could receive less weights or be filtered out.These kinds of measures would increase the quality of the GRANADA database.
Despite these uncertainties the cross-validation shows that the accuracy is higher than 60 % for the model parameters with highest scores (Table 4).For this case study, the estimation accuracy was set equal to one if the absolute estimation error was less than one standard deviation of the estimation uncertainty and zero for all others (Eqs. 27 and 28).By this definition, the accuracy increases by increasing estimation variance, which means that accuracy should be evaluated together with the estimation variance (Eq.23 and Fig. 8).For stochastic simulation the precision of the estimates is of primary interest.In such cases, the probability of extreme realizations may also be quantified.For such applications, the precision is more important than the accuracy of the estimation method.The cross-validation results show that the pre-cision in general is higher if the horizontal distance to the outcrop L(u) was included (Fig. 9).Because precision increases as a function of decreasing estimation variance, the cross-validations show that L(u) should be included despite the uncertainties in the experimental data.

Further studies
These results indicate that more advanced estimation procedures should be considered.In this case study, the total estimation domain (mainland Norway) was considered as homogeneous with respect to variance and correlation length.Methods that take local model parameters and local anisotropy into account may reduce the absolute estimation error but not necessarily the estimation variance.The same is true with respect to estimation methods that are more robust with respect to estimation of extreme realizations.For estimation of most likely minimum and maximum thickness of sediments within a given estimation area, stochastic simulations are recommended.
After initiation of this case study, the number of recorded boreholes, wells, and probe drillings in the GRANADA database has increased (NGU, 2016b).The new recordings might be used as an independent dataset for cross-validation purposes.One interesting candidate for further work is the approach suggested by Rue et al. (2009).They approximate the estimation problem to stochastic partial differential equations.In this method non-stationarity of statistical moments are taken into account, but at the same time less computer resources are spent on matrix inversions which is a challenge for applications with a large number of observations (Lindgren et al., 2011;Ingebrigtsen et al., 2014;Hu and Steinsland, 2016).
Finally, it is appropriate to recall that the primary purpose of the GRANADA database is not the recording of sediment thickness D(u) alone, but to provide information on groundwater resources in general (Lovdata, 1996).Hence, in this context, the present study is a call to explore public data to obtain important estimates for science and society.

Summary and conclusions
The GRANADA open-access database (NGU, 2016a) was used to derive point recordings of sediment thickness above the bedrock D(u).For each D(u) horizontal distance to nearest outcrop L(u) was derived from geological maps.The purpose was to utilize L(u) as a secondary function for estimation of D(u).Two estimation methods were employed: ordinary kriging (OK) and co-kriging (CK).A cross-validation analysis was performed to evaluate the additional information in the secondary function L(u).L(u) was disregarded in OK estimation but included in CK estimation.The crossvalidation results showed that CK provided overall lower mean absolute error compared to the OK results, but the dif-ferences were minor.The estimation uncertainty determines the estimation accuracy and the precision.These quantities might be considered as equally important as the mean absolute error.With respect to the estimation precision, the CK estimates were superior to OK estimates.This result demonstrates the value of using L(u) as a secondary function for estimation of D(u).The problem of clustering of observations can be controlled by calculation of declustering weights, but the relation between D(u) and L(u) should be explored in further studies to control the effect of biased observations.
The semivariogram analysis revealed a correlation length (range) for D of approximately 10 km and about 6 km for L. The cross-semivariogram between D and L gave a corresponding length of 2.7 km (Fig. 6).The recordings of D were affected by an ample small-scale variance (nugget value).Despite this problem, the estimation accuracy was quite high (Table 4).Between 50 and 60 % of the crossvalidation recordings had an accuracy of less than 1 kriging error σ K Eq. ( 23).
Hence, continuous estimates of D(u) might be derived for mainland Norway based on the GRANADA public well database.The challenge, however, is to provide estimates within confined uncertainties.The present case study demonstrates that this goal can be approached by using information embedded in the exposed bedrock.
Data availability.The data material for this study was downloaded in 2010 from public databases managed by the Geological Survey of Norway.Data on unconsolidated sediment thickness D(u) are available from the GRANADA database: http://geo.ngu.no/kart/granada/(NGU, 2016a).Data on horizontal distance to bedrock outcrop L(u) are available at (NGU, 2016d): http://geo.ngu.no/kart/losmasse/.
Competing interests.The author declares that he has no conflict of interest.

Figure 1 .
Figure 1.Surface topography T , sediment thickness D, and bedrock topography B. Observations of D i are indicated in three boreholes (i = 1, 2, 3) and with the associated horizontal distance to the nearest outcrop (L i ).In areas where B is not exposed, B can be estimated by using observations of D as the primary variable and information of L as secondary information.

Figure 3 .
Figure 3. Mean and standard deviation as a function of separation distance h (m), sediment thickness D to the right (a), and horizontal distance to outcrop L to the left (b).

Figure 4 .
Figure 4. Scatter plot of sediment thickness D and horizontal distance to outcrop L. Original (obs) and declustered (dcl) observations to the left (a), and normal score transforms Z D and Z L to the right (b).The black line indicates a "perfect" (1 : 1) relation between Z D and Z L .

Figure 5 .
Figure 5. Semivariogram map ( γ (h x , h y )) of the normal score transformed sediment thickness Z D ; grid cells of 100 × 100 m.

Figure 6 .
Figure 6.Semivariogram and cross-semivariogram functions for normal score data: semivariograms for sediment thickness Z D (a); and horizontal distance to the nearest outcrop Z L (b).Crosssemivariongram for Z D and Z L (c).Dots indicate the experimental data and solid lines are the model functions.

Figure 7 .
Figure 7. Cross-validation results of sediment thickness D. Normal score observations Z D against estimates Z * D to the left (a), and raw observations D and estimates D * to the right (b).OK denotes results from ordinary kriging, and CK from co-kriging.

Figure 9 .
Figure 9. Scaled precision (Eq.31) plotted as a function of cumulative accuracy (Eq.34) for estimation cases [A; B; C] to the left (a), and [D; E; F ] to the right (b).Model parameters are given in Table3.Ordinary kriging (OK) yields the highest accuracy for most cases, but co-kriging CK gave the overall highest precision.

Figure 10 .
Figure 10.Outline of the Scandinavian countries (a).The boxes indicate the subsections of Norway: northern Norway (b), southern Norway (c), and the Oslo region (d).The cross-validation results of GRANADA boreholes (NGU, 2016a) for case F (Tables3 and 4).For this case, 37 % of the locations did not fulfill the accuracy criteria (Eq.28) indicated by the red dots (A C = 0); 63 % of the locations did fulfill the accuracy criteria by either the OK or CK method, indicated by the blue dots (A C = 1 from OK or CK)); for 3.5 % of the locations the accuracy criteria were fulfilled by the CK method and not the OK method, indicated by the green dots (A C = 1 for CK, not OK)).For 9.2 % of the locations the accuracy criteria were met by the OK method only (not shown).Geographical coordinates are given for UTM zone 33.

Table 1 .
Land cover statistics of mainland Norway.
a Land cover identification numbers (NGU, 2016d).b Area of land cover polygons exposed to the atmosphere, A atm = A i = 307 104 km 2 .The total area of mainland Norway is A tot = 323 781 km 2 (Kartverket, 2016).c Number of recorded boreholes, wells, and probe drillings in GRANADA 2010 (NGU, 2016d).d Fraction of land cover polygons relative to A atm .e Fraction of land cover polygons relative to A tot .Mainland Norway covered by water:1

Table 2 .
Median and mean of depth to bedrock D (m), and horizon- tal distance to outcrop L (m), for raw observations (window size = 0 m) and declustered data with window [500; 1000; 2000; 4000] m.The skewness index is skw = median/mean.

Table 3 .
Covariance and cross-covariance model parameters * Eq. (16) used for cross-validation.All models are derived from declustered normal score transformed variables of depth to bedrock D and the horizontal distance to the nearest outcrop L. Practical range β = log(0.05)for all models.
M a -lowest mean absolute error.b -highest accuracy.c -highest precision.