We present a workflow for efficient construction and calibration of large-scale groundwater models that includes the integration of airborne electromagnetic (AEM) data and hydrological data. In the first step, the AEM data are inverted to form a 3-D geophysical model. In the second step, the 3-D geophysical model is translated, using a spatially dependent petrophysical relationship, to form a 3-D hydraulic conductivity distribution. The geophysical models and the hydrological data are used to estimate spatially distributed petrophysical shape factors. The shape factors primarily work as translators between resistivity and hydraulic conductivity, but they can also compensate for structural defects in the geophysical model.

The method is demonstrated for a synthetic case study with sharp transitions among various types of deposits. Besides demonstrating the methodology, we demonstrate the importance of using geophysical regularization constraints that conform well to the depositional environment. This is done by inverting the AEM data using either smoothness (smooth) constraints or minimum gradient support (sharp) constraints, where the use of sharp constraints conforms best to the environment. The dependency on AEM data quality is also tested by inverting the geophysical model using data corrupted with four different levels of background noise. Subsequently, the geophysical models are used to construct competing groundwater models for which the shape factors are calibrated. The performance of each groundwater model is tested with respect to four types of prediction that are beyond the calibration base: a pumping well's recharge area and groundwater age, respectively, are predicted by applying the same stress as for the hydrologic model calibration; and head and stream discharge are predicted for a different stress situation.

As expected, in this case the predictive capability of a groundwater model is better when it is based on a sharp geophysical model instead of a smoothness constraint. This is true for predictions of recharge area, head change, and stream discharge, while we find no improvement for prediction of groundwater age. Furthermore, we show that the model prediction accuracy improves with AEM data quality for predictions of recharge area, head change, and stream discharge, while there appears to be no accuracy improvement for the prediction of groundwater age.

Large-scale geological and groundwater models are used extensively to support aquifer management. (Here “large-scale” refers to an area of tens to thousands of square kilometers.) Determining the distribution of hydraulic properties and the geometry and connectivity of the groundwater system is of significant importance because these features control the flow paths (Desbarats and Srivastava, 1991; Fogg et al., 1999; Weissmann and Fogg, 1999). Incorrect reconstruction of the geological structures has thus been recognized as an important source of uncertainty when a groundwater model is used to make predictions outside its calibration base (Refsgaard et al., 2012; Seifert et al., 2012; Zhou et al., 2014). The data traditionally used for structural mapping include lithological logs from boreholes, hydrological data, and hydraulic testing results, but these data are often sparse and unevenly distributed within an investigated domain. In these (very common) cases, data scarcity becomes a major obstacle for structural mapping in relation to large-scale groundwater modeling (Refsgaard et al., 2012; Zhou et al., 2014).

Ground-based and airborne electromagnetic methods have shown great potential for mapping geological structures (Jørgensen et al., 2003; Thomsen et al., 2004; Abraham et al., 2012; Oldenborger et al., 2013; He et al., 2014; Munday et al., 2015). For large-scale mapping, the airborne electromagnetic method (AEM) is efficient and cost-effective, supplementing traditional data with dense estimates of electrical resistivity which, in some environments, inform about the lithology and thereby about the structure (Robinson et al., 2008; Binley et al., 2015). AEM measurements can be made quickly over large areas, and the resolution can be as fine as 25 m in the horizontal direction and 5 m in the vertical (Schamper et al., 2014), with a penetration depth of up to several hundred meters (Siemon et al., 2009).

Various methods to incorporate resistivity estimates (hereafter referred to as resistivity models) into groundwater model construction have been reported. Manual and knowledge-driven approaches have been used to combine geological, hydrological, and geophysical data with expert knowledge (Jørgensen et al., 2013). However, the manual approach is subjective and can be very time consuming and expensive to use when resistivity models from large AEM surveys are to be incorporated into model construction. Alternatively, more objective and cost-efficient geostatistical modeling approaches (Carle and Fogg, 1996; Deutsch and Journel, 1998; Strebelle, 2002) are available for generating models from a combination of borehole information and AEM-determined resistivity models. For example: He et al. (2014) used a transition probability indicator simulation approach (Carle and Fogg, 1996), while Gunnink and Siemon (2015) used sequential indicator simulation (Deutsch, 2006). Marker et al. (2015) used a deterministic strategy for the integration of AEM resistivity models into the hydrological modeling process.

The studies just mentioned all used sequential hydrogeophysical inversion approaches (SHI, as defined by Ferré et al., 2009). In SHI the geophysical data are inverted first and independently from the later inversion of the hydrological data. For large-scale groundwater modeling, Herckenrath et al. (2013) and Christensen et al. (2016) used both SHI and joint hydrogeophysical inversion approaches (JHI; as defined by Ferré et al., 2009). In JHI, the geophysical and hydrological data are inverted jointly by linking the geophysical and hydrological models directly through some of their parameters. The linking can, for example, be done by using an Archie's law inspired petrophysical relationship (Archie, 1942) to translate between the geophysical and hydrologic parameters.

In general, petrophysical relationships are difficult to establish because such translation tends to be site-, scale- and facies-specific (Chen et al., 2001; Hyndman and Tronicke, 2005; Slater, 2007) and uncertain (Mazáč et al., 1985; Slater, 2007). The studies by Herckenrath et al. (2013) and Christensen et al. (2016) used a fixed petrophysical relationship throughout the model domain. Better results can be obtained by using a spatially variable relationship, which allows for local translation between hydraulic conductivity and electrical resistivity, and by including the spatially dependent petrophysical parameters in the optimization process (Linde et al., 2006).

There are two other challenges for incorporating resistivity models into large-scale groundwater modeling: differences in model discretization and choice of geophysical regularization methodology. Groundwater models are often discretized in a regular voxel grid, while the traditional resistivity models are 1-D and placed at the respective sounding location. For airborne surveys, for example, the resistivity models are normally located along the flight lines (Christiansen et al., 2006). Such resistivity models therefore need to be relocated to conform to the grid of the groundwater model. The relocation will often be a subtle process where information can be lost. To address this issue, Fiandaca et al. (2015) presented a geophysical modeling approach referred to as “voxel inversion”, which decouples the geophysical inversion model space from the geophysical measurement positions. This allows estimation of a 3-D geophysical model that is discretized on the same voxel grid as the groundwater model.

Traditionally, geophysical regularization includes horizontal and vertical smoothing constraints (Constable et al., 1987) or is limited to a few-layer inversion (Auken and Christiansen, 2004), whereas a groundwater system often has sharp-layer or body boundaries. It has therefore been recognized, e.g., by Day-Lewis (2005) and others, that the regularization used to stabilize the geophysical inversion may not reflect the actual hydrologic conditions unless it is chosen carefully. If, for example, smooth regularization is used to estimate resistivity models in a sharply layered system, it will produce a blurred resistivity distribution from which one should be careful with inferring the spatial distribution of hydraulic conductivity to be used in a groundwater model. In this case, it would be better to use minimum gradient support regularization (Portniaguine and Zhdanov, 1999; Blaschek et al., 2008; Vignoli et al., 2015) for the geophysical inversion because the estimated resistivity distribution will tend to consist of fewer, more sharply defined layer boundaries (vertically and horizontally). However, it is often ignored that geophysical data can be inverted using alternative regularization schemes, and to test whether the alternative geophysical models affect the predictive capability of a groundwater model.

Conceptual flowchart for the sequential hydrogeophysical inversion. First step (box 1): geophysical inversion. Second step (box 2): groundwater model calibration where shape factors of the petrophysical relationship are estimated using hydrological data. Third step (box 3): the calibrated groundwater model is used for predictive modeling.

The main objective of the present study is to present a novel sequential hydrogeophysical approach whereby a voxel-based 3-D resistivity model is used to parameterize and calibrate a groundwater model. The model parameterization methodology allows the calibration to compensate for errors in the resistivity model. Furthermore, we will demonstrate that it is important for groundwater flow simulations that the underlying resistivity model is estimated using regularization constraints that conform well to the geological environment. Finally, we analyze how groundwater model prediction accuracy depends on the quality of the geophysical data that were used to estimate the resistivity model. Section 2 of the paper presents the methodology. Section 3 describes the synthetic test case used for our demonstration purposes. Section 4 presents the results, while Sects. 5 and 6 present discussions and conclusions of the work, respectively.

Conceptually, we define a translator function that describes the petrophysical relationship between electrical resistivity and hydraulic conductivity. The petrophysical relationship can vary horizontally and vertically, thereby adapting to the local conditions in translation from the geophysical model space to the hydrological model space. Through inversion, the 3-D spatially dependent optimal parameters of the petrophysical relationship are estimated for each layer interval, thereby covering the entire 3-D model space.

Figure 1 provides a workflow for the method. First, the gathered airborne electromagnetic (AEM) data from the survey area are inverted with smooth or sharp horizontal and vertical constraints (Vignoli et al., 2015). This is done by using a recently developed voxel inversion scheme which decouples the geophysical model from the position of the acquired data (Fiandaca et al., 2015). The geophysical model space thus corresponds to the full 3-D hydrological model grid. Secondly, the geophysical voxel-based resistivity model is used as input for the sequential hydrological inversion. The geophysical model parameter (resistivity) is linked to the main investigated parameter (hydraulic conductivity) through a petrophysical relationship that has unknown shape factor values. The shape factor values are estimated through a hydrological inversion which minimizes an objective function describing the misfit between simulated groundwater model responses and corresponding observed hydrological data. Finally, the calibrated groundwater model can be used to make a set of relevant hydrologic predictions. The various steps of the methodology are explained in more detail in the following.

In the first step (Fig. 1, box 1), the AEM data undergo constrained deterministic inversion using recently developed voxel inversion approaches. This approach allows the geophysical model spaces to be spatially decoupled from the geophysical measurement positions (Fiandaca et al., 2015). In most inversion schemes, the forward and inverse formulations use the same model discretization. In the voxel formulation, the two model discretizations are decoupled. The voxel model space thus defines the geophysical properties on the nodes of a regular 3-D grid.

For calculating the forward responses, a “virtual” 1-D model is built at each sounding position. The “virtual” 1-D model is defined by a number of layers, and layer thicknesses. The geophysical properties are interpolated from the voxel model space into the layer centers of the virtual model that is subsequently used to simulate the forward response for the corresponding sounding.

The voxel inversion approach thus allows for inversion of AEM data into a geophysical model defined on a 3-D regular grid, regardless of the sounding positions. As a result, the geophysical inversion can be conducted using the same grid as that defined for a 3-D groundwater model, thereby minimizing scaling issues in the coupling of geophysical and hydrological models.

The general solution to the non-linear geophysical inversion problem can be
found in Auken et al. (2014). To stabilize the inverse problem, either of two
types of regularization methods can be applied. The first regularization
method is commonly referred to as smoothness-constrained inversion (Constable
et al., 1987). The smoothness-constrained inversion tends to reduce contrasts
and the resulting geophysical model may appear blurred. The reason for this
is found in its minimum-structure L2 norm inversion formalism (Constable et
al., 1987; Menke, 2012). Following the notation used by Vignoli et
al. (2015), this can be expressed as

The second regularization method is the minimum gradient support
(Portniaguine and Zhdanov, 1999; Blaschek et al., 2008; Vignoli et al.,
2015), which allows for large sharp vertical and horizontal model
transitions. The minimum gradient support regularization seeks to minimize
the spatial variations vertically and laterally by penalizing the vertical
and horizontal model gradients through the stabilizer expressed as (Vignoli
et al., 2015)

The geophysical voxel inversion is carried out on the logarithm of the
resistivity values (

In the second step (Fig. 1, box 2), the 3-D distribution of electrical resistivity values is linked to the hydrological parameters (i.e., hydraulic conductivity) through a spatially varying petrophysical relationship. Shape factors of this relationship are calibrated.

Linking hydraulic conductivity and electrical resistivity is not trivial
because the parameter values and the form of the petrophysical relationship
may vary dramatically between different types of environments. In addition,
there can be fundamental questions about how the effective properties
controlling electrical current flow are related to the effective properties
controlling fluid flow (Slater, 2007). The primary factors controlling this
relationship are porosity, pore water conductivity, tortuosity, grain size,
degree of saturation, amount of clay minerals, etc. (McNeill, 1980). The
simplest petrophysical relationship is the empirical relationship known as
Archie's law (Archie, 1942), which relates porosity, pore water conductivity,
and the degree of saturation to bulk electrical conductivity. However, this
type of relationship does not take the electrical surface conductance of clay
minerals into account. The Waxman and Smits model (Waxman and Smits, 1968)
combined with the dual-water model by Clavier et al. (1984) provides a basis
for establishing empirical relationships for shaly sand and sediments
containing clays (Revil and Cathles, 1999; Revil et al., 2012). For glacial
sedimentary environments, it is reported that clay has low electrical
resistivity and also low hydraulic conductivity, and sand has high electrical
resistivity and high hydraulic conductivity (Mazáč et al., 1985). For
these environments, it is common to use a power law relationship, which is
given some theoretical support by Purvance and
Andricevic (2000). The relationship is expressed as

The model parameters,

For illustrative purposes, we use a 3-D synthetic system very similar to that presented by Christensen et al. (2016). The only difference is that the active part of the groundwater system only consists of 5 layers, whereas Christensen et al. (2016) used a 20-layer model.

The groundwater system is intended to mimic a glacial landscape and covers an
area that is 7000 m (N–S) by 5000 m (E–W). The geology of the system was
generated using T-PROGS (Carle, 1999) as having a horizontal discretization
of 25 m

A map of locations of boreholes, a pumping well, pilot points, head recovery prediction, and location of a geophysical cross section.

Thirty-five boreholes are found within the domain (Fig. 2). Each borehole contains a monitoring well that screens the deepest 10 m of sand registered in the borehole. For each system realization, the hydraulic head in the 35 wells and the river discharge at the southern boundary were extracted from a forward simulation made by MODFLOW-2000. The 35 simulated hydraulic head values were contaminated by independent Gaussian error with zero mean and 0.1 m standard deviation. The river discharge was corrupted with independent Gaussian error with zero mean and a standard deviation corresponding to 10 % of the true river discharge. The 36 contaminated values constitute the hydrological data used for groundwater model calibration.

The geophysical reference system was designed so that there is perfect
correlation between hydraulic conductivity and electrical resistivity. This
implies that a relationship between hydraulic conductivity and measured
electrical resistivity is likely to exist. The true relationship is of the
same form as Eq. (3), and it uses constant shape factor values

The AEM data were simulated using AarhusInv (Auken et al., 2014) for a system
setup similar to a typical dual-moment SkyTEM-304 system (Sørensen and
Auken, 2004). The simulated survey consists of 35 E–W flight lines with
200 m spacing between the flight lines. AEM system responses were simulated
for every 25 m along the flight lines, giving a total of 6300 sounding
locations for both the transmitted high and low moments. AarhusInv is a 1-D
modeling code. To mimic the loss of resolution with layer depth we simulated
the responses using the 2-D logarithmic average resistivity of all model
cells inside the radius of the footprint at a given depth. To obtain the
geophysical data set, the simulated data were contaminated with noise
according to the noise model suggested by Auken et al. (2008):

AEM sounding data corrupted by four levels of background noise. The
value on top of each subplot corresponds to the noise level at 1 ms and to
the

The geophysical data were inverted by voxel inversion (Fiandaca et al., 2015) using AarhusInv (Auken et al., 2014). The voxel inversion was conducted in two different ways: by using L2-norm “smooth” constraints, or by using minimum gradient support “sharp” constraints (both implemented in AarhusInv; Auken et al., 2014).

To avoid the influence of numerical discretization errors, the geophysical
voxel inversion uses the same spatial discretization as the reference system
and the groundwater model. For both smooth and sharp inversions, a 40 ohm m
uniform half-space was used as the starting model and spatial regularization
was applied using the same settings throughout all inversions. Considering
the small number of layers and the shallow discretization, it was unnecessary
to apply vertical constraints for any of the inversions. By contrast, depth-
and direction-dependent horizontal constraint factors were used for both
smooth and sharp inversions. The strength given to the horizontal constraints
is based on experience, keeping in mind that the constraint factors should
not prevent data fitting, but promote model consistency. Therefore, a few
experiments were made to “manually” tune the magnitude of the constraint
factors. Different values along the flight lines and perpendicular to them,
respectively, were found to give better results. This is a result of having
higher data density along the flight lines compared to the perpendicular
direction. In these synthetic tests (similar to what is done with field data
with analogous data density) the smooth regularization constraint factors of
CF

In contrast to the conventional inversion of geophysical data, where the vertical discretization of the geophysical model is normally characterized by logarithmically increasing layer thicknesses, in this study fixed layer thicknesses were used in the geophysical models. To account for the loss of resolution with depth without increasing the layer thicknesses, the horizontal constraint factors were set to decrease linearly with depth (tighter bands for the deeper layers), resulting in constraint factors of 1.4 along the flight lines and 1.02 perpendicular to the flight lines for the sixth layer.

The same directional and depth-dependent tuning used for smooth
regularization was also applied to the sharp inversion. In this case
constraint factors of CF

In the following, the groundwater model will be parameterized in two
different ways. Both approaches treat the shape factors between hydraulic
conductivity and resistivity,

The first type of parameterization uses a resistivity model estimated by
smooth voxel inversion of AEM data collected with a background noise level
of 3 nV m

The second type of parameterization uses a resistivity model estimated by
sharp voxel inversion of AEM data collected with a background noise level of
either 1, 3, 5, or 10 nV m

The parameter values are estimated by fitting the available hydrological data
consisting of the 35 observations of the hydraulic head and one river
discharge observation. Calibration is done by minimization of the total
objective function given by Eq. (4), where the measurement objective function
is computed as

Calibration was performed using BeoPEST, a version of PEST (Doherty, 2010) that allows the inversion to run in parallel using multiple cores and computers.

It should be noted that for calibration and model prediction we applied the recharge field and boundary conditions of the reference system.

In step 3 (Fig. 1, box), the calibrated groundwater model is used to make predictions.

In the following synthetic demonstration study, the calibrated SHI-smooth and SHI-sharp groundwater models are evaluated by comparing their simulated model predictions with corresponding predictions simulated for the (synthetic and, therefore, known) reference system. The former are called “model predictions”; the latter are called “reference predictions”.

Prediction types 1 and 2 relate to steady-state flow when groundwater is pumped from the well. This is also the condition for which the hydrologic data used for calibration were sampled. Type 1 is the average age of the groundwater pumped from the well. Type 2 is the size of the recharge area of the pumping well. Both of these predictions differ in type from the calibration data. For these model predictions, we used a homogeneous porosity of 0.2 (the average value of the reference system porosity fields is 0.184).

Prediction types 3 and 4 relate to a new stress situation long after pumping from the well has ceased: type 3 is groundwater discharge into the stream, and type 4 is head recovery for a well screening a layer northeast of the pumping well (the location is shown in Fig. 2).

Reference and model prediction types 3 and 4 were simulated by MODFLOW-2000 (Harbaugh et al., 2000), while types 1 and 2 were simulated by forward particle tracking using MODPATH version 5 (Pollock, 1994) and MODFLOW-2000 results.

The first two types of prediction are interesting from the perspectives of protection and resource management of a well field, while the latter two are relevant in the case of possible change in management practice resulting in a new stress.

As said at the beginning of Sect. 2, steps 1–3 of the framework can be repeated for a number of system realizations to provide consistent statistical interference regarding the model prediction results. Here, 20 different reference system realizations were used. For each prediction, we therefore have 20 corresponding sets of reference predictions and model predictions that can be used to evaluate the performance of a calibrated model with respect to that prediction. The performance is evaluated for the SHI-smooth and SHI-sharp models, respectively, and it is done in the following ways.

Prediction error characteristics are quantified by the mean absolute error
(MAE), the mean error (ME) following

The figure shows an east–west cross section of resistivity for the reference system (realization number 20), and inversion results for smooth and sharp inversion, respectively. The last row shows a histogram of resistivity for each layer. The black curve is the resistivity distribution for the reference system, the red curve shows the resistivity distribution for the smooth inversion, and finally the green curve shows the resistivity distribution for the sharp inversion.

Scatter plot of true and estimated electrical resistivity fields for smooth geophysical inversion and sharp geophysical inversion for different data qualities of the AEM data for model realization number 20. On top of each window a Pearson correlation coefficient (PCC) is calculated.

Figure 4 shows a representative cross section for 1 of the 20 system
realizations. Both geophysical models in Fig. 4 were inverted using data
perturbed with a background noise level of 3 nV m

Figure 4 shows that SHI-sharp-3 resolves the sand body in layer 5 much better than SHI-smooth-3. Moreover, the locations and boundaries of the geological deposits tend to be less smeared out when using the sharp constraints. Inspection of the histograms at the bottom of Fig. 4 shows that the SHI-sharp-3 model tends to produce resistivity distributions that are more similar to the reference distributions than the SHI-smooth-3 model. This improvement could allow for easier translation from electrical resistivity into hydraulic conductivity and correspondingly more faithful representation of hydrogeologic structure and connectivity.

Figure 5 shows voxel-by-voxel density plots of reference versus estimated electrical resistivity for a SHI-smooth model and corresponding SHI-sharp models. Pearson's correlation coefficient (PCC; Cooley and Naff, 1990) is shown on top of the density plot for each layer. A comparison of the density plots and the PCC values of the SHI-smooth-3 and SHI-sharp-3 models shows that using sharp instead of smooth constraints improves the inverted geophysical model. The improvement is seen most clearly for the sand deposits.

For both the SHI-smooth and SHI-sharp models there is a strong correlation between the electrical resistivity estimates and the true electrical resistivities of the first layer, but the SHI-smooth model has weaker correlation than the SHI-sharp models. For both types of models, the correlation weakens with depth and background noise. The former is caused by the resolution limitations of AEM data. However, the depth and resistivity of the low-resistivity clay at the base of the model are well resolved by both the SHI-smooth and SHI-sharp model inversions (results not shown).

The calibration results for the 20 different system realizations are shown in
Fig. 6. The figure shows that the measurement objective function value,

Measurement objective function value obtained for the various
groundwater model calibration cases, while

Figure 7 shows a cross section of the estimated

East–west cross section for model realization number
20.

From Fig. 7a and b it is seen that the estimated

For the SHI-smooth-3 model,

For the SHI-sharp-3 model, the estimated

Scatter plot of true and estimated hydraulic conductivity fields for smooth geophysical inversion and sharp geophysical inversion for different data qualities of the AEM data for model realization number 20. On top of each window the Pearson correlation coefficient (PCC) is calculated.

Figure 8 shows voxel-by-voxel density plots of reference versus estimated
hydraulic conductivity for the SHI-smooth and SHI-sharp models. The results
confirm that the

Model structural accuracy comparison for the groundwater model using both smooth or sharp geophysical models and different background noise levels. The results are averaged over the 20 system realizations. A value of 1.0 means that the model's hydraulic conductivity field is in good agreement with the reference field; a value of 0.0 means no agreement (see the body text for an exact definition of “structural accuracy”).

Model structural accuracy is quantified in Table 1 for both the SHI-smooth
and SHI-sharp models. Structural accuracy is calculated here as the fraction
of the total number of voxels for which the estimated log

Scatter plots of calibrated model prediction versus the reference model prediction using results from the 20 system realizations. The plots in the first and second columns are the average groundwater age and recharge area, respectively, of the pumping well. Column three is for head recovery when pumping has stopped in the observation well shown in Fig. 10, and column four is for groundwater discharge to the river after pumping has ceased; ME and MAE are used to quantify the prediction error on the basis of the 20 realizations.

For each of the 20 system realizations, the calibrated groundwater models were used to make the model predictions described in Sect. 3.5. Figure 9 shows scatter plots of the reference prediction versus the calibrated model prediction; each plotted point corresponds to a particular system realization and the corresponding SHI-smooth-3 or SHI-sharp-3 model. The mean error (ME) and mean absolute error (MAE) of the prediction are also given in Fig. 9. Figure 10 shows a MAE contour map for head recovery predictions.

MAE contour map for head recovery prediction.

The first column of Fig. 9 shows results for prediction of the average age of
the groundwater pumped from the pumping well. The scatter plot illustrates
that SHI-sharp models tend to overpredict average age. This is seen by the
majority of points plotting above the identity line as well as by the value
of ME

The second column of Fig. 9 reports results related to prediction of the recharge area of the pumping well. The scatter plot shows that the SHI-smooth models underpredict the recharge area. This happens because the smooth models lead to estimation of hydraulic conductivities in the deepest layers that are too low. This creates a deep cone of depression around the pumping well that extends upward locally to reach the river bed. This induces a local discharge of water from the stream through the groundwater system to the pumping well. These models thus predict that a significant proportion of the pumping will come from local discharge from the river. (This is compensated for by increased model predicted groundwater discharge to other parts of the river.) For the true, reference system used to generate the data, the river does not lose water, and all water pumped from the well originates from groundwater recharge.

The SHI-sharp models are better predictors of the recharge area, but these models also tend to predict an area that is too small. These models also predict local discharge from the river to the groundwater system, but to a lesser degree than the SHI-smooth models. This is likely because the main features of the reference system are better reconstructed by the SHI-sharp-3 models.

The prediction of head recovery at an observation well (location shown in Fig. 10) is done poorly by SHI-smooth-3 (Fig. 9). The predicted head recovery is very small for most of these models because they tend to have too little hydraulic connectivity between the deepest layers, the estimated hydraulic conductivities are too low in the deep sand layers, and the simulated cone of depression is therefore too deep and too local.

The SHI-sharp-3 models make less biased, fairly reasonable predictions of the head recovery (Fig. 9) because they resolve the variations of hydraulic conductivity at depth better than the SHI-smooth-3 models. The superiority of SHI-sharp-3 models for recovery prediction is also seen from the MAE contour maps in Fig. 10. The MAE is seen to be spatially dependent: it is largest at the pumping well, and smallest at the constant head boundary to the south

The fourth column of Fig. 9 shows that both types of models are good predictors of discharge to the river after cessation of pumping. However, the SHI-sharp-3 model prediction is superior (its points plot closer to the identity line). For SHI-smooth-3, the prediction tends to be positively biased and more spread than for SHI-sharp-3.

In Fig. 11 MAE is used as a metric to evaluate how the prediction performance of SHI-sharp models depends on the level of background noise for the geophysical data. The noise levels were kept unchanged for the hydrological data.

Figure 11 shows that the average age prediction made by SHI-sharp models is nearly unaffected by the quality of the geophysical data. It is speculative, but this result may be because this prediction is highly dependent on small-scale variability in hydraulic conductivity and porosity that cannot be resolved from any of the geophysical data sets. That is, even the highest quality geophysical data are not highly informative, so reducing the data quality further has little effect.

Prediction error as a function of the background noise on the
geophysical data. The black dots are the SHI-smooth models using a background
noise level of 3 nV m

It is different for the recharge area prediction (Fig. 11): MAE increases for
this by approximately 25 % when the level of background noise is
increased from 1 to 10 nV m

The third and fourth rows of Fig. 11 show the
head recovery and river discharge prediction after cessation of the pumping
well. Head recovery and discharge predictions also tend to depend on the
quality of the geophysical data. The MAE increases by 17 % for
recovery prediction and 23 % for discharge prediction when the noise
level of the geophysical data increases from 1 to 10 nV m

Parameterizing the groundwater model by assuming a spatially dependent petrophysical relationship between resistivity and hydraulic conductivity makes it possible to use a resistivity voxel model for construction and calibration of a groundwater model. Assuming that the relationship is spatially dependent can account for two challenges: (i) there may be actual changes in the petrophysical relationship within an investigated domain, and (ii) there may be resolution limitations in the estimated resistivity model.

Challenge (i) is likely to be the rule for many environments, especially sedimentary environments, where the formation resistivity is primarily controlled by the pore water resistivity and the clay content. In the case of spatial changes in pore water resistivity and/or the content of various clay mineral content, the discrimination between clay and sands may be less clear in the estimated resistivity values. For large-scale groundwater systems, the variation of pore water resistivity (e.g., saline pore water) is expected to vary smoothly, which would be accounted for by the spatially varying petrophysical relationship. However, the procedure only works as applied here if the underlying assumption that clay-rich deposits have lower electrical resistivity than sand deposits is valid.

Challenge (ii) concerns the geophysical model resolution of the true formation resistivity. EM methods are, by nature, more sensitive to deposits of low electrical resistivity than to deposits of high resistivity, and their vertical and horizontal resolutions decrease with depth. This challenge affects the resistivity models estimated in the present synthetic study. Spatially dependent shape factors can take a compensatory role for the resolution issues of the estimated geophysical voxel model. The calibrated shape factors may thus no longer have firm physical meaning because they mainly act as correction parameters for absorbing structural errors of the geophysical model. This is acceptable as long as the resulting hydraulic conductivity values are reasonable. The idea of calibrating the shape factors is related to the concept of compensatory parameters in highly parameterized calibration described by Doherty and Welter (2010) and by Doherty and Christensen (2011).

Finally, Auken et al. (2008) showed that using borehole data as a priori information in the geophysical inversion improves the reconstruction of the model features significantly. Estimation of EM-based resistivity models should therefore, wherever possible, be supported by borehole information to improve the decreasing spatial resolution of the EM methods.

Inversion of AEM data using a 1-D geophysical model usually applies smoothness constraints in order to regularize the inversion (Auken and Christiansen, 2004; Viezzoli et al., 2008). Traditionally, the regularization includes both lateral and vertical smoothing constraints (Constable et al., 1987) or a few-layer parameterization (Auken et al., 2008). Inversion using the former type of regularization produces smooth images with blurred formation boundaries which can be problematic when it is important to resolve structural connections in a complex geological system. The latter few-layer inversion may also be prone to producing artifacts when used to map such systems. Day-Lewis (2005) and others therefore recognized that regularization used to stabilize the geophysical inversion can lead to artifacts that do not reflect the actual hydrogeological conditions. Thoughtless use of such results to construct groundwater models can have serious ramifications.

For the present case study, the number of vertical transitions is a great challenge for the AEM method due to the principle of high-resistivity equivalence. That is, it is difficult to resolve a high-resistivity layer between two low-resistivity layers because the energy loss, and therefore the sensitivity, is concentrated in the less resistive layers. This will result in layer suppression, because the data sensitivity to the high-resistivity layer is low (Christiansen et al., 2006). This effect is present for both the smooth and sharp inversions, but in the sharp inversion the effect is less fuzzy, and features, especially for the fifth layer, could be more clearly reconstructed (Fig. 4). When the sensitivity of the AEM method is too low, the regularization may make information migrate from areas with higher measurement sensitivity (Vignoli et al., 2015). In contrast to the smooth regularization scheme, the sharp regularization method is designed to penalize smooth transitions, which eventually improves the reconstruction of the deeper sand bodies in the present study. Therefore, for the studied case, the sharp regularization methodology should be preferred over smooth regularization, because the sharp constraints correspond better to the actual structures of the reference system (sharp transitions between categorical deposits; Fig. 4). Moreover, because the sharp regularization methodology leads to improved reconstruction of subsurface structures, these models lead to greater accuracy and improvement of most groundwater model predictions (Fig. 9).

The groundwater system considered here is relatively shallow, at least as
seen from the perspective of the AEM system used in the demonstration
example. This is evident from the transmitted EM signal (Fig. 3). The
background noise primarily affects the last time gates
(10

We present a workflow for efficient construction and calibration of large-scale groundwater models using a combination of airborne electromagnetic (AEM) data and hydrological data. Other types of data could be integrated as well following the same procedure. First, the AEM data are inverted to form a 3-D geophysical model. Subsequently, the geophysical model is translated into a 3-D model of hydraulic conductivity by using a spatially dependent petrophysical relationship for which the shape parameters are estimated by fitting the groundwater model to hydrological data. The estimated shape factors of the petrophysical relationship primarily work as translators between resistivity and hydraulic conductivity, but they can also compensate for structural defects in the model.

The method is demonstrated for a synthetic case study where the subsurface consists of categorical deposits with different geophysical and hydraulic properties. The AEM data are inverted using both smooth and sharp regularization constraints, resulting in two competitive geophysical models. Furthermore, the influence of the AEM data quality is tested by inverting the sharp geophysical models using data corrupted with four different levels of background noise. The resulting groundwater models are each calibrated on the basis of head and discharge data, and their predictive performance is tested for four types of prediction beyond the calibration base. Predictions of a pumping well's recharge area and groundwater age apply the same stress situation as applied during hydrologic model calibration, while predictions of head and stream discharge are done for a changed stress situation.

It is found that a geophysical model inverted with sharp constraints (SHI-sharp) leads to a more accurate groundwater model than one that is based on a geophysical model inverted with smooth constraints (SHI-smooth). The SHI-sharp model leads to an estimated hydraulic conductivity field of greater accuracy and to improvement of most groundwater model predictions. The explanation is that the reference system (like many real hydrogeologic systems) is characterized by sharp transitions between categorical deposits; this is resolved better by the SHI-sharp resistivity model than by the SHI-smooth model.

Finally, it is shown that prediction accuracy improves with AEM data quality for predictions of recharge area, head change, and stream discharge, while the accuracy appears to be unaffected for prediction of groundwater age, which cannot be predicted accurately even with high-quality geophysical data.

All data are synthetic and are available by request to the corresponding author.

The authors declare that they have no conflict of interest.

The presented work was supported by HyGEM, integrating geophysics, geology, and hydrology for improved groundwater environmental management, project no. 11-15116763. The funding for HyGEM is provided by the Danish Council for Strategic Research. All data are synthetic and are available by request to the corresponding author.

We would like to thank Troels N. Vilhelmsen, Nikolaj Foged, Esben Auken, and Anders V. Christiansen for their participation in discussions and sharing of experiences during the HyGEM project meetings. Finally, we acknowledge the comments from two anonymous reviewers, and editor Bill Hu, whose feedback contributed to improvements of this paper. Edited by: B. Hu Reviewed by: two anonymous referees