Examining the impacts of precipitation isotope input ( δ 18 O ppt ) on distributed , tracer-aided hydrological modelling

Tracer-aided hydrological models are becoming increasingly popular tools as they assist with process understanding and source separation, which facilitates model calibration and diagnosis of model uncertainty (Tetzlaff et al., 2015; Klaus and McDonnell, 2013). Data availability in high-latitude regions, however, proves to be a major challenge associated with this type of application (Tetzlaff et al., 2015). Models require a time series of isotopes in precipitation (δOppt) to drive simulations, and throughout much of the world – particularly in sparsely populated high-latitude regions – these data are not widely available. Here we investigate the impact that choice of precipitation isotope product (δOppt) has on simulations of streamflow, δ18O in streamflow (δOSF), resulting hydrograph separations, and model parameters. In a high-latitude, datasparse, seasonal basin (Fort Simpson, NWT, Canada), we assess three precipitation isotope products of different spatial and temporal resolutions (i.e. semi-annual static, seasonal KPN43, and daily bias-corrected REMOiso), and apply them to force the isoWATFLOOD tracer-aided hydrologic model. Total simulated streamflow is not significantly impacted by choice of δOppt product; however, simulated isotopes in streamflow (δOSF) and the internal apportionment of water (driven by model parameterization) are impacted. The highest-resolution product (REMOiso) was distinct from the two lower-resolution products (KPN43 and static), but could not be verified as correct due to a lack of daily δOppt observations. The resolution of δOppt impacts model parameterization and seasonal hydrograph separations, producing notable differences among simulations following large snowmelt and rainfall events when event compositions differ significantly from δOSF. Capturing and preserving the spatial variability in δOppt using distributed tracer-aided models is important because this variability impacts model parameterization. We achieve an understanding of tracer-aided modelling and its application in high-latitude regions with limited δOppt observations, and the value such models have in defining modelling uncertainty. In this study, application of a tracer-aided model is able to identify simulations with improved internal process representation, reinforcing the fact that tracer-aided modelling approaches assist with resolving hydrograph component contributions and work towards diagnosing equifinality.


Introduction
Hydrological models are critical tools for the planning, development, design, operation, and sustainable management of water resources (Singh and Frevert, 2006).These models provide insight into applications such as the prediction of floods, droughts and water availability, and the effects of climate and land use change on water resources.Problems arise for calibration and validation of hydrological models when there is (1) a lack of available data at sufficient resolutions to force and validate model simulations -especially in remote, high-latitude locations (in Canada: Coulibaly et al., 2013); (2) issues with equifinality affecting model parameterization; and (3) uncertainty in model results (e.g.Beven and Binley, 1992;Kirchner, 2006;Fenicia et al., 2008;Dunn et al., 2008).
It is now widely accepted that calibration and validation of hydrological models based solely on streamflow is not a sufficient evaluation measure (Kuczera, 1983;Beven and Binley, 1992;Kuczera and Mroczkowski, 1998;Seibert and Mc-Donnell, 2002;Kirchner, 2006;Fenicia et al., 2008;Dunn et al., 2008).Modellers are focusing on a model's ability to correctly partition, store, and release water from hydrologic compartments, in addition to adequately simulating total streamflow response.Conservative tracer data provide insights into the dominant hydrological processes and integrated runoff response (in northern catchments: Birks and Gibson, 2009;Tezlaff et al., 2015), and such data assist with constraining model parameter space during calibration, reducing model uncertainty, and assisting with selection of appropriate model structures (e.g.Tetzlaff et al., 2008;Birkel et al., 2010a;Birkel et al., 2014;Smith et al., 2016).An increasing number of studies have investigated the utility of tracer-aided modelling approaches, especially over the past decade (for a comprehensive overview, see Birkel and Soulsby, 2015).
Although greatly informative, previous tracer-aided modelling studies have generally been conducted using lumped conceptual rainfall-runoff models in highly instrumented small-scale experimental catchments (< 10 2 km 2 ).This has resulted in distributed studies at the regional scale (> 10 3 km 2 ) being left largely unexplored, with the exception of a few, select applications (Stadnyk et al., 2013).Modelling at the regional scale typically requires a distributed approach to capture the heterogeneity in meteorological inputs, basin characteristics, and runoff response, resulting in more complex, highly parameterized models (e.g.Michaud and Sorooshian, 1994;Carpenter and Georgakakos, 2006;Her and Chaubey, 2015).Because it is at these larger scales where models are applied operationally and management decisions are based, there is a critical need to understand the abilities, limitations, and uncertainties associated with distributed tracer-aided modelling at the regional scale.
Although there is an identified need, the issue of data availability, particularly input data, proves to be a major challenge associated with this type of application (Birkel and Soulsby, 2015).Tracer-aided hydrological modelling typically requires a time series of isotopes in precipitation (δ 18 O ppt ) to drive model simulations.Unfortunately, throughout much of the world, and particularly in sparsely populated highlatitude regions (such as the vast majority of Canada), these data are not widely available.Although automatic samplers are becoming increasingly common, watersheds in which snow accumulation is substantial will continue to be fraught with difficulties surrounding the collection and characterization of precipitation isotopes, particularly during the winter months (Dietermann and Weiler, 2013;Penna et al., 2014).The lack of spatial and temporal density of δ 18 O ppt observations highlights the need for alternative methods to provide estimates of stable isotopes in precipitation for tracer-aided model input (termed "δ 18 O ppt products").Options include empirically based models generating gridded time series of precipitation isotopes (e.g.Lykoudis et al., 2010;Delavau et al., 2015), and isotope-enabled climate model output (for a comprehensive overview, see Noone and Sturm, 2010;Xi, 2014).
Small-scale catchment studies rely on continuous records of δ 18 O ppt observations at high temporal frequencies (typically daily, and less commonly, weekly) for model input.At the larger scale, tracer-aided modelling completed by Stadnyk et al. (2013) in the remote Fort Simpson region of northern Canada used annual average compositions of rainfall and snowfall δ 18 O to drive model simulations.Their results suggested that utilizing annual, spatially static oxygen-18 in precipitation forcing has the potential to significantly impact simulations and, consequently, model parameterization as well.The assumption that model input is spatially invariant is not preferable, as δ 18 O ppt can vary drastically over small space scales and timescales due to changes in moisture sources and transport processes, rainout history, and seasonality (e.g. in Canada: Gat et al., 1994;Moran et al., 2007;Birks and Edwards, 2009).
This study aims to explore how varying spatial and temporal resolutions of precipitation isotope products, or δ 18 O ppt input, impact regional tracer-aided model simulations and parameterization.Forcing a tracer-aided, distributed hydrological model (isoWATFLOOD) with three precipitation isotope products, we examine how the different δ 18 O ppt products impact the a. simulation of total streamflow and its isotopic variability (δ 18 O SF ); b. internal apportionment of water, namely the seasonality of hydrograph separation; and c. model parameterization and simulation uncertainty.
We explore the impact that varying the resolution of δ 18 O ppt inputs has on the capability of the model to reproduce observed δ 18 O SF variability, and the usefulness of a tracer-aided modelling approach to help inform and quantify simulation equifinality.
2 Study area and data

The Fort Simpson Basin
The Fort Simpson Basin (FSB) is located within the lower Liard River Valley close to the town of Fort Simpson, Northwest Territories, Canada (61 • 45 N, 121 • 14 W; Fig. 1).This region has been the focus of several tracer-aided hydrological studies (e.g.St Amour et al., 2005;Stadnyk et al., 2005Stadnyk et al., , 2013;;Stadnyk-Falcone, 2008).The FSB is selected for this study to build upon previous modelling work conducted within the region, and to follow up on recommendations from Stadnyk et al. (2013) suggesting further analysis and improvement of isoWATFLOOD δ 18 O ppt input.The study period of 1997-1999 is selected based on data availability.
This study considers two sub-basins of the greater Fort Simpson Basin: the Jean-Marie (1310 km 2 ) and Blackstone River (1390 km 2 ) sub-basins (Fig. 1).The basins vary in relief from 0.3 % in the Jean-Marie sub-basin to 0.63 % for the Blackstone sub-basin, on average.Differences in wetland distribution and function, basin physiography, and land cover make-up between the two watersheds (Table 1) are the primary reasons in selecting these sub-basins for this study.These marked differences ensure that watersheds of varying dominant hydrological processes are represented in the modelling, and therefore the impacts of δ 18 O ppt input selection on these processes can be examined.The land cover classification breakdown (Table 1) shows the primary land cover type within the sub-basins as transitional, consisting of shrubs, deciduous varieties, and early generation spruce.The region has a high proportion of wetlands, with the total wetland percentage in Table 1 representing both bogs (disconnected drainage) and fens (connected drainage), although the amount of each type within each respective sub-basin varies.Aylsworth and Kettles (2000) state that Jean-Marie is predominately fen peatlands, while Blackstone is bog-dominated peatlands, with very few or no fen peatlands present.
The Ecoregions Working Group (1989) classifies the FSB as a sub-humid mid-to high-boreal ecoclimatic region (Hbs), classified by cool summers approximately 5 months in length, with moderate (300-500 mm) annual precipitation.Winters are very cold with persistent snow cover.The hydrological response is dominated by snowmelt during late April to early May, while summer and fall runoff events are due to major rainfall, with a return to baseflow occurring during dry summer periods or towards the beginning of the ice-on season in October.

Meteorological and hydrometric data
Daily total precipitation, mean daily temperature, and hourly relative humidity data are obtained from Environment Canada's Fort Simpson Airport weather station.Observed precipitation is supplemented with ANUSPLIN-derived daily precipitation extracted at eight locations throughout the Fort Simpson region (Fig. 1).ANUSPLIN is a multidimensional non-parametric surface fitting method that has been found to be well suited to the interpolation of various cli- mate variables, particularity in data-sparse, high-elevation regions, as the method accounts for spatially varying dependencies on elevation (McKenney et al., 2011).We have validated ANUSPLIN against independent station observations (precipitation and temperature) across the Prairies and Boreal regions of Canada as a precipitation forcing for hydrologic modelling.It has been found to be adequate (r ≥ 0.98) for the purpose of short-term modelling studies.An inversedistance weighting approach is used to spatially distribute the daily ANUSPLIN and observed precipitation time series across the model domain (Kouwen, 2016).Rainfall that occurred over the study period, particularly in 1997, was significantly higher than normal.Additionally, 1998 was above average in temperature, which is especially prevalent in the first portion of the year.Other researchers have attributed the increased rainfall and warmer temperatures to a strong El Niño influence from mid-1997mid- to mid-1998mid- (Petrone et al., 2000;;St Amour et al., 2005).
Hydrometric records are obtained from Water Survey of Canada.Jean-Marie was gauged at Highway No. 1 in 1972 with a period of record of 44 years, whereas Blackstone was gauged at Highway No. 7 in 1991 having a record length of 25 years.Neither sub-basin is regulated; therefore, all flows are considered to be natural.During the study period, mean annual discharge was above normal in both subbasins in 1997, normal in Jean-Marie, slightly below normal in Blackstone in 1998, and below normal in both subbasins in 1999.Winter (ice-on) flows tend to be very low given highly seasonal, high-latitude hydrology, underlying discontinuous permafrost, and the absence of mid-winter melt (St Amour et al., 2005).Averaged winter ice-on flows from 1997 to 1999 were 0.194 and 0.034 m 3 s −1 for the Jean-Marie and Blackstone sub-basins, respectively.A statistical summary of observations used in this study is provided in Table 2.

Isotope data
During 1997 to 1999, intensive sampling took place in the Fort Simpson Basin as part of the Mackenzie Study of the Global Energy and Water Experiment (GEWEX; Stewart et al., 1998).The campaign sampled δ 18 O and δ 2 H of streamflow, rainfall, snowpack, and surface waters (wetlands and lakes) during the open water season (May to October).Dur-ing ice-on conditions, the isotope stratigraphy of river ice extracted during late March in 1998 and 1999 was used to reconstruct the isotopic composition of winter streamflow (Gibson and Prowse, 1999;Prowse et al., 2002;St Amour et al., 2005).This study uses measured δ 18 O compositions in streamflow in the Jean-Marie (n = 71) and Blackstone (n = 69) sub-basins for model calibration.Although δ 18 O ppt compositions (n = 27) were collected as part of the GEWEX sampling campaign, these data are not preferred for traceraided hydrologic model input due to their spatial uniformity and poor temporal resolution.Observations are incorporated into this study as the "static" δ 18 O ppt input, and as a means to validate the KPN43 and REMOiso products and to inform the static precipitation product.The number of measurements and their statistical properties are summarized in Table 2. Isotopic compositions of δ 18 O are expressed in delta (δ) notation as a deviation from VSMOW (Vienna Mean Standard Mean Ocean Water) in units of per mille (‰), such that δ water = (R water /R VSMOW − 1) × 1000 ‰, where R is 18 O/ 16 O in the sample and standard, respectively.Isotope samples were analysed at the Environmental Isotope Laboratory at the University of Waterloo, and St Amour et al. (2005) indicated maximum analytical uncertainties of ±0.1 ‰ for δ 18 O.

Precipitation oxygen-18 input
The precipitation isotope products evaluated in this study represent a variety of spatial and temporal scales, and were selected because they are commonly available for all traceraided hydrologic modelling applications.The first type of input used in this study is annual average δ 18 O ppt compositions of rainfall and snowfall for each year of simulation (i.e.yearly resolution).Values for the FSB were obtained by averaging observations of δ 18 O in rainfall and the snowpack obtained from the GEWEX study (Tables 2 and 3).δ 18 O ppt compositions were assumed constant throughout the study domain (i.e.spatially uniform).Due to a lack of snowfall data collected during this study, we assumed the average annual isotopic composition of the snowpack was representative of the snowfall composition, as has been done in other data sparse, high-latitude tracer-aided modelling studies (Smith et al., 2015(Smith et al., , 2016;;Holmes, 2016;Stadnyk et al., 2013).It is well established in the literature that the isotopic composition of snowfall is not necessarily equal to the average annual composition of the snowpack (due to sublimation and snow metamorphism; Zhou et al., 2008;Taylor et al., 2001Taylor et al., , 2002)).The high latitude of our study site, however, makes freezethaw cycling during the winter rare, making this assumption more reasonable.Due to the averaged values and lack of spatial variability, this product is referred to as "static" throughout the remainder of the paper, and consists of two constant δ 18 O ppt values (rain and snow) for each year.This product is specifically designed and evaluated for remote regions that lack spatially and temporally varying δ 18 O ppt observations.Times series simulations obtained from the KPN43 model created by Delavau et al. (2015) are used as the second type of δ 18 O ppt product in this study.The KPN43 model uses North American Regional Reanalysis (NARR; Mesinger et al., 2006) climate variables, teleconnection indices, and geographic information to produce gridded time series of oxygen-18 in precipitation at a monthly time step (Delavau, 2017).This product is generated at a 10 km resolution (to mirror model set-up), and varies spatially throughout the study domain due to the variation in the climatic predictors and geographic information required to produce simulations.
The third δ 18 O ppt product included in this study is regional climate model output from the isotope-enabled climate model, REMOiso (Sturm et al., 2005(Sturm et al., , 2007)).Raw RE-MOiso δ 18 O ppt output is available at a 55 km spatial resolution and a 6 h time step.REMOiso output is averaged in this study, however, to a daily time step, as the range and variability of sub-daily δ 18 O ppt are erroneously large, and the resolution of streamflow oxygen-18 calibration data does not warrant a temporal frequency of input finer than daily.

Background and set-up
The tracer-aided hydrological model used in this study is isoWATFLOOD (Stadnyk-Falcone, 2008;Stadnyk et al., 2013).isoWATFLOOD is an extension of the WATFLOOD hydrological model, whereby water and oxygen-18 are simultaneously budgeted throughout the modelled hydrologic cycle.WATFLOOD is a distributed model that uses grouped response units (GRUs) to simulate streamflow in hydrologically distinct land cover units (Kouwen et al., 1993;Kouwen, 2016).Process representation within WATFLOOD is considered to be a combination of both conceptually and physically based algorithms, as certain algorithms are conceptually based (e.g.evaporation and snowmelt), while others are more based in physics (e.g.channel routing).Due to the coupling of isotopes to each hydrological process simulated in WATFLOOD, simulation of isotopic composition does not introduce any additional parameters.A more comprehensive description of isoWATFLOOD's model structure and govern-  (Philip, 1954).ing equations can be found in Stadnyk et al. (2013), and selected descriptions are provided in Table 4. isoWATFLOOD requires the δ 18 O of precipitation (either rain and snow separately, or total precipitation) and can utilize (though does not require) distributed relative humidity inputs to force the model.Additionally, δ 18 O compositions for hydrologic storages of river/fen water, soil water, baseflow, and snowpack are needed for model initialization, which can be obtained from field data or estimated.Here, regional isotopic storage initializations are derived from measured data obtained during the GEWEX campaign and reported by St Amour et al. (2005).These include streamflow (−13.52 ‰), interflow (soil water; −14.60 ‰), baseflow (−20.00 ‰), and snowpack (−22.00 ‰) background compo-sitions.Sensitivity analyses have shown that within 1 month of simulation isoWATFLOOD spin-up is complete and, past this point, initialization values have no bearing on model output.All other data required by isoWATFLOOD (e.g.distributed precipitation, temperature, evaporation, inflows) are passed from WATFLOOD forcings or computations.
The isoWATFLOOD model used in this study is based on a previous version reported by Stadnyk et al. (2013).The current version used here is an updated version of isoWAT-FLOOD code, and the watershed set-up incorporates various model improvements made since 2013, independent of this study.Based on findings from Aylsworth and Kettles (2000), we implemented a 90 % bog and 10 % fen split in Blackstone and a 30 % bog and 70 % fen split in Jean-Marie.The entirety of the FSB is modelled at a 10 km spatial resolution, and the model is run continuously from January 1996 to December 1999, whereby 1996 is utilized as a spin-up to set initial hydrologic and isotopic storage conditions.

Calibration and parameter uncertainty
Being a distributed model, WATFLOOD has a large number of parameters requiring calibration.For this reason, a sensitivity analysis is first conducted to identify which parameters have the largest influence on both streamflow and δ 18 O SF .A subset of parameters is identified for inclusion in the calibration based on this sensitivity analysis, including nine hydrological parameters from each of the five most prominent land classes (mixed/deciduous, coniferous, transit, bogs, and fens), and four routing parameters from each of the two modelled sub-basins.This results in 53 parameters that are incorporated into the parameter uncertainty assessment (Tables 4  and S1 in the Supplement).Allowable ranges for each parameter are determined based on published values alongside personal communications with N. Kouwen (Kouwen, 2016) (Table S1).
This study uses a multi-criteria, multi-objective approach to model calibration, with the procedure summarized as follows.
i.A Monte Carlo random sampling approach, assuming uniform parameter distributions, is used to individually select each parameter from its allowable range (Table S1).Random parameter sampling is completed 30 000 times, generating 30 000 unique parameter sets for isoWATFLOOD model evaluation.
ii.For each of the three δ 18 O ppt inputs (KPN43, RE-MOiso, and static), streamflow and δ 18 O SF are simulated from 1996 to 1999 for all 30 000 parameter setsas defined in (i).
iii.Simulated streamflow and δ 18 O SF are assessed statistically over the period of study (1997)(1998)(1999), excluding the 1996 spin-up year), and regionally across the Jean-Marie and Blackstone sub-basins.Simulations are classified as behavioural (or non-behavioural) (Beven and Binley, 1992) based on meeting (or not) the following set of efficiency criteria thresholds, defined in detail below, for simulated streamflow and δ 18 O SF .Behavioural thresholds used in this study are subjectively defined, but are arrived at through a review of methods employed in similar studies (e.g.Moriasi et al., 2007;Birkel et al., 2010aBirkel et al., , b, 2011;;Smith et al., 2016), measurement error, and an iterative process exploring the sensitivity between the set thresholds and resulting behavioural simulations for each input type.Based on this analysis, the Nash-Sutcliffe efficiency (NSE; Nash and Sutcliffe, 1970), volumetric error criteria (%Dv), root mean square error (RMSE), and the Kling-Gupta efficiency criterion (KGE; Gupta et al., 2009;Kling et al., 2012) are selected.A multi-criteria model evaluation approach places emphasis on different statistical properties of a simulation.For example, NSE has a documented bias towards peak flow, and conversely, log(% Dv) is a more appropriate evaluation measure for periods of low flow.The NSE, % Dv, and log(% Dv) efficiency are not considered suitable metrics for δ 18 O SF assessment due to the temporal discontinuity of the isotope observations; therefore, RMSE and KGE are used as isotopic simulation statistics.The KGE statistic puts less emphasis on peak flow differences by providing a more balanced approach where error is first summed and then squared at the end, preserving the sign of the error and enabling a trade-off of error throughout the simulation period (Gupta et al., 2009).It should also be noted that δ 18 O SF observations are not equally distributed through time, whereby the highest concentration of observations occurs during snowmelt in the month of May (∼ 25 %), and the fewest observations during the 6-month ice-on period from November to April (∼ 23 %), with the remaining 52 % of observations sampled during summer.The sporadic distribution of observations may result in the calibrations more highly weighted to certain periods of the year and the dominant processes occurring at that time, thereby having the potential to impact model parameterization.

REMOiso bias correction
Due to a lack of published studies evaluating REMOiso performance within Canada, a comparison between REMOiso output and Canadian Network for Isotopes in Precipitation observations (CNIP; Birks and Gibson, 2009)   for direct comparison to CNIP data using the precipitation amount-weighting approach in Eq. ( 1): where P i is the amount of daily precipitation (mm) obtained from the Snare Rapids Canadian Air and Precipitation Monitoring Network (CAPMoN) station operated by Environment Canada, where isotopic compositions are also sampled under the Canadian Network for Isotopes in Precipitation.
Uncorrected REMOiso simulations exhibit a positive bias in this region (Fig. 2), which is expected based on the ECHAM4 mean annual δ 18 O ppt output (Noone and Sturm, 2010) and personal communications with Birks and Sturm (2016).Therefore, a seasonal bias correction is applied to daily REMOiso simulations.The bias correction is calculated as the average seasonal difference between the monthly amount-weighted REMOiso output and the CNIP observations.Corrected monthly and daily REMOiso output at Snare Rapids is displayed in Fig. 2 as the dashed red and solid orange lines, respectively.For the current study, daily REMOiso output for the Fort Simpson region is biascorrected with the seasonal correction values, ranging from −4.5 ‰ (NDJF) to −8.9 ‰ (MAM), with an average of −7.0 ‰.
The statistical properties of the corrected daily REMOiso simulations, alongside the KPN43 monthly simulations and the static seasonal averages, are summarized in Table 2.

Statistical treatment of data
For discussion and analysis purposes (Sects.4.2-4.4),results represent only the behavioural simulations derived from each δ 18 O ppt product.Uncertainty bounds are the 5th and 95th percentiles drawn from the ensembles of behavioural simulations, denoted as the shaded bounds around each model's mean simulation.
Kendall's tau coefficient (τ ) is a non-parametric test used to compare the level of correlation between two variables.We compute Kendall's tau for the mean daily streamflow and δ 18 O SF simulations derived from the three inputs.By computing τ coefficients for pairs of simulated time series (i.e.REMOiso vs. KPN43, REMOiso vs. static, and KPN43 vs. static), we can statistically evaluate the similarity of model output derived from different precipitation isotope products.
Parameter probability distributions (Table 4) are arrived at by first weighting behavioural parameters for each land cover type to their corresponding percent coverage within the modelled sub-basins.Land cover weighted parameter values are then ranked and non-exceedance probabilities determined.Routing parameter distributions for each sub-basin are arrived at using a similar approach, but are not weighted by coverage.The non-parametric Kolmogorov-Smirnov (K-S) test is used to assess whether behavioural parameter distributions are considered to be from the same distribution.
Spatially distributed precipitation isotope product maps (Fig. S1 in the Supplement) represent daily precipitation isotopes averaged across seasons (DJF, MAM, JJA, SON), and are precipitation amount-weighted using WATFLOOD precipitation input (interpolated Environment Canada Canadian Daily Climate Data, housed in WATFLOODs radcl.r2cfiles; Kouwen, 2016).Maps are generated overlapping the model grid (10 k) for the entire FSB domain, which includes the Jean-Marie and Blackstone sub-basins.

Results and discussion
Results of the three calibrations indicate that the choice of δ 18 O ppt input influences the number of simulations that meet behavioural criteria thresholds.The KPN43 product results in more behavioural simulations (n = 321) relative to the RE-MOiso (n = 268) or static (n = 216) products (Table 5).This also implies that choice of δ 18 O ppt input influences the models' internal apportionment of water (i.e.hydrograph separations) via model parameters.Among input types, potentially significant differences in several parameters were noted (Table S1), and are discussed in Sect.4.4.In almost all instances, the ranges of the parameters were not significantly  constrained from the allowable parameter ranges, yielding confidence in our simulated parameter uncertainty envelopes.

Precipitation oxygen-18 input
Of the three δ 18 O ppt products, KPN43 input is on average the most enriched (−20.48 ‰), followed by REMOiso (−21.78 ‰), and static is the most depleted (−22.82‰) (Figs.3a and 4a).The KPN43 and static products show similar variation about their means, with CVs equal to 0.19 and 0.20, respectively.Conversely, REMOiso has a higher CV (0.25) and much larger range, which is, in part, due to the finer daily time step of this input.Spatial variability between Jean-Marie and Blackstone is zero for the static product as annual snow and rainfall compositions are spatially averaged across the domain.each product.Averaged spatial variability is greatest for the KPN43 forcing, followed by REMOiso, and is constant for the static product.REMOiso shows less long-term average variability because its temporal variability is greater, resulting in more chaotic (randomized) signals of δ 18 O ppt that produce weaker long-term signals when averaged over time.KPN43, on the other hand, exhibits more consistent spatial patterning of δ 18 O ppt variability, resulting in stronger signals of long-term variability on a per-grid basis (Fig. S1).REMOiso input is derived on a 55 km grid, meaning that approximately five isoWATFLOOD grids are equivalent to one REMOiso grid, which also results in a terrain (variability) smoothing effect.The static input exhibits seasonal variability caused by the different compositions of rain and snow, and mixed shoulder season compositions (MAM and SON) when both rain and snow occur.
Although there are only 19 rainfall δ 18 O observations collected over the study period in Jean-Marie, and eight within Blackstone (hollow black diamonds in Figs.3a and 4a), these limited data provide some information regarding the accuracy of the products.By visual inspection, each of the three products generates reasonable estimates of δ 18 O ppt .This is expected for the static input, which is derived directly from these observations; however, this provides qualitative valida-tion for KPN43 and REMOiso.REMOiso is the only product that can somewhat replicate event-scale variability in δ 18 O ppt due to its daily time step.The KPN43 product appears to represent the average composition of summer rainfall events, and displays reasonable seasonal variability.There are insufficient observations to statistically validate these products within the Fort Simpson Basin.The semi-annual static input perhaps does a reasonable job of reflecting δ 18 O ppt seasonality because of the high-latitude location of the basin, where much shorter shoulder seasons exist.

Modelling streamflow
All calibrations adequately capture variations in total streamflow in both sub-basins, as emphasized by the regional calibration statistics (Table 5).On average, behavioural streamflow simulations have a NSE of 0.68, and % Dv of 13.8 %.Mean daily streamflow and uncertainty bounds for the KPN43, REMOiso, and static model calibrations are displayed in Figs.3b and 4b.It is worth noting that both basins have similar drainage areas and received comparable precipitation inputs over the study period, which would naturally result in similar streamflow responses.Comparing normalized (by drainage area) observed discharge over the study Hydrol.Earth Syst.Sci., 21, 2595-2614, 2017 www.hydrol-earth-syst-sci.net/21/2595/2017/ period for the basins reveals the Blackstone sub-basin generates nearly twice as much runoff as the Jean-Marie sub-basin, with normalized discharges of 0.56 and 0.31 mm km −2 , respectively.Therefore, differences in hydrograph characteristics (i.e.peak flows, attenuation, etc.) between Jean-Marie and Blackstone result from variations in basin physiography, storage mechanisms, and land cover composition, specifically large differences in average basin slope and surface water and wetland dynamics (St Amour et al., 2005).Namely, the higher energy environment of Blackstone River promotes a quicker runoff response, and the flatter, more surface water dominated Jean-Marie basin yields a damped runoff response, on average.Within Jean-Marie, both the timing and volume of peak flows derived from snowmelt are well captured in 1998; however, volume is underpredicted in 1997 and 1999 for the average streamflow simulation.The parameter uncertainty bounds generally enclose the observed spring melt hydrograph, except in 1999, when the timing of the melt peak is simulated later than was observed.Snowmelt is controlled by a degree-day snowmelt function in WATFLOOD, using temporally constant snowmelt parameters.Parameter stationarity likely results in an inadequate description of the interannual variability in energy balance and snowpack ripening dynamics within the model.All simulations have difficulty capturing the volume of the snowmelt recession limb, which may be caused by the parameterization of baseflow and fen responses in this sub-basin.Based on previous studies (Connon et al., 2015), it has been suggested that bog and fen complexes are likely one of the primary drivers of hydrograph timing and shape due to their ability to dynamically alter drainage pathways, particularly in this region.In 1997, following a significant melt event, all simulations in Jean-Marie exhibited higher than observed recession limb flows, indicating runoff was slow to drain and storages were too high.This could be an indication of WATFLOOD's inability to capture the dynamic flow paths occurring within Jean-Marie's extensive fen network.This same recession limb discrepancy does not occur in Blackstone, where there are much fewer fens, and bogs would remain hydraulically isolated even during wetter conditions (Connon et al., 2015).In Blackstone, the recession limb hydrograph is well simulated across all inputs; however, peak flows (with the exception of the 1997 snowmelt) are generally underestimated.Postfreshet, simulations adequately capture the timing of rainfall events, but (with the exception of 1997 in Jean-Marie) consistently underestimate the magnitude of the peaks.This underestimation is most evident when all simulations generated a very limited streamflow response to an early October rainfall event in 1998, underestimating the observed peak flow by approximately 50 % (Jean-Marie) and 75 % (Blackstone).These results may point to inadequate precipitation forcing due to the climate station proximity, high spatial variability of rainfall, and inadequate soil moisture parameterization, or could be an unintended side-effect of using NSE in our calibration (Gupta et al., 2009).
Most interesting is the similarity of the streamflow simulations among the different δ 18 O ppt products, further assessed by Kendall's tau coefficient (τ ).In Jean-Marie, τ ranges between 0.92 (REMOiso vs. static) and 0.97 (KPN43 vs. static).In Blackstone τ is more tightly constrained, ranging from 0.96 (REMOiso vs. static) to 0.98 (KPN43 vs. static).All τ values are statistically significant.It should be noted that small deviations between mean streamflow simulations occur during spring melt, where REMOiso-derived streamflow consistently results in higher peaks than KPN43 and staticdriven simulations.These differences in mean streamflow, however, fall within overlapping uncertainty bounds and are not significant outside of parameter uncertainty.Despite significant changes to model parameters (Table S1), the resultant efficiency statistics among the mean streamflow simulations remain nearly identical (Table 5).Based on this analysis, we find that the three precipitation isotope products generate statistically similar streamflow simulations.Given the insignificant differences in streamflow response, it is only through analysis of δ 18 O SF that the impact of different model parameterizations is assessed.

Modelling δ 18 O in streamflow
Mean daily δ 18 O SF simulations and uncertainty bounds for the KPN43, REMOiso, and static product model calibrations are displayed in Figs.3c and 4c.Each model calibration produces mean simulations that capture many of the trends, but not particularly the magnitudes, present in the observed δ 18 O SF record.Observed δ 18 O SF show a depletion due to large influxes of snowmelt during the spring freshets, and gradual enrichment over the summer months due to the evaporation of surface and soil waters, occasionally punctuated by rainfall events that may enrich or deplete δ 18 O SF .During late fall and throughout the winter, δ 18 O SF tends toward a more depleted, stable groundwater composition (St Amour et al., 2005).
Though each of the model calibrations results in similar trends relative to the observed δ 18 O SF record, there are notable departures.As simulated δ 18 O SF uncertainty envelopes associated with each δ 18 O ppt product are, at times, nonoverlapping, differences in δ 18 O SF simulations can be attributed to δ 18 O ppt product and, therefore, are not just an artefact of parameter uncertainty (unlike streamflow).The dissimilarities between δ 18 O SF simulations are also reflected in the RMSE statistic (Table 5); the RMSE is larger for staticderived simulations due to increased emphasis on periods with a higher observation density (i.e.spring freshet), where larger offsets between simulated and observed δ 18 O SF exist.The KPN43 and REMOiso calibrations produce comparable RMSE.The KGE statistic shows only minor differences between δ 18 O SF simulations given the statistic puts more emphasis on long-term bias (Gupta et al., 2009), there-fore reflecting the fit of the overall simulation throughout the study period for this highly seasonal basin (Kling and Gupta, 2009).Further research is required to better understand the impacts of sporadic sampling resolution (for δ 18 O SF observations) on efficiency criteria, and consequently the objective functions.It is noted that sampling during peak freshet was, at times, limited by accessibility during the high water stage; therefore, some temporal gaps exist in the observed δ 18 O SF record (particularly in 1999) during the period that streamflow compositions are generally most depleted.
Differences in δ 18 O SF simulations within each sub-basin are due to a combination of (1) markedly different δ 18 O ppt input compositions during large precipitation events amongst precipitation isotope products, and (2) how new water transits through the system via the model's hydrological compartments.For this study area, large precipitation events can be separated into (1) the accumulation of winter snowfall and corresponding spring freshet (approximately 35 to 40 % of annual precipitation), and (2) major rainfall events occurring post-freshet (summer and fall) (with rainfall representing approximately 60 to 65 % of annual precipitation).
No single model calibration produces consistently strong simulations of δ 18 O SF during the snowmelt period.The KPN43 calibration best captures the timing and magnitude of spring freshet, but overestimates δ 18 O SF (i.e. is too enriched) during the 1997 melt in Blackstone.Conversely, the static and REMOiso calibrations capture the large depletion during the 1997 melt in Blackstone, but produce overly depleted simulations during the 1998 and 1999 freshets -most notably within Jean-Marie.There is a tendency for all models to simulate relatively depleted spring freshet δ 18 O SF compositions.This can be attributed to several factors: (1) overly enriched δ 18 O ppt during the winter months, (2) unaccounted for snow metamorphism processes, (3) an overestimation of direct snowmelt runoff (i.e.streamflow volume), and (4) inaccurate antecedent composition of δ 18 O SF simulated by the models just prior to the spring melt.
Post-freshet, δ 18 O SF simulations are impacted by rainfall amount and composition, and the offset between simulated δ 18 O SF and δ 18 O ppt input at the time of rainfall.As rainfall amount and/or this offset increases, the resulting impact on simulated δ 18 O SF increases.This highlights the importance of capturing the spatial and temporal variability in rainfall δ 18 O, particularly for large and isotopically distinct (from streamflow) events.The threshold defining a large rainfall event varies depending on basin physiography, land cover, storage capacity, and antecedent conditions.St Amour et al. (2005) estimate this threshold to be ≥ 40 mm within the Fort Simpson region.Such a large, isotopically distinct rainfall event occurred on 11-12 June 1998, when approximately 7 mm fell over 2 days with an observed bulk event δ 18 O ppt composition of −22.7 ‰.Both the RE-MOiso and static products reasonably capture this event (−20.9 and −20.1 ‰, respectively, across the study domain); however, the KPN43 product predicted an average δ 18 O ppt composition of −17.6 ‰.In Jean-Marie, where large fen networks help to moderate rainfall-runoff response, the observed δ 18 O SF did not deplete in response to this event, but rather maintained a similar pre-event composition of around −19.17 ‰ (Fig. 3c).KPN43-driven simulations most closely match observed δ 18 O SF due to the antecedent composition of δ 18 O SF prior to the event, even though the KPN43 input generated the least accurate estimate of the depleted rainfall δ 18 O ppt .Conversely, in Blackstone the 11-12 June rainfall generated a much different response in observed δ 18 O SF : a sharp depletion from −19.11 to −20.98 ‰ (Fig. 4c).In this instance, the REMOiso and static calibrations most closely match the observed δ 18 O SF due to their closer representations of the rainfall event composition.In Blackstone, this single event results in a significant offset between KPN43-driven δ 18 O SF simulations relative to those driven by REMOiso and static products, maintained throughout 1998 and up until the 1999 freshet resets the δ 18 O SF .
Throughout much of Canada and in other high-latitude climates, the spring freshet generates a substantial portion of annual streamflow (and typically peak annual flow) when the accumulation of solid precipitation from the winter season melts in late spring over a period of a few weeks.It is therefore important to understand how differences among the products impact snowpack (and subsequently snowmelt) isotopic compositions in isoWATFLOOD.Figure 5 shows the evolution of precipitation-weighted snowpack oxygen-18 (δ 18 O SNW ) throughout each winter of the study period relative to the observed snowpack compositions (hollow black diamonds).Not surprisingly, the static snowpack compositions closely match with observed δ 18 O SNW , and we note that KPN43 and REMOiso snowpacks are more enriched.Caution should be used when comparing modelled vs. observed data here as there is little inter-annual consistency in the number of samples and the location where sampling took place, and no information was provided as to how the δ 18 O SNW were collected (i.e.depth-integrated or depth-dependent samples).Comparison of like-forcing pairs between Jean-Marie and Blackstone reveals subtle spatial differences in simulated δ 18 O SNW .Dissimilarities between the three products within each basin are, however, significant.Interestingly, REMOiso and KPN43 end of winter precipitation-weighted δ 18 O SNW compositions differ by less than 0.5 ‰ in 1997-1998 and 1998-1999.REMOiso and KPN43 inputs consistently generate significantly more enriched snowpacks relative to the static δ 18 O SNW product (and much of the observed data).On average, KPN43 is 3.3 ‰ more enriched, and REMOiso is 3.1 ‰ more enriched than end of season static δ 18 O SNW .Differences in simulated δ 18 O SNW among the products are partially attributed to the poor representation of snowpack physics (i.e.fractionation resulting from sublimation and snow metamorphism) in the current version of the isoWATFLOOD model.The static input inadvertently accounts for some of these processes, as the specified compositions are derived from snow- pack observation near end of winter (in late March).Uncertainty in simulated δ 18 O SNW among the products is notable as well, with static δ 18 O SNW uncertainty remaining relatively constant over the winter relative to REMOiso, and particularly KPN43, where uncertainty decreases as snowpack depth increases (Fig. 5).This is an artefact of the parameterization of sublimation in the models.As the winter progresses, the snowpack grows and sublimated volumes become a smaller fraction of the total snowpack, thus decreasing the effect (and uncertainty) that sublimation has on the volume-weighted δ 18 O ppt of the snowpack.This is observed during periods when the simulated snowpack and snow water equivalent (SWE) are larger, for example, 1998 relative to 1999 (Fig. 5).These significant differences in simulated snowpack composition are one of the primary reasons for offsets between KPN43, REMOiso, and static δ 18 O SF simulations (Figs.3c  and 4c).Once a δ 18 O SF simulation has been offset, it is not possible to "reset" the composition in late fall as streamflow decreases to near-zero and mass retained in the system.This can result in compounding isotopic error (if the offset deviates from observed data) during continuous simulation, thus highlighting the sensitivity of the tracer as a calibration tool.Compounding error is also observed for rainfall events, but generally to a lesser extent due to the relatively smaller durations and magnitudes (volume contributions) of most rainfall events (relative to snowmelt) in high-latitude regions.
Since both δ 18 O SF and δ 18 O SNW are significantly different among δ 18 O ppt products, internal water apportionment (determined by model parameterization) is also likely impacted.Differences in hydrograph separations among the calibrated models are explored to determine the impact δ 18 O ppt has on internal water apportionment and simulation uncertainty.

Hydrograph component analysis and parameter distributions
Component contributions to total streamflow from surface runoff, interflow, and baseflow storage in each season (DJF: December-January-February; MAM: March-April-May; JJA: June-July-August; and SON: September-October-November) derived from each δ 18 O ppt product are shown in Fig. 6.Jean-Marie and Blackstone display similar trends in internal water apportionment throughout the year, indicating generally similar model parameterizations and hydrograph separations among the two basins.Some seasonal differences in component separations exist, however, which are linked to variations in basin physiography, land cover, and storage characteristics reflected by differences in the baseflow (lzf and pwr) and wetland parameters (kcond and theta) among basins (Table S1).Freshet and post-freshet percent contributions to total streamflow in this study are in agreement with those reported in previous studies.St Amour et al. (2005) reported significant post-freshet groundwater Comparison of seasonal volume contributions derived from each δ 18 O ppt product reveals that during spring (MAM), REMOiso-driven simulations show more surface flow contribution to total streamflow, with the mean volume lying above the 95th percentile volumes for both the KPN43 and static input simulations (Fig. 6).On average, REMOiso simulations contribute almost twice as much surface runoff to total streamflow as KPN43 and static simulations during MAM (39 % vs. 25 and 22 %, respectively, for Jean-Marie; and similar, yet slightly larger, percent contributions for Blackstone).
From the seasonal analysis, no other significant differences in component contributions outside of parameter uncertainty can be attributed to the δ 18 O ppt product.It is important to note, however, that each δ 18 O ppt product results in different amounts of parameter uncertainty, both seasonally and overall, as represented by the width of the uncertainty bounds (cross symbols in Fig. 6).The variation in uncertainty bounds between δ 18 O ppt products is also visible in Fig. 3 through Fig. 5.The REMOiso input yields the largest amount of uncertainty in total streamflow, also reflected in the relatively larger amounts of uncertainty in surface water and baseflow component contributions (Fig. 6).Conversely, KPN43 and static inputs generate similar or slightly larger uncertainty in interflow (soil water) contributions relative to REMOiso and lower uncertainty surrounding surface and baseflow contributions, and overall total streamflow.These differences in uncertainty are attributed to the number and characteristics of behavioural parameters retained for each δ 18 O ppt input, which originate due to distinctions in magnitude and variability (both spatial and temporal) among δ 18 O ppt products.Further demonstrated by parameter probability distributions (Fig. 7), the three calibrations result in noteworthy differences in behavioural parameters.We do not display these distributions to comment definitively on parameter identifiability because, as previously noted, the number of evaluations was insufficient for that purpose.Rather, we introduce this analysis to further explore how model parameterization is impacted by δ 18 O ppt input.The selected parameters (Table 4) influence evaporation (f -ratio), surface runoff during snowmelt (akfs, base), upper and lower zone storage (retn), interflow (retn), and baseflow (lzf, pwr).REMOiso parameter distributions more often than not differ from KPN43 and static parameter distributions.Although dissimilarities between KPN43 and static parameter distributions exist, these are typically not as prevalent as differences with REMOisoderived distributions.This echoes the findings from Fig. 7 that KPN43 and static-derived component contributions are more similar than those derived from REMOiso, which may very well be due to the increased spatial and temporal variability of the REMOiso δ 18 O ppt product.Though we cannot verify the correctness of the REMOiso δ 18 O ppt given the absence of daily precipitation isotope observations, differences among inputs imply that temporal resolution of δ 18 O ppt plays a role in the parameterization of a model and the resultant hydrograph separation.
Differences in surface water contributions during snowmelt between REMOiso, KPN43, and static inputs are likely derived from differences in the akfs and base parameters.Parameter distributions from REMOiso are significantly different (as verified through Kolmogorov-Smirnov testing of distributions) than the KPN43 and static input distributions for these parameters (Fig. 7b and f).Lower akfs values represent decreased infiltration and increased surface runoff during snowmelt, which corresponds to REMOiso's increased surface water contributions to total streamflow during spring (MAM).Dissimilarities in baseflow contributions among δ 18 O ppt inputs are influenced by differences in the lzf and pwr parameters (Fig. 7c, d, and g-h), which have a large impact on the quantity of baseflow and the slope of the recession limb of the hydrograph.Wider uncertainty bounds for REMOiso relative to KPN43 and static calibrations within Blackstone (Fig. 6f), and for all models during fall and winter within Jean-Marie (Fig. 6c), are likely due to the wider range of behavioural values for the pwr parameter, specifically the inclusion of lower values which results in longer, more drawn-out recession limbs.It appears that choice of precipitation isotope product influences parameter distributions in isoWATFLOOD, which in turn alters internal water apportionment.In the tracer-aided modelling community, this has significant implications for hydrograph separation and any associated transit time analyses, both of which will be influenced by choice (resolution) of δ 18 O ppt product.

Conclusions
This study used three types of precipitation isotope products as δ 18 O ppt input to a tracer-aided hydrological model (isoWATFLOOD) to investigate the impact differing spatial and temporal resolutions have on simulation of streamflow, isotopic composition of streamflow, internal hydrograph separations, and model parameterization and corre-sponding parameter uncertainty.Our study found that choice of precipitation isotope product (δ 18 O ppt ) 1. did not impact simulation of total streamflow, or the achieved efficiencies of streamflow simulation; 2. impacted the internal apportionment of water, thereby impacting hydrograph separations; 3. impacted model parameterization, and therefore simulation uncertainty; and 4. impacted the variability of simulated δ 18 O SF , most noticeably when event compositions differed significantly from streamflow composition (e.g.snowmelt and large rainfall events).
Of the 30 000 simulations performed for each precipitation isotope product forcing, only 10 % or less were behavioural for each input type.Due to the wide range of behavioural parameter values (Table S1), however, we are confident that the approach used was sufficient to characterize parameter uncertainty.Not unexpectedly, this finding also indicates that 30 000 model evaluations were not sufficient to quantify parameter identifiability in this study.Although total simulated streamflow was not significantly affected by choice of δ 18 O ppt input, δ 18 O SF simulations and the internal apportionment of water (surface flow, interflow, and baseflow) were significantly impacted here.Significant differences in internal water apportionment among the models were diagnosed via δ 18 O uncertainty.Variation between models was greatest during the freshet period, where significantly different snowpack compositions were simulated among the different precipitation isotope products.The highest-resolution (REMOiso, daily) input resulted in significantly different parameter distributions and seasonal hydrograph separations than the other two (monthly and semiannual) inputs.These findings have direct implications for hydrograph separation, and simulated water transit times.In this study, we found that choice of δ 18 O ppt input directly impacted model parameterization, and for this reason, studies should account for both input and parameter uncertainty.Also highlighted was the significance of the snowpack and melt dynamics in tracer-aided models applied in high-latitude regions, resulting in high seasonal uncertainty and indicating more research is warranted to improve process representation.Use of a tracer-aided model afforded an examination of internal model dynamics resulting from specific parameterizations, allowing us to assess the realism of individual simulations as opposed to their efficacy alone.
This study demonstrated that direct quantification of model equifinality was possible using tracer-aided models, and furthermore, we demonstrated that this equifinality was not diagnosable via simulation of streamflow.We have achieved an understanding of how tracer-aided models, like isoWATFLOOD, can be used in data-sparse regions, with limited input data (including δ 18 O ppt observa-tions), and that despite these limitations, these models can still be of value.Regarding the usefulness of precipitation isotope products in regions with limited observations, both the static and REMOiso inputs require existing δ 18 O ppt observations (i.e. from CNIP) to either define or bias-correct the input, limiting their use for certain applications.If these data are not available, the KPN43 input provided reasonable results without the need for additional observations.The existence of CNIP (and other precipitation isotope networks) was crucial to the development, validation, and bias correction of existing δ 18 O ppt products.Attaining an understanding of how δ 18 O ppt input uncertainty impacts simulated model output is important when calibrated models are used to assess climatedriven or land-use-driven impacts on streamflow in remote, data-sparse, high-latitude regions.
For use in tracer-aided modelling, precipitation isotope products should capture both the event-based variability and seasonality of precipitation isotopes to reproduce realistic δ 18 O SF variability.Higher-resolution δ 18 O ppt inputs (e.g.REMOiso, daily) were able to capture event-specific compositions that, when significantly different from δ 18 O SF , tended to cause significant deviations from the δ 18 O SF derived from monthly and semi-annual (i.e.static) inputs.Unfortunately, we could not verify the correctness of the higher-resolution product (i.e.REMOiso) in this study due to the sporadic sampling of isotopes in precipitation observations.Static and seasonal precipitation isotope products missed eventspecific isotopic variation occurring as a result of heavy rainfall events, which require increased temporal resolution (e.g.daily δ 18 O ppt inputs from REMOiso; but perhaps weekly input would suffice) to resolve rainfall event compositions.In seasonal environments, precipitation isotope products must capture the transition from rainfall to snowfall, and from snow accumulation to snowmelt to sufficiently model δ 18 O SF .In this study, both static and monthly inputs adequately captured δ 18 O SF variability at the basin outlet, perhaps the result of the unique seasonality in high-latitude regions.Spatial variability was detected across the study region in δ 18 O ppt inputs, and can be represented by distributed tracer-aided models, like isoWATFLOOD.There is reason to suspect that the variability in (both spatial and temporal) precipitation isotope inputs influences model parameterization; therefore, spatial variability should be preserved to derive the most representative model of a given region.This work highlighted the power of tracer-aided modelling to inform and quantify equifinality in hydrological simulation, helping modellers to work towards reducing modelling uncertainty.Although more work is required to assess and understand parameter identifiability, our analysis showed that selection of precipitation isotope (δ 18 O ppt ) product had direct implications for model parameterization, and that input uncertainty should be considered in future studies.
Hydrol.Earth Syst.Sci., 21, 2595Sci., 21, -2614Sci., 21, , 2017 www.hydrol-earth-syst-sci.net/21/2595/2017/ 6 Future directions Distributed hydrological models, such as WATFLOOD, are complex with large numbers of parameters; therefore, it is important as a community to work toward conducting comprehensive studies that focus on input data uncertainty and parameter identifiability.In the tracer-aided modelling community, this includes uncertainty from precipitation isotope products and their varying spatial and temporal resolutions.Ideally, further studies should be conducted in wellinstrumented basins where δ 18 O ppt input can be better characterized using observed data at higher spatial, and most importantly, temporal resolutions.Several key questions warranting more detailed investigation include the following: (1) are precipitation isotope products adequate alternatives in place of δ 18 O ppt observations; (2) are there specific subsets of model parameters that are more sensitive to choice of precipitation isotope product; and (3) how do (if at all) parameters compensate for compounding model error?Unfortunately, at least within Canada, a well-instrumented watershed at the regional scale does not yet exist, pointing to the importance of implementing additional (or enhancing current) iso-hydrometeorological monitoring networks.Not unexpectedly, the RCM-driven precipitation isotope product in this study, REMOiso, exhibited some bias and needed correction prior to application.More studies are needed to better understand the nature of this bias, and the most appropriate bias-correction methods, which can be done using observations from the CNIP database at a monthly resolution.Due to the lack of high-resolution δ 18 O ppt observations in Canada, however, daily or weekly validation is not yet possible.Additionally, the suitability and performance of other isotope-enabled RCMs for use in Canada, and elsewhere, should be explored.
Lastly, as a tracer-aided hydrologic community we need to push for the sustained monitoring of isotopes in precipitation and streamflow that are required to inform our models and improve uncertainty assessment.This study elucidated the impact that discontinuous observations can have on quantifying model uncertainty, which would only be further exasperated by the absence of observations altogether.In Canada, a concerted effort is needed by the government to protect and sustain our observation networks, which are required for improved prediction in remote regions for climate and hydrologic change detection.
by a simple storage-discharge retn Upper zone retention (mm) relation: DUZ = REC • (UZS-RETN) • Si, where UZS = upper zone storage DUZ = depth of upper zone storage released as interflow, and Si = internal land surface slope.ak2 Recharge coefficient (bare ground) Upper zone to lower zone drainage is represented by a simple storage-discharge relation: DRNG = AK2 • (UZS − RETN), where DRNG is the drainage from UZS to LZS. mf Melt factor (mm • C −1 h −1 ) M = MF(T a − base) base Base temperature ( • C) Anderson (1976) sub Sublimation factor Sublimation is modelled by a static sublimation factor.Amount of sublimation is a fraction of the observed snowfall.For new model set-ups, the sublimation factor has been replaced by a static sublimation rate.

Figure 2 .
Figure 2. Comparison of raw and corrected REMOiso δ 18 O ppt output with CNIP monthly compositions at Snare Rapids, NWT.

Figure 3 .
Figure 3. Input and behavioural simulations for Jean-Marie, including (a) KPN43, REMOiso, and static δ 18 O ppt input time series and daily precipitation; and simulated (b) mean daily streamflow and uncertainty bounds and (c) mean daily δ 18 O SF and uncertainty bounds, for KPN43, REMOiso, and static driven model calibrations.δ 18 O ppt input-specific uncertainty bounds are represented as the shaded regions, with shading colour corresponding to δ 18 O ppt type.

Figure 4 .
Figure 4. Input and behavioural simulations for Blackstone, including (a) KPN43, REMOiso, and static δ 18 O ppt input time series and daily precipitation, and simulated (b) mean daily streamflow and uncertainty bounds and (c) mean daily δ 18 O SF and uncertainty bounds, for KPN43, REMOiso, and static driven model calibrations.δ 18 O ppt input-specific uncertainty bounds are represented as the shaded regions, with shading colour corresponding to δ 18 O ppt type.

Figure 5 .
Figure 5. Precipitation-weighted δ 18 O of snowpack (δ 18 O SNW ) for KPN43, REMOiso, and static inputs from January to the end of melt for each year of the study period.Snow water equivalent (SWE), snowfall (gray line), and rainfall (blue line) are also shown.δ 18 O ppt inputspecific uncertainty bounds are represented as the shaded regions.Diamond symbols represent δ 18 O SNW observations sampled within each respective sub-basin during the GEWEX campaign.

Figure 6 .
Figure 6.Percent seasonal volume contributing to total streamflow from surface runoff, interflow, and baseflow storages for each season.Cross symbols represent the 5th and 95th percentiles for each forcing method, and circle symbols signify the mean values.The combined uncertainty bounds representing the 5th and 95th simulations from all three δ 18 O ppt input types are shaded in gray.

Figure 7 .
Figure 7. Probability distributions for selected parameters (Table 5), as indicated in the bottom right corner of each panel.Parameters are from behavioural simulations, and (a), (b), (e), and (f) have been weighted to the land cover distribution within Jean-Marie and Blackstone, as outlined in Table 1.(c) and (d) are river class parameters within Jean-Marie, and (g) and (h) contain river class parameters for Blackstone.

Table 1 .
Basin characteristics, including land cover classification, area, and average basin slope (recreated from data provided in St Amour et al., 2005).
Figure1.Fort Simpson River Basin (all other tributaries of the Liard and Mackenzie rivers have been removed for ease of viewing).

Table 2 .
Data summary for the study period (SP) and period of record (PoR).The coefficient of variation (CV) is calculated as the ratio of the standard deviation to the mean.
* Provided only for the study period, 1997-1999.n/a: not applicable.

Table 3 .
Static δ 18 O ppt input compositions of annual rainfall and snowfall oxygen-18 for isoWATFLOOD.

Table 4 .
Parameters included in the Monte Carlo calibration, alongside a description of what the parameter represents and the algorithm it is used within.
Samani (1982).f-ratio is a multiplier for the interception capacity for each land class.akSurfacepermeability (bare ground) Conceptual infiltration algorithm (similar to Green and Ampt, akfs Surface permeability 1911), but based on Richard's equation, which is physically based

Table 5 .
Average simulation statistics from n behavioural simulations for streamflow and δ 18 O SF for the three model calibrations (using KPN43, REMOiso, and static inputs).
Spatial variation among sub-basins is noted in the KPN43 and REMOiso products.Both the KPN43 and REMOiso products show, on average, more depleted δ 18 O ppt values within Blackstone (−20.79 and −22.01 ‰, respectively)in comparison to Jean-Marie (−20.17 and −21.54 ‰, respectively), likely caused by the higher-elevation headwaters of Blackstone relative to Jean-Marie (a maximum difference of ∼ 215 m).FigureS1provides seasonally averaged, spatially distributed maps for www.hydrol-earth-syst-sci.net/21/2595/2017/