University of Birmingham Can spatial statistical river temperature models be transferred between catchments?

. There has been increasing use of spatial statistical models to understand and predict river temperature ( T w ) from landscape covariates. However, it is not ﬁnancially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses T w data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum T w ( T w max ) within and between catchments. Models were ﬁtted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature ( T a ) metric (LS_ T a models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of T w max . However, the predictions were biased when mean T w max differed between catchments. The RNS was needed to characterise and predict ﬁner-scale spatially correlated variation. Because the RNS was unique to each catchment


Introduction
River temperature (T w ) is a key control on the health of aquatic systems (Webb et al., 2008) and is particularly important for the growth, survival and demographic characteristics of cold-water-adapted species such as salmonids (Elliott and Elliott, 2010;Gurney et al., 2008;Jonsson and Jonsson, 2009;McCullough et al., 2001).Rising T w will influence fish populations by altering the thermal suitability of rivers (Comte et al., 2013;Isaak et al., 2010Isaak et al., , 2012)).Thus, models that can (1) identify areas most affected by thermal extremes, (2) improve understanding of spatiotemporal variability of thermal regimes, (3) predict the potential effects of climate change and (4) illustrate opportunities for thermal moderation, such as riparian tree planting (Hannah et al., 2008;Hrachowitz et al., 2010), are important for fisheries Published by Copernicus Publications on behalf of the European Geosciences Union.F. L. Jackson et al.: Can spatial statistical river temperature models be transferred between catchments?management.Large-scale models are required to provide information at the spatial scales appropriate to management decisions, i.e. catchment (Chang and Psaris, 2013;Hrachowitz et al., 2010;Imholt et al., 2011Imholt et al., , 2013;;Jackson et al., 2017b;Steel et al., 2016), regional (Hill et al., 2013;Isaak et al., 2012;Ruesch et al., 2012) and national scales.
Although process-based models provide important mechanistic understanding at small spatial scales, their intensive data requirements prohibit their use at larger scales (Jackson et al., 2016).In contrast, empirical models of T w rely on T w observations and explanatory covariates (e.g.altitude or air temperature) which can often be derived remotely at relatively low cost.The development of affordable, reliable, accurate T w data loggers has led to a rapid increase in T w monitoring (Sowder and Steel, 2012), to the point that staff time, data storage and quality control are often now the greatest limitations on data collection (Jackson et al., 2016).At the same time, there have been substantial developments in spatial statistical modelling approaches (Ver Hoef et al., 2006, 2014;Ver Hoef and Peterson, 2010;Isaak et al., 2014;Jackson et al., 2017b;O'Donnell et al., 2014;Peterson et al., 2013;Rushworth et al., 2015), monitoring network design (Dobbie et al., 2008;Jackson et al., 2016;Som et al., 2014), spatial datasets (e.g.shapefiles incorporating covariates such as in "The National Stream Internet Project" (Isaak et al., 2011) or gridded air temperature datasets (Perry and Hollis, 2005a, b)) and spatial analysis tools (Isaak et al., 2011(Isaak et al., , 2014;;Peterson et al., 2013;Peterson and Ver Hoef, 2014).
While continuous river temperature data are routinely collected in some areas, resulting in large regional temperature datasets and associated models (e.g.Wehrly et al., 2009;Moore et al., 2013), this is far from universal.In many cases, financial and logistical considerations limit data collection, making it impractical to monitor all rivers.For example, in Scotland, there are 16 006 river catchments (unique rivers running to the sea), including 629 catchments > 10 km 2 , but there was no systematic nationwide quality-controlled river temperature data collection until 2015 (Jackson et al., 2016).For management purposes, it is therefore often necessary to predict T w at unmonitored locations, both within (Hrachowitz et al., 2010;Jackson et al., 2017b;Peterson and Urquhart, 2006) and between catchments (Isaak et al., 2014).In recent years, it has become increasingly common to develop and apply spatial statistical river network models that incorporate network covariance structure to predict spatial variability in river temperature (e.g.Isaak et al., 2014;Jackson et al., 2017b).It is widely acknowledged that these models can dramatically improve predictions of river temperature where sufficient observational data exist, but the covariance component of the predictions cannot typically be transferred between catchments.Despite these important issues, there has not yet been an assessment of the transferability of spatial statistical T w models between catchments; i.e. the ability of a model developed in one catchment to predict T w in another.This paper investigates the ability of spatial statistical T w models to predict T w at unmonitored locations within and between catchments.
The principles explored in this paper are likely to be relevant to other water temperature metrics so, for brevity, this study focuses on maximum summer temperature, a metric which is prevalent in the recent literature, reflecting its importance for the survival of cold-water-adapted fish (Chang and Psaris, 2013;Hrachowitz et al., 2010;Jackson et al., 2017b;Malcolm et al., 2008;Marine and Cech, 2004).
Models are fitted using two sets of covariates.The first set contains landscape covariates which can be generated from readily available spatial datasets and have been the focus of many previous studies of spatial variability in river temperature (e.g.Hrachowitz et al., 2010).Due to increasing interest in the use of air temperature (T a ) to predict spatial variability in water temperature (e.g.Jonkers and Sharkey, 2016), the second set contains a metric of air temperature in addition to landscape covariates.
The paper addresses the objectives to 1. develop statistical models for predicting maximum summer water temperature from landscape covariates in four separate river catchments, 2. determine whether models containing an air temperature metric explain more of the variation in maximum summer water temperature than those only containing landscape covariates, 3. assess the transferability of models containing only landscape covariates or both landscape and air temperature covariates between catchments and 4. produce single models of maximum summer water temperature for all four catchments using both sets of covariates and consider their potential for transferability at larger (e.g.national) scales.

Methodology
2.1 Water temperature data and metric T w data were obtained from monitoring sites in four catchments: the Bladnoch in western Scotland, and the Dee (Aberdeenshire), Spey and Tweed in eastern Scotland (Fig. 1).These catchments are Special Areas of Conservation for Atlantic salmon and form part of the Scotland River Temperature Monitoring Network (SRTMN) (Jackson et al., 2016).Details of the network, including design and quality-control procedures, are given in Jackson et al. (2016).The catchments all contain an adequate number of T w data loggers to develop T w models on a single-catchment basis, with 59, 34, 25 and 19 sites in the Dee, Tweed, Spey and Bladnoch, respectively.The choice of catchments ensured a broad geographic coverage across Scotland with a wide environmental range of landscape covariates (Jackson et al., 2016).Data were collected at 15 min intervals throughout August 2015.The maximum temperature was calculated for each day and used to produce a 7-day rolling mean of maximum temperatures.The metric of maximum temperatures used in this study (T w max ) was the maximum value of this 7-day rolling mean (Jackson et al., 2017a).This metric was preferred to a single observation of T w , as it characterises the occurrence of sustained high temperatures which are thought to be most ecologically damaging.

Model covariates
Detailed discussion of the landscape covariates and their calculation can be found in Jackson et al. (2016).In brief, the covariates were elevation (Elevation), upstream catchment area (UCA), percentage riparian woodland (%RW), hillshading/channel illumination (HS), channel width (Width), channel gradient (Gradient), channel orientation (Orientation), distance to coast (DC) and distance to the sea along the river (RDS).Table 1 summarises how the covariates were calculated.Before model fitting, gradient, UCA and width were log transformed to reduce skewness, and HS was centred by subtracting the median value from all observations.An air temperature metric (T a max ) was calculated for each site from the gridded UKCP09 T a dataset (available from the UK MET Office); see Perry and Hollis (2005a, b) for details of this dataset.Analogous to the calculation of T w max , T a max was given by the maximum of the 7-day rolling mean of daily maximum air temperatures in August 2015.
Figure 2 illustrates the distribution and correlation among covariates included in the single-or multi-catchment models (excluding strongly correlated (> 0.8) covariate pairs; see below for details) for each of the four catchments and for the global (four-catchment) dataset.

Modelling
A total of ten models of T w max were developed: two models for each of the four river catchments using either (1) landscape covariates only (LS models) or (2) landscape covariates and T a max (LS_T a models) and two models for all four catchments combined, again using either (1) landscape covariates only (multi-catchment LS model) or (2) landscape covariates and T a max (multi-catchment LS_T a model).The modelling process differs slightly between the single and multi-catchment models and these are described in turn.All analysis was done in R version 3.2.3(R Core Team, 2015).

Single-catchment models
The set of covariates was first reduced to avoid problems of collinearity.If two covariates were strongly correlated (Pearson correlation coefficient of more than 0.8) in any one catchment, one of the covariates was dropped from the set available for modelling for all catchments.This ensured all the LS models were based on a common set of covariates (UCA, %RW, HS, Orientation, DC) as were the LS_T a models (T a max , UCA, %RW, HS, Orientation, DC).
The relationship between T w max and the covariates was explored using generalised additive models (GAMs) with Gaussian errors and an identity link (Wood, 2001).A "full" model was first fitted, which included all the available covariates from the reduced dataset and a river network smoother (RNS) (see below).The model can be conveniently written in R formula syntax as where n is the number of covariates (n = 5, 6 for LS, LS_T a models, respectively), s(covariate i ) denotes that covariate i was fitted as a smoother, and an intercept is included by default.The amount of smoothing was estimated from the data (Wood, 2001), with each smoother constrained to have at most 2 degrees of freedom (df) based on the expected sim- Gradient "extract" function in the "raster" package (Hijmans, 2015) to get elevations of the node and a location 1 km upstream.The difference in these elevations divided by the length between the two nodes provided gradient.The length upstream was calculated using "SpatialLinesLengths" from "sp" (Pebesma and Bivand, 2005).
OS. Terrain 10 m DTM, CEH DRN (Moore et al., 1994) Orientation Standard trigonometry based on the x and y locations of the node and associated upstream points 1 km upstream lengths.The lengths upstream were calculated using "SpatialLinesLengths" from "sp" (Pebesma and Bivand, 2005).
CEH DRN (Moore et al., 1994) Upstream catchment area (UCA) Arc Hydro Tools (ArcGIS 10.2.1) was used to "burn in" the DRN to the DTM and then calculate a UCA raster.
OS. Terrain 10 m DTM; CEH DRN (Moore et al., 1994) Hillshading/illumination (HS) "terrain" and "hillShade" functions in the "raster" package (Hijmans, 2015) were used to create a hillshade layer for every hour the sun was above the horizon.These layers were then summed to create a single layer of maximum potential exposure.HS values for the nodes were an average of the raster grid cells in the 1 km river polygon.
Raster grid cells were weighted by the proportion of the cell within the buffer.
OS MasterMap, CEH DRN (Moore et al., 1994) Width Width was calculated by finding the area classified as water within the 1 km upstream and dividing this by the distance upstream.Areas were calculated using "gArea" from "rgeos" (Bivand and Rundel, 2016) and lengths the "SpatialLi-nesLengths" from "sp" (Pebesma and Bivand, 2005).
CEH DRN (Moore et al., 1994), OS Panorama coastline Highest 7-day average maximum August T a (T a max ) Take the T a value, from daily maximum predicted T a matrix, for each cell containing a SRTMN site.
Use these daily values to calculate rolling averages, then select the highest for each site.
Gridded UKCP09 predicted T a dataset (UK MET Office, 2015) plicity of T w max responses to the covariates.The RNS is included to account for spatial structure in the data that cannot be explained by the covariates.The RNS is a modified version of that developed by O'Donnell et al. ( 2014), with the amount of smoothness at a confluence controlled by the proportional influence of upstream tributaries weighted by Strahler river order (Strahler, 1957) and fitted using a set of "reduced rank" basis functions; see Jackson et al. (2017b) for full details.The RNS was allowed up to 7 df based on knowledge of RNS complexity for the Spey (Jackson et al., 2017b).
To ensure the RNS did not account for variability that would otherwise be explained by covariates, RNS basis functions were excluded if they were strongly correlated (> 0.8) with any of the covariates.Thus, base 1 was removed from the F. L. Jackson et al.: Can spatial statistical river temperature models be transferred between catchments?
Spey and Dee RNSs due to correlations with DC.In the LS_T a models, base 2 was also removed from the Spey RNS due to correlation with T a max .The model was fitted by maximum likelihood using the "mgcv" package (Wood, 2016) in R.
All possible subsets of the full model were then fitted.The final model was that with the lowest Bayesian information criterion (BIC) or Akaike information criterion corrected for small sample size (AICc) that contained no terms significant at the 5 % level.The choice of information criterion was based on the desire to penalise more complex models that would be unlikely to transfer well (Millidine et al., 2016).Thus, BIC was used for the Dee and Tweed where there were more sites, and AICc was used for the Bladnoch and Spey where there were fewer sites.Terms in the final model with 1 df were replaced by linear terms.

Multi-catchment models
Covariates were excluded if they were strongly correlated (> 0.8) across the entire multi-catchment dataset.The reduced set of covariates was elevation, UCA, %RW, HS, gradient and orientation for the LS model, and T a max , UCA, %RW, HS, gradient and orientation for the LS_T a model.The RNS basis functions were the same as those included in the singlecatchment models.
A "starting" model was fitted of the following form: where Catchment is a categorical variable allowing a different mean level for each catchment and RNS : Catchment denotes a separate RNS for each catchment.The covariate smoothers were given a maximum of 2 df and the RNS a maximum of 7 df for each catchment.The model was then refined in a backwards and forwards stepwise procedure which considered (a) replacing smooth covariate effects by linear terms and then dropping them altogether; (b) dropping the RNS by Catchment term altogether; and (c) adding interactions between the covariates (either linear or smoothed) and Catchment.An interaction between a covariate and Catchment would indicate inter-catchment differences in the relationship between T w max and the covariate, suggesting that the model might not transfer well to new catchments.Interactions between the covariates were not considered.Model selection was based on BIC.Finally, any non-significant terms (p > 0.05) in the final model were removed.

Model performance and transferability of single-catchment models
The ability of single-catchment models to predict T w max within the catchment they were developed (the donor catchment) was assessed using leave-one-out cross validation.Each site was removed in turn, the final model was refitted, and then T w max was predicted at the missing site using (a) using all model terms (i.e. the covariates and the RNS if present) and (b) only covariates (i.e.excluding the columns in the model matrix relating to the RNS).The prediction using all model terms should outperform that using only covariates because it incorporates the extra information about spatial structure that is captured by the RNS.However, a RNS from one catchment cannot be used to predict in another because the river networks will differ.The prediction using only covariates therefore provides a benchmark for assessing the transferability of models between catchments, since it measures how well a model will transfer to a catchment that is identical in all but its river network.
Transferability to another catchment (the target catchment) was assessed by using the model from the donor catchment to predict T w max at the monitoring sites in the target catchment.As RNSs cannot be transferred, only covariates were used in the predictions (i.e. the columns in the model matrix due to the RNS were ignored).

RMSE
where x s and xs are the observed and predicted T w max at site s, x and x are the mean observed and predicted T w max in the catchment and n is the number of sites in the catchment.Standard deviation was used rather than variance, so that all three metrics are on the same scale and can be compared.Model performance was also illustrated by plotting observed T w max against predicted values and comparing this to a 1 : 1 line.Points close to the 1 : 1 line indicate precise unbiased predictions, points consistently displaced above or below the line indicate biased predictions, and high scatter about the line indicates imprecise predictions.The consequences of predicting outside of the environmental range of a given model were shown by coding sites as "in" or "out" of range.Figure 1 shows the spatial variability in T w max across the four catchments and Fig. 2 summarises the distribution of T w max by catchment (bottom left diagonal panel).Median T w max in the Dee (15.1 • C), Tweed (15.6 • C) and Spey (15.6 • C) was broadly similar, but median T w max in the Bladnoch (16.4 • C) was approximately 1 • C higher (Fig. 2).The range of T w max was 5.7, 5.9, 6.0 and 5.5 • C in the Bladnoch, Dee, Spey and Tweed, respectively (Fig. 2).

Single-catchment models
All four LS models were simple (Table 2), explained much of the variance in T w max (76.6-85.6 %) and contained similar positive relationships between T w max and UCA (Fig. 3).This relationship was near linear until approximately 100 km 2 and then levelled off in the Bladnoch (Fig. 3d), and it was smooth, but near linear in the Spey and the Tweed (Fig. 3b, c) and linear in the Dee (Fig. 3a).The magnitude of the effect was similar across catchments at approximately 4 • C. Three models contained a RNS, which explained much of the variance: 61.7, 13.9 and 63.7 % in the Dee, Tweed and Spey, respectively (Table 2).The Tweed model also had a negative linear effect of %RW.
The LS_T a models always had a better BIC/AICc than the corresponding LS models but were typically more complex, always required more df and only explained a greater percent of the variance in the Bladnoch and the Tweed (Table 2).For the Tweed, the LS_T a model used only covariates, whereas the LS model required a RNS to account for unexplained spatial structure.For the Bladnoch, the LS_T a model included UCA and T a max , whereas the LS model only included UCA.
In common with the LS models, UCA was in all the LS_T a models (Table 2) and the direction, shape and magnitude of the effects were consistent with the LS models (Fig. 4, top row).T a max was in all the LS_T a models except the Spey (Table 2).There was a positive linear relationship between T w max and T a max in the Dee and Tweed (Fig. 4e, f) and a U-shaped response in the Bladnoch which is physically implausible, increasingly so when extended beyond the range of T a max observed in the Bladnoch (Fig. 4g).Orientation had a small positive effect on T w max in both the Dee and Tweed (Fig. 4h, i) with higher temperatures for a north-south orientation than an east-west orientation.There was also a negative linear effect of %RW and a positive smoothed effect of HS in the Tweed and a positive linear effect of DC in the Spey (Fig. 4j, k, l, respectively).

Transferability of single-catchment models
The transferability of the LS and LS_T a models is summarised by their RMSE, bias and standard deviation in Table 3 and illustrated in Figs. 5 and 6, respectively.All the models performed well within catchments (i.e. in the catchments where they were developed) when all model terms (i.e. both covariates and the RNS) were used in the predictions, with a bias of < 0.1 • C in absolute value and a RMSE of < 1 • C. The LS_T a models always had a lower RMSE than the LS models.As expected, within-catchment predictions were poorer when only the covariates were used (exclud- The rest of this section focuses on the predictions, both within and between catchments, using only the covariates.For the catchments in eastern Scotland (Dee, Tweed and Spey), the RMSE, bias and standard deviation of any model were broadly similar whether they were used to predict for the donor catchment or for the other two target catchments.The RMSE of the LS models tended to be lower than that of the LS_T a models (median 1.3 and 1.7 • C, respectively).The LS and LS_T a models both had median absolute biases of 0.3 • C and median standard deviations of 1.1 and 1.4 • C, respectively.RMSE is a combination of bias and standard deviation, so the RMSE of both sets of models was generally dominated by the standard deviation.
Predictions involving the Bladnoch, either as a donor or target catchment, tended to be poor.The Bladnoch is in western Scotland and was warmer than the other catchments Hydrol.Earth Syst.Sci., 21, 4727-4745, 2017 www.hydrol-earth-syst-sci.net/21/4727/2017/   (Fig. 2).The Bladnoch models always overpredicted T w max in the other catchments and the Dee, Tweed and Spey models all underpredicted T w max in the Bladnoch (Figs. 5, 6).This often led to substantial bias and hence RMSE.The Bladnoch LS_T a model had the largest biases, which were also due to the implausible relationship with T a max (Fig. 4g).The Dee, Tweed and Spey had reasonable standard deviations when transferred to the Bladnoch (median 1.0 and 1.1 • C for the LS and LS_T a models, respectively) which suggests that, despite having poor RMSE, the models still could be used to predict areas of relatively high or low T w max within the Bladnoch (rather than absolute values of T w max ).The same is true of the Bladnoch LS models when transferred to the Dee, Tweed and Spey (median standard deviation 1.3 • C).However, the Bladnoch LS_T a model had a high standard deviation (median 3.3 • C) when transferred to the Dee, Tweed and Spey, again due to the implausible relationship with T a max .

Multi-catchment models
The multi-catchment LS model included Catchment, UCA, %RW, Elevation and a RNS for each catchment (Table 4).
By fitting a single model to all four catchments, it was possible to assess whether covariate effects were consistent across catchments and thus transferable to new catchments or regions.None of the covariates interacted with Catchment.
The Catchment effect indicates inter-catchment differences in mean T w max having accounted for the landscape covariates; in particular, higher T w max in the Bladnoch (Fig. 7d).
In common with the single-catchment LS models, there was a positive smooth relationship between T w max and UCA with an effect size of approximately 3 • C (Fig. 7a).There was also a negative linear relationship between T w max and both %RW and Elevation, with effect sizes of approximately 1 and 2    explain less of the variance than in the single-catchment models (Tables 3, 4).The multi-catchment LS_T a model explained 83.2 % of the variance and contained Catchment, UCA, %RW, T a max and a RNS for each catchment (Table 4, Fig. 8).None of the landscape covariates interacted with catchment.However, the T a max relationship did interact with catchment (Fig. 8ad), with positive relationships in the Dee and Tweed, and negative (albeit non-significant) relationships in the Spey and Bladnoch.This suggests that relationships with T a max are non-transferable and T a max would not be a good predictor of T w max in new catchments.

Discussion
Even with the introduction of relatively cheap and accurate data loggers, it is not financially or logistically possible to monitor everywhere.Consequently, there is a need to develop models to understand and predict river temperatures at large spatial scales to inform evidence-based management of rivers and fisheries even where extensive local temperature data collection does not exist.Spatial statistical models offer great promise in this respect.However, to date, the transferability of these models has not been considered.This study fitted separate models of T w max to data from four catchments and transferred these models between catchments.Models containing only landscape covariates typically contained similar covariates and covariate responses, and performed better than models containing T a max when transferred between catchments.A physically implausible model transferred particularly poorly.The covariates alone often explained much less of the spatial temperature variability than when a RNS was added but provided the only means of predicting temperature in new catchments with no or limited data (a minimum of 19 loggers was required to fit the full models including covariates and RNS).A single model fitted to all four catchments suggested common responses to landscape covariates but inter-catchment differences in mean temperature and in the relationships between T w max and T a max .These findings are discussed in more detail below.

T w max responses to landscape covariates
The single-catchment LS models contained similar covariates with comparable effect sizes and response shapes which suggested that transferability between catchments could be reasonably successful.This was confirmed by the lack of significant interactions with Catchment in the multi-catchment model.However, when there are inter-catchment differences in mean temperature, the models might only be good predictors of relative values of T w max within a new catchment (i.e.areas of higher or lower T w max ) rather than absolute values.It is also unclear how well the models would perform in years with differing hydroclimatic characteristics.This study was conducted in a single year with relatively low temperatures and high flows.In a hotter, drier year it might be expected that between-site differences would be greater.Under such circumstances, the current models may not provide accurate predictions of absolute temperatures or inter-site differences without refitting.
All of the T w max responses to landscape covariates (across all models) were physically plausible and hence broadly transferable (Smith et al., 2016).UCA (which was in all the models) is a proxy for discharge, water volume and thermal capacity (Chang and Psaris, 2013;Hannah et al., 2008).Higher UCAs are generally associated with larger water volumes which have a greater thermal capacity, taking longer to warm but also retaining heat for longer (Chang and Psaris, 2013;Imholt et al., 2011).Elevation reflects adiabatic lapse rates which reduce temperatures with increasing altitude (Hrachowitz et al., 2010;Jackson et al., 2017b).The negative relationship between T w max and %RW woodland occurs because riparian shading reduces the amount of incident shortwave radiation reaching the river during daylight hours (Garner et al., 2014;Hannah et al., 2008;Moore et al., 2005).The positive relationship between T w and HS is consistent with greater T w in locations with lower shading effects and greater direct shortwave contributions (illumination).T w was greatest in channels characterised by a north-south orientation which typically experience maximum exposure to incoming radiation (Malcolm et al., 2004).Increasing T w with distance from the coast reflected continentality and the differ- ing specific heat capacities of land and sea, specifically thermal buffering of relatively cooler sea during summer months (Chang and Psaris, 2013;Hrachowitz et al., 2010).

T w -T a relationships
In contrast to the LS models, one LS_T a model included a physically implausible relationship that would not be ex-pected to transfer well (Smith et al., 2016).Specifically, this was an inverse modal relationship between T w max and T a max in the Bladnoch model.This relationship could have arisen because of spatial variability in local (temporal) T w max -T a max relationships due to controls that were not captured by our covariates, coupled with a relatively small air temperature range (1.7 • C) and sample size (19 sites).Nevertheless, even where the relationships between T w max and T a max were plausible, they were inconsistent between catchments in terms of effect size and direction, as indicated by the varying responses in the single-catchment models and the interaction with Catchment in the multi-catchment model.For example, in the latter, the relationships were positive in the Dee and Tweed, and negative in the Spey and Bladnoch (where the relationship simplifies to linear).Given the number of previous studies that have predicted T w from T a within sites over time (temporal models), between sites (spatial models) or both (e.g.two-stage spatiotemporal models), it may appear surprising that models containing T a max gave poorer predictions of betweencatchment temperature variability than those containing landscape covariates alone in this study.However, previous spatial models of T w incorporating air temperature as a predictor (e.g.Wehrly et al., 2009;Moore et al., 2013) have focused on the ability of these models to predict within the data space (interpolate), while this study investigated the ability of models to predict outside of the data space (extrapolate).Indeed, within our multi-catchment model, it would have been possible to force a single T w max -T a max relationship that reflected an average response across catchments.However, this would result in biased estimates of T w max within individual catchments.
The ability of T a max to predict spatial variability in T w max is likely to degrade where the temporal relationships between T w and T a vary spatially, within and between catchments (Kelleher et al., 2012;Segura et al., 2015).It is expected that within-catchment (between-site) variability in the temporal relationships between T w and T a would add noise to any spatial relationships, making them harder to detect and reducing the overall precision of any predictions.Systematic differences in T w -T a relationships between catchments would result in biased predictions when models are transferred between rivers or regions.
Many studies have shown that within-year temporal relationships between T w and T a (often termed thermal or climate sensitivity) can be highly variable between sites and catchments, and importantly this variability relates to regional, hydrological and landscape controls (Tague et al., 2007;Kelleher et al., 2012;Krider et al., 2013;Chang and Psaris, 2013;Hilderbrand et al., 2014;Segura et al., 2015;Mauger et al., 2017).For example, Fellman et al. (2014) observed slopes of between −0.180 and 1.282 across nine watersheds in Alaska depending on glacial influence, while Mauger et al. (2017) observed slopes of between 0.32 and 1.51 across 48 non-glacial streams in Alaska which related to mean elevation and catchment area.Similarly, Tague et al. (2007) observed systematic regional differences in T w -T a relationships in western Oregon that depended on local hydrogeology and concluded that under such circumstances air temperature alone (i.e.consistent with a single air temperature coefficient) would be unlikely to explain spatial variation in river temperatures.Given the reported spatial variability in T w -T a relationships and, importantly, that these relationships can vary systematically within and between catchments depending on other controls (e.g.hydrogeology), it is unsurprising that models containing T a max do not substantially improve predictions of the spatial variability in T w max than models containing landscape variables alone, and that transferred models result in biased T w predictions.If T a is to substantially improve predictions of T w in static spatial models (such as those presented in this study), then it is likely that they would need to include greater model complexity, e.g.allowing for interactions between T a max and landscape covariates (e.g.Mayer, 2012).

The importance of RNS
The performance of the single-catchment LS and LS_T a models in this study compared favourably to regional models of T w max (Moore et al., 2013;Roberts et al., 2013;Wehrly et al., 2009) when predictions were made for the catchment in which the models were developed (i.e.interpolation).For the models that included a RNS, RMSE (0.7-0.9 • C) was approximately half that reported by previous studies, although it should be noted that these studies were conducted at considerably larger spatial scales (Moore et al., 2013;Roberts et al., 2013;Wehrly et al., 2009).The RMSE of models without a RNS (0.9-1.8 • C) was generally similar or slightly better than reported by other studies.
The landscape covariates included in the models in this study explained large (catchment)-scale trends in T w max but were less successful at explaining variability at finer spatial scales.For example, the approximately 20 % variance explained by UCA in the Spey and Dee models is consistent with the 18-25 % of T w variability explained by discharge in Arora et al. (2016).Smaller-scale variability tends to reflect drivers such as water residence time (and heat advection), water sources (Brown et al., 2006;Brown and Hannah, 2008), channel incision, gradient (Jackson et al., 2017b) and land use (Imholt et al., 2013) which are harder to accurately characterise from spatial datasets.In the absence of accurate local-scale characterisation of landscape controls, smallerscale spatial variability is modelled by the RNS.However, whilst the RNS improves predictions within catchments, it is not transferable, so it does nothing to help predictions between catchments.

Extending predictions
The inclusion of the Catchment main effect in both multicatchment models showed differences in mean T w max between catchments (that were not accounted for by the covariates).This sometimes led to substantial bias when transferring single-catchment models to new catchments.Accounting for between-catchment differences in mean T w will be necessary to improve between-catchment predictions of T w .The multi-catchment models in this study used a simple cat-F.L. Jackson et al.: Can spatial statistical river temperature models be transferred between catchments?egorical variable to allow the intercept (and hence mean T w max ) to differ between catchments.However, to predict to new catchments, it would be necessary to extend the modelling approach so that the intercept can be predicted from surrounding catchments.One approach could be to allow the intercept to vary smoothly between catchments using a Gaussian Markov random field (Cressie, 1993), so the intercept in unmonitored catchments could be estimated from nearby monitored catchments.This approach has been developed in other contexts (Millar et al., 2015(Millar et al., , 2016) ) and offers promise in the context of large-scale T w modelling.
An alternative approach could involve modelling T w as a function of T a over shorter time periods (days or weeks) and then allowing this relationship to interact with landscape covariates or location.Such an approach could have additional benefits, allowing the inclusion of temporally incomplete data (e.g.Letcher et al., 2016) or data from temporally inconsistent locations.Where sufficient resources were available it may be possible to supplement the existing network with sites that are monitored for shorter time periods to expand spatial coverage, although the consequences of such deployments for assessing inter-annual temperature variability would need to be investigated.Finally, the development of spatiotemporal models, where temporal variability was driven by T a or discharge, could potentially allow for forecasting or hindcasting of river temperature which was not possible using the approaches presented in this paper.

Conclusions and future work
This study demonstrated that landscape covariates can explain broad-scale patterns in T w max and that such relationships are transferable between catchments, at least to predict relative levels of T w max .It was necessary to use a RNS to characterise and predict finer-scale spatially correlated variation, so predictions of spatial temperature variability were better within catchments than between catchments.T a max was not a transferable predictor of T w max and could result in poor predictions when the relationship was implausible or transferred outside the range observed in the donor catchment.It would be unwise to use a T w -T a relationship to predict spatial variability in T w without also including meaningful (process-relevant) interactions between T a and landscape covariates, something that was not possible in this study due to data constraints.
Mean T w max also varied between catchments (having adjusted for the landscape covariates).Future work that looks to predict to new catchments should investigate how to understand and predict these between-catchment differences.A large-scale correlated spatial smoother (e.g.regional effect) offers potential in this respect.Finally, some of the localscale processes represented in this study (e.g.effect of riparian shading) may benefit from improved characterisation using finer-scale spatial datasets or remotely sensed data.
Improved process representation could lead to both better within-and between-catchment model predictions.
Data availability.The Digital River Network (DRN) used in this study is a commercial product licensed from the Centre for Ecology and Hydrology, ©NERC.Data derived from the DRN are therefore also subject to licensing constraints.As such, these data cannot be made available publicly.It is possible to view the DRN through the CEH WMS server: https://catalogue.ceh.ac.uk/maps/a78c90a2-8da4-4f0a-9c6a-c1d1a4a3c2b0?request= getCapabilities&service=WMS& (Moore et al., 1994).Catchment boundaries (SEPA, 2009) are products derived from the CEH DRN and can therefore only be provided to organisations who also hold a license to use the CEH DRN.SEPA catchment boundaries can be viewed here: http://gis.sepa.org.uk/rbmp/.Ordnance Survey datasets (MasterMap land cover and digital terrain model) are also commercially licensed products (©Crown copyright and database right ( 2016), all rights reserved; Ordnance Survey license no.100024655).Information on these datasets can be found here: https: //www.ordnancesurvey.co.uk/business-and-government/licensing/.The gridded T a dataset was from UKCP09: Daily gridded air temperature dataset (2015), UK MET Office (2015, https://www.metoffice.gov.uk/climatechange/science/monitoring/ukcp09/).Summary T w data used in the study are available on the Marine Scotland Science data web pages (https://doi.org/10.7489/1991-1,Jackson et al., 2017a).
Author contributions.IAM and DMH secured funding for the project.The authors conceived the study.FLJ carried out the data analysis with support from IAM and RJF.FLJ, IAM, RJF and DMH interpreted the results and prepared the paper.
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.Study catchments and spatial patterns of T w max for August 2015 (Jackson et al., 2017a).(a) Catchment positions in Scotland, (b) the River Bladnoch catchment, (c) River Spey catchment, (d) River Dee catchment and (e) River Tweed catchment.Some features of this map are based on digital spatial data licensed from the Centre for Ecology & Hydrology, © NERC (CEH) and contains Ordnance Survey data © Crown copyright and database right (2016).Catchment boundaries are from SEPA (2009).

Figure 2 .
Figure 2. Distributions and inter-relationships between T w max and covariates.Scatter plots of the relationships are shown below the diagonal, kernel density plots of the individual covariates in the diagonal (scaled to have the same maximum value) and correlation coefficients above the diagonal.Numbers in black indicate the correlation coefficients where data are pooled across all catchments.

Figure 5 .
Figure 5. LS model transferability.Panels (a), (b), (c) and (d) show predicted T w max when the target catchment is the Dee, Tweed, Spey and Bladnoch, respectively.The colours and symbols indicate the donor catchment: Dee (red circles), Tweed (orange triangles), Spey (dark blue squares) and Bladnoch (light blue diamonds).Filled (open) symbols indicate sites in (out) of the environmental range of the donor catchment.When the target and donor catchments are the same, the coloured points are based on predictions using only covariates; the grey symbols show the corresponding predictions based on the covariates and the RNS.The dashed lines are robust regression lines of observed against predicted values.Models which transfer well have points falling close to the 1 : 1 line.

Figure 6 .
Figure 6.LS_T a model transferability.Panels (a), (b), (c) and (d) show predicted T w max when the target catchment is the Dee, Tweed, Spey and Bladnoch, respectively.The colours and symbols indicate the donor catchment: Dee (red circles), Tweed (orange triangles), Spey (dark blue squares) and Bladnoch (light blue diamonds).Filled (open) symbols indicate sites in (out) of the environmental range of the donor catchment.When the target and donor catchments are the same, the coloured points are based on predictions using only covariates; the grey symbols show the corresponding predictions based on the covariates and the RNS.The dashed lines are robust regression lines of observed against predicted values.Models which transfer well have points falling close to the 1 : 1 line.

Table 1 .
Covariate calculations.All calculations were in R version 3.2.3(R Core Team, 2015) except where specified.

Table 2 .
LS model and LS_T a model for each catchment, with the percent variance explained by the model (all terms) and the same model but with the RNS omitted (covariates).
w .Rainfall in eastern Scotland (which covers the Spey, Dee and Tweed) was 107 % of the 1981-2010 mean, whereas rainfall in western Scotland (which covers the Bladnoch) was only 98 % of the 1981-2010 mean (MET Office, 2016).Maximum air temperature was the same in eastern Scotland as the 1981-2010 mean maximum and 0.2 • C cooler in western Scotland over the same period (MET Office, 2016).

Table 3 .
Transferability of LS and LS_T a models.The values in normal font are for predictions using only covariates (any RNS information is ignored).The values in italics are for predictions when the target and donor catchments are the same and all model terms are used (both covariates and the RNS).

Table 4 .
Multi-catchment LS and LS_T a models, with the percent variance explained by the model (all terms) and when the RNS is omitted (covariates).