In this study we develop a method to estimate the spatially averaged rainfall
intensity together with associated level of uncertainty using geostatistical
upscaling. Rainfall data collected from a cluster of eight paired rain gauges
in a 400 m

Being the process driving runoff, rainfall is arguably the most important input parameter in any hydrological modelling study. But it is a challenging task to accurately measure rainfall due to its highly variable nature over time and space, especially in small urban catchments. Despite recent advances in radar technologies rain gauge measurements are still considered to be the most accurate way of measuring rainfall, especially at short temporal averaging intervals (< 1 h), which are of most interest in urban hydrology studies (Ochoa-Rodriguez et al., 2015). However, many commonly used urban hydrological models (e.g. SWMM, HBV) are lump catchment models (LCMs) where time series of areal average rainfall intensity (AARI) are needed as model input. Therefore, point observations of rainfall need to be scaled up using spatial aggregation in order to be fed in to a LCM. There are a number of interpolation methods available for spatial aggregation and used in the various LCMs to scale up point rainfall data. The simplest method is to take the arithmetic average (Chow, 1964) of the point observations within the catchment. But this method does not account for the spatial correlation structure of the rainfall and the spatial organisation of the rain gauge locations. Another commonly used method in hydrological modelling is the nearest neighbour interpolation (Chow, 1964; Nalder and Wein, 1998) which leads to Thiessen polygons. In this method the nearest observation is given a weight of one and other observations are given zero weights during interpolation, thereby ignoring spatial variability of rainfall to a certain extent. There are also other methods, with varying complexity levels, including inverse distance weighting (Dirks et al., 1998), polynomial interpolation (Tabios III and Salas, 1985) and moving window regression (Lloyd, 2005). The predictive performance of the above methods are found to be case dependent and no single method has been shown to be optimal for all catchments and rainfall conditions (Ly et al., 2013). One common drawback with all the above methods is that they do not provide any information on the uncertainty of the predictions of AARI as all the methods are deterministic. The uncertainty in prediction of AARI mainly comes from two sources; uncertainty due to measurement errors and uncertainty associated with spatial variability of rainfall. The characteristics of measurement errors can vary depending on the rain gauge type. For example, errors associated with commonly used tipping bucket rain gauges range from errors due to wind, wetting, evaporation, and splashing (Fankhauser, 1998; Sevruk and Hamon, 1984) to errors due to its sampling mechanism (Habib et al., 2001). In addition to measurement errors and since rainfall can vary over space significantly, any spatial aggregation method for scaling up the point rainfall measurements incorporates more uncertainty (Villarini et al., 2008). The magnitude of the uncertainty depends on many factors including rain gauge density and location, rainfall variability, catchment size, topography, and the spatial interpolation technique used. Quantification of the level of uncertainty is essential for robust interpretation of hydrological model outputs. For instance, the absence of information on uncertainty can lead to force fitting of hydrological model parameters to compensate for the uncertainty in rainfall input data (Schuurmans and Bierkens, 2007).

Left: aerial view of rain gauge network covering an area of 400 m

Geostatistical methods such as kriging present a solution to this problem by providing a measure of prediction error. In addition to this capability, these statistical methods also take into account the spatial dependence structure of the measured rainfall data (Ly et al., 2013; Mair and Fares, 2011). Although these features make geostatistical methods more attractive than deterministic methods, they are rarely used in LCMs due to their inherent complexity and heavy data requirements. Since they are statistical methods encompassing multiple parameters the amount of spatial data required for model inference is higher compared to deterministic methods. In addition the underlying assumption of geostatistical approaches typically requires data to be normally distributed (Isaaks and Srivastava, 1989). In general, catchments, especially those at small urban scales, do not contain as many measurement locations as required by geostatistical methods. Furthermore, rainfall intensity data are almost never normally distributed, especially at smaller averaging intervals (< 1 h) (Glasbey and Nevison, 1997). Despite these challenges geostatistical methods can provide information on uncertainty associated with predicted AARI. This capability can be utilised in uncertainty propagation analysis in hydrological models. In literature, geostatistical methods have been used to analyse the spatial correlation structure of rainfall at various spatial scales (Berne et al., 2004; Ciach and Krajewski, 2006; Emmanuel et al., 2012; Jaffrain and Berne, 2012), however its application to support uncertainty analyses of upscaling rainfall data has not been explored.

In this paper we present a geostatistical approach to derive AARI and the level of uncertainty associated with it from observations obtained from multiple “paired” rain gauges located in a small urban catchment. The proposed approach presents solutions to the above-described challenges of geostatistical methods. First, it uses pooling of sample variograms of rainfall measurements at different times but with similar characteristics to increase the number of paired observations used to fit variogram models. Second, a data transformation method is employed to transform the rainfall data to obtain a normally distributed data set. The level of uncertainty in the prediction of AARI is then quantified for different combinations of temporal averaging intervals and intensity ranges for the studied urban catchment. We focused on a small urban catchment with a spatial extent of less than a kilometre given the findings of recent research on the high significance of unmeasured spatial rainfall variability at such spatial scales, especially for urban hydrological and hydrodynamic modelling applications (Gires et al., 2012, 2014; Ochoa-Rodriguez et al., 2015).

The study area is located in Bradford, a city in West Yorkshire, England.
Bradford has a maritime climate, with an average yearly rainfall of 873 mm
recorded from 1981–2010 (MetOffice UK, 2016). The rain gauge network, used
in this study was located at the premises of Bradford University (Fig. 1) and
rainfall data were collected from paired tipping bucket rain gauges placed at
eight locations covering an area of 400 m

Histogram with class interval width of 100 m showing frequency distribution of inter-station distances (m).

All rain gauges are ARG100 tipping bucket type with an orifice diameter of 254 mm and a resolution of 0.2 mm. Dynamic calibration was carried out for each individual gauge before deployment and visual checks were carried out every 4–5 weeks during the measurement period to ensure that the instruments were free of dirt and debris. Data loggers were reset every 4–5 weeks during data collection to avoid any significant time drift. Measurements (number of tips) were taken every minute and recorded on TinyTag data loggers mounted in each rain gauge.

Quality control procedures were performed prior to statistical analysis, taking advantage of the paired gauge setup to detect gross measurement errors. The paired gauge design provides efficient quality control of the rain gauge data records as it helps to identify the instances when one of the gauges fails, and to clearly identify periods of missing or incorrect data (Ciach and Krajewski, 2006). During the dynamic calibration of all rain gauges in the laboratory before deployment, it was identified that the highest and lowest values of the calibration factors for the tipping bucket size are 0.196 and 0.204 mm. The gauges were recalibrated in the laboratory after the first period of measurement and it was found that the largest change in calibration factor for any gauge was a maximum of 4 % of the original calibration factor. Therefore a maximum difference of 4 % in volume per tip was assumed to be caused by inherent instrument error. It was therefore decided that this is the maximum acceptable difference between any pair of gauges. Sets of cumulative rainfall data corresponding to specific events from the paired gauges were checked against each other and if the (absolute) difference in cumulative rainfall was greater than 4 %, that complete set was identified as unreliable and removed from further analysis.

The total average network rainfall depth for the summer seasons of 2012 and 2013 are 538 and 207 mm, respectively. Figure 3 shows time series of daily rainfall averaged over the network for 2012 and 2013. There is a significant difference in cumulative rainfall between 2012 and 2013. This is because 2012 was the wettest year recorded in 100 years in the UK (MetOffice UK, 2016) and 558 mm of rainfall during 2012 summer was unusually high. An average rainfall of only 360 mm was recorded during April to September over the 1981–2010 period at the nearest operational rain gauge station at Bingley, which is around 8 km from the study site with a similar ground elevation (MetOffice UK, 2016).

Time series of network average daily rainfall in the two seasons of 2012 and 2013 with vertical dashed lines indicating the events presented in Table 1.

The data set for 2012 and 2013 contains 13 events yielding more than 10 mm
network average rainfall depth each and lasting for more than 20 min. A
summary of these events is presented in Table 1. Note that this event
separation is only used for the presentation of results in Sect. 4.2. Hence
it does not leave out any data from the development and calibration of the
geostatistical model as presented in Sect. 3. Table 1 shows that the total
event duration ranges from 1.5 to 11.4 h while the event network average
rainfall intensity varies from 1.79 to 7.96 mm h

Summary of events which yielded more than 10 mm rainfall and lasted for more than 20 min with summary statistics of event peaks (derived at 5 min temporal averaging interval) from all stations.

Figure 4 summarises the procedure of geostatistical upscaling of the rainfall data adapted in this study in a step-by-step instruction followed by the detail descriptions of each step. This complete procedure was repeated for temporal averaging intervals of 2, 5, 15, and 30 min in order to investigate the effect of temporal aggregation on the prediction of AARI. The entire 10 months of collected data were used for the development and calibration of the geostatistical model.

Step-by-step procedure developed in this study to predict AARI and associated level of uncertainty. Boxes highlighted in dots indicate the steps to resolve the problem of scarcity in measurement locations, grey boxes show the steps introduced to address non-normality of rainfall data.

The rain gauge network contains eight measurement locations. These eight
measurement locations give 28 spatial pairs at a given time instant which
yields too few spatial lags than would normally be used in geostatistical
modelling. For example, Webster and Oliver (2007) recommend
around 100 measurement points to calibrate a geostatistical model. The
procedure adapted in this study increases the number of pairs by pooling
sample variograms for time instants with similar rainfall characteristics.
With

The underlying assumption of this pooling procedure is that the spatial
variability over the pooled time instants is the same. Therefore it is
important to pool sample variograms of rainfall measurements with similar
rainfall characteristics. Since the spatial rainfall variability is often
intensity dependent (Ciach and Krajewski, 2006),
the characteristics of a less intense rainfall event may not be the same as
that of a high-intensity rainfall event. Hence to make the assumption of
consistency of spatial variability, the range of rainfall intensity over the
pooled time instants should be reasonably small. On the other hand, one
should also make sure that there are enough time instants within a pooled
subset to meet the data requirement to calibrate the geostatistical model.
Based on the above two criteria, three rainfall intensity classes were
selected. The maximum threshold value was limited to 10 mm h

Number of time instants for each temporal averaging interval and rainfall intensity class combination.

Having chosen the rainfall intensity classes to create pooled time instants,
there can still be inconsistency in spatial variability between time
instants within a class and therefore assuming a single geostatistical model
for the whole subset may not be realistic. To reduce this effect to a
certain extent, all observations within an intensity class were standardised
using the mean and standard deviation of each time instant as follows:

The upper part of Fig. 6 shows the distribution of standardised rainfall
intensity for a temporal averaging interval of 5 min derived using Eq. (1).
From the figure it is clear that the data are not normally distributed.
Distributions for other temporal averaging intervals (i.e. 2, 15, and
30 min) show a similar behaviour. But the geostatistical upscaling method to
be used is based on the normal distribution. This requires the rainfall data
to be normally distributed prior to the calibration of the geostatistical
model. The normal score transformation (NST, also known as normal quantile
transformation; Van der Waerden, 1952) is a widely used method to transform a
variable distribution to the Gaussian distribution. It has widely been
applied in many hydrological applications (Bogner et al., 2012; Montanari and
Brath, 2004; Todini, 2008; Weerts et al., 2011). The concept of NST is to
match the

Distribution of standardised rainfall intensity for different rainfall intensity classes at a temporal averaging interval of 5 min before (upper part) and after (lower part) normal score transformation (NST).

A geostatistical model of (normalised) rainfall intensity

The assumption of a constant trend makes that the spatial interpolation can
be solved using an ordinary kriging system (Isaaks and Srivastava, 1989):

Define a prediction grid (a 25 m

Visit a randomly selected grid cell that has not been visited before and predict the transformed rainfall intensity at the grid cell centre using ordinary kriging; this yields a kriging prediction and a kriging standard deviation.

Use a pseudo-random number generator to sample from a normal distribution mean equal to the kriging prediction and standard deviation equal to the kriging standard deviation and assign this value to the grid cell centre.

Add the simulated value to the conditioning data set; in other words treat the simulated value as if it were another observation.

Go back to step (ii) and repeat the procedure until there are no more unvisited grid cells left.

The grid size and number of simulations (i.e., the sample size) were
selected considering the spatial resolution of available measurements and
computational demand. It was observed that neither a finer grid nor more
simulations improved the results significantly. Increasing the resolution to
10 m

Once the realisations have been prepared these are back-transformed by
applying the inverse of Eq. (2) to all grid cells (step 6). Some values
derived from spatial stochastic simulation were outside the transformed data
range. Hence during back transformation (step 6) of these values linear
extrapolation was used. These linear models were derived using a selected
number of head and tail portion of normal

As explained in Sect. 3.4, the geostatistical model of transformed rainfall data were calibrated using variograms for three different intensity ranges. This procedure was repeated for temporal averaging intervals of 2, 5, 15, and 30 min. Exponential models were fitted to empirical variograms. The resulting variograms are presented in Fig. 7.

Calculated variograms for each intensity class within each temporal averaging interval.

The variograms illustrate two properties of the collected rainfall measurements: spatial variability of rainfall and measurement error. One of the main parameters which characterises these properties is the nugget. Theoretically at zero lag distance the variance should be zero. However, most of the variograms exhibit a positive nugget effect (generally presented as nugget-to-sill ratio) at zero lag distance. This nugget effect can be due to two reasons: random measurement error and microscale spatial variability of rainfall. Unfortunately we cannot quantify these causes individually using the variograms. But there is a consistent pattern of nugget against both rainfall intensity class and temporal averaging interval which helps to interpret the variograms.

Considering the behaviour of nugget-to-sill ratio against rainfall intensity
class, it can be observed that the smaller the intensity the higher the
nugget-to-sill ratio, regardless of temporal averaging interval. For
example, at 2 min averaging interval the nugget-to-sill ratio increases from
zero to almost one (nugget variogram) as the rainfall intensity class
changes from > 10 to < 5 mm h

Regarding the behaviour of the nugget-to-sill ratio against averaging
interval, it is expected that with the averaging interval the (microscale)
spatial correlation of rainfall would increase, which partly explains the
observed pattern. The increase in spatial correlation of rainfall intensity
with increasing temporal averaging interval agrees with other similar
studies (e.g. Ciach and Krajewski, 2006; Fiener and Auerswald, 2009; Krajewski et al.,
2003; Peleg et al., 2013; Villarini et al., 2008). For example,
Krajewski et al. (2003) observed in their
study on analysis of spatial correlation structure of small-scale rainfall
in central Oklahoma a similar behaviour using correlogram functions for
different temporal averaging intervals. But commenting on the decreasing
trend of the nugget-to-sill ratio against intensity class, it cannot be
attributed to improvement in microscale spatial correlation as it is neither
natural nor proven. In fact, in Fig. 7 the behaviour of spatial correlation
against rainfall intensity class does not show a distinctive trend except at
the origin, i.e. the nugget effect. The absence of any consistent trend of
spatial variability against intensity class was also observed in Ciach and Krajewski (2006). Meanwhile this
decreasing trend of nugget-to-sill ratio against rainfall intensity
corresponds well with measurement errors of tipping bucket type rain gauges
caused by its sampling mechanism (hereafter referred to as TB error). This is
due to the rain gauges' inability to capture small temporal variability of
the rainfall time series. The behaviour of TB error against rainfall
intensity as seen from Fig. 7 complements results from previous studies
(Habib et al., 2001; Villarini et al., 2008). These studies also show that the TB error decreases
with temporal averaging interval. Habib et
al. (2001) found similar behaviour of TB error with increasing intensity
(0–100 mm h

In addition to the nugget-to-sill ratio, another parameter that
characterises the variograms is the range, i.e. the distance up to which
there is spatial correlation. At lower temporal averaging intervals (

The fact that the data set covers only 10 months of data from 2 years with
varying climatology is something that needs to be acknowledged. However, for
previous studies using such a dense network the duration of data collection
is similar (e.g. 15 months – Ciach and Krajewski, 2006;
16 months – Jaffrain and Berne, 2012). These time periods are reflection of the
practical and funding issues to maintain such dense networks operating
accurately for extended periods. The characteristics of our data are
comparable with Ciach and Krajewski (2006) and
Fiener and Auerswald (2009) as these studies also used rainfall data from
warm months to investigate the spatial correlation structure. Despite the
fact that the data cover only 10 months all derived variogram models are
stable and reliable. Webster and Oliver (2007) suggested around
100 samples to reliably estimate a variogram model. Even in the case of 30 min
temporal averaging interval and > 10 mm h

One of the assumptions we made during the pooling procedure is that the
spatial variability is reasonably consistent within a pooled intensity class.
We acknowledge that with narrower intervals the assumption of consistency in
spatial variability would be more realistic. But with the available data we
had to find a compromise with the number of time instants. We believe that
using three intensity subclasses is a reasonable compromise. Further we also
introduced step 2 (Sect. 3.2) which standardises the rainfall for each time
instant within a subset. Although variograms are derived only for the whole
subset, step 2 (before geostatistical upscaling) and step 9 (after
geostatistical upscaling) ensure that the probabilistic model is adjusted for
each time instant separately. Effectively, we assume the same correlogram for
time instants of the same subclass, not the same variogram. Although this
does not justify the assumption of similar spatial correlation structure
within the pooled classes, it at least relaxes the assumption of the same
variogram within subclasses. To compare the behaviour of variogram models for
a narrower intensity interval, we produced variograms for narrower intensity
classes ranging from 0 to 14 mm h

Calculated variograms for a narrower range of intensity at 5 min averaging interval.

Having calculated all variograms, the next step is to apply spatial stochastic simulation for the time instants of interest followed by steps 6 to 9 in Fig. 4 to calculate the AARI together with associated uncertainty. This procedure was carried out for all events presented in Table 1. The following sections present and discuss the predicted AARI and associated uncertainty levels derived from step 9.

The scatter plot in Fig. 9 shows the coefficient of variation of the
prediction error (CV; see Eq. 6) plotted against predicted AARI at 5 min
averaging interval for all time instants of all events presented in Table 1:

AARI prediction error CV (%) values against predicted AARI for averaging interval of 5 min.

The above discussion is based on results from 5 min temporal averaging
interval. The following section discusses the effect of temporal averaging
interval on prediction error. Further, although CV in Fig. 9 gets as high as
80 %, the corresponding AARI is less than 1 mm h

Having analysed the behaviour of the prediction error CV against predicted AARI, this section presents the effect of temporal averaging interval on the prediction error of AARI. Figure 10 shows the kriging predictions with 95 % prediction intervals derived from the prediction standard deviation for temporal averaging intervals of 2, 5, 15, and 30 min for event 11. Event 11 has average conditions in terms of event duration and peak intensity. Prediction errors of other events against the temporal averaging interval follow the same pattern of behaviour.

Predictions of AARI (indicated by points) together with 95 % prediction intervals (indicated by grey ribbon) for rainfall event 11 for different averaging intervals.

Predictions of event peaks of AARI (indicated by points) together with labels indicating corresponding CV (%) values.

While short time intervals are of greater interest in urban hydrology, they
also lead to large uncertainties. Figure 10 shows the smaller the temporal
averaging interval, the larger the prediction interval and the larger the
level of uncertainty. This is due to the combined effect of higher spatial
variability and larger TB error at lower temporal averaging interval as seen
from Fig. 7. When the averaging interval is larger than 15 min the
prediction interval width becomes negligible. But temporal scales of
interest in urban hydrology of a similar-sized catchment can be as low as 2 min where there is still considerable uncertainty. The 95 % prediction
interval shows around

The decreasing trend of uncertainty in the prediction of AARI with
increasing temporal averaging interval agrees with a previous study by
Villarini et al. (2008). Although the
spatial extent of their study is much larger (360 km

In addition to rainfall event durations, rainfall event peaks are also of
significant interest in urban hydrology as most of the hydraulic structures
in urban drainage systems are designed based on peak discharge which is
often derived from peak rainfall. Hence it is important to consider the
uncertainty in prediction of peaks of AARI. Figure 11 presents predicted
peaks of AARI for all 13 events presented in Table 1, together with labels
indicating corresponding CV (%) values. The peak intensities range from
6 to 92 mm h

As discussed in Sect. 4.2.1, CV decreases with increasing predicted
rainfall peaks and this effect is dominant when the averaging interval is at
the lowest, i.e. 2 min. This is when the TB error is at its highest. When
the temporal averaging interval is 30 min where the TB error is at its
lowest, the difference between CV for lower (< 10 mm h

Geostatistical methods have been used to analyse the spatial correlation
structure of rainfall at various spatial scales, but its application to
estimate the level of uncertainty in rainfall upscaling has not been fully
explored mainly due to its inherent complexity and demanding data
requirements. In this study we presented a method to overcome these
challenges and predict AARI together with associated uncertainty using
geostatistical upscaling. We used a spatial stochastic simulation approach
to address the combination of change of support (from point to catchment)
and non-normality of rainfall observations for prediction of AARI and the
associated uncertainty. We addressed the issue of scarcity in measurement
points by using repetitive rainfall measurements (pooling) to increase the
number of spatial samples used for variogram estimation. The methods were
illustrated with rainfall data collected from a cluster of eight paired rain
gauges in a 400 m

A summary of the significant findings is listed below:

Several studies (e.g. Berne et al., 2004; Gebremichael and Krajewski, 2004; Krajewski et al., 2003) used a single geostatistical model in the form of variogram/correlogram for the entire range of rainfall intensity. The current study shows that for small time and space scales the use of a single geostatistical model based on a single variogram is not appropriate and a distinction between rainfall intensity classes and length of temporal averaging intervals should be made.

The level of uncertainty in the prediction of AARI using point measurement data essentially comes from two sources: spatial variability of the rainfall and measurement error. The significance and characteristics of the measurement error observed here mainly corresponds to sampling related error of tipping bucket type rain gauges (TB error) and may vary for other types of rain gauges.

TB error decreases with increasing rainfall intensity. As a result of that,
the prediction error decreases with increasing AARI. At 5 min averaging
interval the CV values are as high as 80 % when the AARI is smaller than 1 mm h

At smaller temporal averaging intervals, the effect of both spatial variability and TB error is high, resulting in higher uncertainty levels in the prediction of AARI. With increasing temporal averaging interval the uncertainty becomes smaller as the spatial correlation increases and the TB error reduces. At 2 min temporal averaging interval the average CV in the prediction of peak AARI is 6.6 % and the maximum CV is 13 % and they are reduced to 1.5 and 3.6 % respectively at 30 min averaging interval.

TB error at averaging intervals of less than 5 min, especially at low-intensity rainfall measurements, is as significant as spatial variability. Hence proper attention to TB error should be given in any application of these measurements, especially in urban hydrology, where averaging intervals are often as small as 2 min.

An urban catchment of this size needs rainfall data at a temporal and spatial resolution which is higher than the resolution of most commonly available radar data (1000 m, 5 min). In addition the level of uncertainty in radar measurements would be much higher than that of point measurements, especially at a small averaging interval (< 5 min, Seo and Krajewski, 2010; Villarini et al., 2008), which are often of interest in urban hydrology. Hence, experimental rain gauge data similar to the ones used in this study are crucial for similar studies focused on small urban catchments.

Results from this study can be used for uncertainty analyses of hydrologic and hydrodynamic modelling of similar-sized urban catchments in similar climates as it provides information on uncertainty associated with rainfall estimation which is arguably the most important input in these models. This information will help to differentiate input uncertainty from total uncertainty thereby helping to understand other sources of uncertainty due to model parameter and model structure. This estimate of the relative importance of uncertainty sources can help to avoid false calibration and force fitting of model parameters (Vrugt et al., 2008). This study can also help to judge optimal temporal averaging interval for rainfall estimation of hydrologic and hydrodynamic modelling especially for small urban catchments.

The rainfall intensity data used in this study are freely available at

The authors declare that they have no conflict of interest.

This research was done as part of the Marie Curie ITN – Quantifying Uncertainty in Integrated Catchment Studies (QUICS) project. This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 607000. Edited by: P. Molnar Reviewed by: two anonymous referees