Censored rainfall modelling for estimation of fine–scale extremes

Reliable estimation of rainfall extremes is essential for drainage system design, flood mitigation and risk quantification. However, traditional techniques lack physical realism and extrapolation can be highly uncertain. In a warming climate, the moisture holding capacity of the atmosphere is greater which increases the potential for short duration high intensity storm events. In this study, we improve the physical basis for short duration extreme rainfall estimation by simulating the heavy portion of the rainfall record mechanistically using the Bartlett-Lewis rectangular pulse model. Mechanistic rainfall 10 models have had a tendency to underestimate rainfall extremes at fine temporal scales. Despite this, the simple process representation of rectangular pulse models is appealing in the context of extreme rainfall estimation because it is emulates the known phenomenology of rainfall generation. A censored approach to Bartlett-Lewis model calibration is proposed and performed for single site rainfall from two gauges in the UK and Germany. Extreme rainfall estimation is performed for each gauge at the 5, 15 and 60 minute resolutions, and considerations for censor selection discussed. 15


Introduction
Extreme rainfall estimation is required for numerous applications in diverse disciplines ranging from engineering and hydrology to agriculture, ecology, and insurance.It facilitates the planning, design, and operation of key municipal infrastructure such as drainage and flood defences, as well as scenario analysis for climate impact assessment, and hazard risk modelling.Extremes are usually estimated using frequency techniques and intensity duration frequency curves.However, these methods are highly dependent on the observed rainfall record, which may not be characteristic of the extreme behaviour.
In this study we improve the physical basis of shortduration extreme rainfall estimation by simulating the heavy portion of the observed rainfall time series.Traditional approaches to extreme value estimation rely on sampling extremes from the observed record.However, rainfall observations present various problems for the practitioner.They are often not available at the location of interest, they are typically short in duration, and they may not be available at the temporal scale appropriate for the intended use.These difficulties, together with the necessity to obtain perturbed time series representative of future rainfall, have motivated the development of stochastic rainfall generators since the earliest such statistical models developed by Gabriel and Neumann (1962).The reader is referred to Waymire and Gupta (1981), Wilks and Wilby (1999), and Srikanthan and McMahon (2001) for detailed reviews of early developments in rainfall simulation.
The principle of rainfall simulation is to replicate statistical properties of the observed record such that multiple realizations of statistically identical rainfall may be synthesized (Richardson, 1981).Various methods of simulation exist, and there have been several attempts in the literature to categorize the different approaches.Aside from dynamic methods used in numerical weather prediction models, Cox and Isham (1994) suggest that statistical simulation methods may be broadly categorized as either purely statistical or stochastic, while Onof et al. (2000) further categorize stochastic methods as either multi-scaling or mechanistic.The latter of these differ from other statistical approaches because rainfall synthesis follows a simplified representation of the physical rainfall-generating mechanism.Through the clustering of rain cells in storms, the unobserved continuous-time rainfall is constructed by superposition, enabling the synthetic D. Cross et al.: Censored rainfall modelling for estimation of fine-scale extremes rainfall hyetograph to be aggregated to whatever scale is desired (Kaczmarska et al., 2014).Because of this simplified process representation, mechanistic model parameters have physical meaning, which makes this class of model particularly appealing in the context of extreme value estimation.
When no likelihood function can be formulated (Rodriguez-Iturbe et al., 1988;Chandler, 1997), mechanistic models are typically calibrated using a generalized method of moments (Wheater et al., 2007a) with key summary statistics at a range of temporal scales such as the mean, variance, autocorrelation, and proportion of dry periods.Performance is assessed on the ability of the models to reproduce the calibration statistics as well as others not used in calibration including central moments and extremes.Since their inception in the late 1980s by Rodriguez-Iturbe et al. (1987,1988), numerous studies have demonstrated the ability of these models to satisfactorily reproduce observed summary statistics (see Cowpertwait et al., 1996;Verhoest et al., 1997;Cameron et al., 2000a, b;Kaczmarska et al., 2014;Wasko and Sharma, 2017;and Onof et al., 2000, for a review).However, these studies have also shown that mechanistic models tend to underestimate rainfall extremes at the hourly and sub-hourly scales, which limits their usefulness (see Verhoest et al., 2010, and references therein).
We hypothesize that stochastic mechanistic pulse-based models may be poor at estimating fine-scale extremes because the training data, and calibration method, are dominated by low-intensity observations.Mechanistic stochastic models are fitted to the whole rainfall hyetograph, including zeroes, aggregated to a range of temporal scales.Typically, the range of scales used varies from hourly to daily, although implicit in most studies is the assumption that scales required in simulation should be within the range of scales used in calibration.Hence, if the intention of the model is to simulate 15 min rainfall, the training data should include 15 min observations.As the temporal resolution of rainfall data becomes finer, the distribution of rainfall amounts becomes more positively skewed.Primarily, this is because of the increased proportion of dry periods, but also the higher proportion of lowintensity events characteristic of fine-scale rainfall.Because the calibration method uses central moments to fit model parameters, the greater skewness at finer temporal scales makes it difficult to obtain a good fit to extremes at these scales.
In addition to the dominance of low observations, the estimation of fine-scale extremes may be further undermined by operation and sampling errors.This is particularly true of tipping bucket gauges where measurement precision at fine temporal scales is limited to the bucket volume, typically 0.2 or 0.5 mm.Fine-scale rainfall is highly intermittent (starting and stopping with high frequency), yet a tipping bucket gauge can only make a recording when the bucket is full.The limitations of tipping bucket measurements at fine temporal scales have long been understood (Goldhirsh et al., 1992;Nystuen et al., 1996;Yu et al., 1997), although the first formal estimation of sampling error was performed by Habib et al. (2001).In this study, the authors investigate the ability of tipping bucket gauges to capture the temporal variability of fine-scale rainfall at 1, 5, and 15 min scales using tipping bucket measurements simulated from high-resolution optical rain-gauge observations.The authors show that for the lowest rainfall intensities (< 5 mm h −1 ) the mean relative error of the tipping bucket gauge at the 5 min resolution is +3.5 %, with corresponding SD just under +30 % for a bucket volume of 0.254 mm.Larger errors are obtained for the 1 min resolution.They also show that increasing the bucket volume to 0.5 mm significantly increases the spread of the sampling error for low observations at the 5 min resolution.The observed record comprised mainly convective storm events which are typical for Iowa in the US where the data were collected, although the error estimates are significant and present compelling evidence of the impact of sampling error on fine-scale low-intensity rainfall observations.Significant effort has been made since the late 1980s to improve the performance of mechanistic rainfall models through structural developments, with substantial focus on the improved representation of fine-scale extremes (see Sect. 2 for a review).Despite this, little progress has been achieved.To test our hypothesis, a simple approach is proposed in which low observations for fine-scale data are censored from the models in calibration.For a given temporal resolution, a censor amount is set.Rainfall below the censor is set to zero and rainfall over the censor is reduced by the censor amount.This focusses model fitting on the heavier portion of the rainfall record at fine temporal scales, and reduces rainfall intensity at coarser scales.The aim is to investigate whether existing mechanistic models can be used as simulators of fine-scale storm events by changing the data and not the model, thereby reducing the impact of low observations and sampling error on fine-scale extreme rainfall estimation.
The choice of models is limited to those within the Bartlett-Lewis family of models which conform to the original concept of rectangular pulses developed by Rodriguez-Iturbe et al. (1987).Preference is given to the most parsimonious model variants on the basis that having fewer parameters improves parameter identifiability and reduces uncertainty.The Neyman-Scott family of models is excluded on the understanding that the clustering mechanisms of both model types perform equally well (Wheater et al., 2007a), and there is no evidence that randomization of the Neyman-Scott model (Entekhabi et al., 1989) has any advantage over its Bartlett-Lewis counterpart.
In Sect.2, we outline the main mechanistic model developments for improved representation of extremes.The censored modelling approach for the estimation of fine-scale extremes is described in Sect.3. Model structure and selection are explained in Sect.4, and the data and fitting methodology are presented in Sect. 5. Results are given in Sect.6 together with validation analysis.Discussion on the results and censor selection are given in Sect.7. Section 8 provides further dis-cussion and outlines our main conclusions and thoughts for future research.

Mechanistic model developments
Attempts to improve the estimation of fine-scale extremes for point (single-site) rainfall using mechanistic models have focused on changing the model structure.Several authors have cited significant improvement (Cowpertwait, 1994;Cameron et al., 2000b;Evin and Favre, 2008), although increased parameterization and limited verification with real data have meant that most changes have not been widely adopted.An early criticism of the original mechanistic models presented by Rodriguez-Iturbe et al. (1987) is that the exponential distribution applied to rainfall intensities is light-tailed.This choice is consistent with the observation that rainfall amounts, which in the model are obtained through the superposition of such cells, are approximately gamma distributed (Katz, 1977;Stern and Coe, 1984).
On the basis that the gamma distribution gives more flexibility in generating rain-cell intensities, Onof and Wheater (1994b) reformulate the modified (random η) Bartlett-Lewis (MBL) model (Rodriguez-Iturbe et al., 1987) with the gamma distribution to improve the estimation of extremes.Despite the good fit to hourly extremes cited by Onof and Wheater (1994b), subsequent studies have continued to show underestimation at hourly and sub-hourly scales (Verhoest et al., 1997;Cameron et al., 2000a;Kaczmarska et al., 2014).
In an extension of this approach, Cameron et al. (2000b) replace the exponential distribution in the MBL model with the generalized Pareto (GP) distribution for rain cells over a high threshold.Depending on the value of the shape parameter (ξ ), the GP converges to one of three forms: upperbounded (ξ < 0), exponential (ξ = 0), and Pareto (ξ > 0).In the last case, we have a distribution with a heavier tail than the exponential or the gamma.Because the GP distribution is a model for threshold exceedance, the authors specify a threshold below which the MBL model with exponential intensity distribution is used to simulate rain-cell depth, and above which the Pareto distribution is used.This is justified on the assumption that the rain-cell depth may be of either high or low intensity.
The authors present a calibration strategy in which they first fit the MBL model with exponential cell depths to the whole rainfall record using the method of moments from Onof and Wheater (1994b).Generalized likelihood uncertainty estimation (Beven and Binley, 1992) is then used to find behavioural parameterizations of the Pareto scale and shape parameters for rain-cell depths over the thresholdthe location parameter being fixed at the threshold value.The central assumption of this model is that the Pareto scale and shape parameters for cell depths over the threshold will have "minimal impact on the standard statistics of the simulated continuous rainfall time-series" (Cameron et al., 2000b, p. 206).The validity of this assumption is disputed by Wheater et al. (2007a), who argue that the MBL model should be fitted to rainfall coincident with rain cells below the threshold, but point out that this is "impossible since cell intensities are not observed" (Wheater et al., 2007a, p. 16).
The model framework of Cameron et al. (2000b) differs from that of the MBL gamma model of Onof and Wheater (1994a) and is essentially the nesting of two models.The authors present significant improvement in the estimation of hourly extremes and show good agreement with generalized extreme value (GEV) estimates.However, because the underlying process of continuous-time rainfall is unobserved, the authors are forced to implement a calibration strategy which limits the impact on standard rainfall statistics -an approach which is undesirable (Wheater et al., 2007a).Furthermore, the framework appears to be an analogue of the N-cell rectangular pulse model structure initially developed by Cowpertwait (1994) for the Neyman-Scott model, and later incorporated into the Bartlett-Lewis models by Wheater et al. (2007a).Regardless of their relative performance, the large number of parameters required for these models is undesirable on the basis that more parameters reduce parameter identifiability and increase parameter uncertainty.
In an earlier study, Cowpertwait (1994) differentiates between light and heavy rain cells in a modified version of the original (fixed η) Neyman-Scott rectangular pulse (NSRP) model (Rodriguez-Iturbe et al., 1987) by allowing rain-cell intensity and duration to be drawn from more than one pair of exponential distributions.The new model, termed the Generalised NSRP model (GNSRP), leads to a significant increase in parameterization over the original NSRP model, although the author presents an intelligent way to simplify calibration by relating model parameters to harmonic signals.While improvement is achieved in the fit to hourly extremes, the performance of the model in replicating other important statistics is not presented, in particular autocorrelation and the proportion of dry periods.Both of these properties are addressed by Rodriguez-Iturbe et al. (1987, 1988) for the Bartlett-Lewis model with the inclusion of a "high frequency jitter" and randomization of the rain-cell duration parameter η.Entekhabi et al. (1989) present a randomized version of the Neyman-Scott model with significant improvement in the fit to dry periods.However, because no analytical expression was available for the proportion of dry periods, this statistic was not used in model fitting, and other model parameters were not allowed to vary from storm to storm with randomization.Consequently, while the MBL and the GN-SRP models each allow rain-cell intensity and duration to be drawn from more than one pair of distributions, the MBL structure is preferred because it has fewer parameters.
In a later study, Cowpertwait (1998) hypothesized that including higher-order statistics in the fitting routine for mechanistic rainfall models would give a better fit to the tail of the empirical distribution for rainfall amounts.Focussing on the original (fixed η) NSRP model, analytical equations for skewness of the aggregated rainfall depth are presented and used in fitting the models.Empirical analysis showed that including skewness in the fitting statistics improved the estimation of Gumbel distribution parameters from simulated maxima when compared with parameters obtained from observed annual maxima.
A criticism of the rectangular pulse model structure by Evin and Favre (2008) is that it assumes independence between rain-cell intensity and duration.Following previous attempts to link the two variables (Kakou, 1997;De Michele and Salvadori, 2003;Kim and Kavvas, 2006), Evin and Favre (2008) present a new NSRP model in which the dependence between rain-cell depth and duration is explicitly modelled using a selection of copulas.While the authors are not primarily motivated to improve the estimation of rainfall extremes, good estimation of fine-scale extremes is achieved.However, the manner in which the results are presented makes interpretation and comparison with other studies difficult.In the first instance, the extreme performance of all models is almost entirely indistinguishable, indicating that no overall improvement is achieved.Secondly, monthly annual extremes are presented at hourly and daily scales but without clearly stating which month in the year.Despite this, it is likely that monthly extremes will have lower variability than those taken from the whole year, and hence model performance is likely to be better.On the basis of the results presented, it is not clear that explicitly modelling dependence between rain-cell depth and duration with copulas offers any discernable benefit over the original model structure.
Theoretically, copulas offer an attractive framework for modelling the dependence structure between rainfall intensity and duration.However, the obvious mechanism for building copula dependence into mechanistic rainfall models is at the rain-cell level as per Evin and Favre (2008).This approach draws upon the intuition that, just as for the rainfall amounts of storm events, rain-cell amounts may be correlated with their duration.Such intuition follows earlier studies into the dependence structure between rainfall intensity and duration (Bacchi et al., 1994;Kurothe et al., 1997) -although as stated by Vandenberghe et al. (2011, p. 14), "it is not very clear in which way this modelled dependence at cell level alters the dependence between the duration and mean intensity of the total storm".
In recent years, renewed focus on estimating rainfall extremes at hourly and sub-hourly scales has led to the development of a new type of mechanistic rainfall model based on instantaneous pulses (Cowpertwait et al., 2007;Kaczmarska, 2011).In this model structure, rectangular pulses are replaced with a point process of instantaneous pulses with depth X and zero duration, the summation of pulses giving the aggregated time rainfall intensity.Considered initially to offer a more suitable representation of rainfall at sub-hourly scales than rectangular pulses, Kaczmarska et al. (2014) found that the best-performing Bartlett-Lewis Instantaneous Pulse (BLIP) model effectively generated rectangular pulses when depth X was kept constant and cell duration η was randomized.Because of the very large number of pulses generated within cells, the authors noted that this model structure imposes the "most extreme form of dependence" - Kaczmarska et al. (2014Kaczmarska et al. ( , p. 1977)).Consequently, the authors developed a new rectangular pulse model in which both η and µ x are randomized (BLRPR X ), which was found to perform equally as well as the randomized version of the BLIP model but without additional parameterization.
3 Censored modelling for fine-scale extremes Despite the model improvements outlined in Sect.2, there is an ongoing tendency for stochastic mechanistic models to underestimate extremes at hourly and sub-hourly scales.Consequently, the practitioner is required to employ additional methods for better extreme value performance, including disaggregation (Koutsoyiannis andOnof, 2000, 2001;Onof et al., 2005;Onof and Arnbjerg-Nielsen, 2009;Kossieris et al., 2018) and model fitting with more information about the variability of precipitation (Kim et al., 2013a).We propose a censored approach to mechanistic rainfall modelling for improved estimation of fine-scale extremes by focussing model fitting on the heavy portion of the rainfall time series.The aim of this research is to investigate whether mechanistic models can be used as simulators of fine-scale design storm events to reduce the impact of low observations on the estimation of fine-scale extremes.In this approach, rainfall below a low censor is set to zero and rainfall over the censor is reduced by the censor amount.The effect is to generate a time series of heavy rainfall based on the observed record in which the proportion of dry periods is increased and rainfall amounts are reduced.
Figure 1 shows two arbitrary censors applied to 15 min data at Atherstone in 2005 (refer to Sect.5.1 for a description of the data).The left plot shows the uncensored rainfall, and the two right plots the change in rainfall with increasing censors.The reduced rainfall amounts are shown on the secondary y axes.It can be seen from these plots that the minimum recorded rainfall is 0.2 mm, which corresponds to the tip volume of the tipping bucket rain gauge.Compared with higher rainfall amounts this volume is recorded with very high frequency throughout the year at the 15 min resolution.
Censored rainfall synthesis is a method for estimating subhourly to hourly extremes.Because observations below the censor are omitted from model fitting, censored model parameters are scale-dependent and can only be used to simulate storm profiles above the censor at the same scale as the training data.It is the ability to simulate the heavy portion of storm profiles which enables extreme rainfall estimation.The basic procedure is as follows.
Hydrol 1.For the chosen temporal resolution, select a suitable censor (mm) and apply it to the observed rainfall time series by setting rainfall amounts below the censor to zero, and reducing rainfall amounts over the censor by the censor amount.
2. Fit the mechanistic rainfall model to the censored rainfall by aggregating the censored time series to a range of temporal scales and calculating summary statistics as necessary for model fitting.
3. Simulate synthetic rainfall time series at the same resolution as the training data in Step 1 and sample annual maxima.
4. Restore the censor to the simulated annual maxima and plot against the observed maxima.

Model structure and selection
Mechanistic point process rainfall models, first developed by Rodriguez-Iturbe et al. (1987), exist in various forms, although all models are formulated around two key assumptions about the rainfall-generating process.Firstly, rainfall is assumed to arrive in rain cells following a clustering mechanism within storms.Secondly, the total rainfall within cells is represented by a pre-specified rainfall pattern which describes the rain-cell duration and amount.The continuoustime rainfall is the summation of all rainfall amounts in time t.Most models assume rectangular pulses to describe rainfall amount and duration, although alternative patterns have included a Gaussian distribution (Northrop and Stone, 2005) and instantaneous pulses (Cowpertwait et al., 2007(Cowpertwait et al., , 2011;;Kaczmarska et al., 2014).In this latter formulation, pulses are assumed to arrive according to a Poisson process within cells, with each pulse representing an amount with zero duration.The continuous-time rainfall is therefore the summation of all pulse amounts in time t.
In the original form of the model, storms arrive according to a Poisson process with rate λ, and terminate after an exponentially distributed period with rate γ .The arrival of rain cells within storms follows a clustering mechanism which defines a secondary Poisson process with rate β.Two clustering mechanisms are specified by Rodriguez-Iturbe et al. (1987): the first is the Neyman-Scott mechanism in which the time intervals between storm and cell origins are assumed to be independent and identically distributed random variables; the second is the Bartlett-Lewis mechanism in which the time intervals between successive cell origins are independent and identically distributed random variables.In each case, the time intervals are assumed to be exponentially distributed.Rain-cell profiles are rectangular with heights X for amounts, and lengths L for durations.Both X and L are assumed to be independent of each other and follow exponential distributions with parameters 1/µ x and η respectively.Figure 2 shows a graphical illustration of the continuous-time rainfall generation process.Table 1 sets out the model parameters for the original and two randomized Bartlett-Lewis rectangular pulse models (BLRP, BLRPRη, and BLRPR X ), as well as the original Neyman-Scott rectangular pulse model (NSRP).
The original BLRP and NSRP models with exponential cell depth distributions are the most parsimonious, each having only five parameters (see Table 1).A limitation of these models is that their simplicity implies all rainfall -stratiform, convective, and orographic -has the same statistical properties.On the assumption that rainfall may derive from different storm types, in particular convective and stratiform, it is physically more appealing to allow the statistical composition of rainfall models to vary between storms.

(a)
Index Time Rain cell amount Table 1.Model parameters for the original and two randomized Bartlett-Lewis rectangular pulse models (BLRP, BLRPRη, and BLRPR X ) and the original Neyman-Scott rectangular pulse model (NSRP).
Because the exponential distribution is a special case of the gamma distribution where r is equal to 1, µ x 2 = 2µ2 x and µ x 3 = 6µ3 x .Therefore it is not necessary to parameterize r for the exponential distribution, meaning the exponential versions of these models require one parameter less with r set to 1 in calibration.
Two different approaches have been developed to accommodate the simulation of different rainfall types with rectangular pulses.For the Neyman-Scott model, concurrent and superposed processes have been developed in generalized (Cowpertwait, 1994) and mixed (Cowpertwait, 2004) rectangular pulse models respectively.Both models enable explicit simulation of multiple storm types, although their increased parameterization and consequent impact on parameter identifiability means that it is undesirable to simulate more than two storm types.For the Bartlett-Lewis model, randomization of the rain-cell duration parameter η (Rodriguez-Iturbe et al., 1988;Onof andWheater, 1993, 1994b) with a gamma distribution allows all storms to be drawn from different distributions.Because rain-cell durations are assumed to be ex-ponentially distributed, rain cells with high values of η are more likely to be shorter in duration, and those with low values of η will typically have longer durations.Additionally, the rate at which rain cells arrive, and the storm durations, are defined in proportion to η by keeping the ratios β/η and γ /η constant (equal to κ and ϕ respectively).This means that, typically, shorter storms will comprise shorter rain cells with shorter rates of arrival and the opposite for longer storms, which is characteristic of the differences between convective and stratiform rainfall.
The modified (random η) Bartlett-Lewis model (see BLRPR η in Table 1) of Onof andWheater (1993, 1994b) is the most parsimonious of the model structures able to accommodate multiple storm types comprising a minimum of six parameters for the exponential version.The modified (random η) Neyman-Scott model has the same number of parameters as the modified Bartlett-Lewis model, but because there is no evidence that it has any advantage over the latter, it is excluded from this study.The updated random η Bartlett-Lewis model with mean cell depth µ x also randomized (see BLRPR X in Table 1) requires fewer parameters than its instantaneous pulse counterpart and the same number of parameters as the modified BLRPR η model.Structurally, it is identical to the modified model, although µ x is also allowed to vary randomly between storms by keeping the ratio ι = µ x /η constant.
Because the Neyman-Scott and Bartlett-Lewis clustering mechanisms are considered to perform equally well, model selection is limited to the most parsimonious model structures within the Bartlett-Lewis family of models: the original model (BLRP), the linear random parameter model (BLRPRη), and the linear random parameter model with randomized µ x (BLRPR X ).Hereafter, these models are referred to as BL0, BL1, and BL1M respectively.For the models used in this study, it is assumed that rain cells start at the storm origin to prevent the simulation of empty storms which can occur with the Bartlett-Lewis clustering mechanism if the first rain cell starts after the end of the storm.
5 Data and model fitting

Data selection
Estimation of fine-scale extremes with censored rainfall simulation is performed on two gauges: Atherstone in the UK and Bochum in Germany.Atherstone is a tipping bucket rain gauge (TBR) operated and maintained by the Environment Agency of England.The record duration is 48 years from 1967 to 2015, with one notable period of missing data from January 1974 to March 1975.The reason for the missing data is unknown, although it is not expected to affect model fitting and the estimation of extremes.This site was selected from all TBRs in the Environment Agency's Midlands Region on the basis that the number of Environment Agency quality flags highlighted as "good" in the record is greater than 90 %, and the number of "suspect" flags less than 10 % (92.3 and 6.7 % respectively).Between 8 February 1981 and 20 November 2003 the gauge resolution is 0.5 mm.Before and after this period it is 0.2 mm.In the period before 8 February 1981, the TBR record includes a number of observations of 0.1 mm at precisely 09:00:00.It is assumed that these are manual observations to correct the rain-gauge totals to match with check gauge totals following quality checks of the data.
Bochum is a Hellmann rain gauge operated and maintained by the German Meteorological Service.It uses a floating pen mechanism to record rainfall on a drum or band recorder with a minimum gauge resolution of 0.01 mm.The duration is 69 years from 1931 to 1999, and the data are aggregated to a minimum temporal resolution of 5 min.These sites are selected to represent rainfall in different geographical regions obtained using different measurement techniques.Figure 3 shows the locations of these two gauges.

Parameter estimation
Model fitting is performed in the R programming environment (R Core Team, 2017) using an updated version of the MOMFIT software developed by Chandler et al. (2010) for the UK Government Department for the Environment, Food and Rural Affairs (DEFRA) FD2105 research and development project (Wheater et al., 2007a, b).In this software, parameter estimation is performed using the generalized method of moments (GMM) with a weighted least squares objective function: The reader is referred to Wheater et al. (2007b; Appendix A) for a detailed explanation of the fitting methodology.
The GMM is preferred for mechanistic rainfall models because the complex dependency structure and marginal distribution of aggregated time series make it very difficult to obtain a useful likelihood function (Rodriguez-Iturbe et al., 1988).In this procedure, the difference between observed and expected summary statistics of the rainfall time series at a range of temporal scales is minimized, giving an optimal parameter set θ where t = (t 1 . ..t k ) is a vector of k observed summary statistics, and τ (θ ) = (τ 1 (θ). ..τ k (θ)) is a vector of k expected summary statistics which are functions of θ = (θ 1 . ..θ p ) , i.e. of the vector of p model parameters for which analytical expressions are available.The ith summary statistic is weighted according to the inverse of its observed variance ω i = 1/var(t i ) where var(t i ) is the ith diagonal element of the estimated covariance matrix of the observed summary statistics, ˆ .While this weighting is not optimal, it provides a reasonable approximation to the optimal weights for the GMM giving robust estimation of the parameter standard errors (Chandler et al., 2010).Other weights can be applied allowing the user to influence the dominance of specific rainfall properties, although for unbiased estimates of the sum- mary statistics the weights must be independent of the model parameters and the data (Wheater et al., 2007b).
Typically, the vector of observed summary statistics t comprises the mean, variance, auto-correlation, and proportion of dry periods for temporal scales between 1 and 24 h.Prior to model fitting and to allow for seasonality, summary statistics are calculated for each month over the record length and pooled between months.For each month, the pooled statistics are used to estimate the covariance matrix of model parameters required for parameter uncertainty estimation, and the mean of the monthly statistics.Therefore 12 parameter sets are obtained for the whole year.
Model parameters are estimated using two minimization routines.First, Nelder-Mead optimizations are performed on random perturbations around user-supplied parameter values to identify promising regions of the parameter space.Following a series of heuristics to identify the best-performing parameter set, random perturbations around these values are used as new starting points for subsequent Newton-type optimizations.The parameter set with the lowest objective function is the best-performing and selected for that month.Following the approach employed by Kaczmarska (2013) to obtain smoothly changing parameters throughout the year, this two-step optimization is only applied to one month.Subsequent parameter estimation is based on a single Newtontype optimization using the previous month's estimate as the starting point.Testing of this approach has shown that when the parameters are well identified, the same seasonal variation is achieved regardless of the starting month.The sampling distribution of the estimators resulting from the GMM minimization routine is approximately multivariate normal (MVN).The optimal parameter set is estimated by the mean of this distribution, and the covariance matrix is estimated from the Hessian of the least squares objective function S (Wheater et al., 2007b).The MVN distribution of model parameter estimators is used to estimate 95 % confidence intervals for the parameter estimates.On occasions that the model parameters are poorly identified, it may not be possible to calculate the Hessian of the objective function, preventing the estimation of parameter uncertainty.

Experimental design
To test the effectiveness of censored rainfall modelling for the estimation of fine-scale extremes, the approach is applied to three temporal scales: 5, 15, and 60 min.At each scale, the three selected Bartlett-Lewis models are fitted to both datasets with incrementally increasing censors.The minimum increments applied at each resolution are 0.05, 0.10, and 0.20 mm respectively.Initial experiments with the coefficient of skewness and proportion of dry periods included in model fitting for censored data were limited by the inability to obtain well-identified parameters for some or all months.While good model fits were obtained for some low censors, extreme value estimation continued to be underestimated.On the basis that censoring is a new approach to enhance the estimation of rainfall extremes, skewness is not considered to be an important fitting statistic for censored simulations.Furthermore, because censored models cannot be used to generate continuous time series of the sort which may be used for hydrological modelling, the proportion of dry periods is also considered to be unimportant for censoring.Consequently for censored model calibration, the choice of fitting statistics is reduced to the 1 h mean, the coefficient of variation and lag-1 autocorrelation of the rainfall depths at the censor resolution, and the 6 and 24 h resolutions.Again, to ensure well-identified model parameters for the Atherstone dataset, it was necessary to extend the choice of fitting statistics to include the 1 h statistics for 5 min simulations.This was neither necessary for 15 and 60 min simulations at Atherstone nor the Bochum dataset.
For all simulations the fitting window is widened to 3 months; hence, for any given month the models are fitted to data for that month, together with the preceding and following months.This approach is used to increase the data available for fitting the models when censoring on the basis that censoring removes data which would otherwise be used in fitting.Tests have shown that widening the fitting window from 1 to 3 months has the effect of smoothing the seasonal variation in model parameters and improving parameter identifiability.There is also negligible impact on the estimation of summary statistics and extremes under the model parameters.
For the two randomized models, BL1 and BL1M, the gamma shape parameter α is constrained to a fixed value in calibration and simulation.The gamma shape parameter α is an insensitive model parameter and can take any value within a very large range without significant impact on the estimation of summary statistics or extremes.For the BL1 model, parameterization without an upper bound on α often results in poor identifiability with parameter estimates in the thousands to tens of thousands.For the BL1M model, α is typically better identified than for BL1, with a tendency to move towards the lower boundary.In order to avoid having infinite skewness, α must be greater than 4 for the BL1 model and 1 for the BL1M model (see Kaczmarska et al., 2014, and references therein for a discussion of these criteria).Therefore, by fixing α at 100 for the BL1 model and 5 for the BL1M model, the number of parameters to be identified for these models is reduced by one.All models are fitted using the exponential distribution for mean cell depth.This further reduces the number of model parameters to be fitted for both uncensored and censored models; therefore, in all cases the ratio of SD to the mean cell depth (r = σ x /µ x ) is fixed at 1. Sensitivity of the gamma shape parameter (α) and its impact on extreme estimation is investigated in Appendix A. Fitted model parameters for the BL1M model are presented in Appendix B for 5 and 15 min rainfall at both sites for uncensored and censored rainfall using censors selected in Sect.6.2 (Table 2).

Extreme value estimation
Rainfall extremes are estimated from the models by sampling annual maxima directly from simulations.For each model fitted to uncensored data, 100 realizations of 100 years in dura-tion are simulated using parameters randomly sampled from the multivariate normal (MVN) distribution of model parameter estimators.This allows model parameter uncertainty to be represented in the spread of the MVN extreme value estimates (hereafter referred to as MVN realizations), covering the full range of observations.Extreme value estimation up to the 1000-year return level is also provided to indicate the potential magnitude of rarer events.For this extrapolation, extremes are estimated from one realization using the mean of the MVN distribution of parameter estimators (hereafter referred to as the optimal estimates).To ensure stability of the extreme value estimates up to approximately the 1000-year return level, simulations have been extended to 10 000 years.
Extreme value estimation for the censored calibrations is shown in Figs.4-6 for the 5, 15, and 60 min temporal resolutions respectively.The top three plots in each figure show the results for Bochum, and the bottom three plots the results for Atherstone, with observed and simulated annual maxima plotted using the Gringorten plotting positions.All plots show the equivalent extreme value estimates obtained without censoring by simulating one realization of 10 000 years in duration with the optimal parameter set.Upper limits on censoring were identified when model parameterization noticeably deteriorated, resulting in the means of the MVN realizations deviating away from the optimal.Results presented are limited to the four highest censors with well-identified model parameters, together with 95 % simulation bands.The simulation bands show the range of extreme value estimation between the 2.5th and 97.5th percentiles of the 100 MVN realizations for each simulated data point.
All censored models have significantly improved the estimation of extremes at each site and scale with very good estimation by all three model variants, particularly at the 5 and 15 min scales.At these scales, the estimation of extremes with the four censors presented has approximately converged on the observations.At the 60 min scale there is notable improvement in the estimation of extremes, with some convergence in estimation with increasing censors, although there is continued underestimation of the observed.The 95 % simulation bands for all censored models broadly bracket the observations and are largely unvaried with increasing censors, other than with the BL1M model at the 60 min resolution.
At the 5 min scale, estimation has converged on the observations with censors between 0.5 and 0.65 mm at Bochum, and between 0.6 and 0.75 mm at Atherstone.For all three models there is slight underestimation of extremes higher than approximately the 10-year return period, although the BL1M model accurately estimates the highest observed extreme at both sites.At the 15 min scale, convergence at Bochum has occurred for censors between 1.0 and 1.3 mm, while at Atherstone convergence has occurred for censors between 0.6 and 0.9 mm.As for the 5 min resolution models, the BLIM model appears to perform slightly better than the BL0 and BL1 models, resulting in improved estimation of the highest observed extremes and elevated estimates of the 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 Bochum 5 minute extremes 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000  1000-year return period rainfall at both sites.At the 60 min resolution, there is good convergence in estimation for all three models at Bochum, and the BL1M model at Atherstone.However, extreme value estimation with the BL0 and BL1 models at Atherstone is more widely spread across the applied censors.For the BL0 and BL1 models, the 0.2 mm censor results in much lower estimates than the three higher censors, although the means of the MVN realizations for the 0.6 and 0.8 mm censors are starting to deviate away from the optimum realization.For the BL1M model, there is good convergence between the optimal realizations with each censor, although the means of the MVN estimates for the 0.6 and 0.8 mm censors have significantly deviated from the optimum.
The means of the MVN realizations for the BL1M model at Atherstone with the 0.6 and 0.8 mm censors (see Fig. 6) diverge from the optimum because of the generation of unrealistic extremes.This divergence is also observable in the larger spread of 95 % simulation bands over 100 realizations.
While it has been possible to fit the model, Fig. 7 shows that as censoring has increased to 0.8 mm, confidence intervals on model parameters have widened for several months of the year, notably January, February, and June.When sampling from the MVN distribution in simulation, these large confidence intervals mean that there is a high chance of sampling parameters which deviate significantly from the mean of the distribution, thereby giving rise to a wide spread in extreme value estimates.These large confidence intervals indicate that the confidence in parameter estimation decreases with higher censors, and consequently the model error is too large for the reliable simulation of extremes.

Validation
The rainfall extremes presented in Sect.6.1 have been generated mechanistically using model parameters derived from central moments of the censored rainfall time series.While censored models cannot be used to simulate the whole rainfall hyetograph, it is important to ensure that the process by which the extremes are estimated is reliable.Therefore, model performance is validated in the usual way for this class of model by comparing the analytical summary statistics under the model parameters with the observations -here the observations are censored.The lowest censors presented in Figs.4-6 are selected for validation.No distinction is made between models in this choice, although it is recognized that there is some variation in the extreme value performance of specific censors between model types.See Table 2 for censor selection at each site and scale.

Replication of fitting statistics
Figure 8 shows the seasonal variation in mean, coefficient of variation, and lag-1 autocorrelation for all three models at Atherstone with the selected censors in Table 2. Comparable Table 2. Censor selection for model validation.

min
15 min 60 min Bochum 0.5 mm 1.0 mm 1.0 mm Atherstone 0.6 mm 0.6 mm 0.2 mm performance is achieved with the models for Bochum and hence these results are not presented.The plots show the estimated summary statistics calculated using the optimum parameter estimates, together with 95 % simulation bands obtained by randomly sampling 100 parameter sets from the multivariate normal distribution of model parameters.Summary statistics are estimated under the model for all 100 parameter sets and simulation bands generated for the range of estimates between the 2.5th and 97.5th percentiles.Because models are fitted over 3-monthly moving windows, estimated summary statistics are compared with summary statistics for censored observations for the same periods.Fitting statistics for the 6 and 24 h scales are not shown.The limits on the vertical Y axes are optimized at each site and scale; therefore, the reader is advised to pay careful attention to the scales when comparing summary statistics.
All models perform very well with respect to replicating the summary statistics used in fitting, with the 95 % simulation bands comfortably bracketing the observations.The estimated summary statistics are very close to the observed ones, with all models performing equally well.The seasonal variation in mean monthly rainfall varies between scales because there is a higher proportion of low observations at short temporal scales removed by the censors.The greater prominence in seasonal variation shown in plots a and b indicates that the summer months (approx.April-October) are more prone to short intense bursts of rain, and the winter months longer periods of low-rainfall intensity.This is consistent with there being more convective rainfall in the summer, and stratiform rainfall in the winter.The plots in Fig. 8 demonstrate that the models are able to reproduce the censored fitting statistics, confirming the reliability of the process.

Replication of statistics not used in fitting
A consequence of censoring is that it truncates the thin tail of the rainfall amount distribution, which significantly changes its shape.Because this truncation is not replicated in the analytical equations of the models used in this study, the models are not expected to be able to reproduce skewness well.Therefore this statistic is excluded from validation.Conversely, censoring is not expected to significantly impact the ability of the models to estimate the proportion of wet periods.Despite this, censoring significantly changes this statistic at fine temporal scales.Figure 9 shows the seasonal variation in the proportion of wet periods for all three models at both sites with the selected censors in Table 2.The ability of the models to reproduce the proportion of wet periods is generally good, although there is a tendency for all models to overestimate this statistic at both sites.At the 5 min resolution for Bochum, the 95 % simulation bands comfortably bracket the observations between the months of May and October, although there is overestimation in the other months and for all months at the 15 and 60 min scales.At Atherstone, there is good representation of the proportion of wet periods at the 15 min scale, but overestimation at the 5 and 60 min scales.Generally, there is very slightly better agreement in the summer months which, as highlighted in Sect.6.2.1, may be more prone to short intense downpours at fine temporal scales.This suggests that the censored models may be more effective at simulating the heavier shortduration rainfall characteristic of summer convective storms than the longer-duration low-intensity rainfall characteristic of winter storms.

Discussion on censor selection
The censors selected for validation in Table 2 were chosen based on their extreme value performance.For the estimation of extremes at other locations, it would be preferable to have a set of heuristics to guide censor selection.The following discussion of extreme value estimation performed in this study is intended to guide practitioners in the application of censored modelling.

Stability of confidence intervals
Upper limits on censoring were identified where model parameters were either poorly identified or the means of the MVN realizations deviated significantly from the observations.The onset of this effect was observed in Fig. 6 for estimation of hourly extremes at Atherstone with the BL1M model.Figure 10 shows the change in 95 % simulation bands and the means of the MVN realizations obtained from censored models with well-identified and poorly identified parameters for 15 min data at Bochum and Atherstone.The comparison is made between extremes for the selected censors given in Table 2 (1.0 and 0.6 mm respectively) and extremes from higher censors (1.5 and 1.0 mm respectively).
Simulation bands on extreme value estimates for Bochum 15 min rainfall obtained with censors from 1.0 to 1.3 mm, and for Atherstone with censors from 0.6 to 0.9 mm (Fig. 5), are broadly stable and unchanging.This is indicative that parameterization across each model variant and censor is  good, enabling robust estimation of extremes.As the censor at Bochum is increased to 1.5 mm (Fig. 10a-c), there is a noticeable increase in the upper limit on the simulation bands and the means of the MVN realizations have di-verged, leading to overestimation of the extremes.Increasing the censor at Atherstone to 1.0 mm has resulted in very significant widening of the simulation bands and divergence of the means of the MVN realizations (Fig. 10d-f).In each case, D. Cross et al.: Censored rainfall modelling for estimation of fine-scale extremes (a) BL0 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000 2 5 10 25 100 300 1000  Hydrol.Earth Syst.Sci., 22, 727-756, 2018 www.hydrol-earth-syst-sci.net/22/727/2018/ this divergence results from the generation of unrealistic extreme value realizations which are shown in Fig. 10 (light grey lines).
While it has been possible to fit models to data with these high censors, examination of the parameter estimates and associated uncertainty reveals that parameter identifiability is reducing.Figure 11 shows the seasonal variation in estimates for the BL1M model parameters α/ν, κ and ϕ fitted to Bochum 15 min data with a 1.5 mm censor.Parameters λ and ι are well identified with tight confidence brackets around the optimum, while r and α are fixed; therefore, these parameters are not shown.Confidence intervals on α/ν, κ and ϕ are very large in the winter months, indicating that the identifiability of these parameters has deteriorated.When sampling from the MVN distribution for model parameter estimators in simulation, these large uncertainties give rise to poor extreme value estimation.The same behaviour was observed for the BL1M model at Atherstone for 60 min data, as shown in Fig. 7.
With the upper bound on censoring identified, the obvious question is how to identify a lower bound.The results presented in Figs.4-6 suggest that there is convergence in the estimation of extremes with increasing censors.If so, when is the onset of convergence?Figure 12 shows the change in extreme value estimation with censor for 15 min rainfall at Bochum (top plots) and Atherstone (bottom plots) for 10and 25-year return periods.
At both locations, divergence in the means and spread of the MVN realizations shown in Fig. 12 is easily identified with the very large box-plot whiskers at 1.5 and 1.0 mm censors for Bochum and Atherstone respectively.The plots for Bochum also show a large spread in the extreme realizations, with a 1.4 mm censor for the BL1M model, suggesting that parameter identifiability is deteriorating at this censor.
At Atherstone, there is clear evidence of convergence in estimation between censors 0.5-0.9mm.However, convergence is less obvious at Bochum.At Bochum, there is continual improvement in extreme value estimation with the increasing censors, although there is a perceptible reduction in improvement with each successive increase in censor.For censors of 0.7 mm and above, all model realizations bracket the observed extremes (horizontal dashed blue line), which is also true for censors above 0.5 mm at Atherstone.Therefore, ranges may be identified at both sites for censors, which may be considered to give satisfactory estimation of extremes: 0.7-1.3mm at Bochum and 0.5-0.9mm at Atherstone.

How much rainfall to censor?
In Sect.7.1 we identify plausible censor ranges based on parameter stability and convergence of extreme value estimation.However, this does not address the question of how much rainfall to censor.Because extremes are generated mechanistically, we want to simulate the storm event hyetograph; therefore, it is in our interest to keep the cen-sor low in relation to the rainfall depth profile.The most basic check is that the minimum observed extreme (here designated as the smallest annual maxima) is greater than the censor being used.This is true for all the sites and scales investigated in this study, with the lowest observed annual maxima of 1.6 mm occurring at the 5 min scale in Atherstone.However, this significantly exceeds the maximum censor applied to 5 min data at Atherstone, 0.75 mm (see Fig. 4); therefore, it is unlikely that a well-parameterized model would be achieved.
Figure 13 shows the empirical cumulative distribution function (ECDF) plots for the above zero rainfall records at Bochum and Atherstone aggregated to 5 and 15 min resolutions.All the censors used for the estimation of fine-scale extremes in Figs.4-6 are shown, with the top three censors highlighted in magenta.The censors selected for model validation (Table 2) are highlighted in blue, and the lower limits on censors identified in Sect.7.1 for 15 min rainfall are shown and highlighted in green.The ECDF plots are truncated at the 99th percentile to aid comparison of the applied censors; therefore, the maximum rainfall is highlighted in red text on the right of each plot.For all censors, their rainfall percentile values are shown with the colour matching the plotted lines.
It can be seen from Fig. 13 that a substantial proportion of the above zero rainfall record is masked from the models with censoring.At the 5 min scale, the selected censor of 0.5 and 0.6 mm removes in excess of 98 and 96 % of the above zero rainfall from Bochum and Atherstone respectively.At the 15 min scale, the selected censors of 1.0 and 0.6 mm remove in excess of 96 and 81 % respectively.These percentiles are high and support the hypothesis that mechanistic models may be poor at estimating fine-scale extremes because the training data are dominated by low observations.A striking difference in the ECDF plots for the two locations is the smoothness of the curves.The stepped nature of the Atherstone plots is very pronounced and reflects the resolution of the gauge: 0.5 mm from February 1981 to November 2003, and 0.2 mm before and after these dates.The stepped nature of the plots at Atherstone highlights that the selected censor quantiles (blue) are just greater than the 0.5 mm quantiles.We also know from Fig. 12 that a censor of 0.5 mm for 15 min rainfall at Atherstone would give very similar extreme value estimation to the selected 0.6 mm censor (highlighted in green in the ECDF plot, Fig. 13).This implies that to improve the estimation of fine-scale extremes at Atherstone, it has been necessary to remove all observations which correspond to the gauge resolution.
While the proportion of rainfall observations removed prior to model fitting is large -over 90 and 80 % for 5 and 15 min rainfall from Bochum and Atherstone respectivelycomparison with the maximum rainfall amounts and an assessment of the number of independent peaks over the censor demonstrate that the censors are low.Table 3 shows the proportion of maximum rainfall and the number of indepen- Table 3. Proportion of maximum rainfall and number of independent peaks per year for the selected censors given in Table 2.
The number of peaks over the censors are estimated using a temporal separation of 48 h to define independence.The proportion of the maximum observed rainfall is less than 6 % in all cases, which is very low considering that the maximum recorded rainfall across both sites and scales is just 27.9 mm for Bochum 15 min rainfall.For a standard peaksover-threshold extreme value analysis, the threshold is typically set so that between three and five independent peaks per year remain in the partial duration series.Using a temporal separation of 48 h to define independence, the number of peaks per year retained after censoring is between 27 and 65 (Table 3).The actual number of peaks retained for fitting the Bartlett-Lewis models is greater than this because serial dependence in the rainfall time series is simulated with mechanistic modelling.While it is possible to estimate return levels for serially dependent extremes using extreme value theory, the analysis set out in Fawcett and Walshaw (2012) demonstrates that estimating the extremal index is non-trivial and can be subjective.

Further discussion and conclusions
The estimation of rainfall extremes presented in this study using censored rainfall simulation is highly promising and offers an alternative to frequency techniques.The estimation of extremes at sub-hourly scales has far exceeded expectations, with all three models giving a very high level of accuracy across a range of censors.However, censoring uses rainfall models in a way they were never previously intended.Rainfall models have invariably been used for simulation of long-duration time series across a range of scales for input into hydrological and hydrodynamic models.Censored rainfall synthesis cannot be used in this way because only the heavy portion of the hyetograph is simulated.
The success of this research is to broaden the scope of mechanistic rainfall modelling and ask new questions of it.Mechanistic models and related weather generators are very powerful at simulating key summary statistics for a range of environmental variables.An area where these models have consistently underperformed is the estimation of fine-scale extremes.Efforts to improve extreme value estimation at fine temporal scales have focussed on structural developments.But those developments have always been undertaken in the context of rainfall time-series generation.Continued underestimation at fine temporal scales has given rise to the notion that rectangular pulse models are potentially "unsuitable for fine-scale data" (Kaczmarska et al., 2014(Kaczmarska et al., , p. 1985)).
For effective scenario planning with hydrological models, good reproduction of rainfall time series is necessary, with accurate estimation of key summary statistics.However, for assessment of extremes and estimation of storm O bs. 25 year event 19.2 mm O bs. 25 year event 15.5 mm • 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (l) BL1M  profiles, good replication of rainfall central moments is arguably less important.The ability of the censored models to adequately reproduce the central moments used in calibration was checked to ensure that the process by which the extremes are constructed is reliable.Because rainfall over the censor is by definition coincident with rainfall below the censor, the censored models can be used to estimate uncensored extremes by simply restoring the censor to the estimates.Extreme rainfall estimation with censoring across all models, scales, and sites is significantly improved on that without censoring as shown in Figs.4-6.Up to approximately the 25year return period, estimation is broadly equivalent across all models.For rarer events, the BL1M model appears to perform better than the other two at the 5 and 15 min scales at Bochum and Atherstone by accurately estimating the highest observations at those scales.This improvement over the BL0 and BL1 models is significant in the event that extreme rainfall estimation is required beyond the range of observations.This is demonstrated in all four cases (5 and 15 min scales at Bochum and Atherstone) with the higher estimation of extremes at the 1000-year return level by the BL1M model compared with the other two.Below approximately the 25year return period the differences in extreme rainfall estimation are so small that it is not possible to single out any one Hydrol.Earth Syst.Sci., 22, 727-756, 2018 www.hydrol-earth-syst-sci.net/22/727/2018/ model as having the best overall performance, although for increasingly rare events the results suggest a preference for the BL1M model.This result supports the findings reported by Kaczmarska et al. (2014) that the dependence structure between rain-cell amounts and duration in the BL1M model is beneficial in estimating fine-scale extremes.In all three models, there is a slight upward curvature in the Gumbel plotting of extremes, which is consistent with the GEV and GP distributions taking a positive shape parameter (ξ > 0).This curvature is more pronounced for the BL1M model, which would be consistent with a higher positive shape parameter.While extreme value theory encompasses a range of distributions characterized by the sign of the shape parameter, Koutsoyiannis (2004a) argues that rainfall extremes naturally follow the Fréchet distribution for annual maxima (equivalent to the GEV with ξ > 0), supported by empirical evidence in Koutsoyiannis (2004b).The positive growth in extremes observed in our results is consistent with this hypothesis, and suggests that important information about the distribution of extremes is captured in the full storm profile hyetograph over the low censor.Further research is required to investigate the theoretical link between mechanistic model parameters and their extreme value performance.
The results presented in this paper show that censored rainfall modelling has worked for single-site data from two very different locations, and recorded using different gauging techniques.Consistency in the relative magnitude of selected censors identified at each location, and the stability of estimation across a range of censors, give confidence in the approach and support the original hypothesis.It is an obvious limitation of censoring that it cannot be used to obtain time series of synthetic rainfall, as is the principal application of mechanistic rainfall models.However, the intention of this research was to investigate whether mechanistic models could be used for estimation of fine-scale extremes as an alternative to frequency techniques.The accuracy of estimates for sub-hourly rainfall extremes using all three model variants is very good, although the BL1M model appears to outperform the other two models at both sites for the 5 and 15 min scales by accurately predicting the highest observed extreme.
Reducing parameterization by fixing the gamma shape parameter α in the randomized models, and increasing the data for parameterization by widening the fitting window to 3 months has enabled the models to be fitted successfully to censored observations.It is likely that these aides to parameterization are necessary because censoring truncates the statistical distribution of the training data.The analytical solutions in the models do not make this assumption; therefore, a mismatch between the training data and the models arises with censoring.At low censors, truncation is minor and the analytical solutions in the models are able to make reasonable estimates of the fitting statistics.However, as the censor increases and the mismatch grows, a point is reached at which the analytical solutions are no longer able to estimate the fitting statistics, causing deterioration in parameter identifiability.
A principal goal of this research was to improve the physical basis of short-duration extreme rainfall estimation.This has been achieved by simulating storm profiles mechanistically in a way which mimics the phenomenology of rainfall generation.This has given rise to extreme rainfall estimation which may be described as a function of a set of model parameters with physical meaning, e.g. the extreme rainfall quantile z = F {λ, µ x , δ x , δ c , µ c , δ s } for the original Bartlett-Lewis model (see Appendix B for definitions of mechanistic model parameters).Future research is required to link model parameters to environmental covariates and spatial information.The latter may follow the regionalization methodology of Kim et al. (2013b).
Further research is also required to investigate the potential for incorporating censored modelling into a multi-model approach for synthetic rainfall generation.This may take the form of simulating the rainfall below the censor using a bootstrapping approach similar to that in Costa et al. (2015), or continuous simulation of uncensored rainfall with replacement of storms simulated using the censoring approach.
Data availability.The Atherstone tipping bucket rain-gauge dataset was obtained directly from the Environment Agency for England, UK.The data are not publicly accessible because they are used by the Environment Agency for operational purposes, but can be obtained for non-commercial purposes on request.The Bochum dataset was obtained directly from Deutsche Montan Technologie and was recorded by the Emschergenossenschaft/Lippeverband in Germany.The data are not publicly accessible because they belong to the Emschergenossenschaft and Lippeverband public German water boards and are used for operational purposes.To demonstrate the insensitivity of α for the randomized Bartlett-Lewis models, the BL1 and BL1M models were fitted to Bochum 15 min rainfall with changing constraints on α.The models were fitted using the 1 h mean and the 0.25, 6, and 24 h coefficient of variation, skewness coefficient, and lag-1 autocorrelation.For the BL1 model, α is constrained between 4.1 (lower bound) and 5, 10, 25, 50, 75, and 100 (upper bounds).For the BL1M model, α is constrained between 5, 10, 25, 50, 75, and 100 (lower bounds) and infinity (upper bound).For the BL1 and BL1M models, α converges on the upper and lower bounds respectively, although because α is not held fixed, parameter uncertainty is estimated.Parameter ranges are presented in the parallel coordinate plots in Fig. A1 by sampling 1000 parameter sets from the multivariate normal distribution of model parameter estimators for 4.1 < α < 10 6 .The parameter sets corresponding to α = 100 and α = 5 are shown for the BL1 and BL1M models respectively with dashed magenta lines.
The parallel coordinate plots clearly show the insensitivity of α compared with the other model parameters.When α is constrained with upper and lower bounds of between 25 and 100 for the BL1 and BL1M models respectively, α is poorly identified and can take any value over a very large range (see Fig. A1).When α is constrained with upper and lower bounds of less than 25 for the BL1 and BL1M models respectively, the identifiability of α is improved.This insensitivity results from the shape of the fitted gamma distribution used to sample η shown in Fig. A2.
As α increases the gamma distribution converges on the normal distribution and becomes increasingly flat.Therefore, for high values of α, the probability of randomly sampling anywhere within the distribution is greater compared with low values of α.For α ≥ 50, the gamma distribution is approximately normal and the range of η values which may be randomly sampled by both models is always large, resulting in a narrow range of potential exponential distributions from which to sample L where L is the cell duration.This impacts the estimation of extremes as shown in Fig. A3. Figure A3 shows extreme rainfall estimates from the BL1 and BL1M models with α fixed at 5, 50, and 100.For α ≥ 50, extreme rainfall estimation by both models is identical.For α = 5, the BL1 model estimates lower extremes than with higher α values, while the BL1M model gives improved estimation of the growth curve of extremes.Because of this combination of parameter insensitivity and relative performance in the extremes, α is fixed at 100 and 5 for the BL1 and BL1M models respectively.

Figure 2 .
Figure 2. Rainfall generation mechanism for mechanistic stochastic models with rectangular pulses.Panel (a) shows the arrival of storms and cells.Rain-cell intensity defines the height of each cell (X), and duration the length (L).Panel (b) shows the unobserved continuous-time rainfall time series derived from the superposition of cells shown in (a).

Figure 3 .
Figure 3. Plan showing the location of the UK and German rain gauges used in this study.

Figure 4 .
Figure 4. Extreme value estimation at 5 min resolution.Optimal realizations (opt.AM) are shown with solid lines and the means of the MVN realization (mvn.AM) are shown with dashed lines.Simulation bands (SBs) are shown with filled polygons.

Figure 6 .
Figure 6.Extreme value estimation at 60 min resolution.Optimal realizations (opt.AM) are shown with solid lines and the means of the MVN realization (mvn.AM) are shown with dashed lines.Simulation bands (SBs) are shown with filled polygons.

DFigure 7 .
Figure 7.Comparison of censored BL1M model parameters for Atherstone 60 min data.Optimal parameter estimates (params.) are shown with dot-dashed lines, and parameter uncertainty is represented with 95 % confidence intervals (CIs).

Figure 9 .
Figure 9. Seasonal variation in the proportion of wet periods for selected censors, observed vs. estimated.

Figure 10 .
Figure 10.Change in 95 % simulation bands (SBs) and means of the MVN realizations (mvn.AM) for Bochum and Atherstone 15 min data with well-identified (> 1.0 and > 0.6 mm) and poorly identified (> 1.5 and > 1.0 mm) censored model parameters.The spread of the individual realizations is also shown with solid lines (mvn.RL).

DFigure 11 .
Figure 11.Fitted model parameters for the BL1M model with a 1.5 mm censor applied to Bochum 15 min data.

Figure 13 .
Figure 13.Empirical cumulative distribution function plots for Bochum and Atherstone rainfall aggregated to 5 and 15 min temporal resolutions.The plots are limited to the 99th percentile rainfall and show the rainfall quantiles corresponding to the optimum censors used in the estimation of extremes in Figs.4-6.

D
.Cross et al.:  Censored rainfall modelling for estimation of fine-scale extremes Appendix A: Bartlett-Lewis model parameter sensitivity and impact on extreme value estimation

Figure A1 .
Figure A1.Parallel coordinate plots for the two randomized Bartlett-Lewis rectangular pulse models, BL1 and BL1M.Plots show the range of Jan parameter values for uncensored models fitted to Bochum 15 min rainfall.The dashed magenta lines show the parameter sets corresponding to α = 100 and α = 5 for the BL1 and BL1M models respectively.

Figure A2 .
Figure A2.Fitted gamma distributions for the cell duration parameter η for the BL1 and BL1M models with α = 5, 50, 100, and 1000.Plots show the equivalent normal distributions fitted to the mean and SD of the gamma distributions.The range of exponential distributions for the cell duration parameter η is obtained by sampling 500 η values from the fitted gamma distributions.The exponential distributions for the mean of the fitted gamma distributions are also shown.

(c) Censor 2 [0.5 mm]
minute extremes minute extremes Figure 5. Extreme value estimation at 15 min resolution.Optimal realizations (opt.AM) are shown with solid lines and the means of the MVN realization (mvn.AM) are shown with dashed lines.Simulation bands (SBs) are shown with filled polygons.www.hydrol-earth-syst-sci.net/22/727/2018/ Hydrol.Earth Syst.Sci., 22, 727-756, 2018 738 D. Cross et al.: Censored rainfall modelling for estimation of fine-scale extremes minute extremes Hydrol.Earth Syst.Sci., 22, 727-756, 2018 www.hydrol-earth-syst-sci.net/22/727/2018/ D. Cross et al.: Censored rainfall modelling for estimation of fine-scale extremes Variation in extreme value estimation with censors for 15 min data at Bochum and Atherstone for two annual return periods: and 25 years.Plots show the optimal realizations (opt.AM), the means of the MVN realization (mvn.AM), and the spread of the individual realizations (mvn.RL) with box plots.

Table B2 .
BL1M model parameters for the Atherstone 5 min data.

Table B4 .
BL1M model parameters for the Atherstone 15 min data.