Much of our knowledge about future changes in precipitation relies on
global (GCMs) and/or regional climate models (RCMs) that have resolutions
which are much coarser than typical spatial scales of precipitation,
particularly extremes. The major problems with these projections are both
climate model biases and the gap between gridbox and point scale.

To assess the impacts of hydrometeorological extremes in a changing climate,
high-quality precipitation projections on the point scale are often demanded.
Much of our knowledge about future changes in precipitation is based on
global (GCMs) and/or regional climate models (RCMs). These have resolutions
which are much coarser than typical spatial scales of processes relevant for
precipitation. This concerns particularly extreme precipitation, which is far
more sensitive to resolution than mean precipitation

Different approaches have been employed to downscale and/or reduce biases of
simulated precipitation, particularly extremes: (a) high-resolution GCMs,
(b) dynamical downscaling using RCMs that are nested in the GCMs

Quantile mapping

Here we present a modification of the

In Sect.

Schematic of (black) our combined statistical bias correction and
stochastic downscaling model, and (grey) the

We separate bias correction from downscaling into two steps to overcome the
shortcomings of each method and to combine their respective strengths. Our
concept is illustrated schematically in Fig.

With this concept in place, basically in the first step any reasonable
distribution-wise MOS approach, and in the second step any adequate
stochastic model, can be employed. A strength of this concept is its
flexibility; i.e., the best suitable combination of statistical models for a
given location and season can be determined. In this study, we employ a
quantile mapping (QM

To evaluate and illustrate our method, we adopt the perfect predictor
experimental setup of the VALUE framework

The method is evaluated by 5-fold cross-validation for the time
period 1979–2008; i.e., five 6-year long periods are predicted by the model
that was fitted to the remaining 24 years. Artificial predictive skill is
thus not present as the predicted period is not part of the training period.
The model is fitted and evaluated for each season separately; 86 stations
across Europe are studied (as selected for the VALUE experiment; see
Fig.

Location and IDs of used rain gauges from ECA&D. IDs of red
marked stations from left to right: 244, 243, 4002, 58, and 13. Stations for
detailed analysis are marked in blue. Dashed lines represent European
subdomains for analysis as defined by the PRUDENCE project

As prescribed by the perfect predictor experiment within the VALUE framework,
we use the RACMO2 RCM from the KNMI

As the gridded observational dataset, E-OBS version 10

For station density of actual E-OBS versions, refer to the
ECA&D website:

The E-OBS reference gridbox for both steps (bias correction and downscaling) is generally the closest gridbox to the respective station. If the closest gridbox is an ocean gridbox (i.e., for coastal and island stations) and only contains missing values, we select the gridbox with the highest correlation in winter between daily precipitation at the given station and the five closest E-OBS gridboxes. In winter the spatial decorrelation length of precipitation is generally large, implying that several gridboxes are often affected by the same weather system, and, thus, the gridbox with the most similar climate can be reliably identified.

The RCM gridbox that is bias-corrected and downscaled is generally chosen as
the closest gridbox to the E-OBS reference gridbox – also for coastal and
island stations where the chosen RCM gridbox might thus differ from the
closest RCM gridbox to the final reference (i.e., rain gauge). For locations
in the rain shadows we choose the RCM gridbox which best represents the
climate at the given location to correct overly low precipitation values
caused by not enough windward air masses crossing the mountain range

For local-scale observations we used 86 stations across Europe from
ECA&D

In our model we correct several biases. In a first step, the “location
bias” is corrected by gridbox selection (see Sect.

To model precipitation intensities the gamma distribution is commonly used

Since the mixture model is a complex model with six free parameters, a
thorough statistical model selection is necessary. We select between the
mixture model and the simpler gamma-only model separately for the
observed (

To strictly avoid bias correction deteriorating the predictor and introducing
biases, both the complete cross-validated corrected time series and the raw
RCM output are compared to gridded observations as a reference using the
Cramér–von Mises (CvM) criterion. The CvM is a measure of the distance
between two empirical cdfs

To bridge the scale gap we apply the regression model developed by

Subsequently, precipitation intensity on wet days is modeled using a vector
generalized linear model (VGLM) as a regression model

Combining the probability of wet day occurrence and the gamma model
distribution defining the precipitation intensities, we get the probability
that observed precipitation on a given day (

Mean bias.

We evaluate our combined model based on the following metrics.

%sim

We first evaluate the mean bias of our combined model (selected predictor and
VGLM) against station observations and compare it to the raw uncorrected RCM
and to classical QM

Step 1: bias correction to grid scale.

Figure

First, both steps of the combined model are evaluated individually. Second, the combination of both steps is evaluated. In this combined model the predictor selected in the first step is used for the regression model in the second step.

Figure

Step 1: bias correction to grid scale. Boxplots of

The CvM values of the selected predictor (Fig.

The representation of heavy precipitation by the selected predictor is
evaluated by the percentage of simulated values that are higher than the 95th
percentile of the observations on wet days
(%sim

Here we present some examples to illustrate the performance of the VGLM gamma
for different climates, calibrated between gridded (E-OBS) and point-scale
(station) observations. All results that are shown for the evaluation of the
downscaling step (step 2, Figs.

Step 2: downscaling. QQ plots for example stations in DJF. VGLM
gamma standardized to the stationary gamma distribution fitted to observed
wet day intensities between gridded and point-scale precipitation
observations (mm day

Step 2: downscaling. Estimated relation between gridded and
point-scale precipitation observations for example stations in DJF. VGLM
gamma where both parameters depend on the predictor fitted to observed wet
day intensities. The predictor is E-OBS. Circles: observed precipitation
intensities (mm day

To evaluate the goodness-of-fit, we use residual QQ plots (Fig.

Standardization is
performed as (1) compute probabilities for reference values (here: station
observations) from an estimated non-stationary gamma distribution (i.e.,
gamma parameters depend on the predictor and, thus, vary from day to day);
(2) compute quantiles of a gamma distribution with stationary parameters for
these probabilities of a non-stationary distribution; (3) plot these
quantiles against quantiles of stationary gamma distribution for theoretical
probabilities: (1 :

Improvements by the VGLM gamma compared to the predictor can be seen in most
examples ranging from Scandinavia to the Mediterranean and from the Atlantic
coast to eastern Europe in both seasons. However, in some locations the
quantiles modeled by the VGLM gamma compare well to station observations (at
least in Malaga, better than the predictor) up to a certain quantile (e.g.,
Sibiu,

For station density of actual E-OBS versions refer to
the ECA&D website:

In both DJF (Fig.

In the combined model the VGLM gamma, calibrated against E-OBS, is applied to
the predictor selected in Sect.

Steps 1 and 2: combined model.

To evaluate the predictor and VGLM combined model, we apply the same criteria
as for the first step (bias correction, Sect.

Steps 1 and 2: combined model. Boxplots of

The CvM values of the selected model (Fig.

The occurrence of heavy precipitation in the CvM-selected model is slightly
overestimated in most subregions in DJF (Figs.

Ideal performance of our combined model is illustrated in the example QQ plot
of Malaga in DJF (Fig.

QQ plots for example stations of different models (cross-validated)
against station observations for DJF (mm day

Intercomparison of all cross-validated models (not only the selected
best model). Models: uncorrected RCM, QM

In this section an intercomparison of all models (not only the selected best
model from Sect.

To infer the performance of all studied models in estimating the occurrence
of heavy precipitation, boxplots for the percentage of simulated values that
are higher than the 95th percentile of the observations on wet days
(%sim

As in Fig.

To infer whether our model has predictive power, we cannot assess temporal
correspondence compared to observations as in

Spatial autocorrelation (cross-validated). Correlogram (circles) and
smoothed spline fitted to the correlogram (lines) for

We introduced the concept of a combined statistical bias correction and
stochastic downscaling method for precipitation. We thereby extend the
stochastic model output statistics (MOS) approach developed by

The proposed parametric model structure appears not to be the optimal choice
for all considered stations. Yet given that the aim of our study is a proof
of concept, the identification of an optimal model for all individual cases
would be beyond the scope of this work. Nevertheless, where our
implementation is not adequate we provide suggestions for improvements within
the presented framework. Our specific implementation for the QM bias
correction (first step) of wet day intensities employs the mixture
distribution of a gamma distribution for the precipitation mass and a
generalized Pareto (GP) distribution for the extreme tail

Precipitation was in most cases improved by (parts of) our combined method
across different European climates; to what extent depends on region and
season though. The method generally performs better in JJA than in DJF and in
DJF best in the Mediterranean region, with a mild winter climate, and worst
for the continental winter climate in Mid- and eastern Europe or
Scandinavia. Seasonal and regional differences depending on the underlying
mechanism have already been reported for resolution dependence of extreme
precipitation in GCMs

Although our bias correction (first step) improved simulated precipitation
for many locations in both seasons, wet biases may remain even after bias
correction, particularly for continental winter. In agreement with our
results, large improvements by bias correction over the Alps, Spain, and
France have been reported by

The stochastic downscaling (second step) improves the estimated occurrence of
heavy precipitation in many regions, but introduces biases in continental
winter climate. Furthermore, spatial autocorrelation in JJA is improved by
the VGLM, showing the importance of randomization in the framework of
downscaling as already pointed out by, e.g.,

The varying performance of our specific implementation clearly shows that bias correction and downscaling methods should be reevaluated when transferring them to locations with different climatic conditions. In some regions a specific implementation different from the one we used is required. We recommend our model in summer for all studied regions. However, in winter it should only be used for the British Isles, the Alps, the Mediterranean region, and the Iberian Peninsula, but not for continental winter climates (Scandinavia, Mid-Europe, and eastern Europe) and France. While the stochastic downscaling step (VGLM) is very important for representing spatial autocorrelation in summer, it is less important in winter, where the application of solely the bias-correction step might be sufficient. The concept can generally be extended to a wide range of method combinations. Transferring this concept to other climate variables should in principle be possible. Our specific implementation should be applicable to any gamma-distributed variable. However, our approach has so far only been evaluated for precipitation. Thus, users need to evaluate the model for the particular variable at the chosen location when transferring it.

We developed our model in the present-day climate. In a climate change
context the model does not explicitly modify climate trends on a physical
basis. Our model is thus only applicable where changes are correctly
simulated by the GCM/RCM. For instance, changes in the dynamics of local
extreme convective events in summer that need even higher resolution up to
convection-permitting simulations

The general concept of combining two methods and thereby separating bias correction (MOS) and downscaling (PP) into two steps is a powerful approach as it benefits from the respective methodological advantages. Additionally, the strength of this two-step method is that the best combination of methods can be selected. This implies that the concept can be extended to a wide range of method combinations.

The RCM output from the KNMI that was used in this study is
available within the CORDEX framework from the Earth system grid federation
(e.g.,

A non-zero wet day threshold assigns zero probability density to all
intensities between zero and the threshold, resulting in a misfit of the
gamma distribution

Numerical instabilities in the estimation of the mixture cdf may in rare
cases result in a discontinuous cdf (Fig.

The AIC performs best for the part of the distribution where most of the
values are. Hence, a good fit for the bulk of the distribution might include
large biases in the extremes and still have the lowest AIC (example:
Fig.

Examples of problems with the mixture model.

Step 1: bias correction to grid scale. QQ plots of RCM-simulated and
QM

Step 2: downscaling. QQ plots for example stations in JJA. VGLM
gamma standardized to the stationary gamma distribution fitted to observed
wet day intensities between gridded and point-scale precipitation
observations (mm day

Step 2: downscaling. Estimated relation between gridded and
point-scale precipitation observations for example stations in JJA. VGLM
gamma where both parameters depend on the predictor fitted to observed wet
day intensities. The predictor is E-OBS. Circles: observed precipitation
intensities (mm day

QQ plots for example stations of different models (cross-validated)
against station observations for JJA (mm day

Douglas Maraun had the initial idea for this combined method. Claudia Volosciuk implemented the method and performed the evaluation with help from Mathieu Vrac and Douglas Maraun. All authors discussed details of the implementation and the results. Claudia Volosciuk prepared the manuscript with contributions from all co-authors.

The authors declare that they have no conflict of interest.

We thank the KNMI for producing and making available their model output. We
acknowledge the E-OBS dataset from EU-FP6 project ENSEMBLES
(