An adaptive two-stage analog/regression model for probabilistic prediction of small-scale precipitation in France

Statistical downscaling models (SDMs) are often used to produce local weather scenarios from large-scale atmospheric information. SDMs include transfer functions which are based on a statistical link identified from observations between local weather and a set of large-scale predictors. As physical processes driving surface weather vary in time, the most relevant predictors and the regression link are likely to vary in time too. This is well known for precipitation for instance and the link is thus often estimated after some seasonal stratification of the data. In this study, we present a two-stage analog/regression model where the regression link is estimated from atmospheric analogs of the current prediction day. Atmospheric analogs are identified from fields of geopotential heights at 1000 and 500 hPa. For the regression stage, two generalized linear models are further used to model the probability of precipitation occurrence and the distribution of non-zero precipitation amounts, respectively. The two-stage model is evaluated for the probabilistic prediction of small-scale precipitation over France. It noticeably improves the skill of the prediction for both precipitation occurrence and amount. As the analog days vary from one prediction day to another, the atmospheric predictors selected in the regression stage and the value of the corresponding regression coefficients can vary from one prediction day to another. The model allows thus for a day-to-day adaptive and tailored downscaling. It can also reveal specific predictors for peculiar and non-frequent weather configurations.

Among the different SDM approaches presented over the last decades (see Maraun et al., 2010, for a review), perfect prognosis SDMs make use of the physical relationships that exist between some large-scale atmospheric parameters and local weather variables. Local weather scenarios can then be produced for any prediction day, conditional on the largescale atmospheric configuration observed or simulated for this day, where the "prediction day" refers here to some future, past, or present simulation day, depending on the application context at hand (e.g., forecasting, simulation, reconstruction). Perfect prognosis SDMs include transfer functions, weather-type-based models, and methods based on atmospheric analogs. In the latter case, atmospheric analog days of the current prediction day are searched for on the basis of some atmospheric similarity criterion in the historical database. The weather variables observed for the most similar day, for one similar day chosen randomly, or for a selection of the k-most similar days are then used as a weather scenario for the prediction day of interest (e.g., Dayon et al., 2015;Raynaud et al., 2016).
Published by Copernicus Publications on behalf of the European Geosciences Union.
Transfer functions mainly consist of regression models where the expected value of the predictand for time t is expressed as a linear or non-linear function of a set of predictors. For precipitation, the regression can be achieved with multiple linear regressions or generalized linear models (GLMs) which extend the linear regression to non-Gaussian data (e.g., Nelder and Wedderburn, 1972;Stern and Coe, 1984;Asong et al., 2016). Transfer functions can also make use of classification and regression tree (CART) (e.g., Gaitan et al., 2014) artificial neural networks or least squares support vector machines (e.g., Campozano et al., 2016).
The downscaling relationship used in transfer functions is usually established empirically between a selection of largescale predictors and the predictand (e.g., precipitation occurrence) from a set of observations available for recent decades. As physical processes driving surface weather vary in time, the most relevant predictors and the downscaling link are however expected to vary in time too. When inferred from all observations available for a given period, the downscaling relationship -which is thus likely inferred from a heterogeneous ensemble of weather configurations -is consequently likely to be sub-optimal. To reduce this potential limitation, the parameterization of the relationship is often estimated after some data stratification. In the usual calendar stratification, one parameter set is for instance optimized for each calendar month or season (e.g., Nasseri et al., 2013). The stratification can also be based on some weather-type information. In this case, a set of parameters is usually estimated for each weather type of a given pre-established weather-type classification (e.g., Enke and Spegat, 1997). Often applied, this weather-type-based approach is expected to allow for a better identification of the most important driving large-scale variables and consequently for a more relevant downscaling. An obvious limitation however remains for prediction days that do not clearly belong to one specific weather type (e.g., prediction days that are close to the "weather frontiers" delimiting two or more weather types). Those days are indeed likely to be rather dissimilar to the weather configurations that each weather type is expected to characterize, making the downscaling relationships to be used not suited anymore or, at least, sub-optimal.
A smoother weather-type-like approach consists in defining the weather type from all atmospheric situations that are similar to the situation of the prediction day. The ensemble of days from which the downscaling link can be identified is thus expected to be rather homogeneous and to rather well inform the large-to small-scale link sought for the considered prediction day. This is in turn expected to make the link stronger and to improve the prediction (e.g., Woodcock, 1980). Such an approach can actually be achieved by identifying for each prediction day the transfer function from atmospheric analogs of that day. To our knowledge, this twostage downscaling approach, combining in turn the two popular analog and transfer function methods, has only been explored in a few previous studies. In Ribalaygua et al. (2013), it was found to improve the probabilistic prediction of local surface temperature in the Spanish Iberian Peninsula. The multiple linear regression of the regression stage, estimated from the 150 most similar atmospheric analogs of the prediction day of interest, uses forward and backward stepwise selection of predictors from a set of four potential predictors (thickness of the air column and three temperature indexes of previous days). For precipitation, the authors did not test the potential of the two-stage combination, building directly the predictions from the precipitation observations of the 30 most similar atmospheric analogs. In the deterministic approach presented by Ibarra-Berastegi et al. (2011), incorporating the regression stage (with 79 potential atmospheric predictors) was found to allow a clear though not overwhelming improvement of precipitation prediction over the simple analog-based predictions. A multiple linear regression model was also applied here for the regression stage.
In the present study, we present a two-stage analog/regression downscaling model for the probabilistic prediction of small-scale daily precipitation: for each prediction day, the statistical downscaling link between some largescale atmospheric predictors and small-scale precipitation is estimated from large-scale and local-scale observations available from an ensemble of days which are atmospheric analogs to the prediction day. The analog model (AM) used for the analog stage is based on developments from different studies initially focusing on the probabilistic quantitative precipitation forecasts in southern France (e.g., Bontron and Obled, 2005;Marty et al., 2012) and extended to the prediction of precipitation on larger spatial domains (e.g., Chardon et al., 2014). The statistical distribution of daily precipitation is strongly non-Gaussian with a non-negligible mass in zero (corresponding to the probability of a dry day), and a skewed distribution for non-zero daily amounts. For the regression stage, we thus do not use a multiple linear regression model as in previous studies, but a two-part GLM approach where the probability of precipitation occurrence and the distribution of wet-day amounts are modeled separately following Chandler and Wheater (2002) and Mezghani and Hingray (2009). Conversely to the work of Ibarra-Berastegi et al. (2011), this allows prediction of the full distribution of precipitation, including the probability of a wet day. In this twostage analog/regression approach, the analogs change from one prediction day to the other. This makes the statistical downscaling link potentially adaptive; i.e., the predictors and the regressions parameters are likely to vary from one day to the other.
As mentioned above, SDMs are used for the simulation of local weather scenarios in different contexts, e.g., local weather forecasts, reconstructions, or climate impact studies. No specific context is considered here and the two-stage model could be further considered for either forecasting, reconstruction, or future projections. Depending on its intended use, some specific issues would obviously apply, calling for specific focused analyses and developments. For in-Hydrol. Earth Syst. Sci., 22, 265-286, 2018 www.hydrol-earth-syst-sci.net/22/265/2018/ stance, the large-scale atmospheric parameters to be considered as predictors would depend on the dataset considered (e.g., atmospheric reanalyses, climate models, or numerical weather prediction models) as a result of their intrinsic quality (e.g., Caillouet et al., 2016). The development of climate projections would require one to check the temporal transferability of the model in a modified climate context and would thus likely also condition the selection of the predictors as highlighted by Dayon et al. (2015). These context-specific issues are not considered here. Our main objectives are to present the principles of the two-stage analog/regression approach developed for the prediction of small-scale precipitation, to assess its predictive power for both precipitation occurrence and amount, and to give some insight into its adaptive behavior and thus into the temporal variability of the downscaling link. For this, we explore the model skill and behavior for the prediction of daily precipitation for a large number of sites in France. The paper is structured as follows: Sect. 2 describes the data and Sect. 3 the two-stage downscaling model. Section 4 presents the skill of the model for the prediction of both precipitation occurrence and amount. The adaptive behavior of the model is considered in Sects. 5 and 6.

Data
The predictand is the daily small-scale precipitation estimated for the 1982-2001 period over 8981 grid cells of 8 × 8 km 2 covering the continental French territory. The predictand is "local" precipitation, i.e., precipitation at a given grid cell. Each of the 8981 grid cells is thus considered in turn in the following independently of the other cells. In other words, the predictions do not target precipitation fields. Small-scale precipitation data are obtained from the SAFRAN analysis produced for several surface variables at an hourly time step by MeteoFrance (Quintana-Segui et al., 2008;Vidal et al., 2010). SAFRAN precipitation estimates are obtained each day from the closest measurement stations. They are considered as pseudo-observations. Atmospheric predictors are taken from the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40, Uppala et al., 2005). This global meteorological re-analysis is available on a 1.125 • × 1.125 • grid with a 6-hourly temporal resolution.
For the analog stage, predictors are the 1000 and 500 hPa geopotential height fields over a large spatial domain (roughly lat = 10 • , lon = 8 • ) centered on the target location. These predictors have been found to be the most informative large-scale predictors to be used in this context for France (e.g., Guilbaud and Obled, 1998;Obled et al., 2002;Radanovics et al., 2013). They also correspond to the best large-scale predictors of daily precipitation for different regions in Europe with contrasted meteorological regimes (Raynaud et al., 2016). For the regression stage, 22 other predictors were also considered. The selection gathers most predictors considered in previous studies over Europe (e.g., Hanssen-Bauer et al., 2005;Wetterhall et al., 2009;Horton et al., 2012;Raynaud et al., 2016). They include predictors characterizing the thermal state of the atmosphere, its dynamics, the atmospheric water content, and its thermo-dynamical instability (see Table 1). As potential predictor, we also consider the occurrence of precipitation on the previous day. All predictors here are scalar variables. Atmospheric predictors are estimated on a daily time step (mean of the four values available at 06:00, 12:00, 18:00, and 24:00 UTC) from the four ERA-40 grid cells surrounding the prediction grid cell (inverse distance interpolation).
To avoid the multi-colinearity in the predictors for the regression, we identified a subset of uncorrelated predictors. The cross-correlations between all predictor pairs were first estimated on an annual basis from all available data. The cor-relation structure can however differ from one atmospheric configuration to the other. The set of uncorrelated predictors could thus differ from one prediction day to the other. We thus repeated the correlation analysis for each prediction day, using for this estimation the predictor values observed for the 100 nearest atmospheric analogs identified for this day. The main features of the inter-variable correlations were found to be roughly independent of the day (not shown). The final subset of uncorrelated predictors is highlighted in Table 1. These predictors are tested for the prediction of both precipitation occurrence and amount. A large number of different possible predictor sets can be built from these predictors. In the present work, for the sake of robustness, we consider that a maximum of four predictors can be integrated into a given regression model. Predictors are obviously expected to be both day and location specific. In the present work, for the sake of simplicity and readability, we select them from a unique set of four potential predictors. This allows us to reduce the degrees of freedom in the model and to better highlight its skill and adaptive behavior.
For each predictand, the set of the four potential predictors was selected as follows. For 12 SAFRAN grid cells uniformly distributed over the French territory, we first identified with a standard iterative forward/backward algorithm the four-predictor set which leads to the best prediction skill for the all-days configuration. From the 12 different sets obtained, respectively, for the 12 grid cells, we finally retained the set which leads on average to the best prediction skill for the 8981 SAFRAN grid cells.
For precipitation occurrence, this best four-predictor set is constituted from the relative humidity R 700 , the helicity H , the vertical velocity of the air at 700 hPa, W 700 , and the precipitation occurrence Occ − 1 of the day before the prediction day. For precipitation amount, the best four-predictor set is similar except that the occurrence of the previous day Occ−1 is replaced by the 700 hPa air temperature T 700 . Note that the selection of predictors R 700 , W 700 , and T 700 is consistent with results of several past studies in the region (e.g., Ben Daoud et al., 2016).
Predictors considered for the analog and regression stages obviously inform about different features of the atmosphere state for different scales. Geopotential fields, by their spatial extent, characterize the large-scale atmospheric circulation configuration (the spatial domain of several thousands of kilometers includes a part of the northeastern Atlantic and covers France and a part of the neighboring countries), whereas scalar predictors used in the regression stage are descriptive of a more local (and mostly thermodynamic) state of the atmosphere (the spatial domain of several hundreds of kilometers is roughly centered above the target location).

The two-stage analog/regression model (SCAMP)
As illustrated in Fig. 1, the cumulative distribution function (cdf) F Y of precipitation Y at a given site (grid cell) can be expressed for any given day as the composition of the noprecipitation occurrence probability 1 − π and the cdf F Q of the precipitation amount Q for non-zero precipitation: where π is the precipitation occurrence probability, and y and q correspond to the precipitation value with regard to the whole precipitation distribution and to the non-zero precipitation distribution, respectively. In the present work, the cdf of precipitation is modeled for each grid cell and each prediction day with GLMs (Coe and Stern, 1982;Stern and Coe, 1984), estimated for this specific day from atmospheric analogs of the day. The probability of precipitation occurrence and the cdf of the non-zero precipitation amount are modeled separately.
In the following, we first describe the AM used to identify atmospheric analog days (Sect. 3.1) and the GLMs applied in the regression stage (Sect. 3.2).
As discussed later, one can face prediction days where the regression stage fails, i.e., where the regression parameters are not significantly different from zero at the chosen significance level (α = 5 %). For such days, we use the analog model as a backup prediction model. The backup model can Hydrol. Earth Syst. Sci., 22, 265-286, 2018 www.hydrol-earth-syst-sci.net/22/265/2018/ be used for precipitation occurrence probability, for non-zero precipitation amount, or for both predictands simultaneously. The way these different models are combined to finally give, for the current prediction day, a probabilistic prediction of precipitation, is presented in Sect. 3.3. In the following, this two-stage analog/regression model is further referred to as SCAMP (SCAMP stands for Sequential Constructive atmospheric Analogs for Multivariate weather Prediction and refers to the model presented by Raynaud et al., 2016, for the multivariate prediction of precipitation/temperature/radiation/wind).

Atmospheric analogs
The atmospheric analog days retained for the regression stage are identified with an analog model defined from the developments of several past studies in France (e.g., Obled et al., 2002;Marty et al., 2012;Chardon et al., 2014).
For any given prediction day (e.g., 31 May 2018), the analog days retained for the regression are the N d days that are most similar to that day in terms of large-scale atmospheric circulation. The similarity is assessed using the Teweless-Wobus score (TWS, Teweless and Wobus, 1954) applied to the geopotential height at 1000 and 500 hPa at 12:00 and 24:00 UTC, respectively. The TWS compares the shapes of geopotential fields, and thus informs on the localization of low-and high-pressure systems and on the origin of air masses. Note that the N d analog days are identified within a restricted pool of candidate days, namely all days of the archive that are included in a calendar window of ±30 calendar days centered on the prediction day (for the prediction of 31 May 2018, the candidates are all 1 May to all 30 June from all years of the archive). The prediction day (e.g., 31 May 2018) and its 5 preceding and following days are excluded from the candidates. In the present work, the archive period corresponds to 1982-2001 (20 years), and we used the 100 nearest atmospheric analog days to estimate the GLMs in the regression stage.
Following Chardon et al. (2014), the domain considered to estimate the atmospheric similarity was optimized for each target location. A different analog model was thus considered for each of the 8981 SAFRAN grid cells. For each prediction day, the analog days thus likely differ from one SAFRAN grid cell to the next (see Chardon et al., 2014, for an illustration).

Regression stage with GLMs
The cdf of precipitation is then modeled for each prediction day with GLMs estimated for this specific day from the atmospheric analogs of the day. GLMs make the cdf, depending on some covariates, atmospheric predictors in the present case. For each prediction day, the probability of precipitation occurrence π was modeled with a GLM in the form of a lo-gistic regression as . For the non-zero precipitation amount, we used a GLM with the gamma distribution and the log link function. The expected amount µ of non-zero precipitation is therefore here expressed as where x q denotes the scalar vector of the K q predictors (x q 1 , x q 2 , ..x q K q ) and β q the scalar vector of the corresponding regression coefficients (β q 1 , β q 2 , ..β q K q ). The shape parameter ν of the gamma distribution is computed from the variance σ 2 of non-zero precipitation amounts estimated from Pearson's residuals (McCullagh and Nelder, 1989) as where N q is the number of non-zero precipitation data q i considered in the analysis. As the shape parameter ν equals the inverse of the variance 1/σ 2 , the gamma distribution F Q modeling the precipitation amount thus follows a gamma distribution of this type: (ν, α = µ/ν). For any given prediction day, the estimation of both GLM models practically proceeds as follows.
-The precipitation state (wet or dry), the precipitation amount, and the values of the different potential predictors are extracted for the N d nearest analogs of the day. The precipitation state of a given day is considered to be wet if the precipitation amount for this day is higher than or equal to 0.1 mm. It is described with a binary precipitation occurrence variable O, set to 1 for the wet case, and 0 for the dry case.
-For occurrence probability, different sets of predictors are considered in turn. For each set, the parameters of the occurrence GLM are estimated from the predictors/occurrence values available for the N d analogs.
-For precipitation amount, different sets of predictors are again considered in turn. For each set, the parameters of the GLM are estimated from the predictors/amount values available from the analog days which are wet (N q , the number of days considered here for the regression, is therefore smaller than or equal to N d , and varies a priori from one target day to another).
For the considered prediction day, the different sets of predictors considered in turn are built from the four potential predictors identified in the preliminary work (cf. Sect. 2). For occurrence probability (or precipitation amount), the four potential predictors actually allow us to build 15 different sets of predictors, further denoted as "regressive structures" in the following (cf. list in Table 2). For each regressive structure, the regression coefficients of corresponding GLMs are estimated using the iterative re-weighted least squares algorithm (IRLS, Nelder and Wedderburn, 1972). The prediction skills of the different regressive structures are then compared and the regressive structure (predictor set) which minimizes the Bayesian information criterion is retained for the prediction (Schwarz, 1978;Akaike, 1974) (only the regressive structures for which all coefficients are significant at a 5 % level are compared; the significance is estimated with the Z test (or the Student's t test)). The prediction of the occurrence probability (or the expected precipitation amount) for the prediction day is finally obtained from the best occurrence (or amount) GLM, using the values of the predictors observed for that prediction day. The final distribution of precipitation F Y is obtained by combining the issued occurrence probability π and the amount distribution F Q according to Eq. (1).

The analog model as a benchmark and backup prediction model
The N d nearest analog days identified with the AM can also be directly used, without a further regression stage, for a probabilistic prediction. In the following, we also consider predictions obtained with the 25 nearest analog days (for the AM considered here, 25 was found to give the best prediction skill for France by Chardon et al., 2014). In this case, the precipitation cdf for the prediction day is simply the empirical distribution of the precipitation values observed for these 25 analogs. The predictions obtained with this analog model, further called AM 25 , are used as a benchmark to assess the prediction skill of the two-stage analog/regression approach. In addition, they were used as a backup prediction for days for which the regression stage failed in the two-stage approach. One can actually face the situation where no GLM satisfies the significance conditions required for the regression coefficients. This can occur for precipitation occurrence probability, for non-zero precipitation amount, or for both predictands simultaneously. In such cases, AM 25 is applied as a backup prediction model. If the significance conditions cannot be satisfied for the precipitation occurrence GLM, the occurrence probability π is set to that obtained with AM 25 . It thus simply corresponds to the empirical probability π AM 25 of precipitation occurrence derived from the 25 analog days of AM 25 as Similarly, if the significance conditions cannot be satisfied for the precipitation amount GLM, the distribution F Q is estimated with the empirical distribution F Q,AM 25 derived with AM 25 as where F AM 25 corresponds to the empirical cdf estimated from all precipitations (null and positive) related to the 25 analog days. Note also that if the number N q of humid analog days is low (N q < 10), the estimation of a GLM is not expected to be robust. When this case appears, F Q is also set to the cdf obtained with AM 25 . As illustrated in Fig. 2, four prediction cases are thus achieved with the two-stage approach. They correspond, respectively, to cases where AM 25 is used to back up the prediction of the whole precipitation distribution (case 1), where AM 25 is applied to back up the prediction of the amount cdf (case 2), where AM 25 is used to back up the occurrence probability prediction (case 3), and where the regression stage could be activated for both occurrence and amount (case 4). Note that the regression stage achieved with GLMs can also be seen as a way to refine the estimation of the cdf that could have been obtained directly with the backup (and benchmark) AM 25 analog model. The refinement leads to an update of the occurrence probability and/or the cdf of a nonzero precipitation amount.
As described previously, the two-stage analog/regression prediction process is repeated for each prediction day in turn. As the analog days vary from one prediction day to another, the predictors selected in the regression stage and the value of the corresponding regression coefficients are expected to vary from one prediction day to the other. The two-stage model SCAMP allows thus for a day-to-day adaptive and tailored downscaling.

Model evaluation
The prediction skill of the downscaling model is assessed with probabilistic scores usually used to evaluate ensemble prediction systems (EPSs). Let us consider a given EPS, denoted as P.
The Brier score (Brier, 1950;Murphy, 1973) first evaluates the ability of EPS P to predict precipitation occurrence. When estimated over M prediction days, the mean Brier score BS reads as where, for a given prediction day i, p i is the occurrence probability issued by EPS P and o i is the effective precipitation occurrence for this day (o i = 1 for a wet day, = 0 otherwise). The ability of EPS P to estimate the precipitation amount is evaluated with the continuous ranked probability score (CRPS, Brown, 1974;Matheson and Winkler, 1976). When estimated over M prediction days, the mean CRPS reads as where, for a given prediction day i, H y i and F i denote, respectively, the cdf of the observation y i and the cdf derived from EPS P. x denotes the predictand quantiles of the cdfs. Note that H y i corresponds to the Heaviside function where H y i = 1 if x ≥ y i and H y i = 0 otherwise. For this evaluation, the probabilistic prediction of the predictand y is described here, for each prediction day, with a discretized cdf composed of N values, with N = 25. When AM 25 is used as a backup model, the N values are the precipitation observations of the 25th analog days. When the prediction is issued with SCAMP, the N values are those of the 25 percentiles (k − 0.5)/25, k in [1, N ], of the predicted cdf F Y .
In the following, we discuss the prediction skill for precipitation occurrence and amount with the Brier skill score (BSS) and the continuous ranked probability skill score (CRPSS), respectively. Both scores normalize the prediction skill of EPS P with that obtained with a reference EPS P ϕ . P ϕ is here a climatological EPS based on a calendar climatology defined for each prediction day by the precipitation distribution of all days belonging to a seasonal window (±30 days) centered on the corresponding calendar day. In this context, the BSS and CRPSS, respectively, read as and CRPSS = 1 − CRPS where BS ϕ and CRPS P ϕ correspond to the mean BS and the mean CRPS obtained with EPS P ϕ . For both scores, a negative value indicates that the prediction obtained with EPS P is worse than the prediction obtained with the climatological EPS P ϕ . A score of 1 conversely denotes a perfect EPS P. In the following, to assess the added value of the two-stage SCAMP model when compared to the benchmark AM 25 analog model, we additionally estimate the gain in prediction skill as S = S SCAMP − S AM 25 , where S corresponds either to the BSS or the CRPSS.

Results
The two-stage model is used for the probabilistic prediction of small-scale precipitation over the continental French territory for each day of the 1982-2001 period. We here present the prediction skill obtained for occurrence and amount with the two predictors sets presented in Sect. 2. As discussed later in Sect. 5, the four predictors of each set are not necessarily all used; the predictors which have some predictive power for the considered predictand vary from one day to the other. The BSS gain obtained with SCAMP over AM 25 is rather important (up to 0.1 BSS points) and presents a high space variability (Fig. 3b). The gain (between 0.05 and 0.1 BSS points) is high in the mountainous areas (Pyrenees, Massif Central, Alps, Vosges). The highest gains are found along the Mediterranean coast and in the southern Alps, where the BSS of SCAMP was lowest. This highlights the weakness of the AM 25 in these regions -characterized by more frequent convective precipitation and thus a weaker link with large-scale atmospheric circulation -and the interest for thermodynamic and more local predictors. Conversely, lower gains are observed in the western part of France characterized by more frontal precipitation and thus a stronger link with large-scale circulation. Note also that the spatial distribution of BSS is very close (even if it has higher values) to the one obtained by SCAMP with R 700 as a unique predictor (not shown here). Figure 4a shows the CRPSS obtained with SCAMP. The CRPSS values also depend on topography. The highest values, up to 0.45, are obtained in the western part of the Massif Central, the northern Alps, the Jura and the Vosges massifs. Lower values, between 0.32 and 0.45, are obtained in lowlands. The lowest skill (below 0.30) is again obtained along the Mediterranean coast.

Performance of SCAMP
The CRPSS gain obtained over AM 25 is significant for most grid cells, with the highest value (up to 0.10 CRPSS points) obtained in the Rhône Valley and northeastern France (Fig. 4b). Similarly to the BSS gain, a lower CRPSS gain is also obtained here in lowlands and western France. The spatial distribution of CRPSS is also very close here (even if it has higher values) to the one obtained by SCAMP with W 700 as a unique predictor for amount (not shown here).
Despite the large dependency on regional features such as topography or proximity to the sea, adding local and thermodynamic information in SCAMP greatly improves the prediction skill over that of AM 25 , for both precipitation occurrence and amount.

Characterization of SCAMP's behavior
As described in Sect. 3.3, the regression stage of SCAMP is equivalent to update the empirical distribution obtained from the atmospheric analogs directly. For some prediction days, the regression stage can be however only partly activated, for either occurrence or amount. It can be even not activated at all. In these cases, the prediction is fully or partly obtained from the backup model AM 25 .
The frequency with which each activation case (cases 1 to 4) is obtained over the simulation period is given in Fig. 5. The situation where both precipitation occurrence and amount GLMs are activated (case 4) is very frequently observed. It corresponds to more than 85 % of the days except in south-eastern France, where only 60 % of the days are concerned. All in all, the regression stage of SCAMP is very often activated (more than 97 % of the days) to predict the occurrence probability (cases 2+4). In the failing full-updating cases, AM 25 is usually applied to back up the precipitation amount prediction (cases 1 + 2). Case 1, where the whole prediction is backed up with AM 25 , is finally very rare. For a large majority of the grid cells, it occurs less than 35 times in the 20-year period considered (corresponding to around 5 ‰). Figure 6 presents the mean precipitation anomaly for each of the previous cases, i.e., the ratio between the mean amount obtained for all days belonging to the considered case and the overall mean precipitation amount. An anomaly greater (lower) than 1 indicates days that are rainier (drier) than usual. Cases 1 to 4 correspond clearly to different precipitation configurations. The mean precipitation amount of days in case 4 is close to the overall mean. Days in cases 1 and 2 are very dry. Days in case 3 are very wet, with a mean precipitation 3 times larger than the overall mean.
For a given prediction day, the precipitation state of its analog days is actually expected to be roughly similar to that of the day. This thus explains SCAMP's behavior described above. In cases 1 and 2, analog days of the prediction day are likely very dry. The number of humid analog days is thus likely small to very small, and likely too small to allow for a robust estimation of the precipitation amount GLM.  log days are conversely likely humid in case 4 or even very humid in case 3. The number of humid days in those cases is thus likely large enough to allow for a robust estimation of the precipitation amount GLM. The very humid configuration of case 3 suggests that prediction days are characterized by a very large number of humid analog days, which can in turn prevent a robust estimation of the occurrence GLM (e.g., the occurrence GLM cannot be estimated in configurations where all days are wet).
This can also explain the specific results obtained in the southeast. Case 2 is indeed activated much more often in this region (increase of 30 % percentage points) than elsewhere and, in a symmetric way, case 4 is activated much less often in this region (decrease of 30 % percentage points). The reason underlying this result is to be related to the much higher proportion of dry days in the southeast (see Fig. S1 in the Supplement). In this region, the number of wet analog days is thus likely small for a large number of prediction days. As suggested above, this is obviously not a difficulty for the estimation of the occurrence GLM. This is conversely likely for the estimation of the amount GLM. A small number of wet analogs likely prevents a robust estimation of the precipitation amount GLM. This likely explains the much lower (higher) frequency of case 4 (case 2) in the southeast.
The CRPSS gain achieved with SCAMP's results from the updated prediction of both precipitation occurrence and amount. To assess the relative effects of these updates on the gain, we further compared the following four prediction experiments.
-Exp. 1. The prediction of both the occurrence and the amount is achieved with AM 25 for all prediction days. This corresponds to the results given by Chardon et al. (2014, cf . Fig. 3).
-Exp. 2. When possible, the precipitation occurrence probability is updated with the occurrence GLM. The non-zero precipitation amount is always predicted with AM 25 .
-Exp. 3. When possible, the precipitation amount is updated with the amount GLM. The precipitation occurrence probability is always predicted with AM 25 .
-Exp. 4. When possible, both precipitation occurrence probability and amount are updated with the occurrence and amount GLMs. This corresponds to the two-stage configuration already evaluated previously.
The CRPSS gains obtained between Exps. 1 and 2, between Exps. 1 and 3, and between Exps. 1 and 4 are presented in Fig. 7 (the results for Exp. 4, already presented in Fig. 4, are presented again for ease of comparison).
For a large majority of grid cells, the CRPSS gain obtained with an updated prediction of the occurrence probability (from 0 to 0.05 CRPSS points) is significantly lower than that obtained with an updated prediction of amount (from 0.03 to 0.1 CRPSS points). The CRPSS gain obtained in the latter case is additionally close to that obtained with the full two-stage model. The CRPSS gain obtained by SCAMP in Fig. 7c is thus explained in most cases by the updated prediction of precipitation amount. The scheme is somehow different in the south of France along the Mediterranean coast and in the Cevennes-Vivarais mountains. In those regions, the CRPSS gain obtained by SCAMP is mostly explained by the updated prediction of the occurrence probability. Updating only the precipitation amount leads to fairly no CRPSS gain.

Discussion
The sets of potential predictors used in SCAMP for the prediction of precipitation occurrence and amount have been listed in Sect. 2. For each variable, the number of potential predictors is here equal to four. All four predictors are not necessary retained for the GLM. For a given prediction day, a GLM with a single predictor or a combination of several predictors among the four can be selected. Fifteen regressive structures plus the backup AM 25 model are possible in our context (Table 2).
For a given prediction day, the regressive structures selected by SCAMP for precipitation occurrence or for precipitation amount are supposed to include the best information for the prediction. In the following, we assess how often each structure has been selected. This allows for some insight into the atmospheric information really used for the regression stage and how this information varies in time. Figures 8 and 9 present the percentage of times that the 15 regressive structures and the backup AM 25 are used for the prediction of precipitation occurrence and amount, respectively. As in Fig. 5, gray cells indicate that the regression structure has been retained less than 35 times over the 20year evaluation period. For both occurrence and amount, the selection frequency of the structures is also rather region dependent and strongly influenced by topography.
For occurrence (Fig. 8), the most often selected structure is Str. no. 1, which is only based on R 700 (more than 25 % of the time for the whole of France). R 700 was actually found to give the highest predictive power when used in a single predictor configuration. Another structure which is also often selected (more than 15 % for a high number of grids) is Str. no. 7 which combines R 700 with Occ − 1. Secondary structures -as for example Str. no. 6 and no. 13 combining R 700 to W 700 and Occ − 1 -can be selected more than 10 % of the days for some given regions. Other structures are seldom selected and some of them (Str. no. 8,no. 11,no. 14,and no. 15) are almost never selected.
For precipitation amount, the most frequently selected structures are Str. no. 3 and Str. no. 1, both based on one single predictor, W 700 or R 700 , respectively. These structures are selected more than 25 % of the time (or more than 15 %) for a high number of grids. The secondary structures (Str. no. 6,no. 8,and no. 13) are selected from 5 up to 20 %, depending on the region. They always include W 700 in combination with some other predictor (R 700 , H , or R 700 + T 700 ). Str. no. 9, no. 12, no. 14, and no. 15 are almost never selected. The others are selected less than 10 to 5 % of the time.
Note that for the selection of the best regression structure for a given prediction day, all 15 of these regressive structures have been tested in turn. The results above suggest that this systematic test is not necessary and that it could be reasonable to consider only the few structures which are frequently retained or which are retained a "reasonable" fraction of the days. However, the selection frequency of a given structure actually varies with the seasons and/or the encountered synoptic situation, and some secondary regressive structures can be retained frequently for specific situations. This is illustrated in Fig. 10  patterns (WP, defined in Table 3, Garavaglia et al., 2010) from the selection frequency obtained for the all-days situation. For precipitation occurrence (Fig. 10a), the selection of the main regressive structures (i.e., Str. no. 1 and no. 7, respectively, based on R 700 and R 700 + Occ − 1) is up to 15 % more frequent (less frequent) for WP3 (WP5) compared to the alldays situation. For precipitation amount (Fig. 10b), the selection frequency of the main regressive structures (Str. no. 1 and no. 3 based on R 700 and W 700 , respectively) can similarly change up to ±10 %. The reduced selection of a main regressive structure for a given season or WP can lead to preferential retention of some secondary regressive structure. For instance, the regressive Str. no. 8 based on W 700 + H is selected 10 % more frequently for WP2 than for the all-days situation (Fig. 10b).
The preferential (or conversely reduced) selection of some regression structures for given WTs or seasons was estimated for all grid cells of France. In most cases, the preferential (or reduced) selection was found to present a noticeable spa-tial coherency. Different configurations are observed as illustrated in Fig. 11 and discussed below.
The preferential selection of some regression structures can first be observed over large to very large regions. As an example, the preferential selection of Str. no. 3 for the prediction of precipitation amount for days in WP7 (more than +15 % compared to usual) is obtained for all grid cells in France. Whatever the location, the vertical velocity W 700 seems thus required in this specific weather pattern. Another example is that of WP8 which corresponds to an Anticyclonic situation. Whatever the location, no precipitation is really expected for this configuration. No predictor is thus required in addition to geopotential heights used in the analog stage. This configuration logically leads to a large preferential selection of the backup AM 25 model.
For a given weather pattern, the preferential selection of a regressive structure can also vary from one region to the other. For WP2 for instance, the structures based on W 700 or on W 700 and H are selected much more often along the Atlantic coast and in the north of France. The backup AM 25 model is conversely more selected in the southeast, on the  (b) Figure 10. For each season and weather type, difference (%) in selection frequency with the all-days case for different regression structures. Results for the prediction of (a) occurrence and (b) amount. A positive difference indicates that the considered regressive structure is selected more often than for the all-day situation. Results are displayed for a grid cell located in the northwest of France. For a clearer illustration, the three or four regressive structures that are almost never selected are not displayed. Mediterranean coast especially. For this weather regime, the southeast is actually protected by the Massif Central mountain and thus usually does not receive precipitation (cf. Fig. 3 of Garavaglia et al., 2010). The preferential selection of a regressive structure can also be obtained for rather small and specific regions. In Fig. 11b, the regressive Str. no. 8 based on W 700 +H is more frequently selected for WP7 (around +15 %) in the Cevennes-Vivarais region (southeastern part of the Massif Central) and in the pre-Alpine mountains (western part of the Alps). The combination of W 700 and H seems thus to be very informative in those configurations for this really rare WP (4 % of the 20-year period).
Whatever the configuration, the preferential selection of regression structures presents some spatial coherency, at small or large regional scales. This obviously also suggests the spatial robustness of the informative predictors to be retained for given large-scale weather configurations.

Conclusions
The relevance of a two-stage analog/regression model has been explored in this study for the probabilistic prediction of precipitation over France. Atmospheric analogs of the prediction day are identified to estimate the parameters of a two-part regression model further applied for the prediction. The regression model consists of a logistic GLM for the prediction of precipitation occurrence and a logarithmic GLM for the prediction of precipitation amount. The prediction obtained with this two-stage approach updates the predictive distribution that would have been achieved directly from a one-stage analog model based on atmospheric circulation analogs. The two-stage approach makes the downscaling model adaptive: as the analog days are identified for each prediction day, the predictors and regression coefficients of the regression models can vary from one day to the other.
The regression stage allows a non-negligible prediction skill gain compared to the reference analog model (gain up to 0.1 skill score points for both the BSS and the CRPSS). The CRPSS gain is mainly achieved due to the regression model estimated for the precipitation amount. The introduction of local-scale predictors such as relative humidity is obviously crucial there. The adaptive nature of the model and thus the possibility of tailoring the downscaling relationship (both predictors and regression coefficients) to the current prediction day seems to be decisive as well. The CRPSS gain obtained with the two-stage approach is actually 2 times larger than the one obtained by Chardon et al. (2014) with a two-level analog model where a unique and same secondlevel analogy variable (namely humidity) is considered for all days.
The prediction skill and adaptability of this two-stage approach was illustrated for the prediction of both the precipitation occurrence and amount in a simplified configuration where four predictors, selected in a preliminary analysis from a large ensemble of potential predictors, are used in the regression stage. The predictors used for precipitation occurrence are the relative humidity and vertical velocity at 700 hPa, the helicity integrated from 1000 to 500 hPa, and the occurrence of the previous day. A similar set of predictors is used for the precipitation amount (the occurrence of the previous day is replaced by the 700 hPa temperature). Most of the time, the final regression model only includes one or two predictors. It also very often includes the relative humidity R 700 for precipitation occurrence and R 700 or the vertical velocity W 700 for precipitation amount. Some combinations of predictors, almost never used in general, appear to be more frequently retained for some specific weather patterns and/or locations in France, revealing their potential interest for these situations.
For the sake of simplicity and to limit the degrees of freedom in our analysis, we considered a unique set of four potential predictors for all SAFRAN grid cells. This obviously leads to a sub-optimal prediction configuration. The main meteorological processes driving precipitation in France obviously differ from one region to the other. The most informative predictors are thus expected to be region-dependent and the set of predictors to be considered in the regression stage could be refined on a regional basis. This is expected to improve the skill of the prediction. The same would apply for an application of SCAMP to other regions worldwide.
A number of atmospheric variables have been considered as potential predictors in similar downscaling studies. The predictors found to be of interest are most often few. They are roughly the same than those considered in the preliminary analysis of the present work. However, as in the present work, the analyses usually carried out to identify these informative predictors are potentially misleading. The selection of a variable is indeed often based on its predictive power, estimated with some prediction skill score in an all-days evaluation framework. As highlighted in the present work however, some predictors are likely to be informative for very few meteorological situations. An all-days evaluation is expected to reveal robust predictors. It however very likely misses important situation-specific predictors. The two-stage approach here estimates the statistical downscaling link from a homogeneous set of days, with respect to their large-scale atmospheric circulation configuration. Those days are moreover atmospheric analogs to the prediction day. This two-stage approach has thus the potential to reveal the predictive power of very specific predictors, suited for very specific meteorological configurations. It leaves very likely room for significant improvements of the prediction skill for such unusual configurations. It gives likely also the opportunity to better understand the atmospheric factors under play in a number of non-frequent and atypical meteorological situations. Notwithstanding the technical limitations that may hamper such analyses, a broader exploration of a much larger diversity of predictors, possibly non-conventional ones, would be thus definitively worth in this context.
Both the predictors and the regression coefficients were shown in our work to depend on the analog days identified in the analog stage. This is the reason for the adaptability of the downscaling discussed above. Besides the adaptability, we ideally expect that for a given prediction day the predictor selection and the associated regression coefficients will be robust. Further analyses should explore this issue. An interesting work would be for instance to check that the predictors and their related coefficients do not significantly change when the set of analog days considered for the estimation is modified as a result of a different setup of the analog model (e.g., when one changes the archive period or the archive length). Results of our work depend on a number of choices and assumptions. They for instance likely depend on the database used for the large-scale atmospheric predictors. The day-today behavior of such an analog/regression approach (and the skill of the prediction) likely depends on the database and especially on the quality of the predictors. An atmospheric reanalysis with a higher spatial resolution would for instance likely allow for a better description of the shapes of geopotential fields and for a more relevant simulation of regional/local thermodynamic processes. It would likely lead in turn to higher-quality variables for some atmospheric parameters such as air instability. This may allow for a better identification of the daily specificity in the downscaling relationship and for the most informative predictors to be used each day. The reverse may occur when using lower-quality predictors, for instance lower-quality data from reanalyses available for the 20th century or lower-quality data from climate or numerical weather forecasting models. The quality of the predictors is thus obviously also an important issue to be further considered. It may lead to different informative predictors, depending on the intended use of the model (forecast, simulation, or climate impact studies).
SCAMP was used here for the prediction of small-scale precipitation at individual grid cells. The prediction of precipitation fields, obviously required for a number of impact studies, is also a challenging issue (e.g., Clark et al., 2004;Yang et al., 2005;Thober et al., 2014). Different adaptations of SCAMP would be worth investigating in this context. SCAMP could be for instance applied for the prediction of mean areal precipitation over the whole targeted spatial domain and some spatial disaggregation process could be further used to generate the required fields (e.g., Mezghani and Hingray, 2009;Rupp et al., 2012). As highlighted by Chardon et al. (2016), the prediction skill of SCAMP is expected to increase with the size of the spatial domain targeted for the prediction, which makes such an approach rather appealing. Another possible strategy for spatial predictions would be to rely on the advantages introduced by the analog stage of SCAMP. Chardon et al. (2014) indeed showed that for a given prediction day the same set of analog days can be used over rather large domains (up to a few 100s of kilometers) for a quasi-optimal prediction of local-scale precipitation. The precipitation field of each analog day (which is thus spatially coherent because already observed) could thus be used as a first-guess precipitation field scenario for the considered region. The field could be next updated at each location with day-and location-specific coefficients obtained from the regression stage of SCAMP. This spatial prediction issue will be considered in future works.
Data availability. Data used for this work are data described in the section "data" of Quintana-Segui et al. (2008) and Vidal et al. (2010). Atmospheric predictors are taken from the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40, Uppala et al., 2005).