Development and comparative evaluation of a stochastic analog method to downscale daily GCM precipitation

There are a number of statistical techniques that downscale coarse climate information from general circulation models (GCMs). However, many of them do not reproduce the small-scale spatial variability of precipitation exhibited by the observed meteorological data, which is an important factor for predicting hydrologic response to climatic forcing. In this study a new downscaling technique (BiasCorrection and Stochastic Analog method; BCSA) was developed to produce stochastic realizations of bias-corrected daily GCM precipitation fields that preserve both the spatial autocorrelation structure of observed daily precipitation sequences and the observed temporal frequency distribution of daily rainfall over space. We used the BCSA method to downscale 4 different daily GCM precipitation predictions from 1961 to 1999 over the state of Florida, and compared the skill of the method to results obtained with the commonly used bias-correction and spatial disaggregation (BCSD) approach, a modified version of BCSD which reverses the order of spatial disaggregation and bias-correction (SDBC), and the bias-correction and constructed analog (BCCA) method. Spatial and temporal statistics, transition probabilities, wet/dry spell lengths, spatial correlation indices, and variograms for wet (June through September) and dry (October through May) seasons were calculated for each method. Results showed that (1) BCCA underestimated mean daily precipitation for both wet and dry seasons while the BCSD, SDBC and BCSA methods accurately reproduced these characteristics, (2) the BCSD and BCCA methods underestimated temporal variability of daily precipitation and thus did not reproduce daily precipitation standard deviations, transition probabilities or wet/dry spell lengths as well as the SDBC and BCSA methods, and (3) the BCSD, BCCA and SDBC methods underestimated spatial variability in daily precipitation resulting in underprediction of spatial variance and overprediction of spatial correlation, whereas the new stochastic technique (BCSA) replicated observed spatial statistics for both the wet and dry seasons. This study underscores the need to carefully select a downscaling method that reproduces all precipitation characteristics important for the hydrologic system under consideration if local hydrologic impacts of climate variability and change are going to be reasonably predicted. For low-relief, rainfall-dominated watersheds, where reproducing small-scale spatiotemporal precipitation variability is important, the BCSA method is recommended for use over the BCSD, BCCA, or SDBC methods.


4482
S. Hwang and W. D. Graham: Development and comparative evaluation of a stochastic analog method variability of climate variables (Christensen and Christensen, 2003;Giorgi et al., 2001;Johns et al., 2004;Lettenmaier, 1999;Wood et al., 2002). Furthermore, mismatch of the spatial resolution between GCMs and hydrologic models generally precludes the direct use of GCM outputs to predict hydrologic impacts.
To overcome this limitation of GCMs, a number of downscaling methods have been developed. It has been shown that fine-scale downscaled results provide better skill for hydrologic modeling (Andréasson et al., 2004;Graham et al., 2007;Wood et al., 2004) and agricultural crop modeling (Mearns et al., 1999) than using the coarse-resolution GCM output directly. Downscaling techniques are categorized by two approaches: (1) statistical downscaling using empirical relations between features simulated by GCMs at grid scales and surface observations at subgrid scales and (2) dynamic downscaling using regional climate models (RCMs) based on physical links between the climate at large and smaller scale. While dynamical downscaling provides physically consistent local climate simulations, it is computationally expensive. Furthermore current RCMs' predictions typically include systematic biases which require bias-correction after the dynamic downscaling, calling into question the usefulness of the additional computational burden (Hwang et al., 2011. As a result, RCM experiments for large ensembles of GCM simulations over multiple future scenarios are relatively scarce (Chen et al., 2012). To overcome these limitations statistical downscaling methods are often preferred (Hay et al., 2002;Wilby and Wigley, 1997). The primary advantage of statistical downscaling techniques is that they are computationally inexpensive, and thus can be easily applied to large ensembles of GCM simulations. Additionally statistical downscaling can provide local climate information at any space or time resolution of interest that observations are available to be used for bias correction. Thus they can be used to generate data specifically over existing hydrologic and agricultural model grids for climate change impact studies (Fowler et al., 2007;Murphy, 1999;Wilby et al., 2004).
Although much progress on downscaling precipitation predictions has been made, current challenges include the need to represent realistic levels of temporal and spatial variability at multiple scales (e.g., daily, seasonal and interannual variability, Timbal et al., 2009); the simultaneous downscaling of correlated climate variables (i.e. precipitation and temperature, Zhang and Georgakakos, 2012); and the representation of extreme events (Yang et al., 2012;Katz and Zheng, 1999). In particular, accurately representing the spatial patterns of daily precipitation can be an important factor for predicting hydrologic response to climatic forcing at the watershed scale (Bacchi and Kottegoda, 1995). For example, spatially uniform rainfall over large regions may result in higher evapotranspiration losses and lower surface runoff and recharge than spatially variable rainfall with same areal mean precipitation (Smith et al., 2004).
Statistical downscaling approaches are often applied at a temporally aggregated scale (e.g., monthly or seasonally) rather than daily or sub-daily timescales because of high data-handing costs and deficiencies in GCM daily results (Wood et al., 2002;Maurer and Hidalgo, 2008). When applied at a daily timescale, the direct use of GCM results makes them quite susceptible to model biases (Ines and Hansen, 2006). Means of addressing the problem include aggregating GCM predictions into seasonal or sub-seasonal means, downscaling to the target grid scale or station network, and then using a weather generator (Wilks, 2002;Wood et al., 2004;Feddersen and Andersen, 2005) or using methods which re-sample the historic data to disaggregate in space and time (Salathe et al., 2007;Maurer et al., 2010;Zhang and Georgakakos, 2012). Generally using a weather generator to generate daily climate sequences exhibits no skill at reproducing spatial correlation (Fowler et al., 2007). The use of historic analogs is constrained by the requirement that a sufficiently long observation record exists so that reasonable analogs can be found (Zorita and Storch, 1999).
Bias-Corrected Spatial Downscaling (BCSD; Wood et al., 2004;Maurer, 2007) is a widely used technique to downscale GCM results and it has been extensively applied to assess hydrologic impacts of climate change in the US (Christensen et al., 2004;Wood et al., 2004;Salathe et al., 2007;Maurer and Hidalgo, 2008). BCSD generally preserves relationships between large-scale GCM results and local-scale observed mean precipitation trends. Although this method was originally developed for downscaling monthly precipitation and temperature, in principle, daily GCM output can also be downscaled directly using this method. However realistic spatial variability of daily precipitation events may not be reproduced by this method because it is designed to preserve only the observed temporal statistics at the timescale chosen for downscaling and the spatial disaggregation process is essentially a simple interpolation scheme.
The constructed analog method (CA; Hidalgo et al., 2008) is a technique developed to directly downscale daily GCM products to assess hydrologic implications of climate scenarios. Hidalgo et al. (2008) showed that CA exhibited considerable skill in reproducing observed daily precipitation and temperature statistics but underestimated the mean and standard deviation of daily precipitation over the southeast US. Maurer and Hidalgo (2008) compared CA and BCSD method and demonstrated that CA showed better skill than BCSD, particularly in reproducing extreme temperature events. However both methods showed limited skill in reproducing daily precipitation extremes. Subsequently, Maurer et al. (2010) introduced the Bias-correction and Constructed Analog (BCCA) method which improved the CA method by removing the biases attributed to GCMs and showed better accuracy in simulating hydrologic extremes. Abatzoglou and Brown (2012) modified the BCSD method by changing the order of the bias-correction and spatial disaggregation procedures. That is, they interpolated GCM outputs onto a fine grid first and then the fields were bias-corrected using the CDF mapping approach for each fine-scale grid cell (i.e. the target resolution of downscaling). This simple modification (hereafter referred to as SDBC) improved the downscaling skill for reproducing local-scale temporal statistics. However the SDBC method does little to improve skill in reproducing spatial variability because the same approach (interpolation) as used in BCSD is employed for spatial disaggregation.
The methods mentioned above have been widely used for hydrologic, natural resource and agricultural applications and they are available online for the entire United States (see e.g., http://gdo-dcp.ucllnl.org/downscaled_cmip_ projections/dcpInterface.html#Welcome). Furthermore the BCSD method was adopted for use in the recent US Global Change Research Program's National Climate Assessment Report (http://ncadac.globalchange.gov/). However the ability of these methods to predict regional hydrologic response in rainfall-dominated watersheds should be carefully examined since they are not designed to reproduce the small-scale spatial variability of daily rainfall that is known to be important for accurately partitioning rainfall into evapotranspiration, surface runoff and groundwater recharge in these systems. This paper presents a new statistical downscaling technique (Bias-Correction and Stochastic Analog Method, hereafter BCSA) that preserves both the temporal and the spatial statistics of daily precipitation. The BCSA method is used to downscale daily precipitation predictions from 4 retrospective GCM simulations over Florida and the skill of the method is compared to downscaled results obtained using the BCSD, BCCA, and SDBC techniques.

Data
Daily gridded climate observations at 1/8 degree spatial resolution (∼ 12km) over Florida were obtained from Maurer et al. (2002) for the 1950-1999 study period. The Maurer et al. (2002) data include daily and monthly precipitation, maximum, minimum, and average temperature, and wind speed and are archived in netCDF format at http://hydro.engr. scu.edu/files/gridded_obs/daily/ncfiles/. These data represent spatially averaged values over each 12 km grid cell, and were derived directly from observations. Maurer et al. (2010) previously demonstrated the utility of these data to bias-correct and downscale GCMs using the BCSD and BCCA methods. In this study, these gridded observation data were used to both bias-correct daily GCM results and to estimate the observed spatial correlation structure for use in the BCSA method.
Retrospective daily predictions for four different GCMs (i.e. BCCR-BCM2.0, GFDL-CM2.0, CGCM3.1, and CCSM3) from the World Climate Research Programme's (WCRP's) Coupled Model Inter-comparison Project phase 3 (CMIP3) multi-model data set were selected for downscaling using the BCSD, SDBC, and BCSA methods, based on availability and previous use in both statistical and dynamical downscaling experiments (e.g., Maurer et al., 2007;Mearns et al., 2012). GCM results downscaled on a daily basis using BCCA were obtained directly from "Downscaled CMIP3 and CMIP5 Climate and Hydrology Projections" archive at http://gdo-dcp.ucllnl.org/downscaled_cmip_projections/ . Only two GCMs consistent with the models used for BCSD, SDBC, and BCSA (i.e., GFDL-CM2.0 and CGCM3.1) were available for BCCA and thus two additional GCMs (i.e., CNRM-CM3, and MIROC3.2) were randomly selected for comparison. The GCMs selected for this study are shown in Table 1. The grid resolutions for the GCMs range from 1.4 • to 2.8 • . Figure 1 shows how each model grid configuration of GCMs and gridded observation covers the study domain over Florida. As will be shown in Sect. 5 differences in skill among GCM data were found to be insignificant compared to differences in skill among statistical downscaling techniques. Thus use of consistent GCMs for the BCCA method does not affect the major findings and conclusion of the study.

Bias-Correction and Spatial Downscaling at daily scale (BCSD_daily) method
The BCSD method is an empirical statistical technique that was developed by Wood et al. (2002Wood et al. ( , 2004 and has been used by Ines and Hansen (2006), Salathe et al. (2007), and Maurer and Hidalgo (2008). As described above, the method was originally designed to downscale monthly precipitation and temperature. However in this study we employed the methodology at a daily timescale and evaluated its skills for reproducing the spatial and temporal statistics of daily precipitation. The technique will be referred to as the BCSD_daily hereafter. BCSD_daily consists of two separate steps for biascorrection and spatial downscaling. In the first step raw GCM predictions are bias-corrected at the large GCM grid scale using the CDF mapping approach (Panofsky and Brier, 1968). In order to apply this approach to bias-corrected daily precipitation, data corrections for precipitation amount and frequency (i.e., the number or percentage of rainy events) are often separately conducted (e.g., Ines et al., 2011;Teutschbein and Seibert, 2012). In particular this is necessary when using parametric distributions of rain events for the CDF mapping process. However nonparametric transformation using empirical distributions has also been used for bias-correction, often with better skill in reducing biases in than parametric distribution mapping approaches (Gudmundsson et al., 2012). Empirical CDF mapping was conducted in the study as follows: (1) CDFs of observed daily precipitation data (including "0" data) were created individually for each month at the coarse GCM scale using the spatial average of available observed data from Maurer et al. (2002) within each GCM grid. Thus 12 observed monthly CDFs were created for each GCM grid cell; (2) CDFs of simulated daily precipitation were created for each GCM grid cell for each month; (3) daily grid cell predictions were bias-corrected at the largescale GCM resolution using CDF mapping that preserves the probability of exceedance of the simulated precipitation over the grid cell, but corrects the precipitation to the value that corresponds to the same probability of exceedance from the spatially averaged observation over the GCM grid. Thus biascorrected rainfall x t,i on day t at grid i was calculated as where F (·) and F −1 (·) denote the empirical CDF of daily precipitation data and its inverse, and subscripts "sim" and "obs" indicate GCM simulation and observed daily rainfall, respectively. Because the observed CDFs include "0" values the procedure reproduces the probability of occurrence for all magnitudes of precipitation events, including zero rainfall events. Thus rainfall frequency (number of rainy days) is reproduced. The bias-correction procedure is schematically represented in Fig. 2. The examples of daily raw and biascorrected precipitation provided in the figure illustrate that the bias-correction process removes both bias in the precipitation predictions and the tendency of the climate model to underpredict dry days and overpredict the number of low volume rainfall days (Hwang et al., 2011).
In the next step of the BCSD_daily process anomalies (i.e., the ratio of simulated precipitation field to observed temporal mean precipitation field) of the bias-corrected GCM output were calculated for each grid cell. These anomalies were then spatially interpolated to the local-scale resolution using an inverse distance weighting technique (Shepard, 1984). Finally these fine-scale anomalies were re-scaled with the mean precipitation field at the fine grid scale resolution. Any prediction x t,i of which CDF sim,i x t,i is less than Prob(G obs = 0) will be substituted with "0" and thus frequency of al daily rainfall events is corrected in the process.

Bias-Correction and Constructed Analog (BCCA) method
The constructed analog (CA) technique creates a library of observed daily coarse-resolution climate anomaly patterns for the variable to be downscaled, then selects a set of observed coarse-resolution analogs with patterns that closely match the simulated anomaly pattern that must be downscaled. A linear combination of the selected, observed daily coarse-resolution climate anomalies' patterns is used to estimate a coarse resolution analog to the simulated anomaly. A downscaled anomaly is then generated by applying the same linear combination to the corresponding set of highresolution observed climate anomaly patterns. The CA approach retains daily sequencing of weather events from the GCM results and various alternative climate variables (e.g., geopotential heights, sea level pressure) can be considered as predictors to construct the best analog. A significant limitation of the CA approach, as originally developed, is that the biases exhibited by the GCM (resulting from imperfect model parameterization of physical processes or inad-equate topographic representation in the model) are reconstructed in the downscaled fields Maurer and Hidalgo, 2008). In order to overcome this drawback, Maurer et al. (2010) suggested a hybrid method, BCCA combining statistical bias-correction at the coarse scale (as used in BCSD) prior to applying the constructed analog method. However, BCCA may not accurately reproduce the mean and variance of precipitation at the downscaled resolution. This is because anomaly patterns of the bias-corrected GCM (instead of the bias-corrected GCM, itself) are used to choose analogs and historical records corresponding to the analogs are combined using linear regression without further bias-correction at the fine resolution. In this study, we used previously developed BCCA results available over the entire US from http://gdo-dcp.ucllnl.org/downscaled_cmip3_ projections/. As mentioned in Sect. 2, the BCCA results are not available for BCCR-BCM2.0 and CCSM3 that we used for other statistical methods, thus GFDL, CGCM, CNRM-CM3, and MIROC3.2) were used from this data set (Table 1).

4486
S. Hwang and W. D. Graham: Development and comparative evaluation of a stochastic analog method

Spatial Downscaling and Bias-Correction (SDBC) method
The SDBC method developed by Abatzoglou and Brown (2012) was the third previously published methodology evaluated in this study. As described above, the SDBC method is a modified version of the BCSD method in which the order of bias-correction and spatial disaggregation is reversed. That is, GCM outputs are interpolated to the fine grid scale using inverse distance weighting first and then the interpolated precipitation fields are bias-corrected using the CDF mapping approach described above but using observations at the local grid scale. This modification improves the downscaling skill in reproducing local temporal statistics since bias-correction is conducted at the local grid scale.

Bias-Correction and Stochastic Analog (BCSA) method
In this study a new spatial downscaling technique was developed to generate spatially correlated downscaled precipitation predictions which preserve both the temporal statistical characteristics as well as the small-scale spatial correlation structure of observed precipitation fields. The technique will be referred to as the BCSA method hereafter. Because the spatiotemporal features (e.g., frequency, spatial patterns, and correlation) of precipitation events may change monthly or seasonally, the BCSA process was performed using temporal and spatial statistics calculated separately for each month.
i. The first step in the BCSA procedure was to generate an ensemble of synthetic precipitation fields for each month that honor the observed spatiotemporal statistics as follows: gridded precipitation observations were transformed from their observed empirical (non-Gaussian) distributions into standard normal variables using the normal score transformation approach (Goovaerts, 1997;Deutsch and Journel, 1998): where x * t,i is the normal score transform of x t,i (i.e., observed daily precipitation on day t at grid i), G −1 (·) is the inverse transform function of the standard Gaussian CDF and F obs,i (x) denotes the empirical CDF of daily gridded observation for grid i.
ii. Pearson's correlation coefficients ρ for the normal score transform variables for all pairs of grid cell observations over the study domain were calculated for each month using the following equation: where N is the number of data points (days) available for each grid cell,x * i and σ * i denote the temporal mean and standard deviation of normal scores for grid i, respectively. The full correlation matrix that consists of all the calculated pair-wise correlations was then assembled: where n is the number of grid cells.
iii. The symmetric positive-definite correlation matrix ρ was factored using the Cholesky decomposition method (Taussky and Todd, 2006) that decomposes the matrix into the product of a lower triangular matrix and its conjugate transpose: where L is a lower triangular matrix with strictly positive diagonal entries, and L * denotes the conjugate transpose of L.
iv. Vectors with elements corresponding to each grid cell were randomly generated from independent Gaussian distributions for each day t (r t ) then transformed into pair-wise correlated vectors (r ϕ t ) by multiplying with the calculated factorization matrix L * . The random vector for each day, r t , contains n elements corresponding to each grid cell.
The elements of r ϕ t generated by this process honor the observed spatial correlation but have zero mean and unit variance.
v. Spatially correlated normal score variables r ϕ t were back-transformed to their observed empirical distributions using the CDF of the corresponding gridded observations using the following equation: where x ϕ t,i is the element of r ϕ t for grid i, F norm,i (·) denotes the empirical CDF (approximately normal) of the generated normal scores for grid i, andx t,i is the precipitation estimation for day t and grid i. This procedure was repeated for every grid cell to get ensembles of daily precipitation fields that preserve the empirical daily precipitation CDFs for each grid and spatial correlation structure of the observed precipitation field as well. vi.
Step (iv) and step (v) are repeated to create an ensemble of 3000 replicates of spatially distributed precipitation fields for each month.
Next the raw daily GCM predictions were bias-corrected at the large GCM grid scale using the same empirical CDF mapping approach (Eq. 1) as used in BCSD_daily method. Finally, for each day that the coarse-scale bias-corrected GCM results predicted non-zero rainfall, a realization from the appropriate monthly ensemble was selected for which the spatial mean of the generated precipitation field most closely matched the coarse-scale bias-corrected GCM result. Any difference between the spatial mean precipitation of the best-fit generated precipitation field and the coarsescale bias-corrected GCM precipitation (generally < 0.1 mm) was removed by multiplying the generated field by a scaling factor (i.e., spatial mean of bias-corrected GCM field/spatial mean of precipitation field chosen from the ensemble). For days that the coarse-scale bias-corrected GCM results predict zero rainfall over the domain each local-scale grid was assigned zero rainfall.

Assessment of downscaling skill
The temporal mean, 50th percentile, 90th percentile, and standard deviation of the daily precipitation time series for observed and downscaled predictions were calculated for each grid cell and mapped over the state of Florida to evaluate the spatial distribution of these temporal statistics for both the wet season (June through September) and the dry season (October through May). Mean error (ME), root mean square error (RMSE), correlation (R) of these predicted statistics were calculated over the state of Florida for each of these quantities. In addition to these daily precipitation statistics, day-to-day precipitation patterns and persistence/intermittence of events are important for most hydrologic applications. Daily transitions between wet and dry states were thus calculated for both the observed data and predictions (e.g., raw GCM data, bias-corrected GCM results, and downscaled results) using the first-order transition probability (Haan, 1977) and the numbers of events per year with specific wet/dry spell durations were also estimated over the study area for both the wet and dry seasons to investigate daily precipitation occurrence patterns.
In terms of spatial features, observations and predictions were evaluated using several indices indicating spatial standard deviation, correlation, and variability (Hubert et al., 1981). The Moran's I (Moran, 1950;Thomas and Huggett, 1980) index, a commonly used statistical index for identifying spatial dependence, was calculated using the following formula: where x t,i and x t,j refer to the precipitation in station i and j on day t, respectively.x t is the overall spatial mean precipitation on day t. w ij is an adjacency weight based on inverse distance weighting. The I values are between −1 and 1. Like the correlation coefficient, I is positive if both x t,i and x t,j lie on the same side of the mean (above or below), while it is negative if one is above the mean and the other is below the mean (O'Sullivan and Unwin, 2003). Geary's C (Griffith, 2003) was calculated as a measure of spatial variance of precipitation among grid cells, as follows: (9) C values range between 0 and 2. The spatial autocorrelation is positive if C is lower than 1, negative if C is between 1 and 2, and zero if C is equal to 1. In this research average I and C indices were calculated for the wet and dry season over the study period from 1961 to 1999. Moran's I t and Geary's C t represent measures of spatial autocorrelation for each spatial field at day t, however the relationship between the geographical distance and correlation are not measured by these statistics. We used the variogram, defined as the expected value of the squared difference of the values of the random field separated by distance vector h, to describe the degree of spatial variability exhibited by each spatial random field. The experimental variogram 2γ (h) for the observed and simulated precipitation data was calculated for both the wet and dry seasons using the following formula (Goovaerts, 1997): where N (h) denotes the number of pairs of observations (or predictions) separated by distance h available on the same day over the season, and x(u α ) and x (u α + h) are the observed (or predicted) precipitation at locations u α and u α +h, respectively, on the same day in that season.

Evaluation of temporal variability
Gridded annual total precipitation observations, spatially averaged over the state of Florida, ranged from 1048 mm to 1657 mm with a mean of 1343 mm over the study period from 1961 to 1999. The standard deviation of the spatially averaged annual total observation time series was 152 mm. Figure 3 compares the spatially averaged annual total precipitation time series and mean monthly precipitation of raw GCM outputs, bias-corrected GCM results at the GCM scale, and gridded observation (Gobs) over the study period. Biascorrection was conducted at the GCM grid scale using Gobs spatially averaged to each GCM resolution. Recall that biascorrection at the GCM scale is conducted only for BCSD, BCCA, and BCSA. The SDBC method interpolates the raw GCM results to the local scale first and then bias-corrects the interpolated results at the fine resolution. Figure 3 indicates that the GCM outputs are significantly biased in terms of mean precipitation amount (ME of annual total precipitation from −263 mm for CCSM3 to 521 mm for BCCR) but reproduce the observed seasonality of precipitation (i.e., annual cycle of mean precipitation) with high correlation (from 0.83 for BCCR to 0.98 for CGCM). Bias-correction significantly improves the accuracy of monthly mean precipitation. However, the temporal correlation of the time series was not improved because the CDF mapping approach does not change the temporal pattern or timing of precipitation events. Note that predicted annual time series from GCM simulations in retrospective mode (i.e., "hindcast") are not expected to reproduce the actual annual time series for the study period since they do not use actual observed initial conditions or boundary conditions in the simulations. As a result the correlation between the observed and raw GCM annual time series ranges from −0.16 to 0.35 (see Fig. 3). Table 2 compares the mean and standard deviation of observation, raw GCMs, bias-corrected GCMs, and downscaled bias-corrected GCM spatially averaged annual precipitation over the state of Florida. The BCCA method underestimated the observed mean annual precipitation over the study period by 8 % (CGCM3) to 11 % (CNRM-CM3) while the rest of methods reproduced the mean annual precipitation, with errors less than ±20 mm (< 2 % of observed mean annual precipitation). The temporal standard deviation was slightly underestimated by the BCSD results (114 mm to 147 mm over the GCMs) and BCCA (128 mm to 147 mm), and overestimated by SDBC results (153 mm to 247 mm). The SDBC method overestimates the temporal standard deviation of spatially averaged annual total precipitation because the largescale daily GCM precipitation predictions are spatially disaggregated by interpolation and then bias-correction at the downscaled grid resolution. Thus each fine-scale grid cell preserves the precipitation percentile event predicted by the large-scale GCM, exaggerating the spatial extent of high and low percentile events. Figures 4 and 5 compare the spatial distribution of mean precipitation for the wet (June to September) and dry seasons (October through May) over the study period and show that mean climatology was accurately reproduced over the state of Florida by the BCSD_daily, SDBC, and BCSA methods (ME < 0.1 mm). These results are expected since the CDF mapping bias-correction technique employed in these methods is designed to fit the predictions to historic mean climatology. Meanwhile, the BCCA results closely reproduced the spatial pattern of observed mean precipitation for both seasons (R about 0.9), but slightly overestimated mean precipitation in the southern part of the state and underestimated in the central/northern part of the state in the wet season (ME from −0.8 mm to −1.0 mm), and underestimated mean precipitation over the entire state in the dry season (ME from −0.5 mm to −0.6 mm). The spatial distribution of the temporal standard deviation of precipitation showed significant differences among the downscaling methods. Figures 6 and 7 compare the spatial distribution of the temporal standard deviation of the daily precipitation time series over the state of Florida for the wet and dry seasons over the study period, respectively. While the SDBC and BCSA results accurately reproduced the standard deviation for both the wet and dry seasons (ME ≤ 0.1 mm), the BCSD_daily results significantly underestimated the standard deviation for both seasons (average ME over the GCMs: −4.4 mm for wet season and −2.7 mm for dry season). The BCCA results improved over the BCSD_daily results but still underpredicted the daily precipitation standard deviation (average ME: −3.7 mm for wet season and −2.1 mm for dry season) because the linear regression scheme used to construct the analogs in BCCA attenuates extreme events and thus decreases temporal variance. Figures 8 and 9 show the spatial distributions of 90th percentile (5-20 mm) and 50th percentile (< 3 mm) of total daily precipitation for the observation data and downscaled estimates for the wet season, respectively. The results show that the BCSD_daily and BCCA method underestimated the observed 90th percentile daily precipitation amount (average ME over the GCMs: −4.5 mm for both methods) and overestimated the 50th percentile of daily precipitation (average ME: 2.3 mm for BCSD_daily and 0.9 mm for BCCA) because of their tendency to overestimate the occurrence of small rainfall events. On the other hand, the SDBC and BCSA method reasonably reproduce both the 90th percentile and 50th percentile daily precipitation (ME < ±0.2 mm for all cases). Figure 10 compares the full CDFs of daily precipitation for an arbitrarily selected grid located in west central Florida. This figures indicates that errors in the frequency distribution of BCSD_daily and BCCA daily precipitation (under/overestimation shown in Fig. 8 and Fig. 9) tend to be more severe for more extreme events (e.g., < 50th percentile and > 95th percentile; note that the 50th percentile of the all data corresponds to the 5th to 20th percentile of rain events, see Fig. 10 for example). The full CDFs of all GCM results downscaled using the SDBC and BCCA methods accurately fit the observed CDF.
The inaccuracies in the temporal variability produced by the BCSD_daily method are caused by the interpolation scheme used to disaggregate the bias-corrected GCM predictions which produces smooth downscaled results. The temporal standard deviation at downscaled locations corresponding to the center point of the GCM grid produces slightly higher temporal variability (Figs. 6 and 7) because the interpolation procedure produces less smoothing at these locations. This weakness of the BCSD_daily method is improved by exchanging the order of the bias-correction and interpolation procedures (i.e. SDBC) as shown in Fig. 6 through Fig. 9. When the interpolated GCM results are bias-corrected using fine-scale gridded observations at the last step of the downscaling process, the final results reproduce the full observed CDF and thus both the observed temporal mean and temporal standard deviation. Although SDBC has been recently introduced for downscaling daily GCM products (Abatzoglou and Brown, 2012), explicit insight into these distinctions between the BCSD_daily and SDBC downscaling frameworks was not provided by the previous studies.
In addition to reproducing temporal statistics of daily rainfall, day-to-day precipitation patterns are also important for most hydrologic applications. Daily transitions between wet and dry states were estimated for the observed gridded data, the raw GCMs, bias-corrected GCMs and the downscaled bias-corrected GCM predictions obtained using the BCSD_daily, BCCA, SDBC, and BCSA methods. Fig. 11 ME: -1.         9. Spatial distribution of the 50th percentile daily precipitation of Gobs, BCSD_daily, BCCA, SDBC, and BCSA GCMs for each grid cell for wet season (June through September), units in mm. ME, RMSE, and R are reported on each map for the bias-corrected predictions. Note that the percentile is calculated from all data including "0" precipitation data for each grid. Thus the 50th percentile indicates a low precipitation (typically < 1 mm) that ranges from 5th to 20th percentile of rainy events over the grids.  Fig. 11. Comparison of monthly first-order dry to wet (TP_{01}, upper raw) and wet to wet (TP_{11}, bottom row) transition probabilities for raw GCM data (first column) and bias-corrected GCM results (second column). Averaged transition probabilities for all grids over the study area (i.e., the state of Florida) were plotted for each GCM. Transition probabilities of the gridded observation were calculated both at 1/8 • resolution (original resolution of Gobs) and 2 • (aggregated up to approximate average grid scale of GCMs, see Table 1).
compares dry to wet (TP_{01}) and wet to wet (TP_{11}) transition probabilities of raw GCM data and bias-corrected GCM results (using Gobs spatially averaged to the GCM grid scale) to the transition probabilities of gridded observations over the study area both at the original resolu-tion (1/8 • ) and spatially averaged to the grid resolution i.e., The results show that all the raw GCM results tend to overestimate both TP_{11} and TP_{01} for both seasons (TP_{11} > 0.91 and TP_{01} > 0.66 for the dry season, and TP_{11} > 0.98 and TP_{01} > 0.78 for the wet Comparisons of monthly first-order dry to wet transition probability (TP_{01}) for observations (first row), BCSD_daily results (second row), BCCA results (third row), SDBC (fourth row), and BCSA results (fifth row) for 4 GCM products over all grids in the study area. Box plot presents minimum, 10th percentile, median, 90th percentile, and maximum over the grids. season) and bias-correction significantly improves the skill in reproducing the observed transition probabilities at the GCM grid resolution. Note that at the coarse resolution observations had higher transition probabilities over the annual cycle compared to fine-scale observations due to the spatial averaging process. Similarly, for the raw GCMs the probability of precipitation occurrence over the coarse grid cell area is larger than the probability of occurrence at any point or sub-grid within the coarse grid cell. Figures 12 and 13 compare transition probability of downscaled GCMs to gridded observations at 1/8 • resolution. After downscaling the BCSD_daily results still overestimated both TP_{11} and TP_{01} for both seasons compared to observations. The accuracy of bias-corrected downscaled transition probabilities were worse than the accuracy of bias-corrected GCMscale results especially in the wet season likely because of the interpolation scheme used in BCSD downscaling process (see Fig. 11). TP_{11} and TP_{01} for the BCCA results are closer to the observed transition probabilities than the BCSD_daily results but are not as accurate as the SDBC and BCSA results. Differences in transition probabilities among the GCMs were not significant for either the raw or any of the downscaled results.
The frequency and duration of consecutive wet and dry days reflect dynamic properties of precipitation that have important implications for producing extreme hydrologic behavior (i.e., flood and drought events). For evaluation purposes the number of consecutive wet and dry events that persist for more than 5 days was calculated for each downscaled GCM. Figures 14 and 15 show the spatial distribution of the number of events of wet spell length > 5 days in the wet season and dry spell length > 5 days in the dry season, respectively. The results show that BCSD_daily and BCCA produce fewer events of spell length > 5 days compared to observations and show lower correlations with observations (i.e., < 0.1 for BCSD_daily and ≈ 0.5 for BCCA). This is because both methods produce too many wet days (> 0.1 mm) and thus produce longer duration and fewer total number of events. In contrast, the SDBC and BCSA methods reproduce the spatial pattern of the observed frequency of wet and dry spell lengths much more closely for all GCMs (R: 0.71-0.91 for SDBC and 0.60-0.90 for BCSA). Overall, the differences in the results obtained by different downscaling techniques are larger than the differences obtained from different GCMs using the same downscaling technique. For additional insight, the average number of specific wet and dry spell events (i.e., > 5 days, > 10 days, and > 5 days) over the study period and study area for gridded observation and each downscaled GCM prediction are provided in the Supplement (available online). Figure 16 compares the relationship between the spatial standard deviation and mean of daily precipitation events for observations and predictions downscaled using the four methods. The results indicate that the observed relationship between spatial variability and event size was reproduced fairly well by all the methods, but that the BCSA method reproduced the relationship more correctly than the other meth-ods. The spatial variability of daily observations and downscaled GCMs were also quantified by calculating the average Moran's I and Geary's C for each month (Fig. 17). In general the BCSD_daily and SDBC results produced precipitation fields with overestimated spatial correlation (high Moran's I, i.e. ≈ 0.4 and 0.3, respectively, compared to ≈ 0.2 for observations) and underestimated spatial variance (low Geary's C, i.e. ≈ 0.4-0.5 compared to 0.6-0.8 for observations). The BCCA results showed better skills than the BCSD_daily and SDBC results for both the Moran's I and Geary's C indices, but was not as accurate as the BCSA method. In all cases the spatial variance of precipitation (Geary's C index) was found to show strong seasonality, i.e. higher in the wet season and lower in the dry season. No significant seasonality in spatial correlation (Moran's I) was found. Figure 18 compares wet season and dry season variograms calculated for each downscaled result to the variograms of the gridded observations. These figures indicate that the BCSD method significantly underestimated the observed variogram at all separation distances for both wet (June through September) and dry (October through May) seasons. The BCCA and SDBC variogram improved over the BCSD results, but still underestimated the observed variogram. As designed, the BCSA results reproduced the observed variograms correctly for both seasons.

Discussion
Overall, the existing interpolation-based statistical downscaling methods (i.e., BCSD_daily and SDBC) and the constructed analog method (i.e., BCCA) showed limited skills in reproducing the spatial and temporal variability of daily precipitation, which is important for determining hydrologic behavior in low-relief rainfall-dominated watersheds (e.g., . The skill of the BCSA method improved over these methods because BCSA preserves the spatial correlation structure of the observations while also taking the advantage of the CDF mapping bias-correction employed in the other downscaling methods. We used daily GCM precipitation predictions to develop and test the BCSA method in this study. Statistical downscaling on a daily basis should be adequate for many hydrologic modeling applications concerned with predicting spatially distributed streamflow and groundwater levels for water supply purposes (e.g., Xu et al., 1996;Middelkoop et al., 2001). However the BCSA method can be applied to downscale coarse resolution climate data into any temporal (e.g., hourly, daily, monthly) and spatial scale (e.g., gridded or irregularly distributed points) needed for a particular application, as long as observations are available to estimate the cumulative distribution functions and spatial correlation structure of precipitation over the required spacetime grid. Furthermore, because it generates an ensemble of possible local-scale precipitation patterns the uncertainty due to the downscaling process could be examined using a collection of equally probably downscaled climate fields. The procedure can also be applied to temperature and other surfaceweather variables.
One drawback of using the BCSA technique is that spatial disaggregation of coarse scale precipitation predictions is conducted independently on a daily basis, not taking into account day-to-day, week-to-week or seasonal temporal relationships at the local scale. Thus the temporal trends and persistence of downscaled precipitation results depend on the large scale bias-corrected GCMs' skill to reproduce the temporal correlation of precipitation patterns. We found that the observed transition probabilities and the frequency of wet and dry spells of greater than 5, 10 and 15 days duration were reasonably reproduced by the BCSA method, with similar accuracy to the SDBC method and better accuracy than the BCSD or BCCA methods. These results indicate that the bias-corrected GCM outputs have acceptable skill in representing plausible temporal precipitation patterns from a statistical point of view (e.g., average frequency) and this skill is preserved through the BCSA downscaling process.
However bias-corrected GCMs have been previously shown to produce unrealistically long dry spell lengths (e.g., Ines et al., 2011). Similarly, in this study we found that the maximum dry spell length produced by all of the downscaling methods (> 50 days of dry spell length) overpredicted the observed maximum dry spell length of approximately 40 days for the study area and period. Thus long temporal persistence errors are not effectively improved by the simple bias-correction used here and may reduce the utility of using the climate model results for applications (e.g., agricultural crop yield estimation, Ines and Hansen, 2006;Ines et al., 2011). This limitation may possibly be reduced by employing alternative bias-correction methods developed to replicate observed auto-correlation at multiple timescales (Johnson and Sharma, 2012;Mehrotra and Sharma, 2012) or stochastically redistributing temporal structure of climate model output (Ines et al., 2011).
The BCSA method is more computationally expensive than the BCSD and SDBC methods because it requires that an ensemble of stochastic spatial precipitation fields be generated from which to match the bias-corrected daily GCM on a daily basis. However generation of this ensemble is a relatively minor one-time cost that, for example, took approximately 3 h on a common personal computer (e.g., 64 bit, Intel Core i5 CPU, 3.3 GHz, 3.25 GB of RAM) for the resolution (12 km) and domain size (state of Florida) demonstrated here. The BCCA method is also more computationally expensive than the BCSD and SDBC methods because it may include processes for searching analogs and requires linear regression to construct analogs on daily basis. If due to the computational limitations, interpolation-based methods must be considered for downscaling over regions exhibiting high spatial variability of precipitation, other advanced statistical methods for spatial disaggregation (e.g., multivariate geostatistical methods using multiple factors -such as humidity, cloud, or elevation -relevant to spatial variability of precipitation, Haberlandt, 2007;Goovaerts, 2000) could be considered instead of simple univariate interpolation methods. Accurately reproducing the spatial variability of precipitation is generally accepted to be an important factor for predicting hydrologic behavior.  showed that retrospective precipitation fields produced using BCSA predicted streamflow in the Tampa Bay region of Florida more accurately than precipitation fields produced from interpolation-based methods such as BCSD and SDBC when used to drive a previously calibrated integrated hydrologic model. However the significance of errors in representing spatial structure of precipitation will vary from region to region, depending on topographic, geologic and climate characteristics. Therefore hydrologic modeling efforts testing various GCM downscaling techniques are recommended to quantitatively evaluate the hydrologic implications of alternative downscaling techniques, and to select the most appropriate technique, for particular regions and applications of interest.

Summary and conclusions
This study developed a new technique, the bias-correction stochastic analog method (BCSA), to downscale daily GCM precipitation predictions. Four GCM results were used to compare the skill of BCSA in reproducing observed spatial and temporal statistics of daily precipitation to the skills of the BCSD_daily, BCCA, and SDBC downscaling techniques. Downscaled GCM results using BCSD_daily, SDBC, and BCSA correctly reproduced the observed temporal mean of the daily precipitation as well as the annual cycle of monthly mean precipitation, while the BCCA results underestimated the mean daily precipitation. The temporal standard deviation and the magnitude of 90th percentile daily precipitation were underestimated by the BCSD_daily method especially for the wet season. Furthermore BCSD_daily overestimated low precipitation frequency, wet to wet transition probabilities, and dry to wet transition probabilities as well. These inaccuracies of the BCSD_daily method were improved by the BCCA and SDBC methods. However the BCCA method underestimated, and the SDBC method overestimated, the temporal standard deviation of spatially averaged precipitation. The BCSA reproduced the observed temporal standard deviation, magnitudes of both high (90th percentile) and low (50th percentile) rainfall amounts and wet to wet transition probabilities more accurately than the BCSD_daily or the BCCA method.
More significantly, the interpolation-based downscaling methods (both BCSD_daily and SDBC) and the BCCA method were unable to reproduce the observed spatial correlation structure of daily precipitation, which may have important implications for predicting hydrologic behavior in raindominated watersheds. The BCSA technique was designed to generate daily precipitation fields that reproduce observed spatial correlation of daily rainfall. Analysis of spatial standard deviation, Moran's I, Geary's C, and variograms showed quantitatively that BCSA is superior in reproducing the spatial variance and spatial correlation of observed daily precipitation compared to the other methods.
Results of this study underscore the need to carefully select a downscaling method that reproduces all precipitation characteristics important for the hydrologic system under consideration if local hydrologic impacts of climate variability and change are going to be accurately predicted. For low-relief, rainfall-dominated watersheds, where reproducing small-scale spatiotemporal precipitation variability is important, the BCSA method should produce superior results over the BCSD, BCCA, or SDBC methods. A follow-on phase of this work quantitatively evaluated the relative abilities of these statistical methods to reproduce historic hydrologic behavior using an integrated hydrologic model with retrospective GCM simulations in the Tampa Bay region of Florida. This study showed that the BCSA method outperformed other downscaling methods . In future work, the BCSA technique will be used to downscale future GCM climate projections to assess potential climate change impacts on regional hydrology in the Tampa Bay region.