Comparing CFSR and conventional weather data for discharge and soil loss modelling with SWAT in small catchments in the Ethiopian Highlands

Accurate rainfall data are the key input parameter for modelling river discharge and soil loss. Remote areas of Ethiopia often lack adequate precipitation data and where these data are available, there might be substantial temporal or spatial gaps. To counter this challenge, the Climate Forecast System Reanalysis (CFSR) of the National Centers for Environmental Prediction (NCEP) readily provides weather data for any geographic location on earth between 1979 and 2014. This study assesses the applicability of CFSR weather data to three watersheds in the Blue Nile Basin in Ethiopia. To this end, the Soil and Water Assessment Tool (SWAT) was set up to simulate discharge and soil loss, using CFSR and conventional weather data, in three small-scale watersheds ranging from 112 to 477 ha. Calibrated simulation results were compared to observed river discharge and observed soil loss over a period of 32 years. The conventional weather data resulted in very good discharge outputs for all three watersheds, while the CFSR weather data resulted in unsatisfactory discharge outputs for all of the three gauging stations. Soil loss simulation with conventional weather inputs yielded satisfactory outputs for two of three watersheds, while the CFSR weather input resulted in three unsatisfactory results. Overall, the simulations with the conventional data resulted in far better results for discharge and soil loss than simulations with CFSR data. The simulations with CFSR data were unable to adequately represent the specific regional climate for the three watersheds, performing even worse in climatic areas with two rainy seasons. Hence, CFSR data should not be used lightly in remote areas with no conventional weather data where no prior analysis is possible.


Introduction
Accurately represented, spatially distributed hydrometeorological and hydro-climatic data are the most important input parameters for hydrological modelling with the Soil and Water Assessment Tool, called SWAT hereafter (Arnold et al., 1998(Arnold et al., , 2012;;Douglas-Mankin et al., 2010).Although a great deal of effort is being invested into rainfall and climatic data collection, many areas of Ethiopia have no adequate precipitation data, and where such data are available, the monitoring network contains substantial temporal and spatial gaps.This makes it necessary to use other sources of modelled rainfall data for SWAT modelling.The Climate Forecast System Reanalysis (CFSR, 2016) readily provides, for any coordinated on the globe, a climate data set adapted to SWAT.This data set is the result of the close cooperation between two US organizations, the National Centers for Environmental Prediction (NCEP) and the National Center for Atmospheric Research (NCAR), which have completed a global climate data reanalysis over 36 years from 1979 through to 2014.The CFSR data are based on a spectral model which includes the parametrization of all major physical processes as described in detail in Kalnay et al. (1996), Kistler et al. (2001), and Saha et al. (2010).
Published by Copernicus Publications on behalf of the European Geosciences Union.
V. Roth and T. Lemann: Comparing CFSR and conventional weather data with SWAT However, the applicability of the CFSR data for smallscale catchments in the Ethiopian Highlands has not been adequately investigated yet.Aforementioned studies did focus on large basins with numerous CFSR stations, which tend to balance errors in rainfall patterns.So far, few studies have been conducted in the Ethiopian context on the impact of rainfall data on streamflow simulations.Fuka et al. (2014) used CFSR data in a 1200 km 2 watershed in Ethiopia with SWAT suggesting CFSR data perform as well or even better than conventional precipitation.Worqlul et al. (2014) correlated conventionally recorded rainfall with CFSR data over the Lake Tana basin (15 000 km 2 ).They suggested that seasonal patterns could adequately be captured although the CFSR data did uniformly overestimate and underestimate measured rainfall.A recent study from Dile and Srinivasan (2014) evaluated the use of CFSR data for hydrological prediction using SWAT in the Lake Tana basin, Ethiopia.The study achieved satisfactory results in its simulations for both CFSR and conventional data.While the outcome was better with conventional weather data, the study concludes that CFSR could be a valuable option in data-scarce regions.Other studies using CFSR data not in the Ethiopian context (De Almeida Bressiani et al., 2015;Alemayehu et al., 2015) and with large to very large catchments (13 750 to 73 000 km 2 ) concluded that CFSR data yielded good to very good results and the SWAT model responded reasonably to the data set.One CFSR application in China (Yang et al., 2014) with meso-scale watersheds (366 to 1098 km 2 ) concluded that CFSR data were significantly different and that the CFSR data spatial distribution might be the cause for the weak performance.
The impact of spatial variability of precipitation on model run-off showed that standard uniform rainfall assumptions can lead to large uncertainties in run-off estimation (Faurès et al., 2000).Several studies evaluating the CFSR data set have suggested that climatic models tended to overestimate interannual variability but underestimate spatial and seasonal variability (Diro et al., 2009).In another study, Cavazos and Hewitson (2005) performed statistical downscaling of daily CFSR data with Artificial Neural Networks, and their predictions showed low performance in near-equatorial and tropical locations, which led them to conclude that the CFSR data are most deficient in locations where convective processes dominate.Another study found the CFSR data set performed well on a continental scale but that it failed to adequately reproduce some regional features (Poccard et al., 2000).A study in China performed streamflow simulations by SWAT using different precipitation sources in a large arid basin using rain gauge data combined with Tropical Rainfall Measuring Mission (TRMM) data (Yu et al., 2011).The study established that streamflow modelling performed better using a combination of TRMM and rain gauge data, as opposed to data from rain gauges only.Different interpolation schemes with the use of univariate and covariate methods showed that Krig-ing and inverse distance weighting performed similarly well when used with the SWAT model (Wagner et al., 2012).
In this paper, WLRC (Water and Land Resource Centre) and SCRP (Soil Conservation Research Programme) rainfall data (hereafter called WLRC data) are compared to CFSR data over a maximum period of 34 years from 1981 to 2014 (Maybar, 33 years for Andit Tid and 32 years for Anjeni).The main objective of this paper is to compare the two data sets for annual, interannual, and seasonal cycles and subsequently to compare the effects on discharge and soil loss modelling when using these data sets in three locations in the Ethiopian Highlands (see Fig. 1).Calibrated CFSR modelled discharge and soil loss is then compared to calibrated WLRC modelled discharge and soil loss, and the applicability of the CFSR data in small-scale catchments for hydrological predictions is statistically evaluated and compared.

Methods and materials
The effects of spatial and temporal variability in the CFSR rainfall data set for the study area were examined in several steps.First the CFSR data were statistically compared to measured WLRC rainfall data for accurate representation of annual, interannual, and seasonal cycles.This is important because temporal occurrence of rainfall has a great impact not only on discharge, but moreover on sediment yield generation.Many crop types are sown at the beginning of the rainy season(s), which implies extensive ploughing beforehand, which leaves fields unprotected for the first few rainfall events.Hence, it is clear that temporal occurrences of annual, interannual and seasonal cycles play a crucial role for the validation of a data set like the CFSR climatic data.Second, the impact of spatial and temporal variability of rainfall on hydrology and soil loss was assessed by modelling discharge and soil loss with the SWAT model.The SWAT model was calibrated for discharge once using WLRC climatic data and once using the CFSR climatic data set.Afterwards soil loss was calibrated for each catchment.In a last step, discharge and soil loss on a monthly basis were statistically and visually compared using performance ratings established by Moriasi et al. (2007).Each model was calibrated with one to five iterations using 500 simulations each.

Study area
The study areas of the three micro-scale catchments are located in the eastern and central part of the Blue Nile Basin.The Anjeni (AJ) and the Andit Tid (AT) are sub-basins of the Blue Nile Basin, which drains towards the west into the main Nile at Khartoum.The Maybar (MA) catchment drains into the Awash River to the east of the Ethiopian Highlands.The catchment sizes range from 112 to 477 ha and their altitudinal ranges extend from 2406 to 3538 m a.s.l.(see Table 1 for details).The catchments have a subhumid to humid climate with an annual temperature ranging from 12 to 16 • C and a mean annual rainfall ranging from 1211 to 1690 mm.Anjeni has a unimodal rainfall pattern with a main rainy season from June to September, while Andit Tid and Maybar have a bimodal rainfall regime with a small rainy season from April to May (belg) and a main rainy season from June to September (kremt) followed by a long dry season from October to March.Land use is dominated by smallholder rain-fed farming systems with grain-oriented production, ox-plough farming, and uncontrolled grazing practises.

Hydrometeorological data
The hydrometeorological data consist of two sets.The conventional or measured data contain daily rainfall and maximum and minimum temperature from one climatic station for each watershed (Lambrecht Rain Gauge Hellman type with chart recorder, and thermometers).These climatic stations were installed in the early 1980s and span the period until 2014 with some larger gaps (see

Hydrologic model
SWAT (SWAT2012 rev.620) was used to assess the impact of different rainfall patterns on run-off and soil loss dynamics through the ArcSWAT interface (Version 2012.10_1.14).Here, we present the SWAT model only briefly, as it has been widely used in the past, with an extensive review of its performance and parametrization in the United States, China, Switzerland, Kenya, Ethiopia, and other countries (Gessesse et al., 2014;Mbonimpa, 2012;Betrie et al., 2011;Tibebe and Bewket, 2011;Lin et al., 2010;Stehr et al., 2008;Schuol and Abbaspour, 2007).SWAT is a physically-based river basin or watershed modelling tool.The SWAT model requires specific information about weather, soil properties, topography, vegetation, and land management practices occurring in the watershed (Arnold et al., 2012).ArcSWAT divides the catchment into hydrological response units (HRUs) based on unique combinations of soil type, land use, and slope classes that allow for a high level of spatial detail simulation.Run-off is predicted separately for each HRU and routed at sub-basin level to obtain the total run-off for the watershed (Neitsch, S. L. et al., 2011).The surface run-off is estimated in the model using one of two options: (1) the Green and Ampt method (Green and Ampt, 1911) or ( 2) the Natural Resources Con- Peak rate adjustment factor for sediment 0 to 2 0.9 to 1.1 1.2 to 1.6 0.89 to 1.2 routing in the main channel * a__ means a given value is added to the existing parameter value.* * v__ means the existing parameter value is to be replaced by a given value.
veys carried out by SCRP and WLRC through land use mapping and interviews and by own surveys in 2008 and 2012.To adapt to annually changing land use patterns, a generic map was adapted from the WLRC land use maps of 2008, 2012, 2014(Anjeni), and 2010, 2012, 2014 (Andit Tid, Maybar).

SWAT model setup
The watersheds were delineated using the ArcSWAT delineation tool and its stream network compatibility was checked against the stream network from satellite images (one satellite image for each watershed).SWAT compiled 1038 HRUs for Anjeni, 1139 HRUs for Maybar, and 728 HRUs for Andit Tid respectively.Using a threshold with this kind of combination of small catchments in combination with highly detailed land use maps would have decreased the available level of information and increased uncertainty for modelling.Therefore HRUs were defined using a zero percentage threshold area, which means that all land use, soil, and slope classes were used in the process.
The CFSR time series were complete from 1979 to 2014.The WLRC data had substantial gaps in the time series, mostly in the early 1990s and after 2000 (see Table 1 for details).The SWAT weather generator was used to fill the gaps in the WLRC data set for rainfall and temperature.Otherwise daily precipitation and minimum and maximum temperature data were used to run the model.Potential evapotranspiration was estimated using the Hargreaves method (Hargreaves and Samani, 1985).Daily river flow and sediment concentration data were measured at the outlet of the three WLRC watersheds.The flow observations are available throughout the entire year, while calculated sediment concentrations from grab samples are only available during rainstorm events and are extrapolated over the whole time period.Personnel at the research station are instructed to take grab samples only during rainfall events, when the river is turning brown.Grab samples are taken by hand with 1 L bottles which are then filtered through ashless filter papers (retention capacity 12-25 µm).The filtered sediment samples are later transported to their respective research centres which oven-dry and weight them.Sampling frequency is every 10 min at rising water levels and every 30 min after peak water level.The planting and harvesting times were averaged over the entire period and planted at similar dates for the entire simulation.To simulate crop growth, we used the heat unit function in ArcSWAT.Teff (Eragrostis tef), a widely cultivated and highly nutritional crop native to Ethiopia, was planted in the beginning of July and harvested in the beginning of December with several tillage operations preceding planting.Tillage operations were adapted to the usage of the traditional Ethiopian plough called Maresha according to Temesgen et al. (2008), with a tillage depth of 20 cm and a mixing efficiency of 0.3.
Run-off was estimated using the SCS-CN method and flow routing was estimated using the variable storage coefficient method.The model was run for 32 years from 1983 to 2014 with daily data inputs but monthly outputs.Calibration and validation periods were chosen equally balanced regarding high-flow and low-flow years in all three catchments.The model was first calibrated and validated for discharge and then calibrated and validated for soil loss (see Table 1 for details).

Calibration setup, validation, and sensitivity analysis
The SUFI-2 algorithm in SWAT-Cup (Abbaspour et al., 2004(Abbaspour et al., , 2007) ) was used for the calibration and validation procedure and for sensitivity, and uncertainty analysis.SWAT-Cup calculates the 95 % prediction uncertainty band (95 PPU) in a iterative process.For the goodness of fit, two indices called p factor and r factor are used.The p factor is the fraction of measured data inside the 95 PPU band, and varies from 0 to 1 where 1 indicates perfect model simulation.The r factor is the ratio of the average width of the 95 PPU band and the standard deviation of the measured variable.There are different approaches regarding the balance of the p factor and r factor.The p factor should preferably be above 0.7 for discharge and the r factor value should be below 1.5 (Abbaspour, 2015), but when measured data are of lower quality, other values apply.Once an acceptable p factor and r factor are reached, statistical parameters for time series analysis are compared.
For this study we used the Nash-Sutcliffe efficiency (NSE), the Root Mean Square Error-observations standard deviation ratio (RSR), and the percent bias (PBIAS).These are well-known statistical parameters, which are often used for comparison of time series, especially in hydrological modelling (Starks and Moriasi, 2009;Gebremicael et al., 2013;Dile and Srinivasan, 2014;Abbaspour, 2015;De Almeida Bressiani et al., 2015), and therefore help others to compare our modelling results to previous studies.This study refers to the model evaluation techniques described by Moriasi et al. (2007), who established guidelines for the proposed statistical parameters (see Table 3 for details).The NSE is a normalized statistic that indicates how well a plot of observed versus simulated data fits the 1 : 1 line and determines the relative magnitude of the residual variance compared to the measured data variance (Nash and Sutcliffe, 1970).NSE ranges from −∞ (negative infinity) to 1, with a perfect concordance of modelled to observed data at 1, a balanced ac-curacy at 0 and a better accuracy of observations below zero.The RSR is a standardized root mean square error (RMSE, standard deviation of the model prediction error), which is calculated from the ratio of the RMSE and the standard deviation of measured data.RSR incorporates the benefits of error index statistics and includes a scaling factor.RSR varies from the optimal value of 0, which indicates zero RMSE or residual variation, to a large positive value, which indicates a large residual value and therefore worse model simulation performance (Moriasi et al., 2007).
The PBIAS measures the average tendency of the simulated values to be larger or smaller than their observed counterparts.The optimal value of PBIAS is zero.PBIAS is the deviation of data being evaluated, expressed as a percentage.A positive PBIAS value indicates the model is underpredicting measured values, whereas negative values indicate over-predicting.
For this article, the recommendations for reported values were strictly applied for discharge calibration and lowered for soil loss calibration.
The model performance was also evaluated using the hydrograph visual technique, which allows a visual model evaluation overview to be made.As suggested by Legates and McCabe (1999) this should typically be one of the first steps in model evaluation.Adequate visual agreement between observed and simulated data was compared on discharge and soil loss plots on a monthly basis.

General comparison of CFSR and WLRC rainfall data
The raw CFSR and WLRC rainfall input data showed significantly different patterns and rainfall amounts.For Andit Tid, situated on the eastern escarpment of the Blue Nile Basin, the belg and kremt rainfall seasons were temporally adequately represented; i.e. the timely occurrences of the rainy seasons were correctly represented through the CFSR data.However, total CFSR rainfall amounts were far from measured values: while the belg rainfall season in the CFSR data showed some overestimation, the total rainfall and length of the kremt rainy season were strongly underestimated The CFSR data for Anjeni highly overestimated rainfall in the region.While WLRC data showed a clear trend towards only one main rainy season from May/June to September with average monthly rainfall ranging from 100 mm (May) to 380 mm (July), the CFSR data showed a pronounced main rainy season with monthly averages ranging from 400 to 1000 mm from June to September and a distinct small rainy season from March to May with monthly averages 3 times as high as the WLRC rainfall data.The total annual CFSR rainfall was 3 times the WLRC annual rainfall.
WLRC Maybar data showed a clear seasonality, with two rainy seasons, one in March and April, and one from July to August.The belg rainy season showed only mild increase of average rainfall to around 75 mm month −1 and the kremt rainy season showed a distinct increase of rainfall to an average of 270 mm month −1 .From the CFSR rainfall data, no clear distinction could be made between the belg and the kremt rainy season -both showed a rainfall increase to around 150 mm month −1 and the total annual rainfall was strongly underestimated.
In general, all CFSR rainfall patterns showed a similar composition; data variability was more uniformly distributed and the distinct seasonality of the WLRC data were not well represented.CFSR data underestimated the bimodal rainfall climates and strongly overestimated the unimodal rainfall climate.The WLRC data have a highly variable rainfall range in the bimodal rainfall locations, which is not reflected by the CFSR data.In general, the CFSR rainfall data do not represent the high variability of rainfall measured by WLRC data.

Seasonal comparison of rainfall data
The seasonal components of the CFSR rainfall were assessed for the three stations by breaking the monthly data into seasons (dry season from October to March, small rainy season (belg) from April to May, and large rainy season (kremt) from June to September) and by comparing these only.The comparison of measured rainfall to modelled rainfall for the dry season from October to March was unsatisfactory (NSE < 0.50) with negative NSEs for three stations (AT: −1.92, AJ: −12.19,MA: −0.77).The PBIAS indicated model underestimation for Anjeni and Maybar (AJ: 134.2, MA: 30.7) and an overestimation of the rainfall for Andit Tid (AT: -55.2).The RSR showed large positive values (AT: 1.68, AJ: 3.55, MA: 1.3), indicating a low model simulation performance and again an unsatisfactory rating (see table ).
For the belg rainy season from April to May, the model performed badly.Surprisingly, the model performed worst in Anjeni, where no small rainy season occurs.The CFSR model performance for Anjeni was unsatisfactory, with an NSE of −5.42, a PBIAS of 106.1, and an RSR of 2.48.The CFSR model overestimated the monthly rainfall in all but 5 out of 22 years.Andit Tid and Maybar were slightly more adequate but still unsatisfactory.NSE was −0.79 and −0.24 respectively, indicating unsatisfactory performance.PBIAS was −39.4 and 24.3, respectively.RSR was 1.31 and 0.85, which again indicates an unsatisfactory result.
The kremt rainy season from June to September is the season with the heaviest rainfall throughout the year.On average some 77 % of the yearly rain falls within this time period.This is also the time period where the heaviest soil erosion occurs induced by rainfall.For Anjeni, Andit Tid, and Maybar, the CFSR model performed unsatisfactorily (see Table 4) with NSEs below 0.50 (AT: −9.79, AJ: −50.09,MA: −3.28), RSRs above 0.70 (AT: 3.23, AJ: 7.0, MA: 2.03), and PBIAS values ranging from −69.2 (AT) and −47.1 (MA) to +128 (AJ).
The kremt rainy season was underestimated by the CFSR model for the bimodal rainfall pattern in Andit Tid and Maybar, while the unimodal rainfall pattern was heavily overestimated by the CFSR model.

Discharge modelling with WLRC and CFSR data
The performance ratings for each of the three catchments including SWAT-Cup p factor and r factor are summarized in Table 5.The table is divided into discharge comparison and soil loss comparison.Final parameter ranges are presented in Table 2.

Andit Tid
Calibration of Andit Tid with WLRC rainfall data yielded very good results.With a p factor of 0.71 and an r factor of 0.53 (see Sect. 2.6 for performance rating) the statistical parameters RSR, NSE, and PBIAS yielded very good results (0.46, 0.79, 3.1 respectively).The CFSR rainfall data, which underestimated the WLRC rainfall pattern, yielded unsatisfactory results with RSR, NSE, and PBIAS of 0.80, 0.36, and 31.4.The hydrograph shows that the underestimation of rainfall amounts for Andit Tid did result in a constant underestimation of peak flows and of base flows throughout the whole time period.Validation of discharge for Andit Tid with WRLC data showed very good results with RSR: 0.46, NSE: 0.79, and PBIAS 9.6 and marginally unsatisfactory results for the CFSR data set (RSR: 0.74, NSE: 0.45, PBIAS: 37.9).

Anjeni
Anjeni showed very good results for calibration with WLRC rainfall data.RSR, NSE, and PBIAS were well inside the op-timal performance ratings (0.39, 0.85, and 3.7 respectively); see Table 3 and Fig Satisfactory calibration could not be reached with CFSR data and neither baseflow, nor peaks could be adequately represented.With a p factor of 0.49 and an r factor of 1.91 the statistical parameters were unsatisfactory (RSR: 2.70, NSE: −6.27, and PBIAS: −226.0).The hydrograph (Fig. 3) shows that the strong overestimation of CFSR rainfall data during belg led to a modelled discharge with extreme peaks during kremt, which do not correspond to the discharge regime of measured WLRC data.
Validation of discharge for Anjeni with WRLC data showed very good results with RSR: 0.41, NSE: 0.83, and PBIAS −6.7 and unsatisfactory results for the CFSR data set with RSR: 1.24, NSE: −0.53, and very good PBIAS: 8.1.

Maybar
Calibration of Maybar with WLRC rainfall data proved to be less straightforward than Anjeni and Andit Tid.The rugged topography of Maybar combined with a inadequate cross section proved challenging to model.Nonetheless, satisfactory results were achieved for discharge with RSR, NSE, and PBIAS of 0.63, 0.60, and −23.4 respectively.
The CFSR rainfall data yielded an unsatisfactory discharge simulation result with RSR, NSE, and PBIAS.As the CFSRmodelled rainfall shows two similar rainy seasons where WLRC rainfall data have distinct belg and kremt rainy seasons, SWAT modelled discharge showed similar trends.Figure 3 shows regular discharge peaks from February to March, in accordance to rainfall pattern deviation as seen in Fig. 2, when no increase of discharge was measured at the research station.The SWAT model reflected input rainfall pattern adequately, which led to discharge peaks during belg, when there are none in the measured data.At the same time it leads to reduced discharge peaks during kremt, when the measured WLRC data are clearly pronounced.
Validation of discharge for Maybar with WRLC data showed good results with RSR: 0.56, NSE: 0.74, and PBIAS 17.3 and unsatisfactory results for the CFSR data set with RSR: 0.98, NSE: 0.04, and very good PBIAS: −1.9.

Soil loss modelling with WLRC and CFSR data
Soil loss modelling was calibrated using the same set of nine parameters for each catchment (see Table 2 for description).Calibration of soil loss was conducted using the parameter ranges for discharge calibration, and adapting the sediment parameters while leaving discharge parameters untouched.Performance ratings for each of the three catchments including SWAT-Cup p factor and r factor are summarized in Table 5 and visually represented in Fig. 4. Performance rating levels were considerably lowered for soil loss modelling.The threshold for the p factor was set at 0.40 with an r factor be-  low 1.80 and standard performance ratings for RSR, NSE, and PBIAS.

Andit Tid
The good results from WLRC discharge modelling facilitated soil loss calibration and resulted in satisfactory perfor-  mance ratings for RSR and NSE (0.69, 0.65), and an unsatisfactory PBIAS, which was slightly below the threshold at −56.3.The graphic representation showed good visual results (see Fig. 4) in general, but also showed constant overes-timation of the modelled data except for the 3 years of 1988, 1989, and 1994.Validation of sediment yield for Andit Tid with WRLC data showed a marginally satisfactory result with RSR: 0.68, NSE: 0.51, and unsatisfactory PBIAS: −64.3, indicating a general overestimation and unsatisfactory results for the CFSR data set with RSR: 1.39, NSE: −0.94, and satisfactory PBIAS: −11.9, indicating underestimation.

Anjeni
Soil loss modelling with WLRC rainfall data and calibrated discharge yielded satisfactory results.With a p factor of 0.40 and an r factor of 0.65, and statistical parameters RSR: 0.67, NSE: 0.55, and PBIAS: −19.9, the model was just satisfactory.The graphic showed adequate results with a constant overestimation of the model except for 2 years in the early nineties.Modelling with CFSR data resulted in strongly unsatisfactory results (RSR: 1.01, NSE: −0.02, and PBIAS: −33.9), which can easily be explained with the strong model overestimation of rainfall and subsequently, discharge.Parameters could not be adapted further to achieve better results as they were already set to the edge of the possible ranges.
Calibration in Maybar with CFSR rainfall data yielded unsatisfactory results (RSR: 1.02, NSE: −0.03, PBIAS: 54.4).As described in the discharge calibration section (Sect.3. charge during belg and underestimation during kremt.This trend was redrawn with sediment calibration, resulting in small but distinct peaks during belg and smaller peaks than measured during kremt.There was no satisfactory calibration possible with CFSR rainfall data.Validation of sediment yield for Maybar with WRLC data showed satisfactory results for both data sets with a very strong overestimation from the CFSR data set and an equally strong overestimation from the WLRC data set.

Conclusions
In this paper we studied the applicability of CFSR weather data to three small-scale watersheds in the Ethiopian Highlands with the goal of assessing the usability for future modelling in data-scarce regions.First, we compared CFSR and WLRC rainfall data at three stations in the Ethiopian Highlands and therefore rainfall data were compared on a monthly basis with box plots.Second, we modelled discharge with the SWAT model, once with WLRC data and once with CFSR rainfall data.Third, we modelled soil loss for the three stations with the SWAT model and compared calibrated results to measured data.The WLRC rainfall data set resulted in three calibrated and validated discharge models, while the CFSR data resulted in none.For the soil loss modelling the WLRC rainfall data resulted in two out of three calibrated and validated models while none could be adequately calibrated or validated for the CFSR data set.The SWAT modelling showed that CFSR rainfall pattern and rainfall yearly total amount variations were so significant that SWAT model calibration could not adequately represent measured discharge and sediment yield.
Our results clearly show that adequate discharge and soil loss modelling was not possible in the present case with the CFSR data.This suggests that SWAT simulations in smallscale watersheds in the Ethiopian Highlands do not perform well with CFSR data in every case, and that sometimes there is no substitute for high-quality conventional weather data.Such weather data -with high spatial and temporal climatic data resolution -were available for the three small-scale catchments used in the study but are not in many other cases.In these other cases, one should carefully check CFSR data against similar climatic stations with conventionally measured data.In addition, discharge and soil loss modelling showed that usage of CFSR weather data not only resulted in substantial deviation in both total discharge and total soil loss, but also in the seasonal rainfall pattern.The seasonal weather pattern is one of the major drivers of soil loss and is especially pronounced in the Blue Nile Basin, with one long rainy season occurring as fields are ploughed and sown.Thus, contrary to previous studies for the Ethiopian Highlands, this study suggests that CFSR data may not be applicable in any case for small-scale modelling in data-scarce regions; the authors even suggest that outcomes of SWAT modelling with CFSR data for small-scale catchments may yield erroneous results which cannot be verified and may lead to wrong conclusions.Nonetheless, the advantage of CFSR data is their completeness over time, which would allow for comprehensive watershed modelling in regions with no conventional weather data or with longer gaps in conventionally recorded rainfall records.

Figure 1 .
Figure 1.Map overview of Blue Nile (Abbay) Basin with the WLRC research stations, agro-ecological zones according to Hurni (1998), and emplacements of CFSR stations.

Figure 3 .
Figure 3. Modelled SWAT discharge compared to measured discharge (blue) for WLRC (violet) and CFSR (pink) input data and 95 % prediction uncertainty (light blue).Each sub-figure contains the calibration and the validation period.Results are given in m 3 s −1 .

Figure 4 .
Figure 4. Modelled SWAT soil loss compared to measured soil loss (blue) for WLRC (red) and CFSR (green) input data and 95 % prediction uncertainty (light blue).Each sub-figure contains the calibration and the validation period.Results are given in tons per month (t month −1 ).

Table 1 .
Description of study sites, data sources, and time series and gaps.The subdivision of data relates to calibration and validation periods.

Table 2 .
SWAT parameters used for discharge and soil loss calibration with initial ranges and fitted final parameter ranges.

Lemann: Comparing CFSR and conventional weather data with SWAT 927 show
a main rainy season from July to September and a light rainy season from March to May, while the CFSR data only show mildly increased rainfall in March, April, July, and August but no distinct rainy season (see Fig.2for comparison).

Precipitation distribution (1979-2014) Figure 2.
Monthly CFSR and WLRC rainfall distribution of all station as box plots with monthly rainfall distribution.CFSR data from 1979 to 2014 and WLRC data from 1981/1982/1984 to 2014 are shown.See Table 1 for details.

Table 4 .
Seasonal comparison of rainfall time series of daily rainfall amounts.Satisfactory performance ratings are highlighted in bold.Details for duration and gaps can be found in Table1.

Table 5 .
Calibration and validation results of monthly CFSR-and WLRC-modelled discharge and soil loss.Values that meet at least the satisfactory criteria are highlighted in bold.