Scenario approach for the seasonal forecast of Kharif flows from Upper Indus Basin

Snow and glacial melt runoff are the major sources of water contribution from the high mountainous terrain of Indus river upstream of the Tarbela reservoir. A reliable forecast of seasonal water availability for the Kharif cropping season (April – September) can pave the way towards the better water management and subsequently boost the agro-economy of Pakistan. The use of degree-day models in conjunction with the satellite based remote sensing data for the forecasting of seasonal snow and ice melt runoff has proved to be a suitable 10 approach for the data scarce regions. In the present research, Snowmelt Runoff Model (SRM) has not only been enhanced by incorporating the “glacier (G)” component but also applied for the forecast of seasonal water availability from the Upper Indus Basin (UIB). Excel based SRM+G takes into account of separate degree-day factors for snow and glacier melt processes. All year simulation runs with SRM+G for the period 2003 – 2014 result in an average flow component distribution of 53%, 21%, and 26% for snow, glacier and rain respectively. 15 The UIB has been divided into Upper and Lower part because of the different climatic conditions in the Tibetan plateau. The scenario approach for seasonal forecasting, which like Ensemble Streamflow Prediction uses historic meteorology as model forcings, has proven to be adequate for long-term water availability forecasts. The accuracy of the forecast with a MAPE of 9.5% could be slightly improved compared to two existing operation forecasts for the UIB and the bias could be reduced to -2.0%. However, the association between forecasts and observations as 20 well as the skill in predicting extreme conditions is rather weak for all three models, which motivates further research on the selection of a subset of ensemble members according to forecasted seasonal anomalies.


Introduction
Mountains are the water towers of the world.They are the biggest resource of freshwater to half of the world's population fulfilling their needs for irrigation, industry, domestic and hydropower applications (Viviroli et al., 2007).The Indus River on which Pakistan's socio-economic development depends, can be termed as the bread basket of Pakistan (Clarke, 2015).Due to an agrarian economy, Pakistan's agriculture share in water usage is about 97 %, which is well above the global average of about 70 % (Akram, 2009).In Pakistan, the Indus River System Authority (IRSA) decides the provincial water shares according to the Water Apportionment Accord (WAA) of 1991 and provincial irrigation departments subsequently determine the seasonal water allocation to the different canal command areas depending upon the water availability forecast carried out at the end of March for the forthcoming Kharif cropping season (April-September).A reliable seasonal forecast of the water availability from snow and glacial melt is therefore of utmost importance for agricultural production and efficient water management.
On the other hand, snowmelt runoff modelling in mountainous regions faces the challenge of data scarcity as well as the uncertainty in parameter calibration (Pellicciotti et al., 2012).The need of the hour is to not only develop such a hydrological model which has the capability to cater for both snow and glacial melt components but also a reliable forecast technique which could help water managers and policy makers to enhance water resources management in the future.The present paper focuses on the implementation of the Snowmelt Runoff Model (SRM) including the glacier melt component (+G) based on the methodology proposed by Schaper et al. (1999), which is an important value addition to the existing ExcelSRM version (Bogacki and Hashmi, 2013) of the WinSRM (Martinec et al., 2011) model.In the earlier studies on the Upper Indus Basin (UIB) and its subcatchments, e.g.Immerzeel et al. (2010a), Tahir et al. (2011), Butt and Bilal (2011), and Adnan et al. (2016), they have only used the SRM standard version, while glaciers are dealt with by taking them as a part of the snow covered area.The underestimation of flows in periods associated with the glacier melt contribution, as pointed out by Tahir et al. (2011), has now been dealt with by incorporation of a glacier melt component.A unique methodology has been adopted to deal with the early fading of snow cover area from the Tibetan Plateau by separating the whole UIB into two sub-catchments, which is not implemented in the original WinSRM model.
Ensemble Streamflow Prediction (ESP), developed at the U.S. National Weather Service (Day, 1985), is widely used to generate probabilistic long-term stream-flow forecasts.As already successfully applied in the Upper Jhelum basin (Bogacki and Ismail, 2016), a scenario approach is used for seasonal flow forecasting in the UIB, which has much similarity to ESP.It also uses historical meteorology as model forcings; however, like the other operational forecast models for UIB, it is mainly focussed on a deterministic forecast of total Kharif inflow to the Tarbela reservoir.
2 Materials and methods

Study area
The upper catchments of the Indus River basin (Fig. 1) primarily feed the Tarbela reservoir, which is the larger of the only two major reservoirs in Pakistan.The UIB has an area of about 173 345 km 2 , of which approx.11.5 % is covered by perennial glacial ice (Tahir et al., 2011).At the end of most winters, nearly the entire UIB above 2200 m a.s.l. is covered with snow, resulting in more than 60 % of annual flow in the Upper Indus River to originate from snowmelt (Bookhagen and Burbank, 2010).The distribution of monthly inflows into the Tarbela reservoir (see Fig. 2) shows that these flows tend to rise progressively as melting temperatures advance into areas of higher snowpack at the higher elevations.The Indus River starts rising gradually in March reaching its maximum in July, while peak flood events usually occur during the monsoon season during July-September.By the end of July, the flows reduce due to the diminished snow cover in the lower catchment and glacier melt becomes an important flow component in the late summer months.This is due to the first melting of their seasonal snow cover and, when the snow has vanished, melting of the glacier ice.According to Tahir et al. (2011), glacial melt dominates the flows of the largest tributaries of the Indus River, i.e. the Chitral, Gilgit, Hunza, Braldu, and Shyok rivers.

Model structure
The Snowmelt Runoff Model (SRM; Martinec, 1975) is a semi-distributed, lumped temperature-index model which is specifically designed to simulate the runoff in snowdominated catchments that has been successfully applied in more than one hundred snow-driven basins around the globe (Martinec et al., 2011).Input variables of the SRM are daily values of air temperature, precipitation, and snow covered area.The catchment is usually subdivided into elevation zones of about 500 m each and the input variables are distributed accordingly.The total daily amount of water produced from snowmelt and rainfall in the catchment is superimposed on the calculated recession flow according to Eq. ( 1): where Q is the average daily discharge (m 3 s −1 ), M and R are the daily runoff depths originating from snowmelt and rainfall (cm d −1 ), A is the total area of the elevation zone (km 2 ), k is the recession coefficient (-), n is the index of the simulation day, and i and m are the indices and total number of elevation zones, respectively.Daily runoff from snowmelt and rainfall is calculated by Eqs. ( 2) and (3): where c S and c R are the runoff coefficients (-) for snowmelt and rain, respectively, a S is the degree-day factor for snow (cm • C −1 d −1 ), T the degree-days ( • C d) for each elevation zone, S the ratio of the snow covered area to the total area (-), and P is the daily precipitation (cm d −1 ).Schaper et al. (1999) introduced an enhancement in the original SRM approach by incorporating the separate glacial melt component in the model.In addition to the variables used by SRM, it also considers the area covered by exposed glaciers, i.e. not snow covered.An additional melt component is added to Eq. ( 1) that takes into account the specific degree-day factors for glaciers according to Eq. (4): where G is the daily melt (cm d −1 ) from exposed glaciers in each elevation zone, c G is the runoff coefficient (-), a G is the degree-day factor (cm • C −1 d −1 ) for glaciers, and S G is the ratio of the exposed glacier area to the total area (-).This model was tested in several basins and was found to be highly accurate even in basins with 67 % glacier area.The three alpine basins were Rhine-Felsberg, Rhône-Sion, and Ticino-Bellinzona in Switzerland (Schaper and Seidel, 2000).Apart from the improvement in the runoff modelling, the independent computation of glacier melt is an important step towards evaluations of glacier behaviour with regard to climate change.
The glacier melt component according to Eq. ( 4) was incorporated into the existing ExcelSRM (further referred to as SRM+G).This extension requires the glacier exposed area as an additional daily input variable and respective model parameters as given in Eq. ( 4).
An additional enhancement is the possibility to split the watershed into different sub-catchments.This feature is realised by adding the pre-calculated outflow of a subcatchment obtained by a separate simulation to the discharge of the downstream sub-catchment.The travel time can be considered by applying a time lag to the daily discharge time series.

Splitting the UIB into two sub-catchments
In the Karakorum-Western-Himalayas region, snow accumulates during winter and reaches its maximum extent in February or March.Higher altitudes typically have a 90-100 % snow cover that stays more or less constant until melting starts in spring.There is, however, a characteristic bias between the north-western part of the UIB, where at altitudes above 4000 m a.s.l. the snow covered area usually starts gradually decreasing in March, and the south-eastern part, namely the Tibetan Plateau, where at the same altitudes snow cover is fading away rapidly.This bias leads to an inevitable underestimation in forecasting the snowmelt-dominated early Kharif flows (see Sect. 3.1), which motivates the splitting of the UIB into two sub-catchments.
Ideally, the catchment should be split directly downstream of the Tibetan Plateau.However, because the first gauging station where daily flow data were available is the Kharmong gauging station (when the Upper Indus River has entered into Pakistan), this location was chosen to split the UIB into upstream and downstream sub-catchments (namely the Upper and Lower UIB; Fig. 1).The hypsometric characteristics including the number of elevation zones and their corresponding areas of both sub-catchments are shown in Fig. 3.
According to the two sub-catchments, two separate SRM+G models were created.For each simulation, first the Upper UIB model is run in order to simulate flows at Kharmong.These flows are then superimposed onto the flows calculated by the Lower UIB model using a time lag between Kharmong and Tarbela that was estimated by Kirpich (Eq. 5;Kirpich, 1940;USDA, 2010).
In this empirical equation, the time of concentration t (min) is only related to the length of the main channel L (m) and the slope of the longest hydraulic length S (-).Given the altitudes of Kharmong and Darband (upstream Tarbela reservoir) gauging stations as 2542 and 436 m a.s.l., respectively, and a channel length 1 of about 617 km, the approximated time lag of 5000 min was finally rounded to 3 days.

Data sources
There are a number of high-elevation climate stations in the Pakistani part of the UIB operated by WAPDA's 2 Glacier Monitoring and Research Centre (GMRC) and the Pakistan Meteorological Department (PMD).However, they are concentrated on the western part of the UIB and data is not available online.In order to have the most recent data for operational flow forecasting, the World Meteorological Organization (WMO) climate station at Srinagar airport located at an altitude of 1587 m a.s.l. was chosen as temperature base station, which already had proven to give representative temperatures for that region in the SRM model of the Upper Jhelum catchment (Bogacki and Ismail, 2016) and a full set of climatic data can be obtained online from the GSOD 3 database with a time lag of about 2 days only.Based on the daily air temperature data, degree-days in each elevation zone were calculated using a constant temperature lapse rate of −6 • C km −1 .The MODIS/Terra Snow Cover Daily L3 Global 500 m Grid (MOD10A1) product 4 has been used to determine the daily snow covered area in the elevation zones.The compatibility of using MODIS data in conjunction with SRM in the Himalayas and its surroundings has already been investigated by Immerzeel et al. (2009Immerzeel et al. ( , 2010b)).As the MODIS sensor cannot detect snow below clouds, a cloud elimination algorithm is applied using temporal interpolation between two cloud-free days for each pixel.Afterwards the daily percentage of snow cover area in each elevation zone is calculated and smoothed by moving average.
At the beginning of the melting season, glaciers are usually completely covered by fresh snow.As the melting season progresses, the snow cover will fade away and glacier exposed area will increase.The actual glacier extent was derived from two data sources.As a major source on global glacier distribution, the Global Land Ice Measurements from Space (GLIMS) data archive was used (Raup et al., 2007).This data was complemented by interpretation of Landsat 8 scenes (30 m spatial resolution) from late summer to early fall 2013, in order to identify the maximum of the glacier exposed area.The merged data were mapped on the 500 m 1 Digitised from Esri's World Imagery.Sources: Esri, Digital-Globe, GeoEye, i-cubed, USDA, USGS, AEX, Getmapping, Aerogrid, IGN, IGP, swisstopo, and the GIS user community MODIS grid.On a daily basis, the glacier exposed area is determined by all pixels that are classified as glacier but not identified as snow by the MODIS sensor.
A spatial interpolation of in situ (station) precipitation data in mountainous regions is particularly difficult and often biased towards lower values (Archer and Fowler, 2004) as the rain gauge network is usually sparse and mainly located at the valley floors, while maximum precipitation occurs on mountain slopes and increases with altitude in general.A promising alternative to station data are gridded, remotesensing-based precipitation products.However, regional and temporal patterns as well as multiannual means of these products differ significantly in the Himalayas (Palazzi et al., 2013).In particular, the widely used TRMM dataset is known to underestimate the precipitation at high altitudes, as found in the UIB (Forsythe et al., 2011) or the Andes (Ward et al., 2011).
Based on our own precipitation product comparisons for the Upper Chenab catchment, the gridded RFE 2.0 central Asia5 daily rainfall product (Xie et al., 2002) is used in the present model.According to SRM's elevation band approach, the gridded data, having a spatial resolution of 0.1 • latitude and longitude, is mapped to the respective elevation zones.For the period 2003-2015, the product yields a mean annual precipitation of 854 and 482 mm year −1 for the Lower and the Upper UIB, respectively, which reflects the significantly lower annual precipitation on the Tibetan Plateau compared to the western Himalayas (e.g.Bookhagen and Burbank, 2010;Ménégoz et al., 2013).The RFE basinwide annual mean of 701 mm year −1 lies well in the range of 675 ± 100 mm year −1 derived for the whole UIB by Reggiani and Rientjes (2015).

Model parameters
The most important parameter of a temperature-index model that is controlling daily snow and glacial melt is the degreeday factor (cm • C −1 d −1 ), which transforms the index variable degree-day ( • C d) into actual melt (cm d −1 ).
In the case of glaciers, a constant degree-day factor of 0.70 cm • C −1 d −1 , as proposed by Schaper et al. (2000), was chosen, which also corresponds to degree-day factors reported from glaciers in the Himalayas at a comparable latitude (Hock, 2003).The approach for degree-day factors for snow is more elaborate.In the first step, optimal degreeday factors were obtained for each elevation zone and year by diagnostic calibration, i.e. by achieving the best possible fit between simulated and observed hydrographs for each year.From this calibration exercise, it appears that degreeday factors are increasing by the time melting has started in a particular elevation zone (Figs. 4 and 5).Because a generalised rule is needed in the forecasting procedure, zone-  wise degree-day factor functions, as suggested by Ismail et al. (2015), were developed by linear regression between the calibrated degree-day factors and time.The increase in the degree-day factors with the passage of time is because the snow absorbs energy due to physical conditions such as increasing temperatures and solar radiation intensities.This process of energy storage plays a pivotal role in the ripening of the snowpack, which melts rapidly as the snow melting season progresses.The extent to which degree-day factors increase is related to the calibration procedure because it was observed during the model calibration that in a certain elevation zone when the degree-day factors attain a certain value, e.g (0.80 cm • C −1 d −1 ), the snow cover area in that very elevation zone has almost completely faded away so there is no advantage in further increasing the values of degree-day factors.The limit of the degree-day factors increase at a certain spatio-temporal region depends upon various physiographic and climatic parameters and research is on-going to evaluate the trend of degree-day factors in response to the aforementioned parameters.
The start of snowmelt and the corresponding application of the developed degree-day factor generalised rule is correlated with a certain threshold temperature (T th ) for each elevation zone (see Tables 1 and 2).The other model parameters required by SRM like temperature lapse rate, recession coefficient, runoff coefficient for snow, lag time, etc., were applied basin-wide and kept constant for all years (see Table 3).The values of these parameters were determined according to the methods described by Martinec et al. (2011) and slightly adjusted to achieve a good fit over the whole calibration period.It has to be noted that these parameter values will differ for other catchments.

Scenario approach for forecasting
In the forecasting period which starts from 1 April, the four model variables temperature, precipitation, snow covered area, and glacier exposed area have to be predicted for the  forthcoming 6 months of the Kharif cropping season (April-September).As the level of skill of seasonal climate forecasts for the Hindukush-Karakoram-Western-Himalaya region for such a lead time is still not sufficient, a scenario approach already successfully applied in the Upper Jhelum catchment (Bogacki and Ismail, 2016) is used.
This scenario approach has a lot in common with traditional ESP methods (developed at the U.S. National Weather Service as a method for generating long-term probabilistic streamflow outlooks; Day, 1985).Based on the assumption that past meteorology is representative of possible future events, ESP uses historical temperature and precipitation time series as forcings for the hydrological model to produce an ensemble of streamflow traces.A probabilistic forecast is created by statistical analysis of the multiple streamflow scenarios produced (Franz et al., 2008).Initial basin conditions are usually estimated by forcing the hydrological model with observed meteorology in a "warm-up" phase up to the time of forecast (Wood and Lettenmaier, 2008).
The seasonal scenario approach also uses historical temperature and precipitation as forcings for the SRM+G model.In contrast to ESP, however, this approach is, like the other operational forecast models for the UIB, primarily focussed on a deterministic forecast of total Kharif flow volume.Besides the "most likely" (median) flow, SRM+G forecasts only give an indication of the bandwidth of expected flows by the dry (20 %) and wet (80 %) quantiles as limits of the "likely" range.
The other notable differences are the initial basin conditions.SRM and SRM+G do not use any initial conditions, like soil moisture state of snow-water equivalent as used in other hydrological models.Instead, however, the snow cover area and the glacier exposed area are input variables to the model.For reasons of simplicity, the glacier exposed area is treated like the meteorological variables, i.e. the historical time series are used.However, the depletion of the snow covered area during the forecast period, which is the decisive factor for each forecast, is predicted by so-called "modified depletion curves".These modified depletion curves are derived from the conventional depletion curves of each elevation zone by replacing the timescale with the cumulative daily snowmelt depth (Martinec et al., 2011).The decline of the modified depletion curves depends on the initial accumulation of snow and represents the actual snow-water equivalent.When initial snow depth is low, the modified depletion curve declines faster than in years when a lot of snow has accumulated.At the end of March, when the seasonal forecast is carried out, an elevation zone already showing some decline in snow covered area, and hence having also some cumulated degree-days, is chosen as a "key zone".Comparing the relationship of decline in snow covered area versus cumulated degree-days with a statistical analysis of the mod- ified depletion curves of previous years, the actual amount of snow is estimated and the future depletion anticipated accordingly, while assuming similar snow conditions for all elevation zones.
The ESP approach usually has the advantage that errors in the initial conditions are progressively superseded by the meteorological forcings.In SRM+G, however, if an erroneous depletion estimate is in effect, then it will persist during the whole forecast period.As all ensemble traces are based on the chosen depletion curves, the initial estimate is crucially influencing each trace of the ensemble in the same direction.

Verification methods
Model verification comprises the simulation model as well as the forecasting model.The accuracy of the simulation model was evaluated by the two standard criteria used in the SRM (Martinec et al., 2011), namely the relative volume difference Eq. ( 6) and the coefficient of determination R 2 Eq. ( 7) where V and V * are the observed and the simulated annual flow volumes,Q i and Q * i are the observed and the simulated daily discharge values, and Q is the average observed daily discharge.
The skill of the forecasting model was assessed in comparison with IRSA's forecasts that are based on a statistical model and with forecasts from the UBC6 watershed model (Quick and Pipes, 1977) that is used by WAPDA's Glacier Monitoring Research Centre.The set of verification metrics was chosen taking into account that the existing operational forecasts for Kharif flows are traditionally issued in the form of deterministic forecasts, thus only the 'most likely' values forecasted by these models are available.
The accuracy of a forecast is a measure of the error between predicted and observed values.The root mean squared error (RMSE; Eq. 8), mean percentage error (MPE; Eq. 9), and mean absolute percentage error (MAPE; Eq. 10) were used as deterministic metrics to assess the accuracy of the predicted mean Kharif flow volumes.
In the above equations, f i is the forecasted and o i the observed flow volume, and n the total number of considered forecasts.Both, RMSE and MAPE measure the average magnitude of the forecast errors, where RMSE penalises larger errors more than MAPE.The mean percentage error measures the deviation between average forecasted and average observed flows, i.e. a positive MPE indicates overforecasting and a negative under-forecasting.As a commonly used deterministic measure of association, the correlation coefficient, Eq. ( 11), was applied to assess the correspondence between forecasted and observed values.
In addition, the uncentered anomaly correlation AC u (Eq.12; Wilks, 2006) was used as another measure of association.
where c is the climatological average value.The anomaly correlation is designed to measure similarities in the patterns of anomalies from the climatological average between forecasted and observed values.An AC ≥ 0.6 is usually regarded as an indication of some forecasting skill (Wilks, 2006).In the present context, the climatology average c is equivalent to the average observed flows o.
The ability of a non-probabilistic forecast to predict extreme conditions is usually assessed by defining discrete categories like below normal, normal, and above normal.The Heidke and Peirce skill scores for multi-categorical forecasts measure the fraction of correct forecasts in each category in relation to those forecasts which would be correct due purely to random chance.The Peirce skill score, Eq. ( 13), is unbiased in the sense that it assigns a marginal distribution to the reference random forecast which is equal to the (sample) climatology (Wilks, 2006).In the above equation, m is the number of categories, p(f j o j ) the joint distribution of forecasts and observations, where p(f j ) and p o j are the respective marginal distributions.
As the existing operational forecasts are primarily designed as point estimates of the mean flow volume and hence a comparison of probabilistic metrics between these models is not possible, only a basic probabilistic evaluation of the SRM+G scenario ensembles was carried out.The ranked probability score RPS was used, which is essentially an extension of the Brier score to the many-event situation (Wilks, 2006).It reflects the overall performance of a multi-category probabilistic forecast (Franz et al., 2003).In order to calculate the RPS, first the quantiles of m categories have to be determined based on given non-exceedance probabilities of the observed values.Then, for each forecast, the ensemble members as well as the observed flow are assigned to these categories and the respective cumulative distributions, Eq. ( 14), are calculated where F i is the cumulative ensemble distribution of forecast i and p j (f i ) is the relative frequency of an ensemble member falling into category j .For each forecast i, there is only one observation o i , hence the category j the observation falling in is given a relative frequency of p j (o i ) = 1 while all others are set to 0. Finally, the RPS (Eq.15) for n forecasts is the average of the sum of the squared differences in the cumulative distributions.

RPS = (15)
The RPS penalises forecasts more severely when their probabilities are further from the actual observations.The relative improvement or skill of a probability forecast over climatology as a reference forecast is assessed by the ranked probability skill score (RPSS; Eq. 16) where RPS ref is the RPS calculated with a constant forecast, e.g. the average of the observed series.
Besides the RPSS as a single-number score for the forecast performance, the reliability diagram (Wilks, 2006) is used to show the full joint distribution p f i , o j of forecasts and observations of a binary predictand in terms of its calibrationrefinement factorisation (Murphy and Winkler, 1987) . . ., n, (17) where the m conditional distributions p o j |f i specify how often each possible observation o j occurred when the particular forecast f i was issued, or in other words how well each forecast f i is calibrated.Forecasts f i that fall near to the 1 : 1 line in the reliability diagram result in a small (good) reliability term of the algebraic decomposition of the Brier score (Murphy, 1973), which is the weighted average of the squared vertical distances.
The other part of the above factorisation is the unconditional (marginal) distribution p(f i ) that specifies how often each of the m possible forecast values occurred.This socalled refinement distribution is visualised by a probability histogram that is also referred to as a sharpness diagram.A distribution with a large spread indicates that different forecasts are issued relatively frequently, and so have the potential to discern a broad range of conditions.Conversely, a narrow distribution, i.e. if most of the forecasts f i are the same or in a similar range, indicates a lack of sharpness (Wilks, 2006).
While the calibration-refinement factorisation relates to a binary predictand, the SRM+G scenario forecasts are grouped into three categories: less than normal, near normal (most likely), and higher than normal.According to Murphy (1972), this N-state situation is handled as a collection of N • m scalar forecasts, thus treating each category as a separate binary forecast f i that either meets or does not meet the observation o j .

Results and discussion
The development of the SRM+G forecasting model for the UIB has been an iterative process with the focus on creating an operational forecasting tool for Kharif flow volumes to the Tarbela reservoir.Thus, not all improvements have been tested individually while not changing the other components, which would allow an independent assessment of the individual effects.Nevertheless, the results are discussed below for each component.

Splitting of the UIB catchment
While the simulation results using a sole model for the whole UIB showed an acceptable agreement between simulated and observed flows in terms of R 2 and D v , initial hindcast results proved to be unsatisfactory, especially for the early Kharif (1 April-10 June) season, which is the major snowmelt contribution period.The mean percentage error MPE between hindcasts and observations of −21.0 % for the years 2003-2014 indicated a severe bias towards underestimating the actual flows and the respective MAPE of 25.9 % was also unexpectedly large.An analysis of MODIS snow cover data indicates that in the south-eastern part of the UIB, namely the Tibetan Plateau, already in March the snow cover is fading away rapidly.On the other hand, in the north-western part of the catchment, the same elevation zone is still widely covered with snow (Fig. 6).In Table 4 the snow cover area of the relevant elevation zones for the south-eastern (Upper) and north-western (Lower) part of the UIB is given on 1 March and 1 April as an example for the year 2003.While at an elevation of 4000 m a.s.l. the snow cover area reduces from 82 to 71 % in the Lower UIB, in the Upper UIB the snow cover area shrinks sharply from 79 to 50 %.A similar behaviour can be observed for most of the other years as well.
As in forecasting mode, the depletion of the snow cover area during the whole forecasting period is predicted depending on the reduction in the "key zone" in March (see Sect. 2.6).The relatively larger depletion in the Upper UIB leads to an underestimation of the available snow-water equivalent for the whole catchment, which explains the subsequent underestimation of early Kharif flows by the initial hindcasts and in turn motivates the splitting of the UIB into two sub-catchments and separate models (see Sect. 2.3).As a result of this splitting, the MPE of the hindcasts for early Kharif changed to a modest overestimation of 4.2 %, while the MAPE could be reduced to 15.8 %.

Glacier melt component
During the first diagnostic calibration of the degree-day factors, it became obvious that in late summer, even with extreme high degree-day factors, it was usually not possible to reproduce the observed hydrograph.The analysis of the snow cover depletion showed that in most of the years, snow has vanished from areas below 4000 m a.s.l.already in June and elevation zones below 5000 m a.s.l.usually become snowfree in July.Thus, the snowmelt contribution to the flow is rapidly diminishing in August and September (see Fig. 7).
Although the monsoon season usually starts in July, bringing the highest monthly precipitation depth to the UIB in July and August, the resulting flow component from rain is not sufficient to create the necessary discharge.Therefore, many studies postulate a substantial contribution from glacier melt to the annual flow in the UIB.For example, Immerzeel et al. (2010b) estimate a contribution of 40 % from snow and 32 % from glacier melt with the remaining 28 % from rain, Charles (2016) mentioned that the contribution is 50 % from snow and 20 % from glacial melt.
Figure 7 shows the monthly distribution of the three flow components as calculated by SRM+G before subjecting to the recession flow calculation according to Eq. (1).SRM's simple recession flow approach is not mass conservative and does also not allow a direct attribution, at what day these flow components actually occur in the daily discharge Q n+1 .However, an overall water balance shows that the difference between water going into the (virtual) storage and water taken out by the recession flow term Q n is about 7 %, which seems acceptable in relation to the uncertainty associated with the input data.Having in mind the above limitation, the average (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) flow component distribution as simulated by SRM+G is 53, 21, and 26 % for snow, glacier, and  Figures 8 and 9 compare the hydrographs of simulation runs with and without the glacier component for the Upper and Lower UIB.The effect is more visible in the Lower UIB as about 10.5 % of the catchment is glaciated, while for the Upper UIB the glaciated area is merely 1.7 %.

Simulation model verification
The simulation model was verified by comparing full year (1 January-31 December) simulation runs using the actual temperature, precipitation, and snow cover data versus observed daily flows.Examples of respective hydrographs for the Upper and Lower UIB are given in Figs. 10 and 11.The resulting coefficients of determination R 2 and relative volume differences D v for each year and both the Upper and Lower UIB model are given in Table 5.Although the years 2003-2012 were used to calibrate certain model parameters (see Sect. 2.5), they are also used to validate the model in "forecasting mode", as in particular the degree-day factor functions (Tables 1 and 2) were applied as during a real forecasting procedure, i.e. the starting point was chosen according to the melting start threshold temperatures as given in Tables 1 and 2. The years 2013 and 2014, on the other hand, were not used at the time of model calibration; thus they represent a fully independent verification of the simulation model.
The average R 2 of 0.86 and 0.88 for the Upper and Lower UIB, respectively, indicate that in general the two models simulate the variations in the observed hydrographs quite acceptably.Values for the years 2013 and 2014 are even above the average for Lower UIB, which is finally essential for Tarbela inflows, but extreme floods, like 2010 during monsoon  season, are not reproduced that well.The average relative volume differences D v of −1.89 and 0.03 % for the Upper and Lower UIB, respectively, show that the simulation models, although not mass-conservative due to the recession approach (Sect.2.2) and volume differences vary from year to year, are in terms of total flow volume not biased over the long run.

Evaluation of forecasting skills
In order to evaluate the skills of the forecasting model, hindcasts were carried out for the years 2003-2014 always using all years, i.e. also the hindcasted year, as scenario members.
For the years 2015 and 2016, real forecasts have been determined before 1 April of these years, thus without using the particular year as a scenario.In all cases, the expected depletion of snow cover area was predicted for each scenario member based on the respective situation in March of the specific year.
In Table 6, the ensemble medians of hind-and forecasts by SRM+G are compared with observed flows, with IRSA's forecasts that are based on a statistical model, and with forecasts from the UBC watershed model (Quick and Pipes, 1977) that is used by WAPDA's Glacier Monitoring Research Centre.All values are total Kharif (1 April-30 September)  flow volumes in 10 9 m 3 .Figure 12 presents all model results and observed flows and shows also the 20 and 80 % quantiles of the SRM+G scenario ensemble.Table 7 summarises the metrics that are used to compare the forecast skills of the three models.The RMSE, MAE, and MAPE show an improvement in accuracy by SRM+G and the MPE shows a reduction of bias where SRM+G tends to slightly underestimate, while the two other forecasts moderately overestimate the total Kharif flows.However, both the correlation coefficients R and the anomaly coefficients AC u indicate however, that the association between forecasts and observations is weak for all three models.Here, the UBC forecasts show the best correlation followed by SRM+G and IRSA.The above aspects of model performance are synoptically visualised in the Taylor diagram (Taylor, 2001) in Fig. 13 that was plotted using the R-package Plotrix (Lemon, 2006).All models are comparable far away from the point of observations given on the x-axis, with SRM+G having the smallest centred root mean square difference and UBC the best correlation coefficient.
In order to evaluate the model skills in forecasting extreme conditions, i.e. dry or wet, the Peirce skill score was applied.The limits between the categories dry, normal, and wet conditions were defined as 20 and 80 % non-exceedance of the observed historic Kharif flow series 2003-2016, which corresponds to quantiles of 56.8 and 67.9 km 3 , respectively.Obviously IRSA and SRM+G forecasts have no skill in this respect, while UBC shows some, although limited, skill compared to purely random chance.
As only point estimates of IRSA and UBC forecasts were available, the assessment of probabilistic skill only applies to the SRM+G scenario ensembles.Figures 14 and 15 show typical traces of ensemble members as well as the observed, mean, and historic trace.In most years, the ensemble mean is closer to the observed value than the historic trace.
The RPSS was used to assess the overall performance of the probabilistic forecast.The same category limits as for the Peirce skill score were chosen, i.e. the 20 and 80 % quantiles of the observed flow series.As no other probabilistic forecasts were available as reference forecast, the ranked probability score RPS of the scenario forecast ensembles was compared to the climatology, i.e. the average Kharif flow of the observed series of 62.7 km 3 .The RPS of scenario forecast ensembles and climatology were 0.370 and 0.462, respectively.The resulting RPSS of 0.20 indicates that the scenario  ensemble shows some, however limited, skill and improvement over a constant forecast of the historic average.
The reliability diagram (Fig. 16) was constructed with the aforementioned three forecast categories: dry (≤ 56.8 km 3 ), near normal, and wet (> 67.9 km 3 ) conditions based on the historic observations and using for each forecast the frequency of ensemble members falling into the respective category.The resulting reliability diagram is relatively rugged and has gaps in several classes, although each forecast category was treated as an individual forecast resulting in 42 scalar forecasts.However, in total there are only 14 observations, which causes the outlier at the forecast probability class 0.6.There were 3 forecasts that fell into this very class but none of the 14 observations.As can be seen from the sharpness diagram, there are also empty forecast classes (0.4 and 0.5), meaning these probabilities have never occurred in any forecast category.Besides the shortcoming of the limited sample size and in particular the outlier, most points are located around the 1 : 1 line indicating that there is no dry or wet bias and that they fall within an acceptable distance.Res- olution, although difficult to assess taking the limited number of points, seems to be a bit weak, i.e. with a tendency to over-confidence, which may be caused by the fact that the available scenario years comprise mostly near normal conditions.

Conclusions
The SRM was applied to the UIB in order to forecast the total Kharif inflow to the Tarbela reservoir.Several improvements had to be introduced to SRM in order to meet the specific requirements of the UIB.
Not surprisingly, a separate component had to be added to SRM in order to consider the flow component originating from glacier melt.Without this component, especially in the late summer months, there is a lack of water to meet the observed hydrograph.All-year simulation runs with SRM+G for the period 2003-2014 result in an average flow component distribution of 53, 21, or 26 % for snow, glacier, or rain, respectively, which fits well to the values found in a number of other studies.
It is well known, that the Tibetan Plateau receives significantly lesser precipitation than the western parts of the UIB.In addition, MODIS data shows that the snow cover is fading away in early spring much faster than in the other parts.In the present study, SRM's modified depletion curve approach for predicting the snow cover depletion during the forecast period has proven to be very sensitive to errors in the estimation of the actual snow-water equivalent.In such cases, it is in-evitable to split the catchment into more homogeneous units.Therefore, the superposition of flows from sub-catchments by using a time lag was introduced to SRM+G, which leads to a significant improvement in the forecasts of snowmeltdominated early Kharif flows in particular.
The scenario approach is a step towards probabilistic forecasting of seasonal flows in the UIB.As the accuracy of existing forecasts with a mean volume error of 10.9 and 11.4 % is already quite high, the improvement by SRM+G having a MAPE of 9.5 % is only limited.The bias, however, could be reduced to −2.0 %.Association between forecasts and observations is rather weak for all three models, just as none of the models has significant skill in predicting extreme dry or wet conditions.
Regarding the scenario approach, it is obvious that as far as the variables precipitation and temperature are concerned, these tend towards the climatology, i.e. the long-term averages.A variance in the forecasts is only introduced by the different estimates of the snow-cover depletion curves for each forecast.Thus, a promising way to improve the association and sharpness of the scenario approach would be a selection of a subset of ensemble members according to forecasted seasonal anomalies in temperature and precipitation.A quick test using only the five lowest, middle, or highest ensemble members selected according to the (known) relative flow frequency of the forecasted year gives promising results, e.g.not only a MAPE of 4.9 % but also an AC u of 0.78 and a PSS of 0.41.The challenge of course is to forecast the seasonal anomaly in temperature and precipitation.In this respect, further research is needed on how today's global forecast systems may allow a more specific selection of ensemble members particularly in the UIB, where the correlation to common teleconnections like the ENSO 7 status is known to be weak.Special issue statement.This article is part of the special issue "Sub-seasonal to seasonal hydrological forecasting".It is not associated with a conference.

Figure 1 .
Figure 1.Map of the Upper Indus Basin (UIB) showing different elevations and splitting of the UIB at the Kharmong gauging station into Upper and Lower UIB.

Figure 3 .
Figure 3. Hypsometric curves and the distribution of area under 500 m elevation bands for the Upper and Lower UIB.Eleven and seven elevation zones were made for the Lower and Upper UIB, respectively, and the elevation of the weather stations in the western portion of the UIB are presented on the right hand side y-axis.

Figure 4 .
Figure 4. Increase in degree-day factors with time after start of melting for elevation zones 7 and 8 for the Lower UIB.Degree-day factors are obtained by diagnostic calibration.

Figure 5 .
Figure 5. Increase in degree-day factors with time after start of melting for elevation zones 5 and 6 for the Upper UIB.Degree-day factors are obtained by diagnostic calibration.

Figure 6 .
Figure 6.Snow cover variation in the months of March (Left) and April (Right) 2003 in UIB.

Figure 9 .
Figure 9.Comparison of SRM+G (with glaciers) and SRM (without glaciers) for the Upper UIB in 2008.

Figure 10 .
Figure 10.Results of the validation of the final Upper UIB flow forecast model (dashed line) compared to observed flows at Kharmong (solid line) for the year 2014.

Figure 11 .
Figure 11.Results of the validation of the final Lower UIB flow forecast model (dashed line) compared to observed inflows at Tarbela (solid line) for the year 2014.

Figure 12 .
Figure 12.Comparison of Kharif flow forecasts with 20 and 80 % quantiles of the SRM+G scenario ensembles for the years 2003-2016.

Figure 14 .
Figure 14.Plume diagram of ensemble member traces in 2003.

Figure 15 .
Figure 15.Plume diagram of ensemble member traces in 2008.

Table 1 .
Zone-wise melting start threshold temperatures and time-dependent degree-day factors for the Lower UIB.
* 10-day average temperature in • C in each elevation zone.

Table 2 .
Zone-wise melting start threshold temperatures and time-dependent degree-day factors for the Upper UIB.
* 10-day average temperature in • C in each elevation zone.

Table 3 .
SRM+G model parameters for both the Upper and Lower UIB.

Table 4 .
Percentage depletion of snow cover area for the Upper and Lower UIB during March 2003.

Table 5 .
Coefficient of determination R 2 and relative volume difference D v for the Upper and Lower UIB.

Table 7 .
Comparison of forecast skills between IRSA, UBC, and SRM+G.