Defining high-flow seasons using temporal streamflow patterns from a global model

Globally, flood catastrophes lead all natural hazards in terms of impacts on society, causing billions of dollars of damages annually. Here, a novel approach to defining high-flow seasons (3-month) globally is presented by identifying temporal patterns of streamflow. The main high-flow season is identified using a volume-based threshold technique and the PCR-GLOBWB model. In comparison with observations, 40 % (50 %) of locations at a station (subbasin) scale have identical peak months and 81 % (89 %) are within 1 month, indicating fair agreement between modeled and observed high-flow seasons. Minor high-flow seasons are also defined for bi-modal flow regimes. Identified major and minor high-flow seasons together are found to well represent actual flood records from the Dartmouth Flood Observatory, further substantiating the model’s ability to reproduce the appropriate high-flow season. These high-spatial-resolution high-flow seasons and associated performance metrics allow for an improved understanding of temporal characterization of streamflow and flood potential, causation, and management. This is especially attractive for regions with limited observations and/or little capacity to develop early warning flood systems.


Introduction
Flood disasters rank as one of the most destructive natural hazards in terms of economic damage, causing billions of dollars of damage each year (Munich Re, 2012).These flood damages have risen starkly over the past half-century given the rapid increase in global exposure (Bouwer, 2011;UNISDR, 2011;Visser et al., 2014).To specifically address flood disasters from a global perspective, understanding of global-scale flood processes and streamflow variability is important (Dettinger and Diaz, 2000;Ward et al., 2014).In recent decades, studies have investigated global-scale streamflow characteristics using observed streamflow from around the world (Beck et al., 2013;McMahon, 1992;McMahon et al., 2007;Peel et al., 2001Peel et al., , 2004;;Poff et al., 2006;Probst and Tardy, 1987) and modeled streamflow from global hydrological models (Beck et al., 2015;van Dijk et al., 2013;Mc-Cabe and Wolock, 2008;Milly et al., 2005;Ward et al., 2013Ward et al., , 2014) ) to investigate ungauged and poorly gauged basins (Fekete and Vörösmarty, 2007).Despite this broad attention to annual streamflow and its connections to global climate processes and precursors, there has been relatively little attention paid to the intra-annual timing of streamflow, emphasizing the need for analysis of seasonal streamflow patterns to further improve understanding of large-scale hydrology and atmospheric behaviors in the main (flood) streamflow season globally (Dettinger and Diaz, 2000).Moreover, better assessment of streamflow timing and seasonality is important for addressing frequency and trend analyses, flood protection and preparedness, climate-related changes, and other hydrological applications that possess important sub-annual characteristics (Burn and Arnell, 1993;Burn and Hag Elnur, 2002;Cunderlik and Ouarda, 2009;Hodgkins et al., 2003).This motivates further investigation of intra-annual temporal streamflow patterns globally.
Only a small number of studies have investigated globalscale seasonality and temporal patterns of streamflow, with minimal focus on objective streamflow timing.Haines et al. (1988) (Döll and Lehner, 2002) with separation of large basins (Ward et al., 2014).
present one of the first maps providing a global classification.Burn and Arnell (1993) aggregate 200 streamflow stations into 44 similar climatic regions and subsequently combine these into 13 groups using hierarchical clustering based on similarity of the annual maximum flow index, providing spatial and temporal coincidences of flood response.Dettinger and Diaz (2000) aggregate 1345 sites into 10 clusters based on seasonality using climatological fractional monthly flows (CFMFs) to identify peak months and linkages with large-scale climate drivers.
In general, these studies define high streamflow or flood seasons subjectively based on the relationship between dominant streamflow amplitude patterns and large-scale climate drivers/patterns, and delineate large-scale homogeneous regions correspondingly.Defining high-flow season timing is essentially a bi-product of these analyses, and may be problematic due to varying seasonal patterns (e.g., bi-modal distribution, constant or low-flow areas, etc.) not captured at the large-scale delineation.There is also typically no distinction between minor and high-flow seasons.In some cases, these minor seasons (e.g., resulting from bi-modal precipitation distribution) can produce high-flow or flood conditions, and are thus of interest to identify.Here we identify high-flow seasons by capturing annual peak timing using a volumetric technique at the cell and sub-basin scale, presenting an approach focused on streamflow temporal patterns rather than pattern of amplitude.The new measure of peak month (PM) and high-flow season (HS) coupled with the model grid scale provides much higher-resolution peak timings globally than previously presented (often at large basin scale or subcontinental scale).The performance measure introduced here, which is the percentage of annual maximum flow (PAMF), is also a new contribution relating the model's ability to capture high-flow season timing.These advantages are also helpful for identifying less-dominant but important seasons (mi-nor high-flow seasons) that possess similar characteristics to the high-flow season (e.g., a bi-modal annual cycle), another unique contribution of this work.This leads to better temporal characterization and understanding of flood potential, causation, and management, particularly in ungauged or limited-gauged basins.

Streamflow stations
Daily streamflow observations utilized in this study are from the Global Runoff Data Centre (GRDC, 2007), specifically those stations located along the global hydrology model's drainage network.Since station records that are missing even short periods may affect how a high-flow season is defined, we have excluded years with any daily missing values.In this study, a minimum of 20 hydrological years is required for a station to be retained, leaving 691 stations from all continents except Antarctica, with upstream basin areas ranging from 9539 to 4 680 000 km 2 and periods of record between 20 and 43 years across 1958-2000 (Fig. 1).Although this criterion is admittedly quite strict (no missing 20-year daily data), including stations with missing records does not add a significant number.These stations are mostly located on large rivers; the annual streamflow of 75 % of stations is larger than 100 m 3 s −1 , 35 % of stations are larger than 500 m 3 s −1 , 20 % of stations are larger than 1000 m 3 s −1 , and 5 % of stations are larger than 5000 m 3 s −1 .

PCR-GLOBWB
In this study, we evaluate simulations of daily streamflow over the period 1958-2000 taken from Ward et al. (2013) (Van Beek and Bierkens, 2009;Van Beek et al., 2011).Although the PCR-GLOBWB model is not calibrated, and simulations may contain biases and uncertainty at course spatial resolution, the long time series of streamflow provided globally has been deemed sufficient to estimate long-term flow characteristics with spatial consistency (Winsemius et al., 2013).Additionally, this model has been validated in previous studies in terms of streamflow (Van Beek et al., 2011) and terrestrial water storage (Wada et al., 2011) at stations along major rivers in the world.The model's extreme discharges are also evaluated by Ward et al. (2013) with fair to good performance at stations with large drainage area (≥ 125 000 km 2 ), corresponding to 24 % of GRDC stations used in this study, excepting overestimation in several arid regions.Note that for the simulations used in this study, the maximum storage within the river channel is based on geomorphological laws that do not account for existing flood protection measures such as dikes and levees.
For the simulations used in this study, the PCR-GLOBWB model was forced with daily meteorological data from the WATCH (Water and Global Change) project (Weedon et al., 2011), namely precipitation, temperature, and global radiation data.These data are available at the same resolution as the hydrological model (0.5 • × 0.5 • ).The WATCH forcing data were originally derived from the ERA-40 reanalysis product (Uppala et al., 2005), and were subjected to a number of corrections including elevation, precipitation gauges, timescale adjustments of daily values to reflect monthly observations, and varying atmospheric aerosol loading.It is possible that this may have some minor effect on streamflow simulation, likely providing more realistic outcomes.Full details of corrections are described in Weedon et al. (2011).

Defining high-flow seasons
To identify spatial and temporal patterns of dominant streamflow uniformly, we design a fixed time window for representing high-flow seasons globally.Here we define major highflow seasons as the 3-month period most likely to contain dominant streamflow and the annual maximum flow.The central month is referred to as the peak month (PM) and the full 3-month period is referred to as the high-flow season (HS).Specifically, we define PM first, and then define HS as the period also containing the month before and after the PM.This approach is performed for both observed (station) and simulated (model) streamflow to gauge performance.

Methodology for defining grid-cell-scale high-flow seasons
In the last few decades, a number of studies have investigated the timing of peak flows in the context of analyzing flood seasonality, frequency and trends.Generally, two main proper-ties are emphasized regarding flood timing: peak volume and peak timing.Considering peak volume, the occurrence dates are commonly recorded for a fixed time period or specific amount of peak volume, often in the context of trend analysis.For example, Hodgkins and Dudley (2006) use winterspring center of volume (WSCV) dates to analyze trends in snowmelt-induced floods, and Burn (2008) uses percentiles of annual streamflow volume dates as indicators of flood timing, also for trend analysis.For peak timing, two sampling methods are frequently applied in hydrology.The first and most common is the annual-maximum (AM) method, which samples the largest streamflow in each year.The second method is the peaks-over-threshold (POT) method (Smith, 1984(Smith, , 1987;;Todorovic and Zelenhasic, 1970), in which all distinct, independent dominant peak flows greater than a fixed threshold are counted.In contrast to the AM method, POT can capture multiple large independent floods within a single year, including the annual maximum flow, but may not capture the annual maximum flow in years in which streamflow is less than the pre-defined threshold; this threshold can either be defined based on a specific average number of floods or a specific mean exceedance level over the entire period (Cunderlik et al., 2004a;Institute of Hydrology, 1999;Lang et al., 1999).The PM selected, therefore, is dependent on the peak properties (volume, timing) considered.
For a local study, selecting the PM can be based on welldefined climatic or hydrologic characteristics (e.g., rainy season, snowmelt, etc.); however, no single global method can be uniformly applied to define the PM everywhere.Thus, to define the HS, and specifically the PM, globally, both peak volume and peak timing aspects need to be considered (Javelle et al., 2003).To do this, we adopt a volumebased threshold (VBT) technique.This technique is similar to a streamflow volume-based technique in terms of capturing the days (Julian dates) when streamflow exceeds the predefined threshold (percentile of flows) and associated volume (Burn, 2008).The major difference, however, is that the VBT applies the threshold over the entire time series (available record) concurrently instead of on a year-by-year basis.In other words, for the 95th percentile, instead of annually calculating the 95th percentile, it is calculated using the entire period of record.The common volume-based technique thus records events every year surpassing the threshold; however, for the VBT approach, every year need not have a peak above the threshold.This approach emphasizes capturing the key peaks across the entire available time series (as in a peaks-over-threshold approach).VBT thus contains both volume and timing characteristics for defining the peak month (PM).Here, the month containing the greatest number of occurrences over the specified percentage of flows across all years  is defined as the PM, and subsequently the HS is designated as the period containing the PM plus the month before and after the PM.days surpassing the 5 % threshold is listed for each month.
In this example, August has the largest number of days over the threshold (105 days); thus, August is defined as PM and July-September is defined as HS.
To evaluate the defined HS objectively, by evaluating the number of annual maximum flows captured, we develop a simple evaluating statistic called the percentage of annual maximum flow (PAMF).PAMF is computed as shown in Eq. ( 1): where nAMF(i) denotes the number of annual maximum flows that occur in month i across the full record.In Eq. ( 1), when i is 1 (January), i − 1 in the summation is 12 (December), and when i is 12 (December), i + 1 is 1 (January).Here the PAMF provides the percentage of annual maximum flows occurring in the defined HS across the evaluation period.The PAMF is relatively simple, yet provides a clear indication of how well the PM selected represents the occurrence of annual peaks across the time series.For example, a high PAMF indicates that the HS is highly likely to contain the annual maximum flood each year.In contrast, a low PAMF indicates that the timing of the annual maximum flow is more likely to vary temporally, and may be a result of bimodal seasonality, consistently high or low streamflow throughout the year, streamflow regulated by infrastructure or natural variation.In this study, we subjectively classify HS PAMF values as high (80-100 %), moderate (60-80 %), low (40-60 %) and poor (0-40 %).The PAMF is calculated for both the observed streamflow at the 691 selected GRDC stations and the simulated streamflow at the associated 691 grid locations.
The VBT technique is compared with the common volume-based technique and POT technique to gauge per-formance.Four volume-based durations, namely V01 %, V03 %, V05 % and V10 %, and three POT techniques averaging 1, 2, and 3 peaks per year (POT1, POT2 and POT3, respectively), are selected.For the V01 % technique, the HS is simply centered on the PM containing the largest number of occurrences of the top 1 % of annual streamflow volume across the total years available.The V03 %, V05 % and V10 % techniques are similar to the V01 % approach, respectively using 3, 5 and 10 % of annual streamflow volume.Comparatively, techniques with a shorter time component (1-3 % of annual volume) favor identifying the PM by peak timing, since the top 1-4 days of streamflow tend be located near the peak, while techniques with longer time components (5-10 % of annual volume) favor identifying the PM based on duration and peak volume, since the top 19-33 days of streamflow tend to be located near the volumetric centroid of the hydrograph, rather than the peak, if they differ.The VBT technique is an attempt to bridge these two criteria.For the POT techniques, independence criteria are applied to avoid counting multiple peaks from the same event (Institute of Hydrology, 1999).For example, two peaks must be separated by at least 3 times the average rising time to peak, and minimum flow between two peaks must be less than two-thirds of the higher one of the two peaks.More details of independence criteria are described in Lang et al. (1999).
An analysis examining sensitivity of selected threshold levels to the VBT technique is also undertaken.Performances of thresholds representing 1, 3, 5 and 10 % exceedance across the entire period of record, named VBT1 %, VBT3 %, VBT5 % and VBT10 %, respectively, are compared.
To compare techniques and thresholds, the PMs are defined at the 691 selected stations and associated model grids.
The locations where the PMs differ (by at least one technique) are of most interest.This occurs at 61 % of stations and 54 % of associated grids.Cross-correlations of PM between the four common volume-based techniques clearly indicate the tendency of the defined PM to shift from peak timing dominated to peak volume dominated as the time component increases (Table 1).Correlation between VBT techniques and volume-based techniques are quite similar and consistent (0.82-0.86 and 0.84-0.86 for observed and simulated streamflow, using VBT5 %; Table 1), preliminarily indicating some success in capturing both timing and volume properties, while correlations between the VBT techniques and POT are less strong (0.78-0.81 and 0.79-0.83for observed and simulated streamflow, respectively, using VBT5 %; Table 1).The PAMF is also useful for comparing techniques, such that the technique having the highest average PAMF typically contains more annual maximum flow events in their defined HSs.The VBT5 % is superior to other VBT and POT techniques for both observed and modeled streamflow, having the highest PAMF values; however, the volume-based techniques indicate similar or even slightly better performance than VBT5 % (Table 2).This is not unexpected as the volume-based techniques are designed to cap- ture annual peak flows on a year-by-year basis, whereas the POT and VBT record significant peaks across the full time series, and may not capture annual peaks in some years in which that peak is small relative to all peaks throughout the available record.Thus VBT tends to select PMs that contain the most significant peaks overall, and subsequently have the highest potential for capturing probable flood seasons for flood-prone basins, a desirable outcome for this study.To illustrate this in the context of the PAMF, if all years are ranked for each location based on the annual peak flow, and the top 50 % (half) are retained, the PAMF actually favors the VBT approach, surpassing the volume-based approach by 5-6 % for PMs and 2-3 % for HSs.Finally, techniques may be evaluated by comparing the temporal difference (number of months) between modelbased and observed PMs; closer is clearly superior.The VBT3 % and VBT5 % techniques produce the greatest degree of similarity between model-based and observed PMs (81 % of stations having ±1 month difference; Table 3).Overall, the VBT technique demonstrates superior performance as compared with the POT techniques by all comparisons.The VBT technique is also on par with or slightly superior to the common volume-based technique, especially con- sidering the 5 % threshold; thus, the remainder of the analysis is carried out utilizing the VBT5 % technique only.

Methodology for defining sub-basin-scale high-flow seasons
In addition to evaluating the HS at the 691 grid cells based on model outputs, the PM and HS can also be defined at the sub-basin scale globally where observations are present.Previous studies have investigated flood seasonality as it relates to basin characteristics; for example, basins are delineated/regionalized and grouped according to similarity/dissimilarity of streamflow seasonality (Burn, 1997;Cunderlik et al., 2004a), or conversely, flood seasonality is occasionally used to assess the hydrological homogeneity of a group of regions (Cunderlik and Burn, 2002;Cunderlik et al., 2004b); thus, evaluating at the sub-basin scale is warranted.
While defining a single PM for a large-scale basin may be convenient, it may be difficult to justify given the potentially long travel times and varying climate, topography, vegetation, etc.Additionally, infrastructure may be present to regulate flow for flood control, water supply, irrigation, recreation, navigation, and hydropower (WCD, 2000), causing managed and natural flow regimes to differ drastically.This becomes important, as globally more than 33 000 records of large dams and reservoirs are listed (ICOLD, 2009), with geo-referencing available for 6862 of them (Lehner et al., 2011).Nearly 50 % of large rivers with average streamflow in excess of 1000 m 3 s −1 are significantly modulated by dams (Lehner et al., 2011), often significantly attenuating flow hydrographs and flood volumes (20 % of GRDC stations fall into this category).The PAMF, as previously defined, can aid in identifying stations affected by upstream reservoirs through low PAMF values.This is applied with the assumption that reservoir flood control disperses the annual maximum flows across months rather concentrated within a few months (e.g., akin to natural flow).In this study, we used the global sub-basins from the 30 global drainage direction map (DDM30) data set (Döll and Lehner, 2002) with separation of large basins (Ward et al., 2014).
To define a sub-basin's PM, the maximum PAMF and associated PM for each station within the sub-basin are considered according to the following: -if multiple stations exist within the sub-basin, the PM is defined as the PM occurring for the largest number of stations; -if there is a tie between months, their average PAMF values are compared, and the month having the higher average PAMF is defined as the PM; -if there is a tie between months and equivalent average PAMF values, the month having the higher average annual streamflow is defined as the PM.
The sub-basin's PM is defined based on the occurrence of station or grid-level PMs rather than the PAMF values to diminish the chance of results being skewed by biased simulations or varying climate effects in small parts of the subbasin.When there are an equal number of occurrences for different PMs, the average PAMF values are used to determine which PM is selected.In this case, the effect of stations downstream of reservoirs will be minimized given their typically low average PAMF values, assuming operational rules relatively evenly distribute the annual flow across all months; however, if operational rules instead concentrate releases to a few months, PAMF values may actually be high.This procedure is applied for both stations (observations) and corresponding grid cells (model) in each sub-basin.To illustrate this, consider the six GRDC stations in the Zambezi River basin (Fig. 3).For most of the stations, the observed PM is defined as a month later than the model-based PM (Table 4), an apparent bias in the model.The PAMF of STA06 observations is noticeably lower than for other stations (36 %; Table 4) given its location downstream of the Itezhi-Tezhi dam (STA05) (Fig. 3).Otherwise, PAMF values are consistently high across all stations.March is the PM identified most often; thus, the final sub-basin PM selected is March.In contrast, the model-based simulated streamflow produces a high PAMF at STA06 (97 %), as the Itezhi-Tezhi dam is not represented in the simulations used for this study, and subsequently does not account for modulated streamflow.Across other stations, the PAMF is also high; however, an equal number of stations select February and March.In this case, February is selected as the final basin PM given its higher average PAMF value (96 % vs. 91 %).
By this approach, all 691 GRDC stations are grouped into 223 sub-basins to define the PM (Fig. 6); 58 % of sub-basins are defined by a single station, only 7.6 % (observations) and 8.1 % (model) of sub-basins have ties when defining PMs, and only one sub-basin has a tie between PMs and average PAMF values.

Verification of selected high-flow seasons
Model-based PMs are verified by comparing with observation-based PMs at station and sub-basin scales.Additionally, historic flood records from the Dartmouth Flood Observatory (DFO) are used to compare basin-level PMs to actual flooded areas spatially and temporally.Specifically, we apply the following information from DFO: start time, end time, duration and geographically estimated area at 3486 flood records across 1985-2008.

Observed versus modeled high-flow seasons
Ideally the model-based and observed GRDC stations have fully or partially overlapping HS periods.If so, this builds confidence in interpreting HSs at locations where no observed data are available.For comparing modeled PMs to observations, the defined PMs and calculated PAMF are represented globally at the station scale (Figs.4-5) and sub-basin scale (Fig. 6) with temporal differences of PMs (modeled PM -observed PM).In the southeastern United States, GRDC stations express relatively lower PAMF values for observations (40-60 %) than model outputs (60-80 %), due to the high level of managed infrastructure.In the central-southern US and Europe, low PAMF values are computed for both observations and modeled output (Fig. 5) with notable temporal differences (Fig. 4c).For observations, this is attributable, at least in part, to reservoirs and dams along the Mississippi, Missouri and Danube rivers.Additionally, relatively constant streamflow patterns are identified in both observations and modeled output, consistent with previous studies reporting these flow regimes as uniform or perpetually wet (Burn and Arnell, 1993;Dettinger and Diaz, 2000;Haines et al., 1988).Minor high-flow seasons may also play a role.Model biases also affect PM selection; for northwestern North America, PMs for many points are defined on average 1 month earlier than with observations, producing moderate PAMF values (60 % and higher).In northern Europe, especially southern Finland, this becomes much more pronounced, with large differences between PMs from observations and the model, on the order of 4 months (Figs.4c, 6c, and 8a).In western and northern Australia, PMs are modeled 1 month later on average than observations, except for two occurrences in the west (5-month difference) due to both observed and modeled low-flow conditions.Such low-flow regimes are also apparent in southeastern Australia, causing large differences between PMs (4-5 months).The differences in PMs between observations and modeled outputs are also compared at the continental scale (Fig. 7).In North America, 38 % of stations and 51 % of sub-basins produce identical PMs, growing to 82 % of stations and 93 % of sub-basins when considering a ±1 month temporal difference (e.g., HS; Fig. 7).In Asia 65 % of stations and 70 % of sub-basins have identical PMs, growing to 90 % of stations and 92 % of sub-basins with ±1 month temporal difference (Fig. 7).In central Russia, a large difference between PMs (±3 months) is attributable to reservoirs on the Yenisei and Angara rivers and model bias (Fig. 4c).In Africa, 48 % of stations and 60 % of sub-basins produce identical PMs (Fig. 7), 30 % of stations and 27 % of sub-basins are modeled 1 month earlier, and 7.4 % of stations and 6.7 % of sub-basins are modeled 1 month later than observation (Fig. 7).In South America, with only five stations, 40 % have the same month, 40 % are modeled 1 month earlier, and 20 % of stations are modeled 2 months earlier than observations.
Comparing observations and modeled output globally, 40 % of the locations share the same PM.The model's bias is one of the main reasons for this moderate performance; other important contributors include minor high-flow seasons, perpetually wet or dry regions, and anthropogenic effects such as reservoir regulation.Considering a difference of ±1 month, this jumps to 81 %, and 91 % for ±2 months (Fig. 7).From a sub-basin perspective, the similarities are even stronger (50 % identical PM, 88 % ±1 month and 92 % ±2 month), indicating a relatively high level of agreement.For locations having dissimilar PMs (≥ ±3 months, 9 % of locations and 8 % of sub-basins), a substantial number are located downstream of reservoirs directly, such as STA06 in the Zambezi example (Table 4), or are low-flow (dry) or constant-flow locations, both producing exceedingly low PAMF values.Differences in PMs are not unexpected for low-flow and constant-flow locations, given the propensity of the annual streamflow maximum to potentially occur in a wide number of months.Overall, however, as more than 80 % of both stations and sub-basins have similar PMs performs appropriately well in defining high-flow seasons globally at locations where observations are available.This may be subsequently extended to defining PMs and PAMF at all grid cells (Fig. 8).Generally, low and poor PAMF values (0-60 %) indicate a naturally unstable annual maximum flow (no clear high-flow season), which occurs in cases of constant flow, low flow, bi-modal flow and regulated flow.All cases, except regulated flow, are simulated within the PCR-GLOBWB simulations used; thus, the cell-based PAMF values (Fig. 8b) can provide a sense of confidence for the defined PM (Fig. 8a).Examples of low-flow regions include the central United States and Australia, having low PAMF regional values (Fig. 8b).Bi-modal regions, such as much of eastern Africa and southern South America with their two rainy seasons, and constant-flow regions, such as Europe, also indicate low PAMF values (Fig. 8b).These flow regimes are further investigated as minor HS in Sect. 5.

Modeled high-flow seasons versus actual flood records
Model-based PMs may also be verified (subjectively) by surveying historic flood records.One such source is the Dartmouth Flood Observatory (DFO), a large, publicly accessible repository of major flood events globally over 1985-2008, based on media and governmental reports and instrumental and remote-sensing sources.Delineations of affected areas are the best estimates (Brakenridge, 2011).The DFO records provide the start time, end time and duration of each flooding event, as defined by the report or source, and represented as the occurrence (start) month (Fig. 9).DFO flood events and grid-cell-based PMs (Fig. 8a) may be compared outright; however, their characteristics differ slightly.The DFO covers 1985-2008, while the model represents 1958-2000.Also, the model-based PM represents the month most likely for a flood to occur; the DFO is simply a reporting of when the event did occur, regardless of whether it fell in the expected high-flow season or not.Nevertheless, model-based PMs and historic flood records illustrate similarity (compare Figs. 8a and 9), particularly when both the major and minor high-flow seasons are considered, further indicating merit in the ability of the proposed approach to identify the PM.Consistently, regions with high model-based PAMF (80-100 %), such as eastern South America, central Africa and central Asia, tend to agree well with DFO records, while poor or less than poor PAMF (0-60 %) regions, such as central North America, Europe, and eastern Africa, tend not to be in agreement with DFO records.In these low PAMF regions, however, DFO records also illustrate floods occurring sporadically throughout the year, further supporting accordance between cell-based PAMF and DFO records (Figs.8b and 9).

Defining minor high-flow seasons
In some climatic regions, there is no one single, well-defined flood season.For example, eastern Africa has two rainy seasons, the major season from June to September and the minor season from January to April/May.These two seasons are induced by northward and southward shifts of the Inter-tropical Convergence Zone (ITCZ) (Seleshi and Zanke, 2004).This bi-modal eastern African pattern allows for potential flooding in either season.In Canada, as another example, the dominant spring snowmelt season (March-May) and fall rainy season (August-October) allow for flood occurrences in either period (Cunderlik and Ouarda, 2009).
Previous studies have investigated techniques to differentiate seasonality from uni-, bi-and multi-modal streamflow climatologies and evaluate trends in the timing and magnitude of streamflow, including the POT method, directional statistics method, and relative flood frequency method (Cunderlik and Ouarda, 2009;Cunderlik et al., 2004a).These methods may perform well at the local (case-specific) scale to define minor high-flow seasons; however, applying them uniformly at the global scale can be problematic, given spatial heterogeneity.Additionally, even though bi-modal streamflow climatology may be detected, the magnitude of streamflow in the minor season may or may not be negligible in regards to flooding potential as compared with the major season.
To detect noteworthy minor high-flow seasons globally, we classify streamflow regimes by climatology and monthly PAMF value, calculated using Eq.(1) at each month (Fig. 10).Classifications include unimodal, bimodal, constant, and low-flow.The unimodal streamflow climatology has high values of PAMF around the PM; the bi-modal classification is represented by two peaks of PAMF (and may therefore contain a minor season); both constant and low-flow classifications represent low values of PAMF between months.Distinguishing between bi-modal and other classifications is nontrivial.For example, initial inspection of the constant streamflow classification (both climatology and monthly PAMF, Fig. 10c) could be mistaken for a nondominant bi-modal distribution.We adopt the following criteria to differentiate bi-modal streamflow from uni-modal, constant, and low-flow conditions.
-The low-flow classification is defined for annual average streamflow less than 1 m 3 s −1 .
-The major and minor PMs must be separated by at least 2 months in order to prevent an overlap of each HS (3 months).
-If there is a peak in the monthly PAMF values outside the major HS, it is regarded as a potential minor PM.If the sum of the major and potential minor PM's PAMF is greater than 60 % (a minimum of 29 out of 43 annual maximums fall into one of the HS), the potential minor PM is confirmed as a minor PM; the major PM's PAMF cannot exceed 80 %.
A potential minor PM is identified by a secondary peak in the monthly PAMF rather than the magnitude or shape of streamflow.A minor HS is not defined when a major PM's PAMF is greater than 80 % (minimum of 35 out of 43 annual maximums), indicating a robust uni-modal streamflow character (Fig. 10a).The sum of both major and minor PMs' PAMF (joint PAMF) is used to determine the likelihood that one of the HSs contains the annual maximum flow; a high value of the joint PAMFs (80-100 %) indicates strong likelihood (Fig. 10b), and moderate values (60-80 %) imply moderate likelihood, with some probability of being classified as constant streamflow (Fig. 10c); low values (40-60 %) are likely constant or low streamflow (Fig. 10d).Minor HSs are similar to major HSs, containing the minor PM and the month before and after.Minor HSs are evident in the tropics and sub-tropics and are spatially consistent with bi-modal rainfall regimes discovered by Wang (1994) (Fig. 11).Examples include eastern Africa (second rainy season in winter) and Canada (rainfall-dominated runoff in fall, both having high joint PAMF values (80-100 %).Additional examples include the major HS (NDJ) and minor HS (MAM) in central Africa consistent with the latitudinal movement of the ITCZ, intra-Americas' major HS (ASON) and minor HS (AMJJ) (Chen and Taylor, 2002), and coastal regions of British Columbia in Canada and southern Alaska's minor HS (SOND) due to wintertime migration of the Aleutian low from the central North Pacific (Fig. 11).Distinct runoff process controlled by different climate and hydrology systems can induce a bi-modal peak within a large-scale basin, such as the upstream sections of the Yenisey and Lena river systems in Russia where the major HS (AMJ) is dominated by snowmelt and the minor HS (JAS) is spurred on by the Asian monsoon.The same mechanism produces minor HSs around the extents of the Asian summer monsoon (90-100 % of the sum of PAMFs) (Figs. 8b and 11).Moderate minor HSs include, for example, the southern United States' (Texas and Oklahoma) bimodal rainfall pattern (AMJ and SON) and the southwestern United States (Arizona), where the summertime major HS (JJA) is produced by the North American monsoon and the wintertime minor HS (DJF) is affected by the regional large-scale low-pressure system (Woodhouse, 1997).Southeastern Brazil's summertime major HS (NDJF) and postsummer minor HS (AMJ) are dominated by formation and migration of the South Atlantic Convergence Zone (Herdies, 2002;Lima and Satyamurty, 2010).In central and eastern Europe, the major HS (FMAM) and minor HS (JJA) are defined as moderate (60-80 % of joint PAMF values for central Europe and 70-90 % for eastern Europe), indicating that a minor HS is not overly pronounced; for northeastern Europe the major HS (MAM) and minor HS (NDJ) contain high joint PAMF values (80-100 %).
For the major HS and minor HS with joint PAMF values exceeding 60 % (Fig. 12), flood records (DFO) occurring over more than 1 month are counted in each month based on the reported duration.Although one distinct flood event may dominate a monthly DFO record, strong similarity is evident between the HSs and monthly flood records (Fig. 12).Minor HSs with high PAMF values corresponding well to observed DFO flood records include eastern Africa (bi-modal streamflow), the intra-Americas, and northern Asia; only a few reported flood records occur in the minor HSs at high latitudes.

Conclusions and discussion
In this study, a novel approach to defining high-flow seasons globally is presented by identifying temporal patterns of streamflow objectively.Simulations of daily streamflow from the PCR-GLOBWB model are evaluated to define the dominant and minor high-flow seasons globally.In order to consider both peak volume and peak timing, a volume-based threshold technique is applied to define the high-flow season and is subsequently evaluated by the PAMF.To verify model-defined high-flow seasons, we compare with observations at both station and sub-basin scales.As a result, 40 % of stations and 50 % of sub-basins have identical peak months and 81 % of stations and 89 % of sub-basins are within 1 month, thus well capturing high-flow seasons.When considering anthropogenic effects and bi-modal or perpetually wet/dry flow regions, these results indicate fair agreement between modeled and observed high-flow seasons.Regions expressing bi-modal streamflow climatology are also defined to illustrate potential for noteworthy secondary (minor) highflow seasons.Model-defined major and minor high-flow seasons are additionally found to represent actual flood records from the Dartmouth Flood Observatory, further substantiating the model's ability to reproduce the appropriate high-flow season.
Large-scale temporal phenomena associated with the defined major and minor high-flow seasons are also identified.For example, global monsoon systems are clearly evident, as driven by the ITCZ, in central and eastern Africa, Asia and northern South America (Fig. 8).Latitudinal patterns in the extra-tropics are also quite distinct, with high-flow seasons often occurring across similar months in the year.These broad temporal patterns are consistent with previous find- ings (e.g., Burn and Arnell, 1993;Dettinger and Diaz, 2000;Haines et al., 1988); however, this analysis goes further by not being constrained to large-scale patterns for seasonal definition (via clustering) and also providing a sense of the reliability of the defined high-flow seasons.Specifically, the defined PM (Fig. 8a) has extended Dettinger and Diaz (2000)'s peak months by focusing on basin-and grid-scale streamflow volumes and providing likelihood type maps using the PMAF metric developed here (e.g., Fig. 8b) to represent the reliability of the defined PM.This can provide a clear sense of whether the identified high-flow season is pronounced or vague.The identification of minor high-flow seasons and de-ciphering bi-modal from constant streamflow regimes is another notable contribution of this study; minor seasons have not been well identified in previous studies.These identified high-flow seasons are also consistent with DFO flood records both spatially and temporally, further substantiating their appropriateness.
Although biased simulations may theoretically contribute to a misidentified high-flow season, the global hydrological model's acceptable ability to define high-flow seasons is highlighted in this study.The global hydrological model's ability to define major and minor high-flow seasons at high resolution is highlighted in this study.Although results in-dicate relatively positive performance overall, regional performance varies spatially.This is advantageous for many reasons, including hydrologic assessment in ungauged and poorly gauged basins and also for investigating flood season timing within large basins having diverse physical processes, for example, how the PM may shift along long rivers (e.g., Congo River) or basins with both snowmelt and raindominated processes.These spatially heterogeneous highflow seasons at high resolution have the potential to characterize streamflow regimes better than previous studies (e.g., Dettinger and Diaz, 2000;Haines et al., 1988).Additional analysis to include upstream management and regulations is required to further classify global streamflow regimes and major high-flow seasons (or the elimination of them) for specific sub-basin-level hydrologic applications.

Figure 1 .
Figure1.Location of 691 selected GRDC stations with the corresponding number of years per station.Background polygons are world sub-basins based on 30 drainage direction maps(Döll and Lehner, 2002) with separation of large basins(Ward et al., 2014).
Figure 2 provides an example based on 7 years of synthetic streamflow with the volumetric threshold set at the top 5 % of flows; the number of D. Lee et al.: Defining high-flow seasons using temporal streamflow patterns from a global model Top 5% of streamflow

Figure 2 .
Figure 2. Seven years of synthetic streamflow data.The dotted line represents the 5 % streamflow threshold.Numbers indicates the total days above the threshold for each month.

Figure 3 .
Figure 3. Map of the Zambezi River basin; the solid black line delineates the basin and the green points are the six GRDC stations (STA01-06), with STA06 downstream of the Itezhi-Tezhi dam (STA05).

Figure 4 .
Figure 4. Peak month (PM) for flooding as defined by (a) 691 GRDC observation stations, (b) simulated streamflow at associated locations and (c) temporal difference in PM between observations and simulation (simulation-observation, in number of months; a negative (positive) value indicates that the simulated PM is earlier (later) than the observed PM).

Figure 6 .
Figure 6.Peak month (PM) for flooding by sub-basin as defined by (a) 691 GRDC observation stations, (b) simulated streamflow at associated sub-basins and (c) temporal difference in PM between observations and simulation (simulation-observation, in number of months; a negative (positive) value indicates that the simulated PM is (later) than the observed PM).

Figure 7 .Figure 8 .Figure 9 .
Figure 7. Percentage of stations (top panel) and sub-basins (bottom panel) according to the temporal difference of PM between observations and model outputs (SM-OB, number of months) in each continent.

Figure 12 .
Figure 12.Defined major HS and minor HS where joint PAMF is greater than 60 % (left panels); peak month of major and minor HSs (dense color) and pre-and post-month of major and minor HSs (light color).Monthly accumulated actual flood records (DFO) during 1958-2008 (right panels).

Table 1 .
Cross-correlations of peak month (PM) at locations where the PMs differ by at least one classification technique (this occurs at 61 % of stations and 54 % of associated grids).

Table 2 .
Average PAMF of each classification technique for modeled and observed streamflow where stations have different PMs.

Table 3 .
Percentage of stations according to the difference in PMs between modeled and observed streamflow at each classification technique.

Table 4 .
Comparison of peak month (PM) for flooding and calculated P AMF at six GRDC stations in the Zambezi River basin.