A consistent set of trans-basin floods in Germany between 1952–2002

Abstract. Floods that affect many sites simultaneously can pose great challenges in the co-ordination of flood disaster management actions, as well as for the insurance and re-insurance industry, since this type of flooding leads to an accumulation of losses and the risk assessment needs to be extended to a concept representing the spatial risk of flooding. The assessment of the accumulated risk, especially over large domains, requires an analysis of the spatial and temporal coherence of flooding. For Germany the extent of spatial dependence of flooding is largely unknown and no systematic analysis has been performed so far. In this paper, we present a methodology that is capable of capturing the simultaneous occurrence of flooding using multiple series of mean daily discharge. For the first time we present a complete and consistent set of trans-basin floods in Germany for the period between 1952 and 2002. Each flood is characterised by a specific value for the timing, the location and the magnitude of discharges within the entire river network. We propose a measure for quantifying the overall event severity considering both the heterogeneous spatial extent as well as the locally varying magnitudes of a trans-basin flood. In total, we identify 80 trans-basin floods in the entire time period. The set is dominated by events that were recorded in the hydrological winter (64%); 36% occurred during the summer months. 32 events affected more than one third of the entire river network. These most severe events are predominantly winter events. Dividing the study period into two sub-periods, we find an increase in the percentage of winter events from 58% in the first to 70.5% in the second sub-period. Accordingly, we find a significant increase in the number of extreme trans-basin floods in the second sub-period. A natural extension of this study is the quantification of the spatial and temporal dependencies in a multivariate framework. This framework needs to be supported by a flood typology based on the analysis of the physical processes relevant in the genesis of trans-basin floods.


Introduction
Flood events extend over a period of time, with durations lasting from a few hours to several weeks, and affect a certain space, ranging from single catchments to several basins. In this study we focus on river floods that, caused by the same hydrometeorological processes, affect multiple basins and that are consequently of durations exceeding several days. We term them trans-basin floods. Extreme river floods such as the well known events of , May 1999, January 1995and December 1993 have demonstrated that flooding in Germany can at the same time affect communities in more than one federal state and often in more than one river basin. This poses problems in the development of flood disaster management strategies since the co-ordination of flood activities in case of emergencies is subject to the federal state ministries and inter-state agencies are largely endowed with advisory competences only. Further, in the insurance and re-insurance industry this type of flooding leads to an accumulation of losses. Therefore, the risk concept needs to be extended to a concept representing the spatial risk of flooding, i.e. scenarios have to be developed that capture a large number of individual risks during a single event.
Any flood risk assessment has to answer three questions (Merz and Thieken, 2004): (1) What are the possible flood scenarios? (2) How likely is their occurrence? (3) What are the consequences in case a scenario occurs? The risk to a particular site can be estimated by using a wide range of wellestablished methods. However, the assessment of the accumulated risk, especially over large domains like that of entire countries, requires not only studying the local extremes, but also their spatial and temporal coherence. Methods to derive the spatial dependence and the accumulated risk of flooding are still in their infancy, and for Germany the extent of spatial dependence of flooding is largely unknown. This paper starts filling this gap by providing answers to the first question of risk assessment -the possible flood scenarios, and by laying the foundation to answering the second question, namely through an analysis of the past occurrences of transbasin floods and their overall severity.
Past floods provide a range of scenarios and through their systematic analysis inferences can be drawn on the inherent spatial and temporal dependencies. Rodda (2005) demonstrated that past floods can be employed to derive synthetic trans-basin flood scenarios for the main rivers in the Czech Republic. He assembled a set consisting of the 11 most severe historical floods from 1935 to 2002 using reports and local knowledge. For each flood, series of daily mean discharge were acquired within a temporal envelope around the days of the flood peaks for 25 gauges. The author identified three different patterns of flooding, based on which 30 synthetic flood events were derived through a qualitative analysis of the historical events and knowledge of the hydrology of the river basins.
The question arises though, whether a chronology of floods assembled from documented data sufficiently reflects the range of flooding needed to infer future flood scenarios that do not only replicate a limited set of possible combinations. Further, if the frequency of simultaneous flooding is to be assessed, the frequencies of past occurrences need to be known. In the central European context numerous studies have presented collections of flood events, e.g. the works of Glaser (2001), Stanescu (2002), Glaser and Stangl (2003), Jacobeit et al. (2003), Pohl (2004), Barnolas and Llasat (2007), Barredo (2007), and Müller et al. (2009) to name a few. However, uncertainties remain about the completeness with which flood events are identified and especially the issue of geographical referencing and the overall magnitude of floods can be addressed qualitatively only (e.g. Sturm et al., 2001).
From a frequentist's point of view, a series of observed extreme events provides the basis for an extreme value analysis by choosing either block maxima or peaks-over-threshold to define the sample within the observation period. Then distribution functions can be fitted to these samples allowing an extrapolation beyond the range of the observed values, i.e. to more extreme events. The problem with trans-basin floods is that no straightforward measure can be used to quantify the overall magnitude of the event. The only regionally unbiased measure would be the total area of inundation or the total damage caused by a flood. But no consistent information of the actual inundation area or damages is available for historical events. The problem is that historical data on flood losses are neither comprehensive nor standardised throughout Europe (Mitchell, 2003). Blong (2003) presents a list of damage related indices and recently Barredo (2009) pre-sented an attempt to normalise flood losses in Europe, using events listed in either the Natural Hazards Assessment Network (NATHAN, http://mrnathan.munichre.com/) of the Munich Re or the Emergency Events Database (EM-DAT, http://www.emdat.be/), maintained at Université catholique de Louvain. These archives contain selections of floods that exceeded certain thresholds of impact (monetary damage and number of casualties). Damage in turn is a product of the hazard itself and the vulnerability of people and assets towards flooding. In the course of the centuries, changes in the vulnerability, and there especially in the exposure, are evident. This leads to a bias in the archives, as floods of the same intensity may have caused little damage in the past but led to severe damages (and therefore their inclusion in the archives) in later times.
Only recently a somewhat complete catalogue on floods has begun to be maintained (Dartmouth Flood Observatory, available at http://www.dartmouth.edu/ ∼ floods) dating back to 1985, which mainly uses satellite data.
It is the aim of this study to develop both a method that allows the derivation of a consistent set of past trans-basin floods and that provides an indicator to compare the severity of these floods based on their spatial pattern of flood magnitude. We deduce trans-basin floods by jointly analysing series of mean daily discharge at many sites for the simultaneous occurrence of peak discharges. Discharge measurements are integrals of the meteorological, catchment, and channel processes and are therefore suitable to capture the temporal and spatial evolution of flood events. The method and the severity indicator are developed based on data of catchments in Germany; however, they are transferable to other regions.
The first important issue to address is the identification of suitable events in the multiple series of discharge. To allow flood peaks at different locations and at particular time lags to be identified as being mutually related, the identification procedure requires an appropriate definition of thresholds for the flood magnitude and dynamic. So far, only few studies have investigated the spatial coherence of flooding, making use of different combinations of these thresholds, dependent on the aim of the study and the data available.
Besides the before mentioned study of Rodda (2005), Merz and Blöschl (2003) identified mutually dependent annual maxima in Austria for the purpose of developing a flood typology. A maximum time lag of one day and a maximum distance between adjacent catchments of 50 km are applied to determine those annual maxima that belong to the same flood event. The spatial spread of a flood is then expressed as an ellipsoid that is build around the centroids of the affected catchments. However, not all rivers that are affected during the event may have experienced their annual maximum flood and in many cases flooding below the annual maximum occurred at more sites. The method favours floods which propagate along one particular river or affect only neighbouring catchments. Flood occurrences which affect multiple basins at a time lag of more than one day or floods of spatially dispersed S. Uhlemann et al.: A consistent set of trans-basin floods in Germany 1279 origins cannot be captured. This is especially the case when not the entire basin can be used for analysis due to data unavailability. For example, in the case of the severe summer floods of 1954 and 2002 the flood triggering Vb cyclone first led to severe flooding in parts of the Danube basin and thereafter to flooding in the Elbe basin. Both occurrences are evidently related; meanwhile for the German context the missing data for the upper Elbe basin (Czech Republic) prevent a causal correlation through particular distance measures. Keef et al. (2009) develop a measure to capture the spatial dependence in extreme river discharges using a dense network of time series of mean daily discharge in Great Britain. For a range of return levels T they estimate the expected proportion of sites at a range of distances d from any gauged site that exceed the pth quantile during an event. Even though a range of distances are tested, the overall extent of a flood is limited to the proposed radii d. Also, the choice of the quantile largely influences how many sites are identified as responding simultaneously. Naturally, for very high quantiles the spatial coherence is rather low. No indications can be given on whether the all causally related flood peaks have been captured.
Our study extends the previous approaches in various ways: First of all, we aim at identifying all trans-basin floods in a period between 1952 and 2002 using as many sites as possible. We take a holistic approach on each flood event, meaning that rather than applying a strict quantile approach we are more interested in the system response. Therefore, we aim at identifying all peaks at all sites which are mutually related. For that purpose we relax the stringent thresholds of magnitude by defining that only one site needs to exceed the pth quantile. All other sites are checked for the simultaneous occurrence of flood peaks irrespective of the quantile reached.
In our approach we do not impose any distance measure to infer the mutual relation of flood peaks or set any a priori restrictions on flood extent. Since the spatial spread of a flood is confined to the river network, the flood characteristics derived from discharge measurements are not a spacefilling phenomenon (Gottschalk et al., 2006). For the German context the experience from recent floods shows that flooding can develop over long distances and in a spatially asymmetric manner. This is largely due to the complexity and form of the river network, which is very diverse for the German river basins, and due to spatially varying and often multiple origins of floods. Therefore, distance measures like the Euclidean distance between catchment centroids or gauging stations cannot be employed in a straightforward manner to infer the dependence of flood peaks on a trans-basin scale. Rather, we will make use of the time-space correlation in the evolution of a flood event. Therefore, in this study the spatial distances encountered in the study area are treated implicitly by considering the timing of the peaks as a function of space. This allows capturing also dependencies amongst peak recordings at spatially remote locations that are not di-rectly connected by a river network and/or belong to catchments which are not adjacent.
As outlined earlier, floods evolve over a period in time and extend over a certain area. That means flood peaks which can be attributed to the same flood event will be recorded at a time lag τ at the various stations. This time lag corresponds to the drift velocity of the weather system, the concentration time in the catchments and also the spatial distances between sites expressed as length and complexity of the river network including phenomena such as superposition of flood waves at confluences or retention of the flood wave due to dike breaches. In this study, flood peaks are considered as simultaneous when they occur less than a predefined time apart. This time is based on the physical understanding of the processes during flood development and it implicitly considers the spatial evolution of a flood event within as well as across basins. Hence, floods will be defined strictly in terms of the timing of their peak discharges.
A second central aspect of this work is the development of a measure of event severity for trans-basin floods. The translation of discharges into inundation area on a large scale using hydraulic approaches still poses severe demands on data and computing power and was hence not an option for this study. Alternatively, the measure of Keef et al. (2009) gives a statistical indication on the spatial dependence in extreme river discharges. Mapping this measure for all stations provides an overview on where flooding tends to occur spatially coherently at various levels of recurrence but it cannot be used for an event based assessment.
In this study we will lend on the concept of flood impact by assessing the length of the river network potentially inundated during the event. We thereby define the potential for inundation as the exceedance of bankfull flow at any particular site. The events identified in the first step are characterised by information on the flood peaks per site. To derive an indicator that captures the spatial pattern of heterogeneous flood magnitudes, both the spatial extent and the magnitude have to be taken into account. Further, to ensure the generic applicability this measure has to be unspecific to the set of gauged sites. We therefore introduce a simple scheme to regionalise the site specific discharge peaks to the entire river network and normalise the discharge values by a threshold indicating, whether a river stretch has actually been "in flood", hence, has exceeded the bankfull discharge. Conditional on this criterion, the measure is formed as a cumulative weighted function of the dimensionless, normalised flood magnitude and spatial extent.
We have chosen not to express flood magnitude through return periods. During many events peak discharges of very high magnitude can be expected. Return periods are estimates of the flood magnitude and the quality of that estimate at any site depends on the quality of the fit to an extreme value distribution and on the length of the underlying annual maximum series, here 51 years. For large quantiles these estimates are associated with high uncertainties due to an extrapolation beyond the range of the data. Further, expressing the event severity as the mean over all return periods from the affected gauges (even if weighted) would too easily be interpreted as the "true" return period of the entire event. This is certainly not the case. The estimation of the return period of a trans-basin flood must be based on a frequency analysis over the entire event set and even more -moving away from an empirical estimate -must be based on an assessment of many (a thousand or more) synthetically generated flood scenarios that take into account the spatial dependencies amongst the flood peaks within the events. The paper is structured as follows: Sect. 2 gives an overview on the data available for this study. Section 3 provides a detailed description of the methods developed for the identification of trans-basin floods and the indicator for flood severity. The resulting event set is presented in Sect. 4 together with an analysis of the main characteristics of transbasin floods in Germany. To prove the reasonability of the thresholds applied for event identification, and to allow an adaptation of the methodology to different objectives or data, a sensitivity analysis is presented in Sect. 5. Section 6 discusses the results and concludes on the main findings of this study.

Study area and data
Series of mean daily discharge were obtained for the German parts of the river basins Rhine, Ems, Weser, Danube, Elbe, and Odra from various water authorities in Germany and the Global Runoff Data Centre (GRDC). The basins are  Fig. 1).
The gauges were selected based on catchment size whereby the catchment had to exceed a drainage area of 500 km 2 . This threshold was chosen to exclude local floods from the study. The choice of gauges was further constraint by outweighing the best possible spatial coverage of the investigation area and the longest coherent time period. The period of the water years from 1952 to 2002 (a water year ranging from 1 November to 31 October) was chosen, since for this period a maximum of gauges with continuous measurements could be identified. All time series were checked for data errors and missing values. Series with more than two complete water years missing were excluded from the analysis.
162 gauges were selected with a mean catchment area of 16 880 km 2 and the maximum area comprising 159 300 km 2 (Rhine). A high percentage of nested catchments are included in the dataset. The stations are not evenly distributed across the basins, with less dense coverage in the Rhine basin and dense networks in the Danube and Weser basins (see Table 1). Figure 1 illustrates the spatial distribution of all gauges and the relevant basins.
The river network used in this study is the pan-European River and Catchment database developed under the CCM2activity (Catchment Characterisation and Modelling) of the Joint Research Centre (Vogt et al., 2007). The CCM2 dataset offers the stream network for Europe and explicitly allows the deduction of the river topology, also making reference to hierarchical structures like the Strahler system. It is therefore an ideal basis for any regionalisation of point discharge data to the entire river network. The resolution and quality of the data is sufficient for the scale of this study.

Method
Let F denote a trans-basin flood event and Q i (t) the discharge series at the sites i, with i=1,. . . , N and N the total number of sites available, here 162, and t in daily time steps in the period of 51 years between 1952 to 2002. As outlined in the introduction, each event can be fully described by its spatial pattern of flood magnitude. We treat the spatial distances encountered in the study area implicitly by considering the timing of the peaks. This also allows capturing dependencies amongst peak recordings at distant locations that are not directly connected by a river network and/or belong to catchments which are not adjacent. We take a holistic view on each event, meaning that we are interested in the system response at each site given an identified extreme discharge at any other site. An extreme value is defined as at least one gauge exceeding the pth quantile of its annual maxima series during the event. The system response at any other site is given by a peak discharge that significantly deviates from the ordinary variance of Q i (t) around its running mean. Let capture all sites i that have exhibited a significant peak within the time interval . Then each flood F can be described by a set of time dependent discharge peaks (the peak value denoted by the superscript P ) according to with E the total number of identified floods. expresses the overall duration of the event within which the maximum observed peaks at sites i are defined as mutually related or in other words, as having occurred simultaneously. The duration is supposed to last from the first identified peak to the last.
The identification procedure requires an appropriate definition of thresholds for the desired flood magnitude and for the flood dynamic allowing flood peaks at different locations and at particular time lags as being identified as mutually related. The procedure is comprised of the following four steps each being described in more detail in a subchapter: 1. In each series of mean daily discharge those days are identified where a peak above or equal to the pth quantile (peaks over threshold, POT) has been observed. This step is introduced by an explanation of the general procedure for identifying peaks in series of daily mean discharge.
2. Subsequently, all peaks that belong to the same event need to be pooled. For each day during which a POT was recorded at any site the discharge series of all N stations are interrogated in a temporal envelope around that day to find significant peak discharges. The steps involved are: -The definition of an appropriate temporal envelope.
-The evaluation of the significance of the peaks found.
-The definition of an inter-event time criterion which ensures independence between consecutive transbasin floods.
3. After pooling, by applying a simple regionalisation scheme, we translate the point values of discharge peaks into a spatial extent variable that allows us to identify those events that are of a trans-basin extent. The share of the network that is potentially affected by inundation is used as indicator for the spatial extent of each event.
4. Using the regionalisation scheme introduced in point 3, we can formulate a weight cumulative indicator that is a function of the spatial pattern of maximum observed discharges in the river network.

Peaks over threshold
As outlined, we are interested in the system response at any particular site towards a given extreme event at any other site. We adopt the POT approach to identify those days in the spatial series of discharge during which at least one gauge exceeded a discharge threshold u. Starting from these days we apply a pooling procedure to identify any peak discharge that can be mutually related to this event.
Discharge peaks for series of mean daily discharge Q(t) at any observation site i can be identified by evaluating the increments between the preceding and following discharge per day.
Let z(t) depict the series of increments between the daily values of Q(t) simplifying it to positive, zero or negative differences.
Let g denote the index to the daily time series over the 51 year period (a total of 18 628 days). Then peaks can be identified according to three cases (see Eq. 4). Firstly, a clearly peaking hydrograph where in the course of a day the discharge reaches it's maximum and after that immediately ceases (case 1, Eq. 4). These peaks are typical for fast-reacting catchments. For slowly reacting catchments or downstream observation sites these peaks might be prolonged and the flood crest may persist over a day or two until the water level falls again (cases 3 and 4, Eq. 4).
Q P (t P ) then contains the set of discharge peaks contained in the time series Q(t). For cases 2 and 3 in Eq. (4), the first day of the sequence of increments is used as the day of the peak occurrence.
We choose the 10-year flood (Q10) as threshold u to define a minimum event severity above which a flood impact can be expected. The Q10 is commonly requested for risk maps as the first out of three or four zones in risk mapping (as for example proposed by the EU-Flood directive, or e.g. realised in the Rhineatlas (IKSR, 2001)), delineating areas of frequent flooding. In the insurance industry, objects which are situated within the exposure zone of Q10 are usually not considered as insurable (Kron, 2005).
We estimate the 10-year flood using the 90th percentile of the series of annual maxima (AMS). The AMS are extracted from each series of Q(t) choosing the annual maximum peak per water year. Then, the generalised extreme value distribution (GEV) is fitted by the method of L-moments to each of the AMS. From the fitted data the discharge threshold u of the 10-year flood is estimated. Considering the length of the time series used, this estimate is assumed to be reliable.
Using u, we identified all POT in each series of mean daily discharge. To ensure independence of the selected events, the minimum time lag between flood peaks are set according to Svensson et al. (2005). There, dependent on the catchment size, time lags are set to 5 days for catchments <45 000 km 2 , to 10 days for catchments between 45 000 and 100 000 km 2 , and to 20 days for catchments >100 000 km 2 , respectively.
Evaluating the POT of all N gauging stations, those days in the time period during which at least one POT was recorded were compiled into a set of dates expressed as Now, each of the days in D is used as a starting point for identifying trans-basin events. Let j denote the index to D, with j =1,. . . , M. Based on the Q10 thresholds a pooling method has to be applied such that mutually dependent peaks at all sites i can be identified and grouped together.

Temporal envelope W
Flood events that affect numerous catchments do not lead to peak discharges at the same day at all observation sites.
For example, a low pressure system passes through the study area on a south-west to north-easterly track, leading to rainfall at is front and an influx of warm air. The movement may take a couple of days from its first appearance in the study area until it finally leaves the area or the precipitation field has rained out. This rain field meets specific catchment conditions, like e.g. the presence of a snow cover or saturated soils. The processes of runoff concentration and the respective concentration time determine the time lag from the initial precipitation and/or snowmelt to the recording of a flood peak at the respective gauges. The flood wave in turn propagates downstream at a particular speed, leading to lagged flood peaks downstream.
In this study the design of the temporal envelope W is intended to reflect the flood dynamic at a trans-basin scale. Several features in the flood generation of large events are in common to most types of floods and allow the definition of a general time window for flood peak detection. For example, Rodda (2005) uses a 10 day envelope around the date when the peak discharge of a historical flood event was recorded. This time window is then interrogated to find the maximum mean daily discharge for each station for each event. Keef et al. (2009) test the bivariate temporal dependence for a selection of sites in Great Britain on a range of lags up to a maximum of 50 days. They find that 96% of all pairs have estimates of extremal dependence within a lag of |τ | ≤ 3 days. Larger lags occur if any of the pairs is a slow responding catchment.
Here, a more differentiated approach is applied. We infer the mutual dependence of peaks starting from any day D j . Since this is likely not the point in time, when flooding has started, we check in both directions in time to find mutually related peaks. Therefore, the time window W around each day j will be composed of a pre-POT and a post-POT time lag. The pre-POT time lag reflects the drift velocity of a flood producing weather system over the study area and the time of concentration within the catchments, mostly resulting in time lags of a few days. The speed (which may include stationary conditions) and direction of the triggering weather system determine the point in time and the spatial order (succession) at which runoff generation is induced in the catchments. Runoff can thereby result from snow melt, rainfall, or both (rain on snow). Typically, frontal systems that are embedded in the westerlies pass over the study area on a west-easterly track in less than 24 h. In case of quasistationary conditions a frontal system may persist and lasting precipitation (with varying intensities) over a couple of days can occur. Due to the wide spatial coverage of these systems, at the beginning of an event a number of spatially far apart catchments will react simultaneously or within a few days. The same accounts for sequences of disturbances which cross over the study area in short intervals. The time of concentration, i.e. in this study the time until the first gauge reports a peak discharge, is mainly determined by the catchment size, the catchment characteristics, and the initial catchment state.
Depending on the catchment size, various studies have used a time lag of one (catchments between 500-5000 km 2 ) to three days (catchments >20 000 km 2 ) to link the flood triggering circulation pattern with discharges (Duckstein et al., 1993;Frei et al., 2000;Bárdossy and Filiz, 2005;Petrow et al., 2007). We choose a pre-POT interval of 3 days.
The time for the propagation of a flood wave in the channel to the most downstream location leads to a much longer post-POT time lag to be considered. In the process of propagation, the flood wave can be either amplified at confluences due to simultaneous arrival of flood waves from tributaries, be maintained or dampened. For the first two cases, the flood peak can be monitored over long distances and the travel time of the flood wave can take several days, e.g. in the Elbe a flood wave recorded in Dresden (hence flooding originates in Czech Republic) reaches the outlet of the basin (Neu Darchau) approximately 8 to 10 days later. For the flood event in March 1988 the time lag between the flood crest at the lower Elbe and the preceding Q10 corresponded to exactly 10 days. Consequently, we set the post-POT interval of the temporal envelope to 10 days, and W = 3,−2,...,0,...,+9,+10.
For any day D j , all of the N discharge series are checked for the presence of distinct discharge peaks at any time lag τ ∈ W , using the increment based approach as described in Eqs.
(2) to (4). To also capture peaks at the very first or last day of the temporal envelope (i.e. in case 3 Eq. 4), for computational accuracy it is necessary to extend the time window by three days at the beginning and end of the interval. Nonetheless, peaks will only be considered if they fall within W .

Significance of peaks
The significance of each peak identified in W is evaluated by analysing whether it significantly deviates from the normal fluctuations of Q(t). Rather than applying a global threshold based on quantiles of the annual maximum series per site, we chose to evaluate each peak detected in W locally by comparing it to the general behaviour of the hydrograph.
For that purpose we calculate the moving average P (t) (kernel width of 13 days) for the entire discharge series. The residuals between the observed runoff Q(t) and P (t) are then calculated at each time step producing a series of nearly normally distributed noise. We use the 90th percentile of this series as a threshold ν that, if exceeded, reflects those periods in Q(t) during which the hydrograph significantly deviates from the normal fluctuations. We interpreted this as a reaction to a distinct surplus of water in the river network. Also it allows identifying the flood peak rather then any other minor peaks in the hydrograph. If more than one significant peak is detected in the interval W , the one of highest discharge is used for further analysis. The procedure can be expressed as Procedure of identifying significant peak discharges. Q(t) depicts the hydrograph of an arbitrary gauge i in the interval W (grey shaded area) around a day D j . The lowpass function P (t) is calculated using a moving average of 13 days. The significance of each peak is evaluated by calculating the 90th percentile ν of the residuals between Q(t) and P (t). From the identified peaks Q P (arrow) only those are considered that exceed P(t)+ ν and that are located within W . In the example this holds true for the second peak (black arrow).
If no significant peak is detected for site i, then Q P i (t P ) is treated as missing value. The set then comprises only those sites i, for which a significant peak discharge has been identified. Figure 2 illustrates the procedure of peak identification for a typical flood hydrograph. Applying the interval W (grey shade) around any arbitrary day D j , two peaks are identified. Clearly, the first peak is located on the rising limb of the hydrograph and should not be considered. The conditional of Eq. (6) allows this distinction and only the second peak is identified as significant and is chosen for the further analysis.

Independence of events
The last step in the identification of trans-basin floods is the definition of an inter-event time criterion δ that allows defining the independence of events. Often, when a flood is evolving, a number of gauges will exceed the threshold u within short succession. Then the time lag between some consecutive POT-dates can be of a few days only and peaks detected for day D j may overlap with those identified for day D j +1 . These peaks are mutually dependent and need to be pooled into the same event group. The inter-event time criterion δ has to be defined such that no peaks are pooled which belong to separate events. The independence of significant peaks identified for each consecutive entry in D is determined by evaluating the time lag between the last peak identified for day D j and the first peak identified for day D j +1 , with Unless δ j does not exceed one day, all entries of D j and D j +1 are pooled into the same event and the duration of the event grows with the additional dates. This step basically reflects the topological behaviour of the river network, once a flood is progressing downstream. In this way it is secured that also those peaks at downstream locations are picked that do not exceed the threshold of the 10-year flood and that are farthest apart from the place of the onset of an event. E.g. for a typical summer Vb type, it is possible to capture flood peaks at the basin outlet of the Elbe since they can be temporally linked to the last occurring POT in the study area which would be likely reported in the mountainous headwater catchments of the Ore-Mountains. If δ j > 1 day, the events are treated as independent.
Using the series of δ j , an index set K can be created according to the following conditional where K defines the points in time at which peaks identified for any date D j ∈K and those identified for the consecutive date D j +1 are treated as independent. e denotes the subscript to K, with e=1,. . . ,E, and E gives the total number of flood events detected in the discharge series of N gauges within the period of 1952-2002. In case more than one significant peak at a particular site i is present within one event, the larger peak is used for further analysis. Now, each event is fully described by the timing and the magnitude of the discharge peak at each site, as given by Eq.
(1). The overall duration of the pooled event is defined to last from the day of the first pooled peak to the day of the last pooled peak.

Exclusion of spatially small events
Since trans-basin floods are the subject of this work, all events E identified in the procedure described above need to be checked for their actual impact, i.e. their spatial extent. We regionalise the point observations of peak discharges per event to the entire river network and choose a truncation level of 10% potentially inundated rivers to limit the event set to reasonably large events. This threshold translates into at least 1200 river kilometres affected during the event. In most cases this results in flooding in more than one basin. For comparison, the overall length of the river network for each basin is given in Table 1.
Therefore, a discharge threshold κ has to be identified which reflects whether the discharge peak has lead to inundation, hence has exceeded the bankfull discharge. Since no data on river morphology or water levels at bankfull conditions were available, the threshold is estimated using a quantile approach.
Several ranges of recurrence intervals for bankfull discharge have been proposed in literature for natural rivers. Using an annual series approach, Petit and Pauquet (1997) estimate the recurrence interval for the bankfull discharge of 30 gravel bed rivers in the north east of France. A linear relationship between catchment size and recurrence interval leads to an estimated range from about 1.8 to 2.5 years for catchments of 500 km 2 and larger. Using partial duration series, the results even indicate recurrence intervals in the range of 0.8 to 1.5 years. Although recent studies (Navratil et al., 2006;Wilkerson, 2008) show a strong relationship to the type of the river bed and also the methodology used to estimate bankfull discharge, an average of 2 years recurrence interval Q2 seems to be a reasonable threshold.
A simple regionalisation scheme is applied that encounters for both the stream length and the stream complexity. The hierarchical ordering of river networks as developed by Strahler (1964) provides a good measure to describe these features. Depending on the Strahler-order ζ at the gauge i, we regionalise the discharges to the river stretches upstream of a gauge. In case of nested catchments the length of the river network of the intermittent catchment is considered. For those gauges at lower parts of streams, like e.g. the Rhine (ζ ≥6), only those parts of the network are considered that Table 2. List of trans-basin flood events in the period 1952-2002. Classes of spatial extent are highlighted in orange (class 1, L ≥ 50%), yellow (class 2, 50% > L ≥ 33%), green (class 3, 33% > L ≥20%) and blue colours (class 4, 20% > L ≥10%). Winter events are displayed in black fonts, summer events in red fonts. A few exceptions had to be considered: For the eastern tributary of the Elbe, the river Havel, time series were only available at downstream locations. The regionalisation of discharges at locations where large parts of the basin are located upstream with no further gauges is highly uncertain. Therefore, the most upstream gauge of the Havel (Ketzin) was assigned to only the same ordered river network (ζ =6). The basin of the Odra is almost completely located in Poland. Only the confluences of the two major tributaries are located on German territory and two gauges are situated there. Here, the discharge peaks of Hohensaaten and Eisenhüttenstadt were regionalised to the same ζ (8 and 6, respectively) river stretches both up-and downstream to allow an adequate consideration of the river length.

Rank
This kind of regionalisation can only be a very rough estimation of the true effect of each flood. Nonetheless, for the scale considered in this study and the dense network of gauges this is deemed reasonable.
Using simple GIS queries, the cumulative length l of the river network can be calculated for each gauge i according to the above mentioned procedure. Then the ratio of the catchment length to the total length of the entire river network provides the weights λ i = l i i l i . The overall affected length of the river network L is conditional on the exceedance of the threshold level κ and is given in percent according to We choose to truncate the event set at a level of L <10%. This level keeps only those events in the set that apply for a trans-basin analysis, i.e. in most cases more than one basin being affected. For an extreme value analysis the set can be chosen to be truncated at any higher level L which we leave to the practitioner to decide. We will present the results of this study differentiating between several extent classes to analyse possible differences in the processes that lead to trans-basin floods.  Fig. 4. Characteristics of the identified trans-basin events, ranked according to the weighted cumulative discharge index S (solid red line). The share each major basin (Odra, Ems, Weser, Danube, Rhine, Elbe) takes in the formation of the index S is indicated by the green to yellow shading. The percentage of the river network affected is given by L (solid black line). On the secondary x and y-axis both the number of gauges is given at which the 10-year flood was exceeded (dashed red stems) and the number of those that did not show any significant reaction during the event (dashed black stems). The grey dashed vertical lines indicate the approximate division of the event set into the 4 classes of spatial extent as given in Table 2.

Event severity
In this study, the overall event severity is defined as a function of the spatial pattern of maximum observed discharges in the river network. Making the same impact-based assumptions as denoted in the previous section, we weight the normalised peak discharges by the median annual flood (Q2) and derive the weighted cumulative discharge indicator S according to The sum is formed only over those sites and their respective river length where the threshold for bankfull discharge κ had been exceeded. The normalisation to the inundation threshold or median flood κ=Q2 allows comparing the magnitude of a flood at each gauge, and the sum then serves as an indicator for both event magnitude and spatial extent.

Results
Applying the methodology with the chosen set of parameters, a total of 80 trans-basin events are detected within the years 1951 to 2002. Table 2 gives an overview on the events with ranks assigned in the order of event severity according to the indicator S. For each event the first and the last day with a significant peak discharge are given. Classifying the event set by the spatial extent, four event severity classes are further distinguished. Class 1 contains extreme events that affected more than 50% of the entire stream network. A total of 14 events belong to this class (highlighted in orange in Table 2). Class 2 (yellow) contains all events which affected between one third and 50% of the network, 18 in total, and another 21 events affected between one fifth and one third of the network (class 3 -green). The majority of events (27) exhibited a spatial extent just above the threshold level of 10% and up to a maximum of 20% (class4 -blue). The set is dominated by events (64%) that were recorded in the hydrological winter (1 November to 30 of April), 36% occurred during the summer months (1 May to 31 October), which are marked by red fonts in Table 2. In the following, the main characteristics of the events are further analysed. Figure 4 gives an overview on the characteristic features of each event. The events in Fig. 4 are sorted in descending order according to the index S, and the according rank number is given on the x-axis. Using this rank number the event date can be obtained by cross checking in Table 2. For an easier overview, summer events are marked by red fonts and winter events by black fonts.
Focussing on the event severity first, it can be noted that S declines nearly exponentially with an initial sharp decline of event severity within the first two classes, that is, those events which affected 1/3rd or more of the entire river network. The spatial extent of each event L (in %) is also displayed, highlighting the relative contribution of spatial extent and event magnitude to the indicator S. The farther both lines are apart, the higher is the share of river stretches that have been affected by severe flooding. For a better orientation the stems at the top of Fig. 4 indicate the share of the river network that has been either affected by discharges exceeding Q10 (red stems) or for that no significant peak discharges have been recorded during the event (black stems). Figure 4 further gives an indication on the location of the floods, showing the relative contribution of each basin to the indicator S (colour shaded bars). During the most severe floods all major basins react (Rank 1: L=83%). Consequently, the number of gauges which did not exhibit any . The river network is coloured according to the regionalisation of the normalised peak discharges with colours from yellow to red indicating significant peaks above the threshold κ=Q2. Grey shaded river stretches still exhibited significant peaks but did not exceed κ=Q2. At river stretches coloured white no flow reaction could be observed during the event.
reaction during the event is relatively small and often is exceeded by the number of gauges which recorded a 10-year flood or higher. This ratio changes for events of the extent classes 3 and 4. The less severe the events, the more often only one or two basins dominate the event.

Seasonality
Marked differences can be observed between winter and summer floods, both with respect to their region of occurrence as well as their magnitudes. Additionally to the information given in Fig. 4, Fig. 5 highlights which basins were actually affected by what level of flooding. Here the shares are differentiated in those parts, where no significant peak discharges could be detected (N-bar), those parts where a significant peak was observed but did not exceed the threshold of Q2, and those parts that contributed to the event severity by exceeding Q2. Two typical examples are given in Fig. 5 for ( show, that winter floods are characterised by moderate magnitudes that even for events of class 1 hardly reach those of summer floods. It can be generalised that the events are characterised rather through their wide spatial extent (for the example of Fig. 5a: 73% of the river network exceeded the threshold Q2), rather than flooding with high magnitudes. During winter, most of the affected river segments are usually located in the Rhine basin, often in combination with the Weser, covering most of west to central Germany (on a north-south extension). In the example of Fig. 5a, only 5% of the entire river network showed no reaction to the event at all. 22% reported significant peaks, though below the threshold Q2. Consequently, even though winter events are most common in the centre to west of Germany, the hydrometeorological origins of the floods are also present in the east and south, leading to reactions in at least parts of the Elbe and Danube basin (mostly in the western and mountainous catchments). Clearly in Fig. 5a, both basins make up nearly all of the significant peaks below Q2. Returning to the event overview given in Fig. 4, we can generalise, that no winter event is solely located in the south-eastern part of the study area. Also, the figure highlights the difference between the most severe events of class 1 and events of class 2. Events are only listed in the top-ranks if additionally to the Rhine and Weser also catchments in the Elbe and Danube are reacting. Events of class 2 are mostly only confined to the first two basins.
Contrastingly, during summer the north and west of Germany (Rhine, Weser) are hardly being affected. Figure 6b illustrates for the July 1954 flood, that the most severe flood peaks were exclusively observed in Danube and Elbe. The remaining basins are often not reacting at all (in this case 40% without any reaction). The ranking of the event is dominated by the few extreme discharges. From the colour shading in Fig. 4 this can be generalised: Most summer floods almost exclusively affect the basins of the Danube and Elbe, and during nearly all these floods one third or even more of the river network does not respond.
A closer look on the monthly variability in the occurrence of trans-basin floods is taken in Fig. 7. It can be quickly captured that trans-basin floods occur predominantly during winter in the period between December and March. Only few events were detected in the transition months of spring and nearly non in autumn. Summer events occur predominantly between June and August. The differentiation into the event severity classes clearly shows that spatially large extents are almost exclusive to the winter months, with only 5 events in summer belonging to class 2.

Event duration
The average event duration lies in the range of 10 to 15 days, with maximum durations of up to one month and shortest durations of 5 days. Figure 8 shows the results differentiated by the spatial extent classes and by season. Extreme events (class 1, and therefore winter events) mostly take a longer course in their development with a median of 13 days. The outlying event of 28 days is the top-ranking event of the set, March 1988. During this event a succession of snowfall, snowmelt and rainfall led to a continuous increase in the water stages and the formation of several flood waves throughout the entire country. Due to the widespread nature of these floods longer event durations can be expected, since the flood waves propagate through all basins with varying onsets of the flood initiation. Many winter floods can be expected to be partially caused by snowmelt which often leads to delays in the concentration times due to initial storage of rainfall in the existing snow cover. Summer floods in turn show a faster reaction with an immediate rainfall-runoff transformation. Further, due to the limited area affected, also the outlet of the basin and, hence, last detectable flood peak are reached faster.  1952-1977; 1978-2002) and the legends reflect the respective numbers of event per period and for either the extent classes (a) or the seasons (b).

A note on stationarity
by the respective thresholds of spatial extent. It can be noted that the events tend to cluster in time, with periods of frequent, often even multiple floods per year and periods with few occurrences, if any (e.g. late 50s to mid 60s, early to mid 70s, end of 80s to beginning of 90s). This phenomenon has already been described in a number of studies (e.g. Shorthouse and Arnell, 1997; Mudelsee et al., 2004;Llasat et al., 2005;Sturm et al., 2001) and may be explained by distinct modes of inter-annual and inter-decadal oscillations in the climate. Aside from the clustering in flood occurrences, it is further interesting to analyse, whether an actual change in the frequency of the flood events can be observed and whether there are differences with respect to the event severity or season. A simple approach is adopted for this purpose (see Milly et al., 2002). The 51 year observation period is divided into two sub-periods, the first ranging from 1952-1977 (26 years) and the second from 1978-2002 (25 years). In Fig. 9 the frequencies per extent class and season, respectively, are given for each sub-period.
In total, 44 out of the 80 events occurred in the second half of the 51 year period. Differentiating the event frequencies by the classes of spatial extent reveals some interesting details. In the first half of the record, only 30.0% of all events belong to classes 1 and 2, in the second half 47.7% belong to those two groups. Comparing the total number of these extreme events, 11 were recorded during the first half and 21 during the second half. Assuming, that flood events were independent outcomes of a stationary process, these results can be compared to a binomial process. We determined a probability of 5.5% of having 21 or more extreme events (classes 1 and 2 together) out of 32 in the second half of the record what can be described as a significant deviation from a stationary process. For the overall event frequencies, considering all classes together and the frequency of events of classes 3 and 4 no significant changes can be observed. Figure 9b further distinguishes the occurrence of flood events with respect to the season in which these occurred. As stated earlier (see Sect. 4.1), the most severe events are predominantly winter events. An increase in the percentage of winter events from 58.0% in the first to 70.5% in the second observation period can be noted. Even more, out of the 21 extreme events (classes 1 and 2 together) in the second half all were recorded during winter, in the first half 9 out of 11 events (so in total there are 30 extreme winter floods). Using the binomial theorem again, the probability of having 21 out of 30 winter events in the second half of the record is 2.1% and, hence, significantly different from the assumptions of a binomial process.

Sensitivity analysis
To verify the robustness of the resulting event set, it is interesting to revisit the assumptions made in the parameter settings of the methodology. Also, for different objectives in spatial risk assessment, different choices in the parameterisation of the method may be of interest. We test the sensitivity of the methodology for plausible thresholds u and κ, as well as for the time lags τ in the temporal envelope W and analyse the effects on the resulting event sets as compared to the results obtained in Sect. 4. One other important issue to test is the sensitivity of the resulting set of transbasin floods to the data available, i.e. the number of gauges and therefore time series of daily discharge.

Thresholds u, κ, τ
The choice of the threshold u of the minimum desired flood magnitude, here Q10, is a key factor in the identification of flood events. An increase in the threshold to less frequent events will lead to a reduction in the number of trans-basin floods. Therefore, several discharge thresholds u were tested using quantiles of p=0.8, 0.95 and 0.98 of the annual maximum series, corresponding to T =5, 20 and 50 year return period (Q5, Q20 and Q50). Figure 10 shows the results for the indicators S and L for the original set u =Q10 and for u =Q20 and Q50. Generally, all resulting event sets are similar in the upper ranks. That is, the most severe events are detected irrespective of the choice of threshold. That is but for the exception of using Q50 for deriving u what severely reduces the number of identified events, also those of classes 1 and 2 as compared to results using the default settings. Certainly, only a limited number of the events even in classes 1 and 2 exhibit local magnitudes above the 50-year flood and, moreover, only very few of the winter events generally do, as has been already analysed in Sect. 4. E.g. the February flood in 1970 (see Sect. 4.1) is no longer detected. Also, caution has to be taken, since the overall length of the time series is just 51 years and, there-fore, the thresholds derived for the 50-year floods have to be attributed with a much higher uncertainty than those of e.g. Q10. Decreasing u to Q5 in turn doubles the total number of days M during which a POT at any gauge was observed (M(Q10)=381 days; M(Q5)=707 days). This poses problems in the separation of events even when decreasing τ in the temporal envelope W . Due to elevated flow conditions i.e. in winter a number of events become inseparable extending over 1 or 2 months.
To emphasise the magnitude of an event, the truncation level κ for defining bankfull discharge can be increased. As outlined in Sect. 3.1.4, Q2 is a rough approximation of bankfull discharge for natural rivers. Areas of high vulnerability are often embanked and bankfull discharge is increased to the level of dyke construction. Since no detailed information was available for the whole of Germany, we tested κ by increasing it to Q5 (keeping u and W of the original set). As can be seen from Fig. 10, this change mostly influences the values of the indicator S, reducing it to almost half of the original set (from 129.3 for κ= Q2 to 68.7 for κ= Q5, for the same event of March 1988). The increase promotes events with a generally high magnitude in discharge. Therefore, a number of winter events are largely reduced in their spatial extent since rivers often did not exceed the level of Q5 (L max =50% as compared to 82.9% for κ=Q2). In turn, the reduction in L is less present for the severe summer events ranging between 4% and 20% (median 9%), as opposed to 10-47.7% (median 20%) for winter events. Most of the originally detected severe events remain in the set, but the order of the events can change considerably, with a number of events previously ranking in class 2 now ranging below the extent threshold of 10%. The total number of events reduces from 80 to 33, with 9 out of 33 events recorded in summer. So, , the redundancy free number of peaks over threshold (M), the total number of identified flood events E, and the number of trans-basin floods F that exceeded a particular spatial extent L, respectively, against the number of removed gauges. The notation of the box plots is the same as in Fig. 8.
even though summer events tend to exhibit stronger flooding in the affected river, the overall spatial extent is also reduced to less than 10%. When adapting the extent threshold, considering e.g. L=5% as minimum constraint, the number of events identified jumps to 63, bringing forth most of the events of classes 1 to 3 of the original set. The temporal envelope W for flood peak detection was chosen, using process based assumptions on the expected time for flood evolution. The original parameter setting represents some maximum values for the context of the German wide assessment. Tests using shorter intervals of W = [-1,. . . ,+5; -2,. . . ,+7] days show no major changes in the event identification and ranking. Changes can be observed in the detection of peaks at the most downstream locations which results in a tendency to omit the last flood peak and, hence, leads to a slight reduction of the event duration.

Number of gauges
Finally, the influence of the number of data points on the resulting event set is analysed. For that purpose, a subset of gauges is randomly drawn from the original 162 gauges, removing 10, 30, 50, 70 and 90 gauges respectively (resulting in a sample size of N= 152, 132, 112, 92, 72, 52). The procedure for event identification was run 100 times for each subset, leaving the other parameters unchanged.
The results for the bootstrapping are summarised in Fig. 11 showing the changes at the various steps in the methodology, starting from the overall number of POTs, the redundancy free set D of dates with at least one POT observed at any of the N sites, to the resulting number of events E identified. F is then reduced to those events which are of a trans-basin character (F L≥10% ) and that exceeded certain thresholds of spatial extent (L ≥20%, L ≥33% and L ≥50%). The results for the original sample size of N=162 are indicated on the solid vertical lines (expressing 0 number of gauges removed).
As outlined in Sect. 3.1, the POT of the spatial series tend to cluster in time with often many POT occurring on the same day. Using all available gauges (N=162) during the 18 621 days of the 51 year observation period, a total of 950 POT are identified. These are distributed over only M=381 days and finally a total of 130 events (E) are identified. From these only 80 events are considered as trans-basin floods (F L≥10% ). Applying the various levels of spatial extent, 53 events belong to F L≥20% , 32 to F L≥33% and 14 to F L≥50% . Reducing N largely reduces the number of POT that can be used in the identification procedure. This reduction though becomes increasingly unimportant the more the peaks are aggregated due to their mutual dependence, highlighting the strong clustering of POTs in time. The most important result is that the most severe events are almost always being detected, even if the number of gauges used in the analysis is reduced by more then a half. For F L≥50% the median of the 100 runs performed is nearly the same for all N tested, between 12 and 13, as compared to 14 for the original set. In general, reducing the sample size by up to 50 gauges gives similar results for all levels of L. Reducing the set by more sites strongly increases the random component and those events that in the original set are detected due to only one gauge reporting a POT are often no longer detected.
On the other hand, the extreme events remain in the set, since during severe flooding mostly a large number of sites report POT discharges (compare to red stems in Fig. 4). This leads to a promotion of summer events in the set, the fewer sites are used. As described, summer events are characterised by high magnitudes in most of the affected areas, even though the overall spatial extent is limited. Winter events in turn often report lower magnitudes in the affected areas with only a few sites exceeding u. Hence the likelihood of having a site with POT recording in the randomly chosen sample of sites is higher for summer events than for winter events of medium severity.
The other way around, it can be expected that an increase in the number of gauges eventually would not result in major changes to the event set due to the large redundancy in POT.

Discussion and conclusions
This study, for the first time, presents a complete and consistent set of trans-basin floods for Germany in the period between 1952 and 2002. We derive a methodology that is capable of capturing the simultaneous occurrence of flooding using multiple series of mean daily discharge. Based on physical reasoning, we assume thresholds for identifying the spatial and temporal dependencies amongst peak discharges, aiming at capturing the system response rather than using a strict quantile approach. Each flood is characterised by a specific value for the timing, the location and the magnitude of discharges within the entire river network.
The consistent and data-based approach allows formulating a cumulative indicator that considers both the heterogeneous spatial extent as well as the locally varying magnitudes of a flood and, hence, allows ranking the events with respect to their overall severity.
The results indicate that in Germany trans-basin floods are a frequent phenomenon, with 80 events detected in the entire 51-year period. Thereby, the western and central parts of the country are most frequently affected. During the most severe floods all major basins react and the number of gauges that do not exhibit any reaction is relatively small. The less severe the events, the more often only one or two basins dominate the event.
We find a distinct seasonal variation of the trans-basin event characteristics. Summer floods often exhibit very strong local magnitudes that are mostly confined to the basins of the Elbe and Danube and one third or even more of the river network does not respond. In turn, winter floods often can be detected in most basins of the entire study area, but the local magnitudes are less strong than during summer floods. The most severe and in this sense also the spatially largest events are predominantly winter events.
We analysed the frequencies per extent class and season, respectively. It can be noted that the events tend to cluster in time, with periods of frequent, often even multiple floods per year, and periods with few occurrences. By dividing the time period into two subsets we detected changes in the frequency. An increase in the percentage of winter events from 58% in the first to 70.5% in the second observation period can be noted. Coinciding we find a significant increase in the number of extreme trans-basin floods in the second period. This finding is in line with other studies that have detected a shift towards increased winter precipitation and the responsible circulation patterns in Central Europe (e.g. Caspary, 1995Caspary, , 2000Jacobeit et al., 2003Jacobeit et al., , 2006Belz et al., 2007;Pauling and Paeth, 2007;Petrow et al., 2007).
An intrinsic parameter of the methodology is the spatial domain of the study area, here the national borders of Germany. As outlined earlier, summer events tend to be spatially rather limited. Nonetheless, all of the extreme summer floods which can be found in the event set were very prominent events that caused tremendous damages (i.e. July 1954, June 1965. Therefore, these events are well documented and analysed in their hydro-meteorological origins (Glaser, 2001;Christensen and Christensen, 2002;Jacobeit et al., 2003;Ulbrich et al., 2003a, b;Philipp and Jacobeit, 2003;Mudelsee et al., 2004;Pohl, 2004;Grünewald, 2006). These floods affected large parts of the basins of the Danube and Elbe that are located in Austria and Czech Republic and the spatial extent of the entire event by far exceeds that within the national borders of Germany. In contrast, the extreme winter floods in the set can be expected to have been captured more completely in their spatial extent. The rivers Rhine and Weser are located to their largest share or even entirely within the German territory and comprise over 50% of the entire river network used in this study. Both rivers can be categorised as belonging to a winter flood regime (Disse and Engel, 2001;Mudelsee et al., 2006;Belz et al., 2007;Beurton and Thieken, 2009;. Therefore, the dominance of winter events in the set is not surprising and it would be interesting to analyse, whether an extension of the study area to all catchments for each basin would considerably change this ratio. On the other hand, an effect of some winter floods also in the upper Elbe basin is not unlikely. It has to be emphasised, that the results presented here have to be interpreted solely within the national borders. For the purpose of national flood management and insurance issues this is certainly advantageous; but for an analysis of the physics behind these events the event characteristics will have to be analysed in the entire basins. Also, when extending the study area, the conclusions drawn for the changes in flood frequency will have to be revisited. The method developed in this study has been parameterised based on the available data and in context of the spatial domain from which thresholds based on physical understanding of the flood genesis and on standard risk assessment techniques have been derived. When adapting the method to other regions and even more, when extending the event set to the entire basins (i.e. of Elbe, Danube and Odra) which are under study here, the choices for the time lag τ in the temporal envelop W , that define the spatial dependence amongst flood peaks, have to be adapted also. As in this study, this choice has to be made on the basis of physical reasoning (expected times of concentration and travel times in the channel network).
Depending on the desired aim of the analysis, i.e. the preferences towards spatial extent and/or magnitude, it is easy to adopt the method by choosing different percentiles for the thresholds u and κ. The choice of the POT threshold u influences the number of events that can be identified. Changing the threshold κ (bankfull discharge) alters the indicators L and S, since it raises the threshold for spatial extent. Any event set of trans-basin floods should contain events that are markedly connected with inundation and that are likely to have caused damages of considerable magnitude. Q2 is a rough approximation of bankfull discharge for natural rivers. Areas of high vulnerability are often embanked and bankfull discharge is increased to the level of dyke construction. For a good approximation of the inundation caused by a particular flood, the only solution is the definition of specific thresholds on bankfull discharge for each river reach and the routing of the flood wave through the network. Nonetheless, when changing κ the range of S changes but the intra-event comparison is still consistent. In this way, S can easily be adopted for applications in which more emphasise needs to be given to the event magnitude rather than the spatial extent of each event.
From the sensitivity analysis it can be concluded that the most sensitive parameter for the event identification is the number of days M in the time period during which at least one gauge recorded a discharge above the threshold u. M depends on the number of sites available but moreover on the choice of u. This threshold largely determines whether a flood event can be detected in the first place. Increasing the POT threshold u, first of all, the total number of events decreases. This is largely due to a decrease in the number of winter events, since the maximum recorded discharge during many winter events does not exceed high thresholds u. In turn, this promotes the relative share of summer events in the set. From the sensitivity analysis we conclude, that both Q5 and Q50 are inadequate thresholds u for the purpose of this study, because Q5 fails to separate between damaging flood events and periods of simply elevated discharges, and Q50 fails to detect some major events. Both Q10 and Q20 (or any threshold in between) are recommended for an analysis of trans-basin flood events.
The method proved very robust to changes in the number of sites N with respect to the most severe events. We further conclude that up to a critical value of N=110 the overall effects on the resulting event set in terms of number of events detected are insignificant. Nonetheless, when reducing the station network, care has to be taken in the regionalisation of point discharge values to the entire river network that, in turn, determines the quality at which the pattern of spatially heterogeneous flood magnitude can be captured and therefore determines the reliability of the indicator S. Certainly, for the regionalisation as many sites as available should be used to reduce the uncertainties. In addition to the consistent approach of this study, for example, time series that only partially cover the study period could be included and compared to the respective events presented in this study.
The robustness of the method to the number of sites also offers the possibility to extend the analysis further back in time. From the daily time series used in this study (n =162) about 41 stations date back to 1922 or earlier, about 97 stations date back to 1932 or earlier. The series are more or less continuous with a major data gap for many stations during world war second. The spatial spread is not very even in the early 20th century, with many stations along the major rivers being established since long, but many (also large) tributaries only starting to be gauged in the 30s to 50s. Also, until 1930 a strong regional bias can be observed with a dense network in the Danube but a poor coverage in Rhine and Weser. Since the sensitivity analysis of the resulting event set towards the number of available sites is performed by randomly removing stations from the set of time series, the spatial spread is more or less preserved. Now, for the real world situation a bias could be expected due to the location of the stations. If the event set is to be extended back in time this must carefully be taken into account. Using the series at hand we would be confident to extend the set by roughly 10 years back in time (with caution on data gaps in 1945). If the regional bias of the gauging station network and the before mentioned uncertainties in the regionalisation of point discharges to the river network can be taken into account the set may even be extended to the mid to late 1920s.
Besides the coherent occurrence of damages during transbasin floods, for a concise analysis of accumulated risk, it is interesting to analyse the contribution of local floods to mean expected damage. These floods, even though restricted in their spatial extent and their probably uncorrelated occurrence over space and time may still lead to an accumulation of damages. To assess to which degree trans-basin floods and too which degree floods of smaller spatial extent contribute to mean expected damage on e.g. the national scale, an equally consistent approach as presented in this work for trans-basin floods would need to be developed to identify all relevant flood events of small spatial extent. To derive such a set, the pool of gauging stations used in this study would need to be extended by adding stations of smaller catchment sizes resulting in a denser network of stations. So far we have used only stations in catchments that exceed at least 500 km 2 . In this way, we are able to reliably detect large scale flooding. For small floods, e.g. events resulting from convective storms, the uncertainty of the completeness of the event set increases, since a number of local flood events that occurred in ungauged basins will be missed out on. This issue needs to be carefully addressed before conclusions on accumulated risk are drawn.
A natural extension of this study is the quantification of the spatial and temporal dependencies between the peak discharges during the trans-basin floods in a multivariate framework as e.g. it has been proposed by Keef et al. (2009). This framework needs to be supported by a thorough analysis of the responsible hydro-meteorological processes (atmospheric conditions, runoff generation in the catchment, and routing) and their quantification that allows developing a flood typology. In this way, also more understanding can be gained on the responsible mechanisms for flood genesis at the trans-basin scale. For a frequency analysis the conditions of stationarity and homogeneity in the time series of trans-basin floods have to be carefully evaluated, as already changes in the occurrence rates of winter and therefore the (spatially) most extreme floods were found in this study.