Estimating extreme river discharges in Europe through a Bayesian network

. Large-scale hydrological modelling of ﬂood hazards requires adequate extreme discharge data. In practise, models based on physics are applied alongside those utilizing only statistical analysis. The former require enormous computational power, while the latter are mostly limited in accuracy and spatial coverage. In this paper we introduce an alternate, statistical approach based on Bayesian networks (BNs), a graphical model for dependent random variables. We use a non-parametric BN to describe the joint distribution of extreme discharges in European rivers and variables representing the geographical characteristics of their catchments. Annual maxima of daily discharges from more than 1800 river gauges (stations with catchment areas ranging from 1.4 to 807 000 km 2 ) were collected, together with information on terrain, land use and local climate. The (conditional) correlations between the variables are modelled through copulas, with the dependency structure deﬁned in the network. The results show that using this method, mean annual maxima and return periods of discharges could be estimated with an accuracy similar to existing studies using physical models for Europe and better than a comparable global statistical model.


Introduction
There is currently substantial concern in Europe about increasing flood risk linked mainly to climate change. Available studies (Whitfield, 2012;Feyen et al., 2012; predict that the severity of floods will increase, due to changes in extreme precipitation and socio-economic development. Abundant availability of continental and global climate, land-use, and elevation data result in many studies analysing floods in a similarly large domain. However, the amount of hydrological observations at our disposal is far from sufficient for comprehensive assessments of flood hazard. This is not only the result of the uneven distribution of measurement stations, but also of the limited dissemination of data by national or local bodies responsible for their collection. High-resolution historical measurements are critical for calculating hydrological event scenarios for the purpose of delineating flood zones. Those scenarios are typically values of extreme river discharge or water level with a certain return period, i.e. the average interval of time between the occurrences of an event with the same magnitude. Such calculation additionally requires long data series, further narrowing the number of locations were such analysis can be performed. In effect, to conduct large-scale flood-hazard studies, it is necessary to fill the gaps in measurements with modelled river flows. There are two primary approaches used to obtain discharge values in ungauged catchments, i.e. catchments for which no discharge measurements are available. The first approach is to use rainfall-runoff models. They utilize physical equations describing processes such as infiltration, runoff and retention in order to transform rainfall into river discharges. These models are typically used to model river flows on the catchment scale, although in recent years D. Paprotny and O. Morales-Nápoles: Estimating extreme river discharges in Europe a few studies applied them on a continental or global scale. One series of publications (Dankers and Feyen, 2008;Rojas et al., 2012;Alfieri et al., 2014 and others) presented calculations using the LISFLOOD model. The simulation was set up for Europe with a 5 km resolution. Many different datasets of rainfall amount were analysed, including historical observations and future climate simulations, deriving daily discharge data for most of the continent. Another group of studies Winsemius et al., 2013) have introduced a global hydrological model GLOFRIS. This model has a much coarser resolution than LISFLOOD, as its rainfall-runoff module uses a 0.5 • grid (ca. 50-80 km resolution over Europe). The aforementioned studies used the modelling results to perform an extreme value analysis of river discharges. Some also continued the research with flood-hazard estimation. The main drawback of this approach is the computational expense, which necessitates a reduction in resolution. Additionally, only a limited number of rivers are included in the models. For example, LISFLOOD-based studies used a threshold of 1000 km 2 catchment size, later reduced to 500 km 2 , while GLOFRIS was prepared only for rivers with Strahler order 6 or above, which only accounts for about a third of the river length included in the aforementioned European model.
The second approach is to use statistical methods, of which a large variety exists. Several statistical models rely on the fact that catchments close to each other share many characteristics. River basins are therefore pooled into groups based on geographical proximity alone or also based on catchment size, climate data, terrain or soil type. However, the studies employing such techniques mostly covered a limited domain, typically single countries (Meigh et al., 1997;Salinas et al., 2013). The first global analysis was recently presented by . The study applied regional frequency analysis (RFA) for all continents for the first time. Here, after clustering catchments based on size, climate type and average rainfall, a probability distribution of discharges is calculated for each region. Estimates of extreme discharges for a given ungauged catchment were derived by first assigning them to a proper region and then using data on catchment size and rainfall together with region-specific coefficients to solve a simple regression equation, in order to obtain an estimate of the mean of annual maxima of discharges in the catchment. Finally, a generalized extreme-value (GEV) probability distribution with region-specific parameters is used to calculate return periods of discharges. Flood scenarios (peak discharges) obtained through this method were then used in a global flood-hazard analysis by Sampson et al. (2015).
There are also several statistical methods that rely solely on the geographical characteristics of catchments to estimate discharges. Many of them are simple equations that can be easily applied to quickly solve practical problems in engineering, such as estimating dike heights or calculating necessary channel or culvert capacity. Moreover, they are typically only applicable in small areas for which they were prepared. Usually, they are a variation of the "rational equation", which states that river discharges can be calculated by multiplying the catchment area by the rainfall intensity and runoff coefficient (Chow, 1988;Sando, 1998). The first two elements are used in virtually all methods, but the remaining element is either left out due to the difficulty of estimating it, or is derived from a model table of coefficients, or additional factors are added as proxies. For instance, Stachý and Fal (1986) developed an equation to calculate 100-year discharge in catchments above 50 km 2 in Poland which incorporates seven factors: catchment area, extreme rainfall (100-year return period), soil type, catchment slope, river slope, lake area and marsh area. However, it also requires incorporating an additional empirical coefficient for each physio-geographic region of the country, while different return periods than the default 100 years are obtained by multiplying discharge by a region-specific factor, similar to the RFA method. Another example is the preliminary flood risk assessment in Norway (Peereboom et al., 2011), which utilized a simple regression between catchment area and 500-year water level. An "envelope curve" approach was then applied, in which a curve is constructed in such a manner that it contains all (or almost all) observations. This concept was long used to make crude estimations of maximum possible floods, also on a continental scale (e.g. Padi et al., 2011, applied it to Africa). Some attempts have also been made to apply multiple linear regressions on global scale (Herold and Mouton, 2011). This paper presents a new statistical method to calculate extreme river discharges under present and future climate in Europe. It was devised as an alternative to existing physical and statistical models; its purpose was to provide boundary conditions for hydraulic modelling that could be used in a pan-European flood-hazard analysis. The method is based on Bayesian networks (BNs) that combine probability theory and graph theory in order to build and operate a joint distribution. A BN is used to analyse and represent the dependencies between different environmental variables, including river discharges. In this paper, we present the quantification of the model based on a large dataset of river-gauge observations and pan-European spatial datasets. The model shows good performance across regions of Europe at different time periods. We also present a comparison of this new approach with other methods, both physical and statistical. Lastly, we apply it over the entire domain to obtain a large database of extreme discharges, and analyse the influence of climate change on their return periods.
An early and preliminary variant of the method was originally reported in Paprotny and Morales Nápoles (2015). The BN presented there is superseded by an improved version described herein. Also, the work is part of a bigger effort to create pan-European meteorological and hydrological hazard maps under the "Risk analysis of infrastructure networks in response to extreme weather" (RAIN) project. This influenced the choice of the domain and input data, which is ex-plained in Sect. 2, although this does not limit the applicability of the method outside of the European domain.

Materials and methods
In this section, an overview of the model's framework and elements is given, followed by a description of how the model was prepared, what datasets were used to build it, what the underlying mathematical methods are, and how the model's accuracy and utility were assessed.

Workflow and outline of the method
The basic elements of the procedure to derive extreme discharge estimates through a BN are presented in Fig. 1. The first step was to identify available data on annual maxima (Q AMAX ) of daily river discharge (I), and also the catchments which contribute to locations where the measurements were made (Sect. 2.2), i.e. gauged catchments (II). Then, several large-scale (pan-European or global) spatial datasets were compiled (III), providing information on the most important variables influencing extreme river flow behaviour (Sect. 2.3) both for gauged and ungauged catchments (IV). The dependence between those variables and river discharges were analysed through copulas and BNs (Sect. 2.4) (V). After extensive testing of different configurations, an optimal model was constructed (Sect. 2.5) that had the highest performance in validation in terms of the underlying statistical model and prediction capability (VI; Sect. 2.7 and 3.1). The output of the model is annual maxima of daily river discharges (VII), which were then fitted to a probability distribution in order to obtain return periods (Sect. 2.6). After the method was ready, it was applied for all catchments (IV) in the domain to create a database of discharges (VIII). Using frequency analysis, return periods of discharges under present and future climate in Europe (Sect. 3.2) were obtained (IX). The accuracy of the BN model was also contrasted with alternate methods (X; Sect. 3.1 and 4.1).

River discharge data
Discharge data from measurement stations were collected over a domain covering most of Europe (Fig. 2). The study area includes the entire continent, plus Cyprus as a European Union (EU) member, with two exceptions. Out of the territory of the former Soviet Union, only river basins that are at least partially located within the EU were included. Also, the outlying regions of Madeira, the Azores and the Canary Islands were omitted because they are outside the EURO-CORDEX climate model's domain.
In total, data series for 1841 stations were compiled, not including a few dozen available stations whose tributaries could not be unequivocally identified and were therefore excluded from the analysis. The data were collected from five sources, as follows:  (2000) The data collected were daily discharges observed between 1950 and 2013, though of primary interest were data up to 2005, since it was the maximum range of EURO-CORDEX climate models' historical scenario runs. All datasets were quality-checked by the providers; only a few cases of misplaced decimals in daily series were identified in the data after inspection. Daily discharges were transformed into annual maxima (Q AMAX ) for each calendar year, except for the last group of 50 stations, as Fal (2000) only reported the extreme and mean values. The total number of Q AMAX values for the years 1950-2005 in the database was 74 757. The stations represent 37 countries and 439 different river basins (78 % of the domain's area of 5.67 million km 2 ). However, the south-eastern part of Europe is substantially under-represented, with most stations concentrated in Scandinavia and western Europe. France has the highest number of Q AMAX values in the database (14 %), followed closely by Spain, Sweden and the United Kingdom (UK), as can be seen in Table 1. However, the largest density of stations is in Switzerland, Austria and the UK. The catchments' sizes D. Paprotny and O. Morales-Nápoles: Estimating extreme river discharges in Europe Figure 2. Measurement stations used in the work ("long data series" indicates stations with sufficient data for calculating return periods) and river basins included in the domain. span from 1.4 to 807 000 km 2 , with 43 % of them being in the 100-1000 km 2 range.
Long data series, i.e. at least three full decades of uninterrupted data (1951-1980, 1961-1990 or 1971-2000) were available for 1125 stations. These observations were used to validate the accuracy of the model in estimating mean Q AMAX and return periods, while the complete database was used to quantify the BN model.

Spatial datasets
Several large-scale spatial datasets were collected for this study, even though not all of them were used in the final setup of the model. Nevertheless, all were useful for testing different configurations of the BN. The most important dataset was a map of the river network and catchments, which was derived from the pan-European CCM River and Catchment Database v2.1, or CCM2 de Jager and Vogt, 2010). It was created by calculating flow direction and accumulation on a 100 m resolution digital elevation model (DEM), combined with land-cover information, satellite imagery and national GIS databases. CCM2 was utilized to delimit the domain used in this paper. In total that area covers 831 125 river sections (almost 2 million km in length) in 70 638 river basins. Each river-gauge station was connected with a corresponding river section in CCM2. Each river section belongs to one primary catchment, whose attributes include the identifier of the next downstream catchment. Using this information, the whole tributary of a gauge station, or A few indicators could be derived from this dataset alone: catchment area, river network density (total river length divided by catchment area) and catchment circularity (catchment area divided by the area of a circle that has the same perimeter as the catchment), whereas others were derived using the datasets described below. The next most relevant source of information is climate data, both historical and future projections. Two datasets for the former were analysed. E-OBS is a spatial interpolation of observations made by weather stations covering the years 1950-2015 (Haylock et al., 2008), while ERA-Interim is a complete climate reanalysis for 1979-2015 (Dee et al., 2011). However, E-OBS has gaps in spatial coverage and includes few variables, whereas ERA-Interim has a relatively coarse resolution (0.75 • ). In effect, slightly better performance of the model was recorded using high-resolution control runs of a climate model under the EURO-CORDEX framework ; the results of this analysis can be found in Supplement 2. EURO-CORDEX uses regional climate models (RCMs) for Europe, where boundary conditions are obtained from global-scale general circulation models (GCMs). In this work, we utilize simulations for the historical run  and two climate-change scenarios (RCP 4.5 and RCP 8.5 for 2006-2100). The necessary variables (precipitation, snowmelt and runoff) and resolution (0.11 • ) were included in a total of 14 model runs; of these, 8 model runs start in 1950. Of the model runs, one was made using GCM boundary conditions which came from a 12-member ensemble.
This model run, which was selected to carry out this study, was made by the Climate Limited-area Modelling Community utilizing the EC-Earth general circulation model (run by ICHEC) with the COSMO_4.8_clm17 regional climate model (Rockel et al., 2008), realization r12i1p1. This RCM also has relatively good model performance when estimating extreme precipitation in comparison with others . No bias correction was performed, even though it is often considerable for extreme precipitation (Rojas et al., 2011). For the sake of simplicity and universality of the method, we opted to use all input data unaltered. However, as an additional check on the method's performance, a different GCM-RCM combination was analysed, and the results have been added to Supplement 2. From this dataset four variables were derived: total precipitation, snowmelt, near-surface temperature and total runoff. All data were daily values on a 0.11 • rotated grid (spatial resolution of about 12 km).
Meteorological factors are the driving force behind floods, but more factors influence the runoff -terrain, land use and soils. Information on terrain was obtained from two DEMs. Most of the domain is available from EU-DEM, a dataset produced for the European Environment Agency. It was created by merging two sources of satellite altimetry data -Shuttle Radar Topography Mission (SRTM) and ASTER GDEM. It has a 25 m resolution and covers 39 countries (DHI GRAS, 2014), including areas north of 60 • N, which are missing from SRTM-only datasets. For eastern Europe and some Atlantic islands which are not covered by EU-DEM, SRTM data were used instead (Farr et al., 2007). SRTM has a 3 arcsec resolution (∼ 100 m over Europe) and there are several versions available. The one used here is a void-filled derivate obtained from Viewfinder Panoramas (2014). Both datasets were resampled to a common 100 m grid matching the CCM2 dataset. The variables calculated from the DEMs included average elevation, average river slope and average catchment slope. The latter was de-D. Paprotny and O. Morales-Nápoles: Estimating extreme river discharges in Europe rived either by averaging all slopes in the DEM or by calculating the slope S with the following equation: where H max is the maximum, and H min the minimum, elevation in the catchment and A is the catchment area. Another variable, the time of concentration, which is a measure of water circulation speed in the catchment, was calculated based on Gericke and Smithers (2014). Finally, we tested a terrain classification similar to one used in FLEX-Topo hydrological model (Savenije, 2010). In this approach, all grid cells in the DEM are classified based on height above nearest drainage, slope inclination and absolute elevation (Gharari et al., 2011;Gao et al., 2014). Three classes -wetlands, hillslopes and mountains -were calculated as a percentage of total catchment area. Land-use statistics for catchments were mainly based on CORINE Land Cover (CLC), another dataset produced by the European Environment Agency (2014a). In this study, CLC 2000 edition, version 17 (12/2013), in raster format (100 m resolution) was used. It includes 44 land-cover classes with a minimum mapping unit of 25 ha and covers 39 countries. The main source material were Landsat 7 satellite images from the years 1999-2001 (European Environment Agency 2007). Similar to EU-DEM, the dataset does not cover some catchments in eastern Europe and in a few other areas. Missing information was supplemented using the Global Land Cover 2000 dataset, produced by the Joint Research Centre using algorithmic processing of SPOT 4 satellite images (Bartalev et al., 2003). This product has a 30 arcsec resolution and includes 22 land-cover classes. The different classifications were synchronized to derive the area covered by forests, croplands (total and irrigated), marshes, lakes, glaciers, bare land and artificial surfaces. However, the data were only available for a single year for the whole domain, even though CLC was also produced for 2006, 2012 and, in some countries, for 1990. In contrast to terrain or soils, land use is dynamic and could influence the analysis for early time periods. Yet, some historical land-use reconstructions and projections (e.g. Klein Goldewijk et al., 2011) do not have the necessary resolution or thematic coverage for use in this analysis. Therefore, fixed values of land-use percentages were used for all years, including the future climatechange scenarios.
Last but not least, soil property data were analysed. Occurrence of peat, unconsolidated and aeolian deposits, average water content, and soil texture were derived from the European Soil Database v2.0 (Panagos et al., 2012), developed on a 1 : 1 000 000 scale, and Harmonized World Soil Database v1.2 (FAO/IIASA/ISRIC/ISS-CAS/JRC, 2012), available at 30 arcsec resolution. Soil sealing (i.e. area covered by artificial impervious surfaces) was obtained from Revised Soil Sealing 2006, a dataset based on satellite imagery with a 100 m resolution (European Environment Agency, 2014b). Grain-size structure of the soil (gravel, sand, silt or clay) was calculated from SoilGrids1km database (Hengl et al., 2014).

Bayesian networks
As noted in the introduction, BNs are graphical, probabilistic models (Pearl, 1988;Kurowicka and Cooke, 2006). They have several advantages when compared against other methods, for the application described in this paper. For one, their graphical nature makes the dependence configuration explicit, as evidenced in Fig. 3 in the next section. A BN takes into account, for example, dependencies between different environmental variables, which are not easily modelled with regression methods. Also, they can capture the often nonlinear nature of those dependencies. The class of BNs used in this research includes several elements, whose specifics need to be explained before the actual hydrological model is presented.
First of all, consider a set of random variables (X 1 , X 2 , . . ., X n ), which could be discrete, continuous or both. This distinction defines the different types of BNs. In this paper, we build a continuous BN, since our environmental data are continuous. Furthermore, discrete BNs are only efficient for small models, whose variables have a limited number of states because of the way the (conditional) probabilities are calculated, as we explain later on. The random variables are represented as "nodes" of the BN, while the dependencies between them are represented as "arcs" joining different nodes. An arc represents the (conditional) correlation between two variables, and has a defined direction. The node whose arc points into the direction of another node is known as the "parent", while the node on the "receiving" end of the arc is its "child". A set of nodes and arcs forms the eponymous "network" of the BN. The arcs have to connect the nodes in such a manner that the graph is acyclic, i.e. if we chose any node and follow strictly the direction of all arcs in a path, we will not end up at the same node. Each variable is conditionally independent of all its predecessors given its parents. Therefore, each variable has a conditional probability function given its parents, and the joint probability can be expressed as follows: We already see that one of the purposes of BNs, perhaps the main one, is updating the probability distributions of subsets of nodes, when evidence (observations) of a different subset becomes available. Hence, it is important not only to properly set up the network with nodes and arcs, but also to choose a good method to describe the dependencies. In case of a discrete network, this is done using conditional probability tables. In our model, node "Max discharge" has 7 parents. In this case, if each continuous node was to be discretized into 5 states, a probability table with 5 8 = 390 625 conditional probabilities would be required. Of these, only 5 7 = 78 125 may be estimated by difference, as probabilities must add to 1. Thus, 312 500 probabilities would need to be specified. Similarly, if we were to discretize each node into 10 states, 90 000 000 probabilities would need to be specified. Even a discretization into 5 states for each node in our model would make the quantification prohibitive given the data available. Considering other nodes (node "Buildup" has 4 continuous parents) would make it even more restrictive for the use of discrete BNs. Thus, in this paper we apply a continuous non-parametric BN to avoid the use of probability tables.
By using a non-parametric continuous BN, we only need to specify an empirical marginal distribution for each variable and a rank correlation for each arc (Hanea et al., 2015). We use the usual estimator of the cumulative probability distribution: where (x i , . . ., x n ) are the samples of a random variable, while 1 {x i ≤x} = 1 over the set {x i ≤ x} and is zero elsewhere. Spearman's rank correlations are used to parameterize oneparameter (conditional) copulas. A copula is, loosely, a joint distribution on the unit hypercube with uniform [0,1] margins. There are many types of copulas, described in detail by Joe (2014). Here, we use bivariate Gaussian copulas, an assumption that was tested against alternate distributions (Clayton and Gumbel copulas). Details of this calculation and the validation of the whole BN can be found in Supplement 1. The bivariate Gaussian copula C has the following cumulative distribution function: where is the standard normal distribution, −1 is its inverse and ρ is the bivariate Gaussian cumulative distribution with (conditional) product moment correlation ρ between the two marginal uniform variates u and v in the interval [0,1]. In contrast to the copula specification, the nonparametric BN we apply in this study is parameterized by (conditional) rank correlations. This is because they are algebraically independent; hence, any number in the interval [−1,1] assigned to the arcs of the BN will warranty a positive definite correlation matrix. The rank correlation (denoted by r) of two random variables X i and X j with cumulative distribution functions F X i and F X j is the usual Pearson's product moment correlation ρ computed with the ranks of X i and X j : Conditional rank correlations are calculated as shown in Eq. (5), except that the conditional distributions are used inside the arguments to the right of the equal sign. For the Gaussian copula, conditional correlations are equal to partial correlations and these are constant. For one-parameter bivariate copulas, Eq. (5) becomes the following: D. Paprotny and O. Morales-Nápoles: Estimating extreme river discharges in Europe The conditional rank correlation of X i and X j given the random vector Z = z is the rank correlation calculated in the conditional distribution of (X i X j |Z = z). For each variable X i with m parents Pa 1 (X i ), . . ., Pa m (X i ), the arc Pa j (X i ) → X i is associated with the rank correlation: where the index j is in the non-unique sampling order. For more details on the non-parametric BNs we refer the reader to Hanea et al. (2015). After all the variables and parameters of the BN are in place, the joint distribution is uniquely determined. Under the Gaussian copula assumption, exact inference is available as well as efficient sampling procedures (for details, see Hanea et al., 2006). Here, 1000 samples were used each time we wanted to conditionalize the BN in order to derive an estimate of river discharges for a given location in our dataset. This number of samples is adequate to approximate the conditional distributions of interest while keeping the procedure computationally feasible. The BN for river discharges presented here was implemented in MATLAB; however, the Uninet software for non-parametric BNs was also used to visualize and analyse the model during the study (for details, see Morales Nápoles et al., 2013).

Extreme discharge model
The final BN for extreme river discharges was derived by testing many configurations involving around 30 variables. It is important to note that a BN can neither be created uniquely in an automated manner nor is it desirable to do so. Therefore, the BN in this study was built stepwise and assessed using a set of statistical measures presented in Sect. 2.7. The final model is based on eight variables and is presented in Fig. 3, with a histogram representing each variable's distributions. The position of the nodes shows their hierarchy relative to the annual maximum of daily river discharge (MaxDischarge); the order in which different variables conditionalize on the river discharge distribution (using Eq. 7) is clockwise. The (conditional) rank correlation coefficients are indicated on the arcs. The variables and BN structure are described in more detail below. Annual maximum of daily river discharge (MaxDischarge) in square metres per second (m 2 s −1 ). The parents of this variable are all the remaining variables in the BN. By far the most important is the catchment area (Area) in square kilometres (km 2 ), which determines the scale of the processes in a river basin and is largely dependent on catchment steepness (Steepness) in metres per kilometre (m km −1 ). This is because mountainous catchments are very small, divided by ranges, and only grow in size when many rivers join along the way to its drainage basin, crossing more planar regions. Steepness was calculated here using Eq. (1); it is a proxy for terrain characteristics that influence the speed with which the water from rainfall moves down the slopes (Savenije, 2010).
The climate model from EURO-CORDEX framework delivered two variables to the BN. First is the annual maximum of daily precipitation and snowmelt (MaxEvent) in millimetres (mm). Both factors are relevant, though melting of snow cover is important only in some regions. Both events often occur concurrently (as evidenced in a list of large European floods by Barredo, 2007), hence using a summation of the two improved the performance of the BN. The variable has one parent, catchment steepness, as hilly and mountainous areas receive more precipitation, also in the form of snow. The second variable is the extreme runoff coefficient (RunoffCoef), a dimensionless indicator. It was constructed to include meteorological factors influencing the circulation of water in a catchment. Every climate model needs to represent this variable to take into account factors such as soil moisture, evaporation and retention. The annual maximum of the climate model variable "total runoff" was obtained for each sample, and then divided by MaxEvent. This variable is dependent on catchment steepness, since in hilly or mountainous terrain, conditions limit evaporation or retention. It should be noted that the values of these climate variables were calculated as an average of annual maxima derived for each grid cell separately, and not by identifying the largest single event that occurred in a given catchment.
The BN is completed by three land-cover types, all expressed as a percentage of total catchment area. The statistics were obtained by choosing relevant classes from land-cover datasets. The first variable represents lakes, and was obtained using the "water bodies" class in CLC, with missing coverage supplemented by the water body layer in the CCM2 database. Lakes retain water from rainfall or snowmelt, thus reducing river discharge. This node has two parents: catchment steepness and extreme runoff coefficient. Lakes, especially large ones, are more prevalent in post-glacial plains of northern Europe, though increased lake cover is also observed in the mountains. In both of those areas, the runoff coefficients are higher, due to lower temperatures and more prevalence of impermeable soils. The second variable represents marshes, which are defined by CLC as three classes, "inland marshes", "peat bogs" and "salt marshes", while from Global Land Cover 2000 (GLC) the "regularly flooded shrub and/or herbaceous cover" class was used here. Similar to lakes, marshes increase retention in a catchment. They often occur in the same areas as lakes, with soils and climatology also having influence (as estimated by the runoff coefficient). Lastly, the built-up areas (Buildup) variable contains the "artificial surfaces" class from CLC or GLC. Construction increases the amount of impervious cover in a catchment, reducing infiltration, while water management systems collect the rainfall and route it directly to rivers. This variable is influenced, in order, by catchment steepness (flat areas are preferred for construction), runoff coefficient (which  is higher is colder areas), lakes and marshes (less space available for construction).
In order to estimate river discharge in an ungauged catchment, the BN is updated, that is, the value of the node or set of nodes (other than discharge) is defined based on the observations corresponding to that particular catchment, i.e. new evidence. Fig. 4 shows the effects of updating on the example of Basel station in Switzerland (meteorological data pertain to the year 2005). Conditionalizing on only two variables (catchment area and steepness) changed the mean of the distribution from 341 to 1740 m 2 s −1 . Knowing all seven variables that are parents of the river discharge node, we obtain an estimate of river discharge of 2819 m 2 s −1 . In this case, the estimate is fairly accurate, as discharge of 3212 m 2 s −1 was actually measured. The same procedure was applied to all rivers in the domain. Additional examples of conditionalization of the BN can be found in Supplement 1. It should be noted that the discharge in each river section was estimated independently from another section in the same river using data for the entire upstream area.
In addition to validating the method, we apply it to model the influence of future climate predictions from EC-EARTH-COSMO_4.8_clm17 (Fig. 8) and EC-HadGEM2-ES-RACMO22E (Fig. S9 in the Supplement) models. As noted before, land-cover statistics are fixed in time, and therefore only the climate variables change over time in the prediction. Future changes were calculated for two climate scenarios: RCP 4.5 and RCP 8.5. Those "representative concentration pathways" indicate changes in future physical and socio-economic environments that would cause, by 2100, an increase in radiative forcing of 4.5 or 8.5 W m −2 (Moss et al., 2010).

Return periods of discharges
Annual maxima of daily river discharges calculated by the BN were used to perform a frequency analysis. Only stations with long data series were used, i.e. those with at least 30 years of discharge observations. To find an optimal model for estimating the marginal probability distribution of annual maxima of discharges, we used the Akaike information criterion (AIC) measure (Mutua, 1994). AIC values varied significantly between stations. On average, the AIC value was the lowest for the GEV distribution, indicating that it was the best fit over 15 other tested distributions, such as generalized Pareto, gamma, lognormal or Weibull distributions. This three-parameter distribution, however, gave very large errors for some stations. Therefore, to avoid completely unrealistic estimates in the database, we decided to use the twoparameter Gumbel distribution, which is essentially the GEV distribution with the shape parameter equal to zero. This distribution was previously used in several large-scale floodhazard studies (Dankers and Feyen, 2008;Hirabayashi et al., 2013;Winsemius et al., 2013;Alfieri et al., 2014). In order to calculate discharge Q with probability of occurrence p, the following equation is used: where µ is the location parameter and σ is the scale parameter. Parameters were fitted using maximum likelihood estimation (Katz et al., 2002;Gelman et al., 2013). The extreme value analysis assumes the stationarity of the river discharge series. Using Spearman's rank correlation, we found that in 918 of 1125 gauges used to obtain return periods, the trend was not significant at level of significance of 0.05. In order to maximize the number of stations available for validation, 30-year time periods were used in the calculation. 30 years were used because such a time period maximizes the number of stations available for validation. Also, this time span is commonly used in climate research. The main validation set consists of 958 stations with 1971-2000 data, 129 with 1961-90 data and 38 with 1951-80 data. That is 1125 in total out of 1841 used to quantify the BN. For further analysis, we made the calculation for all stations with data for a given time period; the 1981-2010 period was added as well, utilizing modelled discharge estimates based on the RCP 4.5 climate scenario for the years 2006-2010. Additionally, subsets comprising different regions of Europe and catchment size were also analysed.

Measures for validation of the model's results
Accurate estimation of return periods of extreme discharges, as well as mean annual maxima, are the desired outcomes of the BN model. Quality of return periods and average maxima simulations were evaluated using a set of three measures: coefficient of determination, Nash-Sutcliffe efficiency and RMSE-observation standard deviation ratio. Those methods were selected because they have also been used in other studies (e.g. Rojas et al., 2011) and were included in an overview of most important measures by Moriasi et al. (2007). First, the Pearson's coefficient of determination (R 2 ) was used to measure the correlation between observed and simulated values. In Kurowicka and Cooke (2006) it is noted that R 2 actually factorizes into a function of the conditional rank correlations attached to the BN. Second, Nash-Sutcliffe efficiency (I NSE ) was applied to measure bias of the model. Its maximum value is 1, which means a plot of observed vs. simulated data fits the 1 : 1 line (no bias), while a value below 0 (down to −∞) indicates that the mean of the observations is a better predictor than the simulated value. The relevant equation is as follows: where x obs i is the ith observation of a variable, x sim i is the ith simulated value of that variable and x mean is the mean of observations. The final measure is root mean square error (I RMSE )-observation standard deviation ratio (I RSR ). It standardizes the RMSE based on the standard deviation of observations (I SDobs ): To further investigate the relative accuracy of the method in light of alternate models, we performed a RFA analysis, as presented by . This required us to obtain supplementary data. Each river-gauge station had to be assigned to one of five climate zones according to the Köppen-Geiger classification; a world map by Kottek et al. (2006) was used for that purpose. Overall, 65 % of stations with long records in our sample are located in the temperate climate zone, with 30 % in continental, 4 % in polar and 1 % in arid zones. Additionally, mean annual rainfall was derived from CORDEX climate data. The final input information was catchment area, readily available from our datasets. In order to estimate discharge in the RFA, a given station had to be assigned to 1 of the 82 clusters included in the RFA. The first criterion is the climate zone, which allocated a station to a group of clusters. Then, the Euclidean distance to each cluster centroid (defined through a logarithm of area and rainfall) was calculated. Afterwards, a "mean annual flood" equation (see  was solved using the coefficients from the nearest cluster as well as catchment area and annual rainfall, providing us with Q MAMX . Finally, cluster-specific GEV distribution parameters were then applied to obtain return periods of extreme discharges.

Results
In this section, extreme river discharges calculated using the BN are compared with observed river discharges. Additionally, we present the results of applying the method to estimate the influence of climate change on discharges in Europe using EC-EARTH-COSMO_4.8_clm17 climate models. Results obtained with alternate climate models can be found in Supplement 2.

Validation of the model's results
Extreme river discharge estimates obtained from the BN are presented and compared with observed discharges in Figs. 5 and 6. The graphs include the mean annual maximum of daily discharge (Q MAMX ) and three return periods of discharges. In Fig. 6 we show a comparison of specific river discharges, i.e. runoff divided by the respective catchment areas (Wrede et al., 2013). The former shows the highest performance, with both R 2 and I NSE at 0.92, while accuracy of simulated discharge fitted to Gumbel distribution decreases with the probability of occurrence. The 10-year discharge (Q 10 ) has almost the same performance as Q MAMX ,  1971-2000, 1961-1990 or 1951-1980). while the 1000-year (Q 1000 ) discharge noticeably deviates from the 1 : 1 line, mainly for very large rivers. It should be also remembered that the return periods were based only on 30-year series, and therefore a 100-or 1000-year discharge includes the uncertainty of extrapolation of the return periods. However, the I NSE value is still good, and R 2 changes moderately. The R 2 drops to 0.52 for Q MAMX when considering specific river discharge and 0.44 for 100-year discharge, with I NSE at 0.43 in both cases. Again, performance is slightly higher for 10-year discharge and drops approaching 1000-year discharge. It is also interesting to notice that the rank correlations for all four cases discussed previously (Q MAMX , Q 1000 , Q 100 and Q 10 ) are in the order of 0.8 and their bivariate distribution does not present large asymmetries ( Fig. S5 in Supplement 2). This could be an indication that a method based on copulas could also be used for bias correction; however, further investigation of this observation is outside of the scope of this paper. Performance of the model by time period, region or catchment area was also analysed in more detail (Table 2). For four different time periods, where availability of stations varies, the results of the validation are almost identical. Only for 1981-2010 is it slightly lower because it is partially outside the timespan of the historical scenario of EURO-CORDEX; for 2006-2010, data from RCP 4.5 climate-change scenario run had to be used to fill the missing information. Much more variation in the quality of the simulations is observed when dividing the results by geographical regions (their definitions correspond to the regionalization of the CCM2 catchment database). Western Europe (comprised mainly of France, Belgium, the Netherlands and the Rhine river basin) had particularly good results for Q MAMX , followed by the Danube river basin and Scandinavia (roughly defined as Sweden and Norway). The lowest correlation for Q MAMX was observed in the Iberian Peninsula (Spain and Portugal), while Central Europe (mainly Poland, Lithuania, Denmark and north-east Germany) had the highest I NSE values. Iberia had the lowest performance for Q 100 , while western Europe recorded the highest correlation, and Scandinavia had the best score in I NSE and I RSR . Central European and Scandinavian stations' error was lower and I NSE values higher for the 100-year return period compared to Q MAMX . No region dropped below acceptable levels (i.e. R 2 or I NSE value of 0.5, according to Moriasi et al., 2007), albeit stations in the Iberia and "other regions" have noticeably lower performance. In the case of Spain, to which almost all stations collected for the Iberian Peninsula belong, discharges tend to be overestimated, which may point to the influence of reservoirs on river flow. In-deed, many Spanish stations with large errors were found to be just downstream of large dams. Finally, "other regions" is a grouping of a small number of stations scattered around Europe, mainly from Finland, Italy and Iceland. Those areas, containing many rivers in both arid and polar climates, are under-represented in the quantification of the BN, which may provide a potential justification for their lower performance. In Fig. 5 it can be seen that the amount of scatter in the plot increases for rivers with smaller discharges. Detailed results in Table 2 show that the performance of the model drops for the smallest catchments, especially for those below 100 km 2 (177 catchments). For others, above 500 km 2 , the R 2 and I NSE values are mostly in the range of 0.5-0.6 for specific discharges, as when considering all stations. Additionally, to validate the robustness of the method, we did a split-sample test. Stations were randomly divided into two sets. Data from 917 stations were used to quantify the BN in order to simulate discharges in the remaining 924 stations. Of the latter, 586 stations had at least three full decades of discharge observations, which allowed us to make a comparison with simulated discharge. The validation result was almost identical with those reported for the full quantification, and even better results (R 2 = 0.94 and I NSE = 0.93) were observed for Q MAMX , while for Q 100 the same value of I NSE was calculated and R 2 equalled 0.90. Still, performance at individual stations varies. A selection of observed and simulated discharges, both annual maxima and those fitted to Gumbel distribution, is presented in Fig. 7. At some stations, there is a very close fit, while at others, either the discharge is overestimated or the distributions have different shapes. This is, however, not atypical even for more local studies.
The final analysis in this section is the comparison of the BN model and RFA. Using RFA, estimates of extreme discharge were obtained for all 1125 stations with long records and compared to discharges in Fig. 9. In the case of Q 100 , Gumbel-distributed discharges were used, as the performance with GEV distribution was slightly lower. The performance of both BN and RFA models is visually similar, though the BN recorded higher correlation and less bias then the RFA. Less scatter can be observed in upper and lower ranges of discharges, with similar performance in the middle. Using specific river discharges (Fig. 10), the performance of both methods was lower, but still much better for the BN: I NSE , for example, was negative for both Q MAMX and Q 100 when using RFA, in contrast to a value of 0.43 for the BN. RFA was devised as a global method instead of a regional one, but at the same time it is in fact a set of 82 regional approximations of hydrological processes. Here, we analyse contributing factors of extreme discharges all together, achieving comparable or even better results.

River discharges in Europe
Calculation of river discharges utilizing data from EURO-CORDEX climate simulations was done for the years 1950-2100, and are presented here in three time slices: 1971-2000, 2021-2050 and 2071-2100. The first period is from the historical "control" run, while the other two were analysed for two emission scenarios: RCP 4.5 and RCP 8.5. Projected trends calculated from the data are presented in Fig. 8. For the sake of clarity, only rivers with catchment area above 500 km 2 are presented in the picture; full-scale maps of discharges have been included in the Supplement. Aggregate statistics by region and catchment size were included in Table 3. In the description we focus on 100-year discharge, but the trends are also representative of other return periods.
The projected trends in Europe are very diversified. For Europe as a whole, there is a slight 4-7 % increase in discharges with a 100-year return period (Q 100 ), with the biggest change observed in the 2021-2050 RCP 8.5 scenario. Along 34-44 % of river length in Europe, Q 100 is projected to increase at least by 10 %, depending on scenario. Yet, along 16-21 % of river length a decrease by more than 10 % is expected, with only small changes (±10 %) for the remaining 35-49 %. In RCP 8.5 both increases and decreases of Q 100 are more prominent than in RCP 4.5. In effect, Q 100 in the 2071-2100 RCP 8.5 scenario is projected to correspond to 176-year discharge under present climate  if we take the median value. This value is slightly lower in mid-century and in end-century for RCP 4.5, with the smallest change compared to present climate in the 2021-2050 RCP 4.5 scenario.
Between regions, by mid-century, the largest average increases in extreme discharges are expected in the Iberian Peninsula and Danube basin (RCP 4.5), while Q 100 in Central Europe (i.e. mainly the Elbe, Oder and Vistula river basins) is projected to surge even more in RCP 8.5. By the end of the century, however, southern Europe (comprised mostly of Italy) will experience the biggest average increase. Conversely, Q 100 is projected to decrease on average in the British Isles in all four scenarios, in north-east Europe (Finland, north-west Russia and the Baltics) in three scenarios, in Scandinavia in two and in south-east Europe (mainly Greece) in one. Those discrepancies are the result of several trends, namely changes in extreme precipitation, snowmelt and runoff coefficient. The first is projected to increase across the continent, while the other two decrease at the same time with some exceptions. Decline in snowmelt, a consequence of thinner snow cover, will contribute to lower extreme discharges in parts of Scandinavia and Scotland. However, in most of Sweden, Finland and other areas, less snowmelt will be offset by more rainfall. Lower precipitation is expected only in small, scattered patches of Europe, most noticeably in southern Spain. At the same time, an increase of the runoff coefficient could be observed in predictions for the Iberian Peninsula and western Europe, with decreases in the remainder of the continent. Higher temperatures and less soil moisture are contributing factors to those trends.
In Table 3 projected trends in Q 100 were also provided per catchment size. The differences in average increase of discharges are very small and partially caused by their uneven distribution in Europe. Median return periods show more diversity, since the relative increase in discharge by certain increment of return period typically gets smaller as the river Figure 8. Predicted trends in daily river discharge with a 100-year return period (Gumbel distribution) under climate-change scenarios RCP 4.5 and RCP 8.5 (rivers with catchment area above 500 km 2 only). Projections based on EC-EARTH-COSMO_4.8_clm17 climate model run. grows in size. Most importantly, this breakdown shows that the method is able to detect trends in discharge in both large and small rivers.

Discussion
The results presented in the previous section, however encouraging on their own, have to be compared to other existing studies. Such analysis is presented in Sect. 4.1. Section 4.2 presents a discussion of the limitations of the method and the uncertainties in the model's setup and results. Finally, in Sect. 4.3, ongoing and planned developments of the BN are presented.

Comparison with other models
The accuracy of the BN model of extreme river discharges can be compared, directly or indirectly, with results of other statistical and physical models. In case of the former, a comparison with the RFA method was shown in Sect. 3.1. For the latter, reported values of R 2 and I NSE from several studies were obtained.
Studies with measures of model performance comparable with this analysis were summarized in Table 4. All of the publications were based on the LISFLOOD model forced by a large variety of climate models. The validation of those hydrological models was mainly based on Global Runoff Data Centre discharge data, similarly to this study. The correlation between observed and simulated mean annual maxima of daily discharges (Q MAMX ), measured by R 2 , was between . Simulated and observed average annual maxima of daily river discharges and 100-year discharge for 476 stations; Bayesian network model in red, regional frequency analysis in green. 30-year periods of annual maxima were used (the most recent available out of 1971-2000, 1961-1990 or 1951-1980). (a) (b) Figure 10. As Fig. 9, but for specific discharge, i.e. divided by catchment area.
0.86 and 0.94. The corresponding value in this study is within this range. Only one other study (Dankers and Feyen, 2008) reported R 2 for discharge with different return periods (Q 20 , Q 50 and Q 100 ). When compared with the results using the BN model, our results are slightly higher. It should be noted that in the aforementioned analysis, using Gumbel distribution (like in this study) yielded better correlation than GEV distribution. Only two studies reported I NSE values. Most interestingly, Rojas et al. (2011) show that the performance of the hydrological model changed significantly depending on how climate data were treated. The authors noted large biases in modelled precipitation data, and made a correction based on observational datasets. This modification of cli-mate data output slightly improved the correlation, but most importantly the I NSE went from a negative value, indicating poor performance, to a value close to perfect fit with a 1 : 1 line. In this study, no modifications to climate data were made and yet I NSE values for our statistical model are in the range of a physical model forced by bias-corrected climate data. Of course, the reported validation results are not perfectly comparable with this analysis, since the described studies focussed on relatively large rivers (those more than ca. 1000 km 2 catchment area) and used ENSEMBLES regional climate simulations, which are several years older than the CORDEX simulations employed herein. Additionally, R 2 and I NSE are not the only measures available. Dankers and  (2009) 1961-1990, 209 stations Rojas et al. (2011) LISFLOOD model, 1961-1990 , 1961, -1990, , 554 stations Feyen (2008 report that the error in simulating Q MAMX was bigger than 50 % in 24-25 % of stations and more than 100 % in 6-8 %. In this study, for comparable river size, i.e. with extreme discharge of ca. 100 m 3 s −1 and more, those values are 34 and 11 %. Still, overall the performance of the BN can be described as similar to the LISFLOOD model in estimating annual extremes.

Limitations and uncertainties
The BN model, despite its overall high performance, has lower accuracy over certain regions. Some of the uncertain-ties and limitations of the model are immanent properties of large-scale hydrological simulations, while others are specific to how the method was conceived and what assumptions and data were included. One of the foremost aspects belonging to the first category is that the method assumes natural flow in the catchment. Hydraulic structures, such as large dams, can have profound influence on extreme discharges, as many were developed as a flood-reducing measure. As mentioned in the results section, flows in Spanish rivers were generally overestimated and reservoirs may provide a likely explanation. Continental-or global-scale models routinely omit this aspect, as there is not enough information available to incorporate the existence of reservoirs or their operation. The BN model includes reservoirs only indirectly; they count as lakes and contribute to the percentage of the catchment covered by water bodies and have a negative influence of extreme discharge. In total, 326 large dams are within the catchments of the stations used in this study, according to the GRanD database (Lehner et al., 2011). Additionally, the conditions in the catchment may change over the timespan of the analysis of discharge data , due to reservoir construction or river regulation, or simply because of landuse developments. Currently a single snapshot of European land cover is used (from around the year 2000), but the area covered by lakes, marshes and particularly artificial surfaces is dynamic. In our analysis there was very little difference in performance between different time periods, but this aspect could be relevant locally.
The configuration of the BN presented here was the best one we found, but may not be the only solution possible, or the best one there could be. In Paprotny and Morales Nápoles (2015), the setup of the model was slightly different, with unconsolidated deposits (calculated as a fraction of all soil types in a catchment) used instead of the runoff coefficient. It can be noticed that despite several soil datasets being mentioned in the methodology (Sect. 2.3), none made the final configuration of the model. Low resolution and limited thematic accuracy of global soil data are likely the cause. Several other variables describing terrain, climate or land cover mentioned in Sect. 2.3 were not included, as adding them did not improve the model. However, one alternative configuration worth mentioning is a BN incorporating terrain classification based on height above nearest drainage. Replacing lake and marsh cover with "wetlands" and "hillslopes" identified in the DEM (see Gao et al., 2014) caused only a fractional drop in performance. Given that land-cover data for Europe have very high resolution and good accuracy, this approach may give better results in areas with less satisfactory data such as the developing countries.
Some issues are related to the datasets used. Discharge data are daily values, rather than absolute peak flows, as that variable was the only one available from the main source of information, i.e. the Global Runoff Data Centre. Yet, Polish data were only available as sub-daily maxima, which did not affect the accuracy for Poland or Europe much, but is nonetheless a slight inconsistency. More crucially, daily discharge is not adequate to model flash floods, floods of short duration or floods in small catchments. Flash floods can occur in matter of minutes and outside of river beds. Also, the model utilizes daily precipitation and snowmelt, which also may not be accurate for large catchments, where the biggest floods are caused by rainfalls lasting many days. Potential incorporation of different timespans of flood-inducing meteorological events is yet to be analysed. In some regions the amount of river-gauge station data was very limited, mainly in south-eastern Europe, while in others (northern and west-ern Europe) it was abundant, making the sample less representative. The river-gauge observations might still contain errors, even though they were quality-checked by the providers; they could also be systematically inaccurate due to, for example, outdated rating curves.
Further concerns are related to the river and catchment dataset CCM2. It has lower accuracy in areas with low relief energy, otherwise known as plains. Slight inaccuracies in the DEM result in improper delimitation of catchments in such regions. Large numbers of lakes in post-glacial parts of Europe can also result in sometimes substantial errors. For instance, the I NSE value for Q MAMX for mountainous Norway is 0.90, while for Sweden, with its lake-filled landscape, it drops to 0.71. River-gauge stations, for which there was a significant difference between catchment area in CCM2 and the corresponding value in the stations' metadata, were removed from the sample. The improperly divided basins still exist in our final database of simulated extreme discharges, though. This also involves omission of most artificial channels and all cases of bifurcation, river deltas included.
Climate data from CORDEX have the highest resolution available, yet biases in representing rainfall, snowmelt and runoff could influence the results. As addressed in Sect. 4.1, bias-correction of precipitation significantly improved performance of the LISFLOOD hydrological model, leaving room for further enhancements of the method. Another issue is related to climate-change scenarios used to construct the database of discharges. The difference between RCP 4.5 and RCP 8.5 scenarios is sometimes very large, as witnessed in Fig. 7. This alone illustrates major uncertainty related to future projections of climate. For the historical period, the use of an alternative CORDEX model and a climate reanalysis has shown (Supplement 2) that the BN model's performance depends on the climate model used, yet it is still considerably better than the RFA.
Finally, the underlying dependence structure requires further investigation since some of the bivariate distributions of variables indicate that a non-Gaussian copula could be a better model (see Supplement 1 for details). Other copulas could potentially be used since, for some distributions, tail dependence and other asymmetries may be present, even though the normal copula works well most of the time. Skewness, for example, may be modelled by copulas based on mixture distributions. This would correspond to copulas with more than two parameters (Joe, 2014).

Applications and further developments
The method was originally conceived to provide extreme discharge estimates that could be used for pan-European hazard mapping. As shown in the previous sections, the BN provides similar results when compared to existing hydrological models, yet it is much faster. For hydrodynamic modelling of water levels (Paprotny et al., 2017), catchments with area greater than 100 km 2 were selected, both to fur-ther reduce calculation time and due to limited applicability of the BN model to very small catchments. The calculation of annual maximum discharge for 151 years, including 95 years in two climate-change scenarios, in a domain of almost 156 000 river sections above the threshold and obtaining return periods of flood events, takes less than a day on a desktop PC. The exact value depends on the number of samples used when conditionalizing the BN and the number of samples used to quantify the BN. Nevertheless, the method can reduce time needed to perform a flood-hazard analysis, both continental-scale and local, as long as annual extremes are relevant for a particular study.
The results of this study -extreme discharges with certain return periods under present and future climate for all river sections in the domain -are publicly available online . The dataset was formatted in GIS in such a manner that it can be easily combined with the CCM2 river and catchment database. The files include a total of 10 different return periods of discharges (2-1000 years) and 5 scenarios, the same as described in Sect. 3.2. Additionally, for each future scenario, change in the return period of discharge compared to 1971-2000 was calculated and included in dataset. Flood-hazard maps that utilized those results are also accessible, but further discussion about them is outside the scope of this paper. This is definitely a line for future research recommended by the authors, with the first application presented in Paprotny et al. (2017). We should note, however, that all the databases were published with the intention of analysing them on a European scale, and users should be careful applying them on a local scale, especially for small and medium catchments (with an area of less than 500 km 2 ).
Thus far, the model's domain has been limited to Europe, but investigation is also ongoing into applying the method to other regions, globally. Currently, data from the United States and Mexico are being analysed. There is a very large number of river-gauge observations available for the contiguous US, while for its southern neighbour the number and quality of historical records is limited. These case studies provide interesting challenges when compared to Europe. Mexico lays mostly within tropical and arid climate zones, which is in stark contrast to Europe. The United States is geographically diversified and its biggest river system -the Mississippi-Missouri basin -is almost four times larger than the Danube basin. For these countries, global spatial datasets will be used which have a lower resolution than those applied in this study. It is possible, for instance, to quantify the BN model with those datasets and analyse its performance relative to the European quantification presented in this paper, as well as to combine those data. In this way, the model's configuration with seven variables can be challenged, as the risk is that the method is overfitting the data from Europe. But again, this could only be definitely resolved by testing the model in other geographical areas of the world. As a first check, Couasnon (2017) applied the model for the contigu-ous United States, indicating that the European quantification performed generally well, though much less accuracy was observed for arid and hurricane-influenced parts of the country than in those with temperate climate. Quantification based on US or combined (US-Europe) data performed less well, though for any variant the results were better than when using RFA, which was originally validated for that area by . Finally, the model could be potentially evaluated not only using all variables, but conditionalized only on some of them, as observations for all variables might not be available in a given location.

Conclusions
In this paper we presented a first attempt to model extreme river discharges in Europe using BNs. The method revisits the old concept of estimating discharges using only geographical properties of catchments, but employing an entirely new approach. Instead of a usual regression analysis, we determine the (conditional) correlations between different variables describing the catchments with copulas and a nonparametric BN. We show that the model has comparable accuracy to other large-scale hydrological models in simulating mean annual maxima and return periods of daily discharges and better performance than a RFA. The data necessary to apply the method can be obtained from pan-European (or global) databases for any location in the continent (or other locations where global data are available). In this sense, the method can be used to create basic flood scenarios at any ungauged location where data for these variables are available. For that reason it was used to provide estimates of extreme river discharges for both present and future climate in all rivers in a domain covering most of the continent. However, the accuracy at different ungauged locations varies to some degree. The best performance was found in Scandinavia, western Europe and the Danube basin, while the lowest was observed in southern Europe, especially in the Iberian Peninsula. Trends in discharges were found to be very diversified, while the database itself will be applied to delimiting flood-hazard zones in a separate study. Further research regarding discharge estimates with our model is recommended, especially for future climate scenarios.
There are several advantages of our approach. It has low computational expense, the method is flexible as its configuration could be easily modified, and the model can be used even if not all variables for a given location are available. At the same time it allows for sensitivity analysis of different variables on extreme discharges, as well as easy incorporation of changes in climate or land use over time. It relies purely on the statistical distributions and statistical dependence of catchment descriptors, without any empirical modifiers or clustering typical for other statistical methods. The model also has a graphical nature, which makes its formulation explicit. The aim was to make the method universal and, even though so far it was only comprehensively tested for Europe, its overall performance is encouraging. The accuracy of the model changes relatively little between regions and time periods, as well as when a split-sample test is applied. The disadvantages are mostly typical for other large-scale models, such as assumption of natural flow conditions in the rivers and lower performance in smaller catchments. Validation has shown that for catchments smaller than 500 km 2 , and especially than 100 km 2 , performance is significantly lower than for larger ones due to increasing influence of local factors. The method was also crafted only for annual maxima of discharges, with the purpose of accurately estimating return periods rather than discharges in a particular year. But again, this is the most relevant parameter in flood-hazard analysis. The method will be further developed and tested in other parts of the world.
Data availability. This work relied entirely on public data as inputs, which are available from the providers cited in Sect. 2.2 and 2.3. Results of the work can be downloaded from an online repository .