Identification of spatial and temporal contributions of rainfalls to flash floods using neural network modelling: case study on the Lez basin (southern France)

. Flash ﬂoods pose signiﬁcant hazards in urbanised zones and have important implications ﬁnancially and for humans alike in both the present and future due to the likelihood that global climate change will exacerbate their consequences. It is thus of crucial importance to improve the models of these phenomena especially when they occur in heterogeneous and karst basins where they are difﬁcult to describe physically. Toward this goal, this paper applies a recent methodology (Knowledge eXtraction (KnoX) methodology) dedicated to extracting knowledge from a neural network model to better determine the contributions and time responses of several well-identiﬁed geographic zones of an aquifer. To assess the interest of this methodology, a case study was conducted in southern France: the Lez hydrosystem whose river crosses the conurbation of Montpellier (400 000 inhabitants). Rainfall contributions and time transfers were estimated and analysed in four geologically delimited zones to estimate the sensitivity of ﬂash ﬂoods to water coming from the surface or karst. The Causse de Viols-le-Fort is shown to be the main contributor to ﬂash ﬂoods and the delay between surface and underground ﬂooding is estimated to be 3 h. This study will thus help operational ﬂood warning services to better characterise critical rainfall and develop measurements to design efﬁcient ﬂood forecasting models. This generic method can be applied to any basin with sufﬁcient rainfall–run-off measurements.


Introduction
Flash floods are rapid (they rise in a few hours) and intense floods that occur within small basins.Our current lack of understanding of these floods constitutes a great societal challenge because of their socioeconomic and environmental impacts (Gaume and Bouvier, 2004;Llasat et al., 2010).Over the past 20 years, flash flooding in south-eastern France has caused more than 100 fatalities and several billion euros in property damage.In karst basins, the event of June 2010, in the river Var (southern France) caused 27 casualties and more than one billion euros of damages.Early warning is also a priority (Borga et al., 2011;Price et al., 2011) that could be improved by using forecast models.In recent decades, considerable efforts have been devoted to improving our understanding and forecasting of flash flooding (Gaume et al., 2009;Marchi et al., 2010).In the literature three aspects were investigated: (i) the rain event (or other cause of rising water), (ii) run-off genesis, and (iii) surface and underground geomorphologic and geologic settings that channel the water transfer toward the outlet.
Mediterranean rain events often occur at the meso-scale (Rivrain, 1997) and generate intense localised rainfall.For this reason, Le Lay and Saulnier (2007), Cosandey and Robinson (2000), and Tramblay et al. (2010) show that flashflood generation is controlled by spatial and temporal variability of rainfall and initial soil-moisture conditions.Moreover, sensitivity to rainfall heterogeneity is elevated in small watersheds, which are locations of flash flooding (Krajewski et al., 1991;Corradini and Singh, 1985 al., 2015).The hydrodynamic behaviour of hydrosystems subject to intense rain events depends on soil moisture as well as geology, tectonics, and land use (Anctil et al., 2008;Nikolopoulos et al., 2011).Moisture content estimation at the watershed scale has proven beneficial for discharge prediction (Kitanidis and Bras, 1980;Parajka et al., 2006;Wooldridge et al., 2003).Nevertheless, soil-moisture measurements are highly dependent on field measurement techniques; they provide relative spatial and temporal distributions (Katul et al., 2007;Lauzon et al., 2004) rather than absolute values.
In karst systems, underground water obviously plays a significant role in flooding (Bailly-Comte et al., 2012;Fleury et al., 2013).Nevertheless, karst systems are intrinsically heterogeneous and their hydrodynamic behaviour generally differs from one system to another (Bakalowicz, 2005).However, even if the contribution of karst groundwater to flash flooding is assumed to be negligible because of its longer response time (Borga et al., 2007;Norbiato et al., 2008), other studies emphasise the considerable contribution of groundwater to flash flooding (Bailly-Comte et al., 2012).Faced with the question of the role of karst groundwater in flash flooding, this study investigates a method for estimating spatialised contributions from different parts of a heterogeneous aquifer.
Because of the lack of knowledge regarding the various hydrodynamic behaviours involved in karst systems, a generic black-box method seems to be adequate.For this reason, neural network modelling seems to be a relevant method (Kong-A-Siou et al., 2011, 2014;Kurtulus and Razack, 2007).For this purpose, in recent decades, the multilayer perceptron has been increasingly used in the field of hydrology (Maier and Dandy, 2000;Toth, 2009).These models have been effective in identifying the rainfall-run-off relationship (Hsu et al., 1995).Their ability to forecast flash floods (Toukourou et al., 2011;Artigue et al., 2012) and model karst system behaviour have also been demonstrated (Kong-A-Siou et al., 2011).To model hydrosystem behaviour efficiently, neural networks need relevant data sets as input and output variables, and rigorous application of regularisation methods (Abrahart and See, 2007;Bowden et al., 2005;Fernando et al., 2009).Rainfall data are obvious inputs; in addition (Anctil et al., 2008), demonstrated that soil-moisture content observations improve prediction performance.Even so, selection of relevant variables to represent moisture content is a difficult task (Darras et al., 2014a).Data quantity and quality are the major limiting factors in the application of neural networks to hydrological modelling (Pereira Filho and Santos, 2006).Because of noisy data, neural networks used to model natural phenomena are sensitive to overfitting; the use of regularisation methods to deal with the bias-variance trade-off is thus mandatory (cf.Sect.3.1.2).Kong-A-Siou et al. ( 2014) compared neural network models and VEN-SIM software to simulate flooding or drought; they concluded that neural modelling performed better for extreme events, whereas VENSIM worked better for intermediate, more complex events.This statistical approach has been used to propose some interesting hydrological models.Artigue et al. (2012) has proposed a combination of linear and nonlinear modelling in the same model.Corzo and Solomatine (2007) have proposed a combination of specialised neural networks to represent isolated processes involved in flood genesis.These methods provided efficient forecasts on rapid hydrodynamic watersheds.Moreover, recent advances have proven that the use of these statistical tools can improve the currently available knowledge of a system.Based on these recent scientific findings, the Knowledge eXtraction (KnoX) methodology was developed to describe contributions and time transfers of spatialised rainfall in any basin.This paper thus proposes to apply this methodology to better apprehend both surface and groundwater processes at the origin of flash flooding in a karst basin.To this end, we focus on the Lez karst hydrosystem which feeds the Lez river that flows through the conurbation of Montpellier (southern France) with a population of 400 000.Because of its meteorological and geomorphological setting, the Lez river at the Lavalette station, located at the entrance to the city of Montpellier is the site of flash flooding.In addition, as a karst system, the geomorphological structure of the Lez aquifer is strongly heterogeneous, leading to anisotropic water circulation and highly non-linear hydrodynamic behaviour.Flow rate at Lavalette station includes contributions from perennial karst springs (the most important is Lez spring), temporary karst springs (Lirou spring can be stronger than Lez spring), diffuse karst arrivals, and also run-off.
The scientific challenge of this study is thus to apply neural networks to better quantify processes operating in flash flooding.For this purpose, after the Introduction, Sect. 2 presents a discussion of neural network modelling and the KnoX method.Section 3 is a description of the study area.Section 4 presents the application of the KnoX method to the study area and estimate of contributions and time transfers of spatialised rainfalls to discharge at Lavalette.Section 5 discusses the results and exposes operational and scientific implications.In the conclusion section we discuss innovative perspectives of this generic methodology.provided hereafter.The chosen model is the multilayer perceptron because of its properties of universal approximation and parsimony (Barron, 1993).The universal approximation is the capability to approximate any differentiable and continuous function with an arbitrary degree of accuracy (Hornik et al., 1989).In our study, the multilayer perceptron is a feedforward model, a finite impulse response model based on Nerrand et al. (1993).Designing a multilayer perceptron consists mainly of selecting input variables and the number of hidden neurons.This determines the number of parameters mechanically; model complexity increases with the number of parameters.The general equation of the function calculated by the feed-forward multilayer perceptron is the following: where the estimated value of the output at the discrete time k is y k ; the observed value of this variable is y k p ; the input vector is u k ; the non-linear function implemented by the neural network is g NN ; w u and w y are the width of windows used to apply the input time series, they are linked to the length of the vectors u and y p ; and C is the matrix of parameters of the model, also called "weights".
As statistical models, neural networks are designed in relation to a database.This database is usually divided into three sets: a training set, a stop set, and a test set.The training set is used to calculate parameters through a training procedure that minimises the mean quadratic error calculated on output neurons.The training is stopped by the stop set (cf. Sect.2.1.2),and model quality is estimated by the third part of the database: the test set, which is separate from the training and stopping sets.The model's ability to be efficient on the test set is called generalisation.However, the training error is not an efficient estimator of the generalisation error: the efficiency of the training algorithm makes the model specific to the training set.This specialisation of the neural network on the training set is called overfitting.Overfitting is exacerbated by large errors and uncertainties in field measurements; the model learns the specific realisation of noise in the training set.This major issue of neural network modelling is called bias-variance trade-off (Geman et al., 1992).Usually regularisation methods are used to avoid overfitting; to this end, two regularisation methods were used in this study.

Regularisation methods
In the context of this study, the goal of regularisation methods is to minimise output variance.To this end, crossvalidation (Stone, 1974) was used as explained in Kong-A- Siou et al. (2012) to empirically select input variables and the number of hidden neurons.Cross-validation thus minimises model complexity and therefore output variance (Schoups et al., 2008).
Another regularisation method is commonly employed: early stopping (Sjöberg et al., 1995).This method stops training before overtraining occurs.A dedicated set, called a stop set, is considered separately from the database.
Working also on the Lez aquifer but considering only underground water at the Lez spring, Kong-A-Siou et al. (2011) applied multilayer perceptron to perform forecast at Lez spring and validated cross-validation as a useful method to select the complexity of the model.Moreover, Kong-A-Siou et al. ( 2012), for the same basin, focussed on regularisation methods (early stopping and weight decay).They conclude that early stopping used in conjunction with cross-validation was efficient.
Nevertheless these results, obtained with a 16-year daily database cannot be applied directly in the present study because the flash-flood database is too limited to extract definitively another set from the database (the stop set).Thus, to apply early stopping without stop set, a pre-defined maximum number of training iterations were selected to stop training before the complete convergence and, by this way, avoid overtraining.Nevertheless, for this purpose, the selection of the optimal number of training iterations is done using a stop set.Then afterwards, the model is run without the stop set using this pre-defined optimal number of training iterations.In the first stage, the database, not including the test set, was divided into S subsets corresponding to flashflood events.Training was performed on S-1 subsets with 50 different parameter initialisations.The remaining subset was used as a stop set.Each subset was used in turn as a stop set.For each trial the training iteration with the minimum mean quadratic error over the stop set is set aside.The median of these numbers of iterations was calculated for all stop sets and all initialisations and selected as the optimal number of training iterations.In a second stage, this optimal number of training iteration (12 iterations) is used in all the following without further utilisation of a stop set.
In this study, parameters are iteratively calculated using the Levenberg-Marquardt algorithm (Hagan and Menhaj, 1994).
It is also well-known that model performance depends strongly on the parameters initialisation.To define a reliable simulation independent from the initialisation, Darras et al. (2014b) proposed to establish an ensemble of 50 models trained from different initialisations.The output is calculated at each time step by the median of the 50 outputs.It is well-known that this method can smooth the output of the model; nevertheless this is not a drawback in this study as this method improves the robustness of the model, which is very important to extract information.

Towards knowledge improvement about processes
Even if neural networks generally implement black-box models, several authors have tried to make the model more understandable.For example Johannet et al. (2008) 2009) demonstrated the possibility of observing physically interpretable information at the output of hidden neurons.Another path would be to exploit parameters values.Several works were done to constrain the model using physical knowledge at the level of the parameter for example to select the best input set, or to select the more physically plausible model (for example the parameters linked to the evapotranspiration input must be positive) (Olden and Jackson, 2002;Kingston et al., 2006).Considering individual parameter value, another goal would be to assess that the neural network model truly performed physical relation (Mount et al., 2013).
Focusing on parameters, the principal difficulty is the sensitivity of their values to the initialisation before training.This dependence can be avoided using statistical treatments as proposed by Kingston et al. (2006).Kong-A-Siou et al., ( 2013) used a multistep procedure to extract knowledge: (i) proposal of a postulated model that describes the available high-level knowledge about the behaviour of the system to be modelled, (ii) implement a neural model architecture that follows this postulated model: each box of this diagram is implemented using a multilayer perceptron (or a unique linear neuron), (iii) train an ensemble of identical models that differ by their initialisation, and calculate the median of the absolute value of each parameter over the ensemble models (noted as median parameter), and (iv) combine median parameters in a chain of causality to quantify the role of each input variable.Compared to other works that calculate a similar parameters chain-based calculation, and looked at constrains at the level of parameters or inputs (Kingston et al., 2006), this method is original because it applies constrains at the level of processes identified in the block diagram (postulated model).Using the block diagram of the postulated model indicates that some processes are possible; others are not.It allows thus for diminishing the number of parameters, and by this way, the complexity of the model, and the multi-finality of parameters value.The sign of the parameter is not important as the product of two negative parameters is positive in the chain of parameters product; for this reason and in order to take profit of the "black-box" capabilities of ANN, we do not want to constrain individual parameters.Kong-A-Siou et al. ( 2013) applied this method to the Lez karst aquifer to evaluate the groundwater contributions from different geographic zones to the discharge at the outlet.This methodology is called KnoX.Its accuracy was assessed on a fictitious model, whose processes were perfectly known, before being applied to a real aquifer.
In this study we propose to apply the KnoX method to quantify spatially and temporally the effect of different processes, effective in a heterogeneous aquifer, to flash floods.The considered gauge station is Lavalette at the entrance of Montpellier, the time step is an hour.Regarding the case study on the Lez basin, it is very different from the work made by Kong-A-Siou et al. (2013), as in the present study we considered flash flooding at Lavalette (maximum dis-charge equal to 480 m 3 s −1 ) having an important surface water contribution; whereas the previous work investigated daily run-off of underground water at the Lez spring (maximum discharge inferior to 20 m 3 s −1 ).In the present study we investigate the improvement of knowledge about karst and non-karst (surface) flooding processes.

Performance criteria
Several criteria were used to model selection and performance assessment.The first is the Nash-Sutcliffe efficiency, hereafter referred to as R 2 (Nash and Sutcliffe, 1970).R 2 is used to perform model selection using cross-validation.The second is specifically flood oriented: the synchronous percentage of peak discharge, or S PPD .The last, a purely temporal aspect, is the delay between measured and simulated flood peak, hereafter referred to as P d (peak delay).

Nash-Sutcliffe efficiency
The Nash-Sutcliffe efficiency is the most widely used criterion for evaluating hydrological models.It is equivalent to the R 2 determination coefficient: where k is discrete time, n the number of time steps used to calculate R 2 , y k the simulated discharge, y k p the measured discharge, and y k p is the measured mean discharge.The Nash score is not really convenient for assessing flood simulations as it takes into account errors on the whole event and not specifically on the peak.For this reason, other criteria were proposed.

Synchronous percentage of peak discharge
Synchronous percentage of peak discharge is especially designed for the evaluation of flash-flood modelling.It is the ratio of measured and simulated discharges at the time of the observed peak discharge: where k max p is the time of the measured peak discharge.

Delay between measured and simulated flood peaks
The delay between simulated and measured peak discharge is calculated using Eq. ( 4).A positive delay means a retarded simulated peak discharge.Conversely, a negative lag means advanced simulated peak discharge.The peak delay can be expressed as where k max is the time of the simulated peak discharge.
3 Case study: the Lez aquifer

Lez hydrosystem
The Lez aquifer is a Mediterranean karst system located in south-eastern France upstream of Montpellier (Fig. 1).Its extent is estimated at about 380 km 2 (Bérard, 1983).The Lez spring is the main outlet of this aquifer, hereafter referred to as the "basin".Another major spring is the Lirou spring, which flows only during rain events.Both springs feed the Lez river, which crosses Montpellier and its conurbation, an area with a population of about 400 000.The recharge area, composed of karst outcrops and swallow holes, is estimated at about 130 km 2 (Dörfliger et al., 2008).The surface catchment, an area of about 120 km 2 , hereafter referred to as the "watershed", is defined by its topographic setting at the outlet of Lavalette gauging station.As often with karst systems, geographical areas of the watershed and the underground basin are not superposed.Due to complex geology, the recharge area extends to only a part of the watershed and underground basin.For this reason, the Lez aquifer is considered to be a hydrosystem.

Geological and tectonic settings
Similar to many karst systems, the Lez hydrosystem is composed of karst and non-karst components.The karst component crops out in the upstream part of the system; it underlies impervious formations in the downstream part.The karst component consists of Cretaceous and Jurassic carbonate rocks.The karst in these formations developed under the current Mediterranean Sea level as a result of the Messinian crisis (Hsü et al., 1973).These formations also crop out widely and form the calcareous plateaus of both the Causse de l'Hortus and the Causse de Viols-le-Fort.The downstream part of the system is composed of Eocene carbonate and clay formations and Tertiary sandstone and conglomerate formations.
Two major tectonic events have affected the geomorphological structure of the Lez hydrosystem.The first was Pyrenean compression, which occurred during the Eocene.This south-north compression led to the formation of east-west trending faults.The second tectonic event was the opening of the Lion Gulf during the Oligocene.This event led to the formation of north-east-south-west sinistral faults, including the Corconne fault that crosses the Lez basin.

Meteorological and hydrogeological setting
The study area is subject to a Mediterranean climate.Mediterranean events often occur at the meso-scale and promote intense and localised rainfall.Daily rainfalls can reach 650 mm, such as one event that occurred in September 2002 in south-eastern France.Such high-volume rainfall events are referred to as Mediterranean episodes.

Hydrodynamic circulation
Kong-A-Siou et al. ( 2013) divided the Lez basin into four parts (Fig. 2) to better analyse the rainfall-run-off relationship at the Lez Spring at a daily time step.The east-west division is based on the Corconne fault pathway.On the western side of the basin, the south-north division is based on the Causse de Viols-le-Fort boundary, which is a cropping part of the principal aquifer.On the eastern side of the basin, a south-north division has been drawn based on its geological setting (impervious or non-impervious soils).The Oligocene and Eocene formations define a well-delineated impervious zone in the south-eastern part of the basin.The geological composition of each zone is assumed to be "homogeneous", which means that the geology within a zone is quite similar and that it differs more from the geology of other zones.Using the KnoX method, Kong-A-Siou et al. ( 2013) were able to estimate both the water contribution from each "homogeneous" geological zone to the Lez spring discharge and the mean time response.The last study, which was conducted at daily time step, shows the important contribution, more than half, of the north-eastern zone to the discharge of the Lez spring.These contributions are presented in Table 5.

Flash flooding in the Lez basin
Fed by abundant rainfall on the basin (245 mm in few days), the Lez receives contributions from surface watershed and also from underground (karst) basin thanks principally to its tributary: the Lirou river.The Lez can exceed a discharge higher than 500 m 3 s −1 at its entrance to Montpellier.This corresponds to a specific discharge greater than 4 m 3 s −1 km −2 , based on the size of the surface watershed (120 km 2 , see Sect.3.1), or 1.3 m 3 s −1 km −2 considering the whole underground basin (380 km 2 , see Sect.3.1).These two simple numbers highlight the need to better understand the origin of the water, and water circulations during flash floods at the Lavalette station at the entrance to Montpellier.
To this end, two different approaches have been proposed in the literature, using event-based modelling.The first uses data assimilation (Kalman filter) to (i) estimate karst filling at the beginning of the event, (ii) adapt transfer velocity at each time step, and (iii) correct the lack of accuracy of rainfall measurement.Based on these improvements, R 2 of simulation increased from 0.89 to 0.91 for an event in December 2003, and from 0.72 to 0.98 for an event in September 2005 (Table 1).The model is based on the Soil Conservation Service production function coupled with a lag and route transfer function (Coustau et al., 2012).The second approach has operational goals and proposes a graphical method (abacus) to estimate flood peaks from forecast rain features and karst filling (Fleury et al., 2013).Using abacus, authors revised the estimated peak of the September 2005 event down to 460 m 3 s −1 from 480 m 3 s −1 .
Thus, it appears that improved knowledge of karst-river interactions is critical.For this purpose, in the next section we propose to use the KnoX method to estimate the contribution of each zone of the Lez basin to flash-flood events.

Monitoring network
Hourly rainfall data are available at five rain gauges: Saint-Martin-de-Londres, Prades-le-Lez, Sommières, Vic-le-Fesq and Saint-Hippolyte-du-Fort.The French Weather Forecasting Service (Météo France) manages the first two gauges, and the Flood Forecasting Service of the Grand Delta (SPCGD) manages the last three gauges.Only the Prades-le-Lez rain gauge is inside the Lez system, but as pointed out in introduction, it is essential to make use of spatialised rainfall information.In addition, no data at the considered time step is available further south than the Prades-le-Lez rain gauge.Spatial rainfall variability is thus not correctly described in the southern part of the basin.This will limit the reliability of this study regarding the southern zone of the basin.Unfortunately, it is not convenient to use weather radar information in this basin because, due to the distance of the Nîmes radar (50 km), this information is not robust from one event to another and generally underestimate the rainfall value compared to the rain gauge measurements (Marchandise, 2007;Visserot, 2012); also radar information is not available for all events in the database.Discharge data are provided by the Lavalette gauging station managed by an office of the French ministry of ecology and sustainable development (DIREN).Both rainfall and discharge data are available at an hourly time step, which is convenient for flash-flood modelling.
The data suffer from high noise and uncertainty.The uncertainties of discharge measurements have been estimated at around ±20 % for flash floods.The uncertainty of rainfall measurements, can be as high as ±10 to 20 % ( Fifteen flood events whose peak discharges exceed 80 m 3 s −1 were selected (Table 2).Events 7 and 8 were the most intense; contrary to other intense events, events 13 and 8 occurred on dry soils.
4 Application of the KnoX method to flash flooding at Lavalette

From postulated model to neural network model
As presented in Sect.2.2, the postulated model represents the schematic high-level information one has about the basin of interest.This a priori knowledge must be expressed using a block diagram and each box of this diagram is implemented using a multilayer perceptron (or a unique linear neuron).

Postulated model
The postulated model describing flash-flood genesis at Lavalette station is based on the work of Kong-A-Siou et al. ( 2013) as the considered basin is the same (surface and underground).Remember that the primary difference is that flash floods are considered at hourly time steps at the Lavalette station in this study.Using continuous data at daily time steps at the Lez spring, Kong-A-Siou et al. (2013) showed that the north-eastern and north-western zones are the principal contributors to Lez spring discharge.To estimate the contributions of each zone to flooding at Lavalette, we distinguished both behaviours: surface (rapid if inside the impervious watershed) and underground (slower if infiltrated into karst outcrops or in faults: faults play the role of a drain in impervious parts of the basin inside and outside of the Lavalette surface watershed).Schematically, by looking at the map presented in Fig. 2 and following the previous reasoning, one can propose that the north-western zone would make a minor contribution to flash flooding at Lavalette because it is outside the surface (topographic) basin and because its underground time response is high (Table 5).The south-eastern zone would also have a minor impact because its impervious area is mostly outside the Lavalette watershed.
Regarding the south-western and north-eastern zones, it is difficult to propose an a priori quantification.It is thus not easy to estimate the principal contributors to flash flooding.Application of the KnoX method would provide this quantification.The postulated model of the basin behaviour is thus composed of four branches, each corresponding to a zone of the basin, involving surface and groundwater, and feeding a complex mixing process.The postulated model is represented in Fig. 3 in a grey block diagram.
The model used to apply the KnoX method is based on the multilayer perceptron; it follows the postulated model represented in Fig. 3 with four zones contributing to discharge at Lavalette station.As suggested by the KnoX method, to be able to identify the contribution of each zone to the discharge, a linear hidden neuron is added between the inputs and the layer of sigmoid neurons.These neurons are intended to represent rain that falls on each zone; they facilitate the estimate of the time response of water falling in each zone.

Input data
Inputs are mean rainfalls for each zone.These rainfalls are calculated using the Thiessen polygon method.Table 2 shows the weight of each rain gauge for each zone.It highlights the sparse spatial distribution of rainfall information in the south of the basin.Nevertheless, taking into account the importance of the stakes in this zone, and as the goal of this study is to better understand the behaviour of the basin in order to develop well-suited monitoring strategy, we consider the rainfall information sufficient to carry out this study.

Model selection
As presented in Sect.2.1.2,model selection is done using cross-validation and pre-definite number of training iterations.Ranges of investigation and chosen values of various window width and hidden neurons numbers are provided in Table 3.One can note that the complexity of the model is moderate (small number of hidden neurons).To make the model assessment more reliable on the most intense events 7   and 8, model selection was done without these events (blind assessment).

Model validation
The database presented in Table 4 shows seven flash-flood events.Because of the small number of events and their heterogeneity it seemed necessary to estimate modelling quality on all events.We thus decided to train seven models, testing each on one event (training performed on the six following events).The model tested on event n is noted as T n .This is a cross-test operation.Table 4 shows the performance of the seven models in terms of R 2 , synchronous percentage of peak discharge (S PPD ), and peak delay (P d ).After training, we compared the quality of the models: aside from model T 2 , R 2 and S PPD scores of model T 13 are the worst, respectively 0.71 and 138 %.The other models show satisfactory R 2 and S PPD scores: R 2 from 0.79 to 0.96 and S PPD from 87 to 99 %.Table 4. Performances of models T 2 , T 4 , T 6 , T 7 , T 8 , T 13 and T 14 : Nash criterion (R 2 ), the synchronous percentage of the peak discharge (S PPD ) and the peak delay (P d ).T 7 and T 8 are models tested on the two most intense events, they are highlighted in bold.Regarding the P d , only model T 2 performed badly.The models T 4 , T 7 , T 8 , and T 14 are efficient regarding the three performance criteria.Model T 13 overestimates the flood peak; note that event 13 is the sole event that occurred on dry soils, except event 8 when extremely intense rainfall was observed.

Models
Looking at hydrographs presented in Fig. 4 for the two most intense events and taking into account the scores presented in Table 4, one can suggest that the models are efficient enough to be used for knowledge extraction.In addition, as it will be shown in Sect.4.3.1,knowledge extraction is independent of outliers as it takes into account all events of the training database.

Contributions and time transfers of spatial rainfall to discharge at the Lavalette station
The KnoX method was used to estimate the contributions of the four previously defined zones to flash flooding at the Lavalette station.
where H z (H z = 1, 4) is the subscript of the first hidden layer of linear neurons, H N (H N = 1, N c ) is the subscript of the second hidden layer (of N c non-linear neurons); q d is the subscript of the previously measured discharge inputs y q , and o is the subscript of the output layer.
The contribution of an entire zone can be expressed as the sum of the contributions of the considered zone at different time steps: This contribution calculus is done for each exogenous input: rainfall or measured discharge, and for each designed model (T n , n = 1, 7).The contributions of the previous measured discharges used as input to the model ranges from 21 to 30 % (79-70 % for total rainfall) depending on the considered model T n (n = 1, 7).Nevertheless, only rainfall contribution values are considered (for a total of 100 %) because the measured input of discharge plays the role of state variable (Artigue et al., 2012).Rainfall contribution medians for the seven models are provided in Table 5.Values obtained by Kong-A-Siou et al. ( 2013) are also reported; they show the difference between contributions of the same zones to very different processes (flash flood at Lavalette station for this study, and daily aquifer discharge at the Lez spring in the 2013 study).

Time distribution of contributions
Figure 5 shows the time distributions of contributions by the north-western, north-eastern, south-western, and southeastern rainfall inputs.The percentages expressed in this section are the contribution of the inputs to the output.
Figure 5 shows that the major contribution comes from the south-western zone, with two peaks at k-1 and k-4 to k-5.This means that, on average, for all events and all time steps, water comes principally from the south-western zone via two transfer functions: one associated with rapid surface response (k-1) and the other associated with slower karst response (k-4 to k-5) (Causse de Viols-le-Fort, cf.Figs. 1, 2).The same reasoning can be applied to the north-eastern zone: fast surface response at k-2 and slower karst water at k-5 (due to numerous faults in this zone, cf.Fig. 2); nevertheless, contributions from the north-eastern zone are less pronounced than the south-western ones.

Rainfalls contributions to discharge
The map shown in Fig. 2 and Table 5 can guide the discussion: Fig. 2 presents the transcription of geological properties in infiltration capabilities.
-Regarding the south-western zone (43-54 %), it appears that the large extent of karst delayed contribution (24 % for k-4 to k-5) comes from the Causse de Viols-le-Fort.This property is not observed in daily continuous modelling (Table 5) because the Lirou spring (outlet of the Causse de Viols-le-Fort, cf.Fig. 1) is an intermittent spring that flows only in wet conditions; moreover, this part of the aquifer is pumped for drinking water during the dry season.
-Regarding the north-eastern zone, the second largest contributor to flash flooding at Lavalette (18-30 %), a careless analysis could lead to the conclusion that it may be the major contributor because it has a large impervious basin within the surface watershed of the Lez at Lavalette.However, significant losses occur through numerous faults in the southern part of this zone (cf.Fig. 2).As in the south-western zone, two contributions play a role: surface (rapid) and underground (slow) (recall that the contribution reflects the behaviour of the entire training database; thus, this schematic behaviour can be assumed  ground basin.Indeed, a dye tracing experiment demonstrated water circulation between sinkholes in river tributaries of the Vidourle (east of Lez basin, cf.Fig. 2) (Bérard, 1983).
-In the north-western zone, both behaviours (flash flooding at Lavalette or daily run-off at the Lez spring) differ greatly.For flash floods at Lavalette, the north-western zone has a weak influence, which is consistent with the representation of the basin in Fig. 2 (perched aquifer delaying water transfers and limited infiltration along the Corconne fault due to the limited infiltration capability of the fault); for daily run-off at the Lez spring, conversely, delayed transfer and permanent infiltration along faults increases the storage and thus contributes more to daily run-off (28-31 %).
-Lastly, the south-eastern zone has a lesser effect on flash flooding due to its small area in the watershed at Lavalette (12-24 %).One can observe a relatively large variability on Fig. 5.This may be a limit of the work due to (i) the high sensitivity of this small fully impervious area to localised heavy rainfall, combined with the bad representation of the rainfall variability in this zone (Sect.3.6.1),or (ii) the heterogeneity of events that influences the training.For daily run-off at the Lez spring, this zone can be excluded from the recharge basin (4-7 %); this is consistent with the Fig. 2 information, as the zone is composed of impervious formations downstream of the spring.

Time behaviour
Temporal contributions within each zone are shown in Fig. 5.As analysed previously, these contributions are consistent with dual behaviours: fast surface water and slower karst water.The sensitivity of these estimations with respect to the different models (seven models) shown by dotted points does not contradict the proposed analysis.

Flash-flood simulations
Schematically, Fig. 5 shows that response times of 2 h (probably surface water) and 5-6 h (probably karst water) are not very different.Consequently, it is possible for karst water to add to surface flooding in the event of multi-peak rainfalls.This behaviour was underlined by Bailly-Comte et al. (2012), Coustau et al. (2012), andFleury et al. (2013) who focussed on the importance of the initial water level inside the karst.Consequently, flash-flood simulations would require realtime piezometric information in both the north-eastern and south-western zones to estimate the influence of karst water in these two zones.
T. Darras et al.: Spatial and temporal contributions of rainfalls

Limits of the study
The KnoX method is a novel tool for investigating the behaviour of heterogeneous basins.Because this method is currently under discovery and development, the sensitivity of the provided estimations to noise, uncertainty, and small database size have not yet been fully assessed.Nevertheless the overview of the Lez aquifer that this method has provided appears to be quite consistent with the current knowledge.
Based on the proposed behaviour of the Lez aquifer, several fieldwork projects are currently in progress to assess karst and non-karst contributions at the Lavalette station.

Conclusion
Mediterranean flash floods and mountain floods are responsible for numerous casualties and major property damage.These floods occur in heterogeneous basins, which are difficult to observe and thus to model.For this reason this paper investigates the ability to obtain information on a complex aquifer through global systemic modelling using neural networks.For this purpose we chose as a case study flash flooding at the entrance to the great city of Montpellier (southern France) where large potential losses are at stake.After recent trends in flash flooding and karst modelling, this paper focuses on hydrological modelling with neural networks and presents the basics of neural network modelling.It was shown that these statistical models can efficiently model unknown relationships using only databases.Moreover, efficient new approaches were demonstrated to extract information from a set of parameters.Among these methods, the KnoX method can identify contributions from various geographic zones to discharge at the basin outlet; it also provides better characterisation of processes linked to karst water and surface water.To investigate this capability, a case study was conducted on a complex hydrosystem, the Lez hydrosystem.The application to this system shows that the KnoX method consistently estimated the water contributions from four "homogenous" geological zones of the hydrosystem to the discharge at its outlet.The main contributor to flash flooding at Lavalette was identified as the Causse de Viols-le-Fort karst plateau.Piezometric information within this plateau would thus be of crucial importance to model flooding at the Lavalette station.On a more interesting note, several time responses were identified and associated with surface circulations or underground contributions.The lag between these two different response times, estimated at 3 h, may thus correspond to a synchronisation difference between surface and underground flooding.This information may help flood warning services anticipate the size of a flood in case of a rain event composed of two rain peaks separated by 3 h.This is a generic method that can be applied to any heterogeneous basin as long as a sufficient database is available.
; Raynaud et T. Darras et al.: Spatial and temporal contributions of rainfalls and Jain T. Darras et al.: Spatial and temporal contributions of rainfalls and Kumar (

Figure 1 .
Figure 1.Map of the Lez hydrosystem with location of karst outcrops, rain gauges, gauging stations, springs, Causses de Viols-le-Fort and de l'Hortus and of Corconne fault.Boundaries of surface watershed, underground basin and urban zones are also shown.

Figure 2 .
Figure 2. Map of the Lez basin: zone boundaries and topographic watershed, impervious and non-impervious formations, faults intensifying infiltration.

NorthFigure 3 .
Figure 3. Postulated model: grey block diagram.Three layer multilayer perceptron with linear hidden layer between rainfall inputs and non-linear layer.Parameters used in Eq. (4) are denoted in red.

Figure 4 .
Figure 4. Hydrographs of major events in the database: events 7 and 8. Simulated discharge is the median of outputs coming from the 50 run models (differing by their initialisation parameters).Uncertainty on the observed value is the measurement 20 %.Uncertainty on the simulated value is represented by simulations coming from the 50 run models (differing by their parameters initialisation).
of information from parameters After training, the median of absolute values of the parameters for 50 different initialisations is calculated.It is noted as M C ij for the parameter C ij linking the neuron (or input) j to the neuron i.The rainfall contribution of zone z to output at time step k − d (k is the discrete time and d a delay) is denoted as r z (k − d).It is calculated according to the chain of parameters linking one input: r z (k − d), to the output y(k).As it is shown in Fig. 3, we have three layers of parameters between the input r z (k − d) and the output y(k); therefore, there are three terms in the numerator; denominator corresponds to normalisation terms in order to estimate the specific contribution of the input r z (k − d) relative to the sum of all other parameters of the same layer.There is also three normalisation terms because there are three layers of parameters.The following notations are reported in red in Fig. 3.The contribution is calculated as

Figure 5 .
Figure 5. Median and total spread of time distributions of north-western, north-eastern, south-western and south-eastern rainfall inputs contributions calculated from parameters of the seven designed models.

Table 1 .
Dates, peak discharges, and mean cumulative rainfalls of flood events contained in the database.Intense events are highlighted by a star and in bold.Mean cumulative rainfall is calculated using a weighted average of the five rain gauges with the Thiessen polygon method.

Table 2 .
Percentage of each rain gauge to the rainfall for each zone and for the whole Lez basin by Thiessen polygons.

Table 3 .
Optimisation of the rainfall temporal window widths.

Table 5 .
Contributions (in bold) of different zones to discharge.Flash-flood contribution is the median of contributions of rainfall inputs to the output of the seven models T 2 , T 4 , T 6 , T 7 , T 8 , T 13 and T 14 .Maximum and minimum values come from the set of 7 models in this study and from 10 experiments of 50 initialisations in Kong-A-Siou et al. (2013).