Can assimilation of crowdsourced data in hydrological modelling improve flood prediction?

. Monitoring stations have been used for decades to properly measure hydrological variables and better predict ﬂoods. To this end, methods to incorporate these observations into mathematical water models have also been developed. Besides, in recent years, the continued technologi-cal advances, in combination with the growing inclusion of citizens in participatory processes related to water resources management, have encouraged the increase of citizen science projects around the globe. In turn, this has stimulated the spread of low-cost sensors to allow citizens to participate in the collection of hydrological data in a more distributed way than the classic static physical sensors do. However, two main disadvantages of such crowdsourced data are the irregular availability and variable accuracy from sensor to sensor, which makes them challenging to use in hydrological modelling. This study aims to demonstrate that streamﬂow data, derived from crowdsourced water level observations, can improve ﬂood prediction if integrated in hydrological models. Two different hydrological models, applied to four case studies, are considered. Realistic (albeit synthetic) time series are used to represent crowdsourced data in all case studies. In this study, it is found that the data accuracies have much more inﬂuence on the model results than the irregular frequencies of data availability at which the streamﬂow data are assimilated. This study demonstrates that data collected by citizens, characterized by being asynchronous and inaccurate, can still complement traditional networks formed by few accurate, static sensors and improve the accuracy of ﬂood fore-casts.


Introduction
Observations of hydrological variables measured by physical sensors have been increasingly integrated into mathematical models by means of model updating methods.The use of these techniques allows for the reduction of intrinsic model uncertainty and improves the flood forecasting accuracy (Todini et al., 2005).The main idea behind model updating techniques is to either update model input, states, parameters, or outputs as new observations become available (Refsgaard, 1997;WMO, 1992).Input update is the classical method used in operational forecasting, and uncertainties of the input data can be considered as the main source of uncertainty of the model (Bergström, 1991;Canizares et al., 1998;Todini et al., 2005).Regarding the state updating, filtering methods such as the Kalman filter (Kalman, 1960), extended Kalman filter (Aubert et al., 2003;Madsen and Cañizares, 1999;Verlaan, 1998), ensemble Kalman filter (Evensen, 2006), and particle filter (Weerts and El Serafy, 2006) are the most used approaches to update a model when new observations are available.
Due to the complex nature of the hydrological processes, spatially and temporally distributed measurements are needed in the model updating procedures to ensure a proper flood prediction (Clark et al., 2008;Mazzoleni et al., 2015;Rakovec et al., 2012).However, traditional physical sensors require proper maintenance and personnel, which can be cost prohibitive for a vast network.For this reason, improvements to monitoring technology have led to the spread Published by Copernicus Publications on behalf of the European Geosciences Union.
of low-cost sensors to measure hydrological variables, such as water level or precipitation, in a more distributed way.The main advantage of using this type of sensors, defined in the paper as "social sensors", is that they can be used not only by technicians but also by regular citizens and that due to their reduced cost and voluntary labour by citizens, they result in a more spatially distributed coverage.The idea of designing these alternative networks of low-cost social sensors and using the obtained crowdsourced observations is the base of the European project WeSenseIt (2012WeSenseIt ( -2016) ) and various other projects that proposed to assess the usefulness of crowdsourced observations inferred by low-cost sensors owned by citizens.For instance, in the project CrowdHydrology (Lowry and Fienen, 2013), a method to monitor stream stage at designated gauging staffs using crowdsource-based text messages of water levels is developed using untrained observers.Cifelli et al. (2005) described a community-based network of volunteers (CoCoRaHS), engaged in collecting precipitation measurements of rain, hail, and snow.An example of hydrological monitoring, established in 2009, of rainfall and streamflow values within the Andean ecosystems of Piura, Peru, based on citizen observations, is reported in Célleri et al. (2009).Degrossi et al. (2013) used a network of wireless sensors in order to map the water level in two rivers passing by Sao Carlos, Brazil.Recently, the iSPUW project was initiated to integrate data from advanced weather radar systems, innovative wireless sensors, and crowdsourcing of data via mobile applications in order to better predict flood events for the Dallas-Fort Worth Metroplex urban water systems (ISPUW, 2015;Seo et al., 2014).Other examples of crowdsourced water-related information include the so-called Crowdmap platform for collecting and communicating the information about the floods in Australia in 2011 (ABC, 2011) and informing citizens about the proper time for water supply in an intermittent water system (Alfonso, 2006;Au et al., 2000;Roy et al., 2012).Wehn et al. (2015) stressed the importance and need of public participation in water resources management to ensure citizens' involvement in the flood management cycle.Buytaert et al. (2014) provide a detailed and interesting review of the examples of citizen science applications in hydrology and water resources science.In this review paper, the potential of citizen science, based on robust, cheap, and low-maintenance sensing equipment, to complement more traditional ways of scientific data collection for hydrological sciences and water resources management is explored.
The traditional hydrological observations from physical sensors have a well-defined structure in terms of frequency and accuracy.On the other hand, crowdsourced observations are provided by citizens with varying experience of measuring environmental data and little connections between each other, and the consequence is that the low correlation between the measurements might be observed.So far, in operational hydrology practice, the added value of crowdsourced data is not integrated into the forecasting models but only used to compare the model results with the observations in a post-event analysis.This can be related to the intrinsic variable accuracy, due to the lack of confidence in the data quality from these heterogeneous sensors, and the variable lifespan of the crowdsourced observations.Regarding data quality, Bordogna et al. (2014) and Tulloch and Szabo (2012) stated that quality control mechanisms should consider contextual conditions to deduce indicators about reliability (the expertise level of the crowd), credibility (the volunteer group), and performance of volunteers as they relate to accuracy, completeness, and precision level.Bird et al. (2014) addressed the issue of data quality in conservation ecology by means of new statistical tools to assess random error and bias.Cortes Arevalo et al. (2014) evaluated data quality by distinguishing the in situ data collected between volunteers and technicians and comparing the most frequent value reported at a given location.With in situ exercises, it might be possible to have an indication of the reliability of data collected.However, this approach is not enough at an operational level to define accuracy in data quality.For this reason, to estimate observation accuracy in real time, one possible approach could be to filter out the measurements following a geographic approach which defines semantic rules governing what can occur at a given location (e.g.Vandecasteele and Devillers, 2013).Another approach could be to compare measurements collected within a predefined time window in order to calculate the most frequent value, the mean, and the standard deviation.
Crowdsourced observations can be defined as asynchronous because they do not have predefined rules about the arrival frequency (the observation might be taken once, occasionally, or at irregular time steps, which can be smaller than the model time step) and accuracy of the measurement.In a recent paper, Mazzoleni et al. (2015) presented results of the study of the effects of distributed synthetic streamflow observations having synchronous intermittent temporal behaviour and variable accuracies in a semi-distributed hydrological model.It was shown that the integration of distributed uncertain intermittent observations with single measurements coming from physical sensors would allow for the further improvements in model accuracy.However, it did not consider the possibility that the asynchronous observations might be coming at the moments not coordinated with the model time steps.A possible solution to handle asynchronous observations in time with the ensemble Kalman filter (EnKF) is to assimilate them at the moments coinciding with the model time steps (Sakov et al., 2010).However, as these authors mention, this approach requires the disruption of the ensemble integration, the ensemble update, and a restart, which may not be feasible for large-scale forecasting applications.Continuous assimilation approaches, such as three-dimensional and four-dimensional variational methods (3D-Var and 4D-Var), are usually implemented in oceanographic modelling in order to integrate asynchronous observations at their corresponding arrival moments (Derber and Rosati, 1989;Huang et al., 2002;Macpherson, 1991;Ragnoli et al., 2012).In fact, oceanographic observations are commonly collected at asynchronous times.For this reason, in variational data assimilation, the past asynchronous observations are simultaneously used to minimize the cost function that measures the weighted difference between background states and observations over the time interval, and identify the best estimate of the initial state condition (Drecourt, 2004;Ide et al., 1997;Li and Navon, 2001).In addition to the 3D-Var and 4D-Var methods, Hunt et al. (2004) proposed a four-dimensional ensemble Kalman filter (4DEnKF) which adapts EnKF to handle observations that have occurred at non-assimilation times.Furthermore, for linear dynamics, 4DEnKF is equivalent to the instantaneous assimilation of the measured data (Hunt et al., 2004).Similarly to 4DEnKF, Sakov et al. (2010) proposed a modification of the EnKF, the asynchronous ensemble Kalman filter (AEnKF), to assimilate asynchronous observations (Rakovec et al., 2015).Contrary to the EnKF, in the AEnKF, current and past observations are simultaneously assimilated at a single analysis step without the use of an adjoint model.Yet another approach to assimilate asynchronous observations in models is the so-called first-guess at the appropriate time (FGAT) method.Like in 4D-Var, the FGAT compares the observations with the model at the observation time.However, in FGAT, the innovations are assumed constant in time and remain the same within the assimilation window (Massart et al., 2010).In light of reviewed approaches, this study uses a pragmatic method, due in part to the linearity of the hydrological models implemented in this study, to assimilate the asynchronous crowdsourced observations.
The main objective of this study is to assess the potential use of crowdsourced data within hydrological modelling.In particular, the specific objectives of this study are (a) to assess the influence of different arrival frequencies and accuracies of crowdsourced data from a single social sensor on the assimilation performance and (b) to integrate distributed low-cost social sensors with a single physical sensor to assess the improvement in the streamflow prediction in an early warning system.The methodology is applied in the Brue (UK), Sieve (Italy), Alzette (Luxembourg), and Bacchiglione (Italy) catchments, considering lumped and semidistributed hydrological models, respectively.Synthetic time series, asynchronous in time and with random accuracies, that imitate the crowdsourced data, are generated and used.
The study is organized as follows.Firstly, the case studies, the crowdsourced data and the datasets used are presented.Secondly, the hydrological models, the procedure used to integrate the crowdsourced data, and the set of experiments are reported.Finally, the results, discussion, and conclusions are presented.
2 Sites locations and data

Case studies
Four different case studies are used to validate the obtained results for areas having diverse topographical and hydrometeorological features and represented by two different hydrological models.The Brue, Sieve, and Alzette catchments are considered because of the availability of precipitation and streamflow data, while the Bacchiglione catchment is one of the official case studies of the WeSenseIt Project (Huwald et al., 2013).

Brue catchment
The first case study is located in the Brue catchment (Fig. 1), in Somerset, with a drainage area of about 135 km 2 at the catchment outlet in Lovington.The Shuttle Radar Topography Mission digital elevation model (SRTM DEM) of 90 m resolution is used to derive the topographical characteristics, streamflow network, and the consequent time of concentration, by means of the Giandotti equations (Giandotti, 1933), which is about 10 h.The hourly precipitation (49 rainfall stations) and streamflow data used in this study are supplied by the British Atmospheric Data Centre from the HYREX (Hydrological Radar Experiment) project (Moore et al., 2000;Wood et al., 2000).The average precipitation value in the catchment is estimated using ordinary kriging (Matheron, 1963).

Sieve catchment
The second case study is the Sieve catchment (Fig. 1), a tributary of the Arno River, located in the central Italian Apennines in Italy.The catchment has a drainage area of about 822 km 2 with the length of 56 km and it covers mostly hills and mountainous areas with an average elevation of 470 m above sea level.The time of concentration of the Sieve catchment is about 12 h.Hourly streamflow data are provided by the Centro Funzionale di Monitoraggio Meteo Idrologico-Idralico of the Tuscany Region at the outlet section of the catchment at Fornacina.The mean areal precipitation is calculated by the Thiessen polygon method using 11 rainfall stations (Solomatine and Dulal, 2003).

Alzette catchment
The Alzette catchment is located in the large part of the grand duchy of Luxembourg.The drainage area of the catchment is about 288 km 2 and the river has a length of 73 km along France and Luxembourg.The catchment covers cultivated land, grassland, forest land, and urbanized land (Fenicia et al., 2007).The Thiessen polygon method is used for averaging the series at the individual stations and calculating hourly rainfall series (Fenicia et al., 2007), while streamflow data are available measured at the Hesperange gauging station.

Bacchiglione catchment
The last case study is the upstream part of the Bacchiglione River basin, located in the north-east of Italy, and tributary of the Brenta River which flows into the Adriatic Sea at the south of the Venetian Lagoon and at the north of the Po River delta.The study area has an overall extent and river length of about 400 km 2 and 50 km (Ferri et al., 2012).The main urban area located in the downstream part of the study area is Vicenza.The analysed part of the Bacchiglione River has three main tributaries.On the western side are the confluences with the Bacchiglione of the Leogra and the Orolo rivers, while on the eastern side is the Timonchio River (see Fig. 2).The Alto Adriatico Water Authority (AAWA) has implemented an early warning system to forecast the possible future flood events.

Crowdsourced data
Social sensors can be used by citizens to provide crowdsourced distributed hydrological observations such as precipitation and water level.An example of these sensors can be a staff gauge, connected to a quick response code, on which citizens can read water level indication and send observations via a mobile phone application.Another example is the collection of rainfall data via lab-generated videos (Alfonso et al., 2015).Recently, within the activities of the WeSenseIt Project (Huwald et al., 2013), one physical sensor and three staff gauges complemented by a QR code were installed in the Bacchiglione River to measure the water level.In particular, the physical sensor is located at the outlet of the Leogra catchment while the three social sensors are located at the Timonchio, Leogra, and Orolo catchments outlet, respectively (see Fig. 2).
It is worth noting that, in most of the cases, it is difficult to directly assimilate water level observations within hydrological models.However, it is highly unrealistic to assume that citizens might observe streamflow directly.For this reason, crowdsourced observations of water level are used to calculate crowdsourced data (CSD) of streamflow by means of rating curves assessed for the specific river location, which can be easily assimilated into hydrological models.It is because of both the uncertainty in rating curve estimation at the social sensor location and the error in the water level measurements that CSD have such low and variable accuracies when compared to streamflow data estimated from classic physical sen-  sors.CSD are then assimilated within mathematical models as described in Fig. 3 ("overall information flow").
In most hydrological applications, streamflow data from physical sensors are derived (and integrated into hydrological models) at regular, synchronous time steps.In contrast, crowdsourced water level observations are obtained by di-verse types of citizens at random moments (when a citizen decides to send data).Thus, from the modelling viewpoint, CSD have three main characteristics: (a) irregular arrival frequency (asynchronicity), (b) random accuracy, and (c) random number of CSD received within two model time steps.Because streamflow CSD are not available in the case studies at the moment of this study, realistic synthetic CSD with these characteristics are generated ("considered information flow" in Fig. 3).
For the Brue, Sieve, and Alzette catchments, observed hourly streamflow data at the catchments' outlets are interpolated to represent CSD coming at arrival frequencies higher than hourly.For the Bacchiglione catchment, synthetic hourly CSD of streamflow are calculated using measured precipitation recorded during the considered flood events (post-event simulation) as input in the hydrological model of the Bacchiglione catchment.A similar approach, termed "observing system simulation experiment" (OSSE), is commonly used in meteorology to estimate synthetic "true" states and measurements by introducing random errors in the state and measurement equations (Arnold and Dey, 1986;Errico et al., 2013;Errico and Privé, 2014).OSSEs have the advantage of making it possible to compare estimates to true states and they are often used for validating the data assimilation algorithms.
Further details and assumptions regarding the characteristics of CSD and related uncertainty are provided in the next sections.
M. Mazzoleni et al.: Can assimilation of crowdsourced data improve flood prediction?

Datasets
Three flood events for each one of the four described catchments are considered to assess the assimilation of CSD in hydrological modelling.
For the Brue catchment, a 2-year time series (June 1994 to May 1996) of observed streamflow and precipitation data are available for model calibration and validation.On the other hand, for the Sieve catchment, only 3 months of hourly runoff, streamflow, and precipitation data (December 1959to February 1960) are available (Solomatine and Shrestha, 2003).For the Alzette catchment, 2-year hourly data (July 2000 to June 2002) are used for the model calibration and validation (Fenicia et al., 2007).For these catchments, the observed precipitation values are treated as the "perfect forecasts" and are fed into the hydrological model.
For the Bacchiglione catchment, three flood events that occurred in 2013, 2014, and 2016 are considered.In particular, the one of 2013 had high intensity and resulted in several traffic disruptions at various locations upstream Vicenza.The forecasted time series of precipitation (3-day weather forecast) is used as input to the hydrological model.In all the case studies, the observed values of streamflow at the catchment outlet (Ponte degli Angeli for the Bacchiglione) are used to assess the performance of the hydrological model.

Lumped model
A lumped conceptual hydrological model is implemented to estimate the streamflow hydrograph at the outlet section of the Brue, Sieve, and Alzette catchments.The choice of the model is based on previous studies performed in the Brue catchment (Mazzoleni et al., 2015).Direct runoff is the input in the conceptual model and it is assessed by means of the soil conservation service curve number method (Mazzoleni et al., 2015).The average curve number value within the catchment is calibrated by minimizing the difference between the simulated volume and observed quick flow, using the method proposed by Eckhardt (2005), at the outlet section.
The main module of the hydrological model is based on the Kalinin-Milyukov-Nash (KMN; Szilagyi and Szollosi-Nagy, 2010) equation: where I is the model forcing (in this case direct runoff), n (number of storage elements) and k (storage capacity expressed in hours) are the two model parameters, and Q is the model output (streamflow in m 3 s −1 ).In this study, the parameter k is assumed as a linear function between the time of concentration and a coefficient c k .The discrete statespace system of Eq. ( 1) derived by Szilagyi and Szollosi-Nagy ( 2010) is used in this study to apply the data assimilation approach (Mazzoleni et al., 2015(Mazzoleni et al., , 2016)).The model calibration is performed maximizing the Nash-Sutcliffe efficiency (N SE ) and the correlation between the simulated and observed value of streamflow, at the outlet points of the Brue, Sieve, and Alzette catchments, using historical time series.The results of the calibration provided a value of the parameters n and c k equal to 4 and 0.026, 1 and 0.0055, and 1 and 0.00064 for the Brue, Sieve, and Alzette catchments, respectively.

Semi-distributed model
The hydrological and routing models used in this study are based on the early warning system implemented by the AAWA and described in Ferri et al. (2012).One of the goals of this study, in the framework of the WeSenseIt Project, is to test our methodology using synthetic CSD in the existing early warning system of the Bacchiglione catchment.
In the schematization of the Bacchiglione catchment, the location of physical and social sensors corresponds to the outlet section of three main sub-catchments, Timonchio, Leogra, and Orolo, while the remaining sub-catchments are considered as inter-catchments.For both sub-catchments and inter-catchments, a conceptual hydrological model, described below, is used to estimate the outflow (streamflow) hydrograph.The streamflow hydrograph of the three main sub-catchments is considered as the upstream boundary conditions of a routing model used to propagate the flow up to the catchment outlet (see Fig. 2), while the outflow from the inter-catchment is considered as an internal boundary condition to account for their corresponding drained area.In the following, a brief description of the main components of the hydrological and routing models is provided.
The input for the hydrological model consists of precipitation only.The hydrological response of the catchment is estimated using a hydrological model that considers the routines for runoff generation and a simple routing procedure.The processes related to runoff generation (surface, sub-surface, and deep flow) are modelled mathematically by applying the water balance to a control volume representative of the active soil at the sub-catchment scale.The water content S w in the soil is updated at each calculation step dt using the following balance equation: where P and E T are the components of precipitation and evapotranspiration, while R sur , R sub , and L are the surface runoff, sub-surface runoff, and deep percolation model states, respectively (see Fig. 2).The surface runoff R sur is expressed by the equation based on specifying the critical threshold be-yond which the mechanism of Dunnian flow (saturation excess mechanism) prevails: where C is a coefficient of soil saturation obtained by calibration, and S w,max is the content of water at saturation point which depends on the nature of the soil and on its use.The sub-surface flow is considered proportional to the difference between the water content S w,t at time t and that at soil capacity while the estimated deep flow is evaluated according to the expression proposed by Laio et al. (2001): where K S is the hydraulic conductivity of the soil in saturation conditions and β is a dimensionless exponent characteristic of the size and distribution of pores in the soil.The evaluation of the real evapotranspiration is performed assuming it as a function of the water content in the soil and potential evapotranspiration, calculated using the formulation of Hargreaves and Samani (1982).
Knowing the values of R sur , R sub , and L, it is possible to model the surface Q sur , sub-surface Q sub , and deep flow Q g routed contributions according to the conceptual framework of the linear reservoir at the closing section of the single subcatchment.In particular, in the case of Q sur , the value of the parameter k, which is a function of the residence time in the catchment slopes, is estimated by relating the velocity to the average slope length.However, one of the challenges is to properly estimate such velocity, which should be calculated for each flood event (Rinaldo and Rodriguez-Iturbe, 1996).According to Rodríguez-Iturbe et al. (1982), this velocity is a function of the effective rainfall intensity and the event duration.In this study, the estimation of the surface velocity is performed using the relation between velocity and intensity of rainfall excess proposed in Kumar et al. (2002) to estimate the average travel time and the consequent parameter k.However, this formulation is applied in a lumped way for a given sub-catchment.As reported in McDonnell and Beven (2014), more reliable and distributed models should be used to reproduce the spatial variability of the residence times over time within the catchment.That is why, in the advanced version of the model implemented by AAWA, in each sub-catchment the runoff propagation is carried out according to the geomorphological theory of the hydrologic response.The overall catchment travel time distributions are considered as nested convolutions of statistically independent travel time distributions along sequentially connected, and objectively identified, smaller sub-catchments.The correct estimation of the residence time should be derived considering the latest findings reported in McDonnell and Beven (2014).Regarding Q sub and Q g , the value of k is calibrated comparing the observed and simulated streamflow at Vicenza.
In the early warning system implemented by AAWA in the Bacchiglione catchment, the flood propagation along the main river channel is represented by a one-dimensional hydrodynamic model, MIKE 11 (DHI, 2007).However, in order to reduce the computational time required by the analysis performed in this study, MIKE11 is replaced by a Muskingum-Cunge model (see, e.g.Todini, 2007) considering rectangular river cross-sections for the estimation of hydraulic radios, wave celerities, and other hydraulic variables.
Calibration of the hydrological model parameters is performed by AAWA, and described in Ferri et al. (2012), considering the time series of precipitation from 2000 to 2010 in order to minimize the root mean square error between observed and simulated values of water level at the Ponte degli Angeli gauged station.In order to stay as close as possible to the early warning system implemented by AAWA, we used the same calibrated model parameters proposed by Ferri et al. (2012).

Kalman filter
In data assimilation, it is typically assumed that the dynamic system can be represented in the state space as follows: where x t and x t−1 are state vectors at time t and t − 1, M is the model operator that propagates the state x from its previous condition to the new one as a response to the inputs I t , while H is the operator which maps the model states into output z t .The system and measurement errors w t and v t are assumed normally distributed with zero mean and covariance S and R. In a hydrological modelling system, these states can represent the water stored in the soil (soil moisture, groundwater) or on the earth's surface (snow pack).These states are one of the governing factors that determine the hydrograph response to the inputs into the catchment.
For the linear systems used in this study, the discrete statespace system of Eq. ( 1) can be represented as follows (Szilagyi and Szollosi-Nagy, 2010): where t is the time step, x is the vector of the model states (stored water volume in m 3 ), is the state-transition matrix (function of the model parameters n and k), is the input-transition matrix, and H is the output matrix.For example, for n = 3, the matrix H is expressed as Expressions for matrices and can be found in Szilagyi and Szollosi-Nagy (2010).
For the Bacchiglione model (semi-distributed model), a preliminary sensitivity analysis on the model states (soil content S w and the storage water x sur , x sub , and x L related to Q sur , Q sub , and Q g ) is performed in order to decide on which of the states to update.The results of this analysis (shown in the next section) pointed out that the stored water volume x sur (estimated using Eq. 8 with n = 1, H = k, and I t replaced by R sur ) is the most sensitive state, and for this reason we decided to update only this state.
The Kalman filter (KF; Kalman, 1960) is a mathematical tool which allows estimating, in an efficient computational (recursive) way, the state of a process which is governed by a linear stochastic difference equation.The KF is optimal under the assumption that the error in the process is Gaussian; in this case, the KF is derived by minimizing the variance of the system error assuming that the model state estimate is unbiased.
The Kalman filter procedure can be divided into two steps, namely forecast equations, (Eqs.10 and 11) and update (or analysis) equations (Eqs.12, 13, and 14): where K t is the Kalman gain matrix, P is the error covariance matrix, and Q o is a new observation.In this study, the observed value of streamflow Q o is equal to the synthetic CSD estimated as described above.The prior model states x at time t are updated, as the response to the new available observation, using the analysis equations Eqs. ( 12) to ( 14).This allows for estimation of the values of the updated state (with superscript +) and then assessing the background estimates (with superscript −) for the next time step using the time update equations, Eqs. ( 10) and ( 11).The proper characterization of the model covariance matrix S is a fundamental issue in the Kalman filter.In this study, in order to evaluate the effect of assimilating CSD, small values of the model error S are considered for each case study.In fact, a covariance matrix S with diagonal values of 1, 25, and 1 m 6 s −2 are considered for the Brue, Sieve, and Alzette catchments.The bigger value of S in the Sieve catchment is due to the higher flow magnitude in this catchment if compared to the other two.A sensitivity analysis of model performance depending on the value of S is reported in the Results section.For the Bacchiglione catchment, S is estimated, for each given flood event, as the variance between observed and simulated flow values.

Assimilation of crowdsourced data
As described in the previous section, a main characteristic of CSD is to be highly uncertain and asynchronous in time.Various methods have been proposed to include asynchronous observations in models.Having reviewed them, in this study, we are proposing a somewhat simpler approach of data assimilation of crowdsourced observations (DACO).This method is based on the assumption that the change in the model states and in the error covariance matrices within the two consecutive model time steps t 0 and t (observation window) is linear, while the inputs are assumed constant.All CSD received during the observation window are individually assimilated in order to update the model states and output at time t.Therefore, assuming that one CSD is available at time t * 0 , the first step of DACO (A in Fig. 4) is the definition of the model states and error covariance matrix at t * 0 as The second step (B in Fig. 4) is the estimation of the updated model states and error covariance matrix as the response to the streamflow CSD Q o  13) and ( 14), respectively.The Kalman gain is estimated by Eq. ( 12), where the prior values of model states and error covariance matrix at t * 0 are used.Knowing the posterior values x + t * 0 and P + t * 0 , it is possible to predict the value of states and covariance matrix at one model step ahead, t * (C in Fig. 4), using the model forecast equations, Eqs. ( 10) and ( 11).
The last step (D in Fig. 4) is the estimation of the interpolated value of x and P at time step t.This is performed by means of a linear interpolation between the current values of x and P at t * 0 and t * : The symbol ∼ is added on the new matrices x and P in order to differentiate them from the original forecasted values in t.
Assuming that new streamflow CSD are available at an intermediate time t * 1 (between t * 0 and t), the procedure is repeated considering the values at t * 0 and t for the linear interpolation.Then, when no more CSD are available, the updated value of x − t is used to predict the model states and output at t+ 1 (Eqs.10 and 11).Finally, in order to account for the intermittent behaviour of these CSD, the approach proposed by Mazzoleni et al. (2015) is applied.In this method, the model states matrix x is updated and forecasted when CSD are available, while without CSD the model is run using Eq. ( 10) and covariance matrix P propagated at the next time step using Eq. ( 11).

Crowdsourced data accuracy
In this section, the uncertainty related to CSD is characterized.The observational error is assumed to be normally distributed noise with zero mean and given standard deviation where the coefficient α is related to the degree of uncertainty of the measurement (Weerts and El Serafy, 2006).One of the main and obvious issues in citizen-based observations is to maintain the quality control of the water observations (Cortes Arevalo et al., 2014;Engel and Voshell Jr., 2002).In the Introduction section, a number of methods to estimate the model of observational uncertainty have been referred to.In this study, coefficient α is assumed to be a random variable uniformly distributed between 0.1 and 0.3, so we leave more thorough investigation of the uncertainty level of CSD for future studies.We assumed that the maximum value of α is 3 times higher than the uncertainty coming from the physical sensors due to the uncertainty estimation of the rating curve at the social sensor location.

Experimental setup
In this section, two sets of experiments are performed in order to test the proposed method and assess the benefit of integrating CSD, asynchronous in time and with variable accuracies, in real-time flood forecasting.
In the first set of experiments, called "Experiment 1", assimilation of streamflow CSD at one social sensor location is carried out in the Brue, Alzette, and Sieve catchments to understand the sensitivity of the employed hydrological model -KMN -under various scenarios of these data.
In the second set of experiments, called "Experiment 2", the distributed CSD coming from social and physical sensors, at four locations within the Bacchiglione catchment, are considered, with the aim of assessing the improvement in the flood forecasting accuracy.

Experiment 1: assimilation of crowdsourced data from one social sensor
The focus of Experiment 1 is to study the performance of the hydrological model (KMN) assimilating CSD, having lower arrival frequencies than the model time step and random accuracies, coming from a social sensor located at the outlet points of the Brue, Sieve, and Alzette catchments.
To analyse all possible combinations of arrival frequencies, number of CSD within the observation window (1 h), and accuracies, a set of scenarios are considered (Fig. 5), changing from regular arrival frequencies of CSD with high accuracies (scenario 1) to random and chaotic asynchronous CSD with variable accuracies (scenario 11).In each scenario, a varying number of CSD from 1 to 100 is considered.It is worth noting that for one CSD per hour and regular arrival time, scenario 1 corresponds to the case of physical sensors with observation arrival frequencies of 1 h.
Scenario 2 corresponds to the case of CSD having fixed accuracies (α equal to 0.1) and irregular arrival moments, but in which at least one CSD coincides with the model time step.In particular, scenarios 1 and 2 coincide for one CSD available within the observation window since it is assumed that the arrival frequencies of that CSD have to coincide with the model time step.On the other hand, the arrival frequencies of CSD in scenario 3 are assumed random and CSD might not arrive at the model time step.
Scenario 4 considers CSD with regular frequencies but random accuracies at different moments within the observation window, whereas in scenario 5 CSD have irregular arrival frequencies and random accuracies.In all the previous scenarios, the arrival frequencies, the number, and accuracies of CSD are assumed periodic, i.e. repeated between consecutive observation windows along all the time series.However, this periodic repetitiveness might not occur in real life, and for this reason, a non-periodic behaviour is assumed in scenarios 6, 7, 8, and 9.The non-periodicity assumptions of the arrival frequencies and accuracies are the only factors that differentiate scenarios 6, 7, 8, and 9 from scenarios 2, 3, 4, and 5, respectively.In addition, the non-periodicity of the number of CSD within the observation window is introduced in scenario 10.
Finally, in scenario 11, CSD, in addition to all the previous characteristics, might have an intermittent behaviour, i.e. not being available for one or more observation windows.

Experiment 2: spatially distributed physical and social sensors
Synthetic CSD with the characteristics reported in scenarios 10 and 11 of Experiment 1 are generated due to the unavailability of streamflow CSD during this study.In order to evaluate the model performance, observed and simulated streamflows are compared for different lead times.Streamflow data from physical sensors are assimilated in the hydrological model of the AMICO (Alto Adriatico Modello Idrologico e idrauliCO) system at an hourly frequency, while CSD from social sensors are assimilated using the DACO method previously described.The updated hydrograph estimated by the hydrological model is used as the input into the Muskingum-Cunge model used to propagate the streamflow downstream to the gauged station at Ponte degli Angeli, Vicenza.
The main goal of Experiment 2 is to understand the contribution of distributed CSD to the improvement of the flood prediction at a specific point of the catchment, in this case at Ponte degli Angeli.For this reason, five different settings are introduced, and represented in Fig. 6, corresponding to different types of employed sensors.
Firstly, only streamflow data from one physical sensor at the Leogra sub-catchment are assimilated to update the hydrological model of sub-catchment B (Fig. 2) of setting A (Fig. 6).On the other hand, in setting B, CSD from the social sensor located at the Leogra sub-catchment are assimilated.In setting C, CSD from three distributed social sensors are integrated into the hydrological model.Setting D accounts for the integration of CSD from two social sensors and physical data from the physical sensor in the Leogra sub-catchment.Finally, setting E considers the complete integration between physical and social sensors in Leogra and the two social sensors in the Timonchio and Orolo sub-catchments.

Experiment 1: influence of crowdsourced data on flood forecasting
The observed and simulated streamflow hydrographs at the outlet section of the Brue, Sieve, and Alzette catchments with and without the model update (considering hourly streamflow data) are reported in Fig. 7 for nine different flood events for 1 h lead time.As expected, it can be seen that the updated model tends to better represent the flood events than the model without updating in all the case studies.However, this improvement is closely related to the value of the matrix S. The higher the S value (uncertainty model), the closer the model output gets to the observation.For this reason, a sensitivity analysis on the influence of the matrix S on the assimilation of CSD for scenario 1, i.e. coming and assimilated at regular time steps within the observation windows, is reported in Fig. 8.The results of Fig. 8 are related to the first flood events of the Brue, Sieve, and Alzette catchments.Increasing the number of CSD within the observation window results in an improvement of the N SE for different values of model error.However, this improvement becomes negligible for a given threshold value of CSD, which is a function of the considered flood event.This means that the additional CSD do not add information useful for improving the model performance.Overall, increasing the value of the model error S tends to increase N SE values as mentioned before.For this reason, to better evaluate the effect of assimilating CSD, a small value of S, i.e. a model more accurate than CSD, is assumed.
In scenario 1, the arrival frequencies are set as regular for different model runs, so the moments and accuracies in which CSD became available are always the same for any model run.However, for the other scenarios, the irregular moments in which CSD become available within the obser-  For scenarios 2 and 3 (represented using warm red and orange colours in Figs. 9 and 10 for lead times equal to 24 h), the µ N SE values are smaller but comparable to the ones obtained for scenario 1 for all the considered flood events and case studies.In particular, scenario 3 has lower µ N SE than scenario 2. This can relate to the fact that both scenarios have random arrival frequencies; however, in scenario 3, CSD are not provided at model time steps, as opposed to scenario 2. From Fig. 10, higher values of σ N SE can be observed for scenario 3. Scenario 2 has the lowest standard deviation for low values of CSD because the arrival frequencies have to coincide with the model time step and this stabilizes the N SE .In particular, for an increasing number of CSD, σ N SE tends to decrease.However, a constant trend of σ N SE can be observed, due to particular characteristics of the flood events, in the case of flood event 1 of the Sieve and flood events 2 and 3 of the Alzette.It is worth nothing that scenario 1 has null standard deviation because CSD are assumed to come at the same moments with the same accuracies for all 100 model runs.
In scenario 4, represented using blue colour, CSD are considered to come at regular time steps but have random accuracies.Figure 9 shows that µ N SE values are lower for scenario 4 than for scenarios 2 and 3.This is related to the higher influence of CSD accuracies if compared to arrival frequencies.High variability in the model performance, especially for low values of CSD, can be observed in scenario 4 (Fig. 10).
The combined effects of random arrival frequencies and CSD accuracies are represented in scenario 5 using a magenta colour (i.e. the combination of warm and cold colours used for scenarios 2, 3, and 4) in Figs. 9 and 10.As expected, this scenario has the lowest µ N SE and the highest σ N SE values, compared to those reported above.
The remaining scenarios (6 to 9) are equivalent to scenarios 2 to 5 with the only difference being that they are non-periodic in time.For this reason, in Figs. 9 and 10, scenarios from 6 to 9 have the same colour as scenarios 2 to 5 but indicated with a dashed line in order to underline their non-periodic behaviour.Overall, it can be observed that non-periodic scenarios have similar µ N SE values to their corresponding periodic scenario.However, the smoother µ N SE trends can be explained because of the lower σ N SE values, which means that model performance is less dependent on the non-periodic nature of CSD than their period behaviour.Table 1 shows the N SE values and model improvement obtained for the different experimental scenarios during the different flood events.Small improvements are obtained when N SE is already high for one CSD as for the Sieve catchment during flood event 2 or the Alzette catchment during flood event 2.Moreover, it can be seen that a lower improvement is achieved for scenarios (2, 3, 6, and 7) where arrival frequencies are random and accuracies fixed if compared to those scenarios (4, 5, 8, and 9) where arrival frequencies are regular and accuracies random.
In the previous analysis, model improvements are expressed only in terms of N SE .However, statistics such as N SE only explain the overall model accuracy and not the real increases/decreases in prediction error.Therefore, increases in model accuracy due to the assimilation of CSD have to be presented in different ways as increased accuracy of flood peak magnitudes and timing.For this reason, additional analyses are carried out to assess the change in flood peak prediction considering three peaks occurred during flood event 2 in the Brue catchment (see Fig. 7).Errors in the flood peak tim-Hydrol.Earth Syst.Sci., 21, 839-861, 2017 www.hydrol-earth-syst-sci.net/21/839/2017/ Figure 9. Dependency of the mean of the Nash-Sutcliffe efficiency sample, µ N SE , on the number of streamflow crowdsourced data in the experimental scenarios 1 to 9 for the considered flood events in the three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom row).
ing, E RRT , and intensity, E RRI , are estimated as where t o P and t s P are the observed and simulated peak time (h), while Q o P and Q s P are the observed and simulated peak streamflow (m 3 s −1 ).From the results reported in Fig. 11, considering 12 h lead time, it can be observed that, overall, error reduction in peak prediction is achieved for an increasing number of CSD.In particular, assimilation of CSD has more influence in the reduction of the peak intensity rather than peak timing.In fact, a small reduction of E RRT of about 1 h is obtained even increasing the number of CSD.In both E RRI and E RRT , the higher error reduction is obtained con-sidering fixed CSD accuracies and random arrival frequencies (e.g.scenarios 1, 2, 3, 6, and 7).In fact, smaller E RRI error values are obtained for scenario 1, while scenarios 5 and 9 are the ones that show the lowest improvement in terms of peak prediction.These conclusions are very similar to the previous ones obtained analysing only N SE as model performance measures.
The combination of all the previous scenarios is represented by scenario 10, where a changing number of CSD in each observation window is considered.In scenario 11, the intermittent nature of CSD is accounted for as well.The µ N SE and σ N SE values of these scenarios obtained for the considered flood events are shown in Fig. 12.It can be observed that scenario 10 tends to provide higher µ N SE and lower σ N SE values, for a given flood event, if compared to scenario 11.In fact, intermittency in CSD tends to reduce model performance and increase the variability of N SE values for random configuration of arrival frequencies and CSD accuracies.In particular, σ N SE tends to be constant for an increasing number of CSD.

Experiment 2: influence of distributed physical and social sensors
Three different flood events that occurred in the Bacchiglione catchment are used for Experiment 2. Figure 13 shows the observed and simulated streamflow value at the outlet section of Vicenza.In particular, two simulated time series of streamflow are calculated using the measured and forecasted time series of precipitation as input for the hydrological model.Overall, an underestimation of the observed streamflow can be observed using forecasted input, while the results achieved used measured precipitation tend to properly represent the observations.In order to find out what model states lead to a maximum increase of the model performance, a preliminary sensitivity analysis is performed.The four model states, x S , x sur , x sub , and x L , related to S w , Q sur , Q sub , and Q g , are uniformly perturbed by ±20 % around the true state value for every time step up to the perturbation time (PT).
No correlation between time steps is considered.After PT, the model realizations are run without perturbation in order to assess the effect on the system memory.No assimilation and no state update are performed at this step.From the results reported in Fig. 14 related to flood event 1, it can be observed that the model state x sur is the most sensitive state if compared to the other ones.In addition, the perturbations of all the states seem to affect the model output even after the PT (high system memory).For this reason, in this experiment, only the model state x sur is updated by means of the DACO method.Scenarios 10 and 11, described in the previous sections, are used to represent the irregular and random behaviour of CSD assimilated in the Bacchiglione catchment.
Figures 15 and 16 show the results obtained from the experiment settings represented in Fig. 6 during three different flood events.Three different lead time values are considered.Different model runs (100) are performed to account for the effect induced by the random arrival frequencies and accuracies of CSD within the observation window as described above.Figure 15 shows that the assimilation of streamflow from the physical sensor in the Leogra sub-catchment (setting A) provides a better streamflow prediction at Ponte degli Angeli if compared to the assimilation of a small number of CSD provided by a social sensor in the same location (setting B).In particular, Fig. 15 shows that, depending on the flood event, the same N SE values achieved with the assimilation of physical data (hourly frequency and high accuracy) can be obtained by assimilating between 10 and 20 CSD per hour for a 4 h lead time.This number of CSD tends to in-  crease for increasing values of lead times.In the event of intermittent CSD (Fig. 16), the overall reduction of N SE is such that even with a high number of CSD (even higher than 50 per hour) the N SE is always lower than the one obtained assimilating physical streamflow data for any lead time.
For setting C, it can be observed for all three flood events that distributed social sensors in Timonchio, Leogra, and Orolo sub-catchments allow for obtaining higher model performance than the one achieved with only one physical sensor (see Fig. 15).However, for flood event 3, this is valid only for small lead time values.In fact, for 8 and 12 h lead time values, the contribution of CSD tends to decrease in favour of physical data from the Leogra sub-catchment.This effect is predominant for intermittent CSD, scenario 11.In this case, setting C has higher µ N SE values than setting A only during flood event 1 and for lead time values equal to 4 and 8 h (see Fig. 16).
It is interesting to note that for setting D, during flood event 1, the µ N SE is higher than setting C for the low number of CSD.However, with a higher number of CSD, setting C is the one providing the best model improvement for low lead time values.In the event of intermittent CSD, it can be noticed that the setting D always provides higher improvement than setting C. For flood event 1, the best model improvement is achieved for setting E, i.e. fully integrating physical sensor with distributed social sensors.On the other hand, during flood events 2 and 3, setting D shows higher improvements than setting E. For intermittent CSD, the difference between settings D and E tends to reduce for all the flood events.Overall, settings D and E are the ones providing the highest µ N SE in both scenarios 10 and 11.This demonstrates the importance of integrating an existing network of physical sensors (setting A) with social sensors to improve flood predictions.
Figure 17 shows the standard deviation of the N SE , σ N SE , obtained for the different settings for 4 h lead time.Similar results are obtained for the three flood events.In the case of setting A, σ N SE is equal to zero since CSD are coming from the physical sensor at regular time steps.Higher σ N SE values are obtained for setting B, while including distributed CSD (setting C) tend to decrease the value of σ N SE .It can be observed that σ N SE decreases for high values of CSD.As expected, the lowest values of σ N SE are achieved including the physical sensor in the data assimilation procedure (settings D and E).Similar considerations can be drawn for intermittent CSD, where higher and more perturbed σ N SE values are obtained.

Discussion
The assimilation of CSD is performed in four different case studies considering only one social sensor location in the Brue, Sieve, and Alzette catchments, and distributed social and physical sensors within the Bacchiglione catchment.
In the first three catchments, different characteristics of CSD are represented by means of 11 scenarios.Nine different flood events are used to assess the beneficial use in assimilating CSD in the hydrological model to improve flood forecasting.
Overall, assimilation of CSD improves model performance in all the considered case studies.In particular, there is a limit in the number of CSD for which satisfactory model improvements can be achieved and for which additional CSD become redundant.This asymptotic behaviour, when extra information is added, has also been observed using other metrics by Krstanovic and Singh (1992), Ridolfi et al. (2014), Alfonso et al. (2013), among others.From Fig. 9 it can be seen that, in all the considered catchments, increasing the number of model error induces an increase of this asymptotic value with a consequent reduction of CSD needed to improve model performance.For this reason, a small value of the model error is assumed in this study.In addition, it is not possible to define a priori the number of CSD needed to improve a model because of its different behaviour for a given flood event in the event of no update.In fact, as reported in Table 1 and Fig. 8, flood events with high N SE values even without updates tend to achieve the asymptotic values of N SE for a small number of CSD (e.g.flood event 1 in the Brue and flood event 2 in the Sieve), while more CSD are needed for flood events having low N SE without updates.However, for these case studies and during these nine flood events, an indicative value of 10 CSD can be considered to achieve a good model improvement.
Figures 9 and 10 show the µ N SE and σ N SE values for scenarios 2 to 9. Figure 9 demonstrates that for irregular arrival frequencies and constant accuracies (e.g.scenarios 2, 3, 6, and 7) the N SE is higher than for scenarios in which accuracies are variable and arrival frequencies fixed (e.g.scenarios 4, 5, 8, and 9).These results point out that the model performance is more sensitive to the accuracies of CSD than to the moments in time at which the streamflow CSD become available.Overall, σ N SE tends to decrease for high number of CSD.The combined effects of irregular frequencies and uncertainties are reflected in scenario 5, which has lower mean and higher standard deviation of N SE if compared to the first four scenarios.
An interesting fact is that, passing from periodic to nonperiodic scenarios, the standard deviation σ N SE is significantly reduced, while µ N SE remains the same but with a smoother trend.A non-periodic behaviour of CSD, common in real life, helps to reduce the fluctuation of the N SE generated by the random behaviour of streamflow CSD.Finally, the results obtained for scenarios 10 and 11 are shown in Fig. 12.The assimilation of the irregular number of CSD in scenario 10, in each observation window, seems to pro- vide similar µ N SE as the values obtained with scenario 9.One of the main outcomes is that the intermittent nature of CSD (scenario 11) induces a drastic reduction of the N SE and an increase in its noise in both considered flood events.All these previous results are consistent across the considered catchments.
In the case of the Bacchiglione catchment, the data from physical and social sensors are assimilated within a hydrological model to improve the poor flow prediction in Vicenza for the three considered flood events.In fact, these predictions are affected by an underestimation of the 3-day rainfall forecast used as input in flood forecasting practice in this area.
One of the main outcomes of these analyses is that the replacement of a physical sensor (setting A) for a social sensor at only one location (setting B) does not improve the model performance in terms of N SE for a small number of CSD.Figures 15 and 16 show that distributed locations of social sensors (setting C) can provide higher values of N SE than a single physical sensor, even for a low number of CSD, in the event of CSD having the characteristic of scenario 10.For flood event 1, setting C provides better model improvement than setting D for low lead time values and a high number of CSD.This can be because the physical sensor at Leogra provides constant improvement, for a given lead time, while the social sensor tends to achieve better results with a higher number of CSD.This dominant effect of the social sensor, for a high number of CSD, tends to increase for the higher lead times.On the other hand, for intermittent CSD (scenario 11) this effect decreases in particular for flood events 2 and 3.
Integrating physical and social sensors (settings D and E) induces the highest model improvements for all the three flood events.For flood event 1, assimilation from setting E appears to provide better results than assimilation from setting D. Opposite results are obtained for flood events 2 and 3.In fact, the high µ N SE values of setting D can be because flood events 2 and 3 are characterized by one main peak and similar shape while flood event 1 has two main peaks.Assimilation of CSD from distributed social sensors tends to reduce the variability of the N SE coefficient in both scenarios 10 and 11.

Conclusions
This study assesses the potential use of crowdsourced data in hydrological modelling, which are characterized by irregular availability and variable accuracy.We demonstrate that even data with these characteristics can improve flood prediction if integrated into hydrological models.This opens new opportunities in terms of exploiting data being collected in current citizen science projects for the modelling exercise.Our results do not support the idea that social sensors should partially or totally replace the existing network of physical sensors; instead, these new data should be used to compensate the lack of traditional observations.In fact, in the event of a dense network of physical sensors, the additional information from social sensors might not be necessary because of the high accuracy of the hydrological observations derived by physical sensors.
Four different case studies, the Brue (UK), Sieve (Italy), Alzette (Luxembourg) and Bacchiglione (Italy) catchments, are considered, and two types of hydrological models are used.In Experiment 1 (Brue, Sieve, and Alzette catchments), the sensitivity of the model results to the assimilation of crowdsourced data, having different frequencies and accuracies, derived from a hypothetical social sensor at the catchments outlet is assessed.On the other hand, in Experiment 2 (Bacchiglione catchment), the influence of the combined assimilation of crowdsourced data, from a distributed network of social sensors, and existing streamflow data from physical sensors, are evaluated.Because crowdsourced streamflow data are not yet available in all case studies, realistic synthetic data with various characteristics of arrival frequencies and accuracies are introduced.
Overall, we demonstrated that results are very similar in terms of model behaviour assimilating asynchronous data in all case studies.
In Experiment 1, it is found that increasing the number of crowdsourced data within the observation window increases the model performance even if these data have irregular arrival frequencies and accuracies.Moreover, data accuracy affects the average value of N SE more than the moment in which these data are assimilated.The noise in the N SE is reduced when the assimilated data are considered to have nonperiodic behaviour.In addition, the intermittent nature of the data tends to drastically reduce the N SE of the model for different values of lead times.In fact, if the intervals between the data are too large, then the abundance of crowdsourced data at other times and places is no longer able to compensate their intermittency.
Experiment 2 showed that, in the Bacchiglione catchment, the integration of data from social sensors and a single phys-Hydrol.Earth Syst.Sci., 21, 839-861, 2017 www.hydrol-earth-syst-sci.net/21/839/2017/ ical sensor could improve the flood prediction even for a small number of intermittent crowdsourced data.In the event that both physical and social sensors are located at the same place, the assimilation of physical data gives the same model improvement as the assimilation of the high number and non-intermittent behaviour of crowdsourced data.Overall, the integration of existing physical sensors with a new network of social sensors can improve the model predictions.
Although the cases and models are different, the presented study demonstrated that the results obtained are very similar in terms of model behaviour assimilating asynchronous data.
Although we have obtained interesting results, this work has some limitations.Firstly, the proposed method used to assimilate crowdsourced data is applied to the linear parts of hydrological models.This means that the proposed methodology has to be tested on models with non-linear dynamics.Secondly, while realistic synthetic streamflow data are used in this study, the developed methodology is not tested with data coming from actual social sensors.Therefore, the conclusions need to be confirmed using real crowdsourced observations of water level.Finally, advancing methods for a more accurate assessment of the data quality and accuracy of data derived from social sensors need to be considered (e.g.developing a pre-filtering module aimed at selecting only data that have good accuracy while discarding those with low accuracy).Future work will be aimed at addressing the limitations formulated above, which will allow for a better characterization of the crowdsourced data, making them a reliable data source for model-based forecasting.

Data availability
The DEM data were downloaded from the SRTM database (http://srtm.csi.cgiar.org).The rainfall and river discharge data were provided by the British Atmospheric Data Centre from the NERC Hydrological Radar Experiment Dataset (Brue catchment, http://www.badc.rl.ac.uk/data/hyrex/), and by the Alto Adriatico Water Authority (Bacchiglione catchment).The authors are grateful to Marco Franchini for providing the data on the Sieve catchment.
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.Representation of the four case studies considered in this study; clockwise: Brue catchment; Sieve catchment; Alzette catchment; Bacchiglione catchment.

Figure 2 .
Figure 2. Structure of the hydrological model and location of the physical (green dots), social (red dots), and Ponte degli Angeli (PA, blue dots) sensors implemented in the Bacchiglione catchment by the Alto Adriatico Water Authority.

Figure 3 .
Figure 3. Graphical representation of the methodology proposed to estimate streamflow from crowdsourced observations of water level: (a) crowdsourced observations of water level are turned into streamflow crowdsourced data (CSD) by means of rating curves assessed for the specific river location; (b) the streamflow CSD within the hydrological model are assimilated.

Figure 4 .
Figure 4. Graphical representation of the data assimilation of the crowdsourced observations (DACO) method used in this study to assimilate asynchronous streamflow crowdsourced data.

Figure 5 .
Figure 5. Experimental scenarios representing different configurations of arrival frequencies, number, and accuracies of streamflow crowdsourced data.

Figure 6 .
Figure 6.Experiment 2: characteristics of the five experimental settings (A to E) implemented within the Bacchiglione catchment: location of the social and physical sensors (dots), hydrological model update based on different sensors (coloured areas).

Figure 7 .Figure 8 .
Figure 7. Observed (black line) and simulated hydrographs, with (red line) and without (blue line) assimilation, for the flood events which occurred in the three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom row).

Figure 11 .
Figure 11.Representation of the errors in flood peak timing, E RRT , and intensity, E RRI , (as described in Eqs.20 and 21), as function of the number of streamflow crowdsourced data and experimental scenarios (1 to 9), for three different flood peaks occurred during flood event 2 in the Brue catchment.

Figure 12 .
Figure 12.Dependency of the mean µ N SE and standard deviation σ N SE of the Nash-Sutcliffe efficiency sample (first row and second row, respectively) on the number of streamflow crowdsourced data in scenarios 10 (solid lines) and 11 (dashed lines) for the considered flood events (black, blue, red lines) in the three catchments: Brue (left panel), Sieve (central panels), and Alzette (right panels).

Figure 14 .
Figure 14.Effect of model state perturbation on the model output for the Bacchiglione catchment: PT indicates perturbation time; x s indicates model state related to S w ; x sur indicates model state related to Q sur ; x sub indicates model state related to Q sub ; x L indicates model state related to Q g .

Figure 15 .
Figure 15.Model performance expressed as the mean of the Nash-Sutcliffe efficiency µ N SE , assimilating a different number of streamflow crowdsourced data during the three considered flood events for the three lead time values (left panels: 4 h; central panels: 8 h; right panels: 12 h) of scenario 10, for the five experimental settings (A to E) in the Bacchiglione catchment.

Figure 16 .
Figure 16.Model performance expressed as the mean of the Nash-Sutcliffe efficiency µ N SE , assimilating different number of streamflow crowdsourced data during the three considered flood events for the three lead time values (left panels: 4 h; central panels: 8 h; right panels: 12 h) of scenario 11, for the five experimental settings (A to E) in the Bacchiglione catchment.

Figure 17 .
Figure 17.Variability of model performance expressed as σ N SE , assimilating streamflow crowdsourced data within settings A, B, C, and D, assuming a lead time of 4 h, for experimental scenarios 10 (upper row) and 11 (bottom row), during the three considered flood events in the Bacchiglione catchment.