Large watershed flood forecasting with high resolution 1 distributed hydrological model 2

Flooding is one of the most devastating natural disasters in the world with 10 huge damages, and flood forecasting is one of the flood mitigation measurements. 11 Watershed hydrological model is the major tool for flood forecasting, although the 12 lumped watershed hydrological model is still the most widely used model, the 13 distributed hydrological model has the potential to improve watershed flood 14 forecasting capability. Distributed hydrological model has been successfully used in 15 small watershed flood forecasting, but there are still challenges for the application in 16 large watershed, one of them is the model’s spatial resolution effect. To cope with this 17 challenge, two efforts could be made, one is to improve the model’s computation 18 efficiency in large watershed, another is implementing the model on high performance 19 supercomputer. By employing Liuxihe Model, a physically based distributed 20 hydrological model, this study sets up a distributed hydrological model for the flood 21 forecasting of Liujiang River Basin in southern China that is a large watershed. 22 Terrain data including DEM, soil type and land use type are downloaded from the 23 website freely, and the model structure with a high resolution of 200m*200m grid cell 24 is set up. The initial model parameters are derived from the terrain property data, and 25 then optimized by using the PSO algorithm, the model is used to simulate 29 observed 26 flood events. It has been found that by dividing the river channels into virtual channel 27 sections and assuming the cross section shapes as trapezoid, the Liuxihe Model 28 largely increases computation efficiency while keeping good model performance, thus 29 making it applicable in larger watersheds. This study also finds that parameter 30 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-489, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 26 September 2016 c © Author(s) 2016. CC-BY 3.0 License.


Introduction
Flooding is one of the most devastating natural disasters in the world, and huge damages have been caused (Krzmm, 1992;Kuniyoshi, 1992;Chen, 1995;EEA, 2010).Flood forecasting is one of the most widely used flood mitigation measurements, and the watershed hydrological model is the major tool for flood forecasting.Currently the most popular hydrological model for watershed flood forecasting is still the lumped model (Refsgaard et al., 1997), which averages the terrain property and precipitation over the watershed, as well as the model parameters.Hundreds of lumped models have been proposed and widely used, such as the Sacramento model proposed by Burnash et al. (1995), the Tank model proposed by Sugawara et al. (1995), the Xinanjiang model proposed by Zhao (1977), and the ARNO model proposed by Todini (1996), to name a few among others.It is widely accepted that the precipitation for driving the watershed hydrological processes is usually unevenly distributed over the watershed, particularly for the large watershed; therefore, the lumped model could not easily forecast the watershed flooding of large watersheds.Furthermore, due to the inhomogeneity of terrain property over the watershed, which is true even in very small watershed, the watershed flood forecasting could not be forecasted accurately if the model parameters are averaged over the watershed.For this reasons, new models are needed to improve the watershed flood forecasting capability, particularly for large-watershed flood forecasting.
Published by Copernicus Publications on behalf of the European Geosciences Union.
Development of distributed hydrological models in the past decades has provided the potential to improve the watershed flood forecasting capability.One of the most important features of the distributed hydrological model is that it divides watershed terrain into grid cells, which are regarded to have the same meaning of a real watershed; i.e., the grid cells have their own terrain properties and precipitation.Hydrological processes are calculated at both the grid cell scale and the watershed scale, and the parameters used to calculate hydrological processes are also different at different grid cells.This feature enables it to describe the inhomogeneity of both the terrain property and precipitation over watershed.The distributed feature of the distributed hydrological model is a very important feature compared to the lumped model, which could help it better simulate the watershed hydrological processes at all scales, small or large.The inhomogeneity of precipitation over a watershed could also be well described in the model, this is very helpful in modeling large-watershed hydrological processes, particularly in tropical and sub-tropical regions where the flooding is driven by heavy storms.For this reason, the distributed hydrological model is usually regarded as having the potential to better simulate or forecast the watershed flood (Ambroise et al., 1996;Chen et al., 2016).Employing a distributed hydrological model for watershed food forecasting has been the new trend (Vieux et al., 2004;Chen et al., 2016;Cattoën et al., 2016;Witold et al., 2016;Kauffeldt et al., 2016).
The blueprint of a distributed hydrological model is regarded to be proposed by Freeze and Harlan (1969); the first distributed hydrological model was the SHE model proposed by Abbott et al. (1986a, b).The distributed hydrological model requires different terrain property data for every grid cells to set up the model structure; therefore, it is data-driven model.In the early stage of distributed hydrological modeling, this posted great challenge for the application of distributed hydrological model, as the data were not widely available and inexpensively accessible.With the development of remote sensing sensors and techniques, terrain data covering global range with high resolution has become readily available and could be acquired inexpensively.For example, the digital elevation model (DEM) at 30 m grid cell resolution with global coverage could be freely downloaded (Falorni et al., 2005;Sharma and Tiwari, 2014), which largely enhances the development and application of the distributed hydrological models.After that, many distributed hydrological models have been proposed, such as the WATERFLOOD model (Kouwen, 1988), THALES model (Grayson et al., 1992), VIC model (Liang et al., 1994), DHSVM model (Wigmosta et al., 1994), CASC2D model (Julien et al., 1995), WetSpa model (Wang et al., 1997), GBHM model (Yang et al., 1997), WEP-L model (Jia et al., 2001), Vflo model (Vieux et al., 2002), tRIBS model (Vivoni et al., 2004), WEHY model (Kavvas et al., 2004), Liuxihe model (Chen et al., 2011(Chen et al., , 2016)), etc.
The distributed hydrological model derives model parameters physically from the terrain property data, and is regarded not needed to calibrate the model parameter, and therefore it could be used in data-poor or ungauged basins.This feature of distributed hydrological model allowed it to be applied widely in evaluating the impacts of climate changes and urbanization on hydrology (Li et al., 2009;Ott and Uhlenbrook, 2004;VanRheenen et al., 2004;Olivera and DeFee, 2007).But it also was found that this feature caused parameter uncertainty due to the lack of experiences and references in physically deriving model parameters from the terrain property; therefore it could not be used in fields that required high flood-simulated accuracy, including watershed flood forecasting.It was realized that parameter optimization for a distributed hydrological model is also needed to improve the model's performance, and a few methods for optimizing parameters of the distributed hydrological model have been proposed.For example, Vieux and Vieux (2003) tried a scalar method to adjust the model parameters, and the model performance was found to be largely improved.Madsen et al. (2003) proposed an automatic multi-objective parameter optimization method with the Shuffled Complex Evolution (SCE) algorithm for the SHE model, which also improved the model performance.Shaffi and De Smedt (2009) proposed a multi-objective genetic algorithm for optimizing parameters of the WetSpa model; the improved model result is regarded to be reasonable.Xu et al. (2012a) proposed an automated parameter optimization method with the Shuffled Complex Evolution method developed at The University of Arizona (SCE-UA) algorithm for the Liuxihe model, which improved the model performance in small-watershed flood forecasting.Chen et al. (2016) proposed an automated parameter optimization method based on the Particle Swarm Optimization (PSO) algorithm for Liuxihe model watershed flood forecasting, and tested it in two watershed: one small watershed and one large watershed.The results suggested that the distributed hydrological model should optimize model parameters even if there is only little available hydrological data, while the derived model parameters from the terrain property could physically serve as initial parameters.The above progresses in the distributed hydrological model's parameter optimization have matured, and will largely improve the performance of the distributed hydrologcial model, thus advancing the application of the distributed hydrological model in real-time watershed flood forecasting.
Spatial resolution is a key factor in distributed hydrological modeling.Theoretically, if the spatial resolution of a distributed hydrological model is higher, i.e., the grid cell size is smaller, the terrain property could be described finer, and the hydrological processes could be better simulated or forecasted; therefore, the model spatial resolution should be as high as possible.On the other hand, higher model spatial resolution requires higher-resolution terrain property data for the model setup, which may not be available in some watersheds.Most important is that distributed hydrological model uses complex equations with physical meanings to calculate the hydrological processes; therefore, it needs much more computation resources than that of lumped model, and the required computation resources increase exponentially with the increase of the model spatial resolution.Therefore, in modeling flood processes of a large watershed, the computation time needed for running the distributed hydrological model would be huge if the model spatial resolution is kept high, which may make the model application impractical due to the high running cost.So if a distributed hydrological model needs to be applied in a large watershed, a coarser resolution is the only choice, and the model's capability will be impacted with less satisfactory results.This is also called the scaling effect of a distributed hydrological modeling.For this reason, current application for watershed flood forecasting is either limited to a small watershed with higher resolution or a coarser resolution in a large watershed, i.e., a trade-off between the model performance and running cost.
Presently forecasting large-watershed flooding has been in great demand as it impacts people and their properties at large range, but, due to the scale effect, current distributed hydrological models employed for a large watershed are at coarser resolution, which lowers its capability for flood forecasting and warning.For example, past application of a distributed hydrological model for large-watershed flood forecasting is at a resolution coarser than 1 km grid cell (Lohmann et al., 1998;Vieux et al., 2004;Stisen et al., 2008;Rwetabula et al., 2007); the models employed in the pan-European Flood Awareness System (EFAS; Bartholmes et al., 2009;Thielen et al., 2009Thielen et al., , 2010;;Sood and Smakhtin, 2015;Kauffeldt et al., 2016) are at 1-10 km grid cell, which makes the result only applicable for flood warning.
The challenge for distributed hydrological model application in large-watershed flood forecasting is its need for huge computational resources; to cope with this challenge, two efforts could be made.One is to improve the computation efficiency of distributed hydrological modeling in a large watershed, and the other is implementing the model on a highperformance supercomputer; therefore, if the users are willing to pay a high computation cost, the flood forecasting of a large watershed with high resolution could be done.In this study, the Liuxihe model (Chen et al., 2011(Chen et al., , 2016)), a physically based distributed hydrological model proposed for watershed flood forecasting, has been used for flood forecasting of a large watershed in southern China to validate the feasibility of a distributed hydrological model's application for large-watershed flood forecasting.
2 Method and data

Liujiang River basin
The river basin studied in this paper is the Liujiang River basin (herein after referred to as LRB) in south China, which is the first-order tributary of the Pearl River.LRB originates from the village Lang in Guizhou Province, and drains though the Guizhou Province, Guangxi Zhuang Autonomous Region and Hunan Province with 72 % of its drainage area in the Guangxi Zhuang Autonomous Region.The length of its main channel is 1121 km, and the total drainage area is 58 270 km 2 , which makes it a large river basin in China.
LRB is a mountainous watershed.There are high mountains in the north and northwest of the watershed with high elevation, whereas in the south and southeast areas, the elevations are relatively low.This topography helps form severe flooding in the middle and downstream.The basin is in the sub-tropical monsoon climate zone with an average annual precipitation of 1800 mm, and the precipitation distribution is highly uneven both at spatial and temporal scale with 80 % of its annual precipitation occurring in the summer.LRB is in the center of storm zone of the Zhuang Autonomous Region; heavy storm was very frequent in the past.There have been 59 disastrous flooding events in the past 400 years with recording since 1488, which makes the LRB the tributary with the most disastrous flooding among all the first-order tributaries of the Pearl River.In the watershed, there are no significant reservoirs to store flood runoff; therefore, flood forecasting is one of the most effective ways of flood management.

Liuxihe model
The Liuxihe model is a physically based distributed hydrological model proposed mainly for watershed flood forecasting (Chen, 2009;Chen et al., 2011Chen et al., , 2016)).Like other distributed hydrological models, the Liuxihe model divides the watershed into grid cells based on the DEM of the studied watershed.To keep a reasonable model performance, in past experiences of Liuxihe model research and application, the model resolution is limited to 90 m × 90 m or 100 m × 100 m, but only used in small watersheds (Chen, 2009;Chen et al., 2011Chen et al., , 2013Chen et al., , 2016;;Liao et al., 2012a, b;Xu et al., 2012a, b).Precipitation, evaporation and runoff production are calculated at cell scale; i.e., runoff routes first on the cell, then along the cell to river channel and finally to the watershed outlet.As the Liuxihe model is mainly used in sub-tropical regions, the runoff production is calculated based on the saturation-excess mechanism (Zhao, 1977).The runoff routing is classified as hillslope routing, river channel routing, subsurface routing and underground routing.The hillslope routing is regarded as a one-dimensional unsteady flow, and the kinematical wave approximation is employed to do the routing.The river channel routing is also regarded as a one-dimensional unsteady flow, but the diffusive wave approximation is employed to do the routing.The above methods are widely used in the dominant distributed hydrological models.
What makes Liuxihe model unique is that the river channel cross section shape is assumed to be trapezoid.With this assumption, the river channel size could be represented with three dimensions, including the bottom width, side slope and bottom slope.One of the advantages with this assumption is that the river channel cross section size could be estimated with remotely sensed data (Chen et al., 2011); therefore, the Liuxihe model could do river channel runoff routing physically, thus making the Liuxihe model a fully distributed hydrological model.As there are too many river channel cross sections, and many of them are in the upstream of the watershed where they are not easily accessed, in real hydrological modeling, directly measuring the river channel cross section sizes is impractical considering the high cost.For this reason, most of the distributed hydrological models could not be applied in real applications or simply routed in the runoff with lumped methods, which makes the model not a fully distributed hydrological model, thus lowering the model's capability in simulating or forecasting the watershed flood processes.Another advantage of this assumption is that it also simplifies the runoff routing, thus improving the model's computation efficiency.For this reason, even though the Liuxihe model has a very high resolution, it still could be used in real-time flood forecasting.This feature of the Liuxihe model in estimating river channel cross section sizes gives it the potential to be used in large-watershed flood forecasting.
Like other distributed hydrological models, when used in ungauged or data-poor watershed flood forecasting, the Liuxihe model derives model parameters physically from the terrain property data.But if there is observed hydrological data, automatic parameter optimization methods could been tried.But an automatic parameter optimization needs thousands of model runs, which makes it difficult to be used widely due to huge computing source requirement, and also means it takes a long time to set up the model.For this reason, a pub-lic computer cloud was set up for optimizing the parameters of the Liuxihe model, which employs parallel computation techniques and was implemented on a supercomputer system (Chen et al., 2013).With this development, the Liuxihe model could easily optimize its model parameters.
Above advancements of the Liuxihe model in estimating river channel cross section sizes with remotely sensed data, automatic parameters optimization and supercomputing gives it the potential to be used in large-watershed flood forecasting; therefore, in this study the Liuxihe model is employed to study flood forecasting in the LRB.

Hydrological data
There are 66 rain gauges installed in the watershed.In this study, hydrological data of 30 flood events have been collected, including the precipitation of the rain gauges and the river discharge of the Liuzhou river gauge, which is located in the downstream of the watershed and is close to the outlet, as shown in Fig. 1, with a hourly step; brief information on these flood events is listed in Table 1.

Terrain property data
Terrain property data include a DEM, land use/cover map and soil map, which are used for setting up the distributed hydrological model for flood forecasting.In this study, the DEM was downloaded from the SRTM database (Falorni et al., 2005;Sharma and Tiwari, 2014), the land use type was downloaded from the USGS land use type database (Loveland et al., 1991(Loveland et al., , 2000)), and the soil type was downloaded from FAO soil type database (http://www.isric.org).The downloaded DEM has a spatial resolution of 90 m × 90 m, considering LRB is large.The running load for the model with a resolution of 90 m × 90 m may be too heavy to run in this study; therefore, the DEM is rescaled to the resolutions of 200 m × 200 m, as shown in Fig. 2a.The downloaded land use and soil type were at a resolution of 1000 m × 1000 m, and therefore are rescaled to the same resolution as the DEM, as shown in Fig. 2b and c, respectively.

Liuxihe model setup
Considering that the LRB is large, the DEM with a 200 m × 200 m resolution is adopted to set up the model structure, not the original 90 m × 90 m resolution.The whole watershed is first divided into 1 469 900 cells by the DEM horizontally, which were further categorized into hillslope cells and river cells.By using Strahler method (Strahler, 1957), the river channel is divided into a three-order system as shown in Fig. 3, which divides all of the cells into 1 463 204 hillslope cells and 6696 river cells.
To estimate the river channel sizes, 178 virtual nodes were set on the river channel system, and 225 virtual channel sections were formed as shown in Fig. 3.As in the Liuxihe model, the shape of the virtual channel sections is assumed to be trapezoid, and therefore the cross section size is represented by three dimensions, including bottom width, side slope and bottom slope.As proposed in Liuxihe model, the bottom width is estimated based on the satellite remote sens-ing imageries.For the side slope, it is a low-sensitive data and could be estimated based on local experiences.For the bottom slope, it is calculated with the DEM along the virtual channel section.

Parameter optimization
In the Liuxihe model, an initial parameter set was derived first based on the terrain properties, including the DEM, soil type and land use/cover type, so that the parameters will be optimized.In this study, for the insensitive parameter of the land use/cover-related parameters, which is the evaporation coefficient, the initial value is set to be 0.7 for all cells based on the experiences.The initial value of roughness, i.e., the Manning coefficient, which is the sensitive parameter of the land use/cover-related parameters, is derived from the land use/cover type based on references (Chen et al., 1995;Zhang et al., 2006Zhang et al., , 2007;;Shen and Shuanghe, 2007;Guo et al., 2010;Li et al., 2013;Zhang et al., 2015), and listed in Table 2.For the soil-related parameters include the water content at saturation condition, the water content at field condition, the water content at wilting condition, hydraulic conductivity  (Zeidler, 1993;Anderson et al., 1996), a value of 2.5 is set to b for all soil types, and the water content at wilting condition is set to be 30 % of the water content at the saturation condition.The soil thickness is estimated based on local experiences and listed in  tivity at saturation condition are estimated by using the Soil Water Characteristics Hydraulic Properties Calculator (Arya et al., 1981) based on soil texture, organic matter, gravel content, salinity and compaction.The estimated initial values of soil-related parameters are listed in Table 3.In this study, the PSO algorithm is employed to optimize the initial model parameters, as the PSO algorithm has been integrated into the Liuxihe model cloud (Chen et al., 2013(Chen et al., , 2016)).The number of particles of the PSO algorithm is set to 20, while the value range of inertia weight ω is set to 0.1 to 0.9, the value range of acceleration coefficients C1 is set to 1.25 to 2.75, and C2 to 0.5 to 2.5, and the maximum iteration is set to 50.The flood event of 20080609 (see Fig. 4) is selected to optimize the parameters of the Liuxihe model, and Fig. 4 shows the result of the parameter optimization.Among them, Fig. 4a is the parameters evolving process, Fig. 4b is the changing curve of objective function, which is set to minimize the peak flow error, and Fig. 4c is the simulated hydrograph of flood event 20080609 (see Fig. 4) with the optimized parameters.
From the results in Fig. 4, it could be found that after 12 evolutions, the parameters optimization process converges to its optimal values, and the optimal parameters are achieved, the simulated hydrological process of a flood event that is used for parameter optimization is quite a good fit to the observed hydrological process and it could be said that the parameter has a good optimization effect.
As mentioned above, the automatic parameter optimization of the distributed hydrological model is very time consuming.In this study, even a supercomputer is employed with parallel computational techniques, and the time used for this parameter optimization is overwhelming; the total time used for achieving the above optimal parameters of the Liuxihe model for LRB flood forecasting is 220 h, more than 9 days.Considering several runs are usually needed before achieving the final results, the parameter optimization procedure may take a few months, but this run time is really a good investment and the validation results proves this is worth doing.

Model validation
The other 29 flood events were simulated by using the Liuxihe model with the above optimized parameters, and the simulated hydrographs of eight flood events are shown in Fig. 5, the simulated hydrographs of eight flood events with initial parameters are also shown in Fig. 5.
From the result of Fig. 5, it has been found that the simulated flood processes fits the observation reasonably well, particularly the simulated peak flow is quite good, and the simulated hydrological processes with optimized model parameters improved the simulated hydrological processes largely.To further analyze the effect of parameter optimization on model performance improvement, five evaluation indices of the simulated flood events, including the Nash-Sutcliffe coefficient, the correlation coefficient, the process relative error, the peak flow error and water balance coefficient are calculated from the simulated results.Table 4 listed the five indices for both the simulated results with the initial parameters and the optimized parameters.
From Table 4, it could be seen that the five evaluation indices are quite good for the simulated hydrological processes with the optimized model parameters.The average peak flow error is 5 % with 14 % the maximum.The average Nash-Sutcliffe coefficient, correlation coefficient, process relative error and water balance coefficient are 0.82, 0.83, 0.22 and 0.87, respectively, which are also quite good for large river basin flood simulation.ulated hydrological processes with the optimized model parameters are also good improvements to those simulated with the initial parameters, which are 0.64, 0.62, 0.37, 0.29 and 0.78.They are excellent in improving in all five indices, with the average increases of 0.18, 0.21 and 0.09 of the average Nash-Sutcliffe coefficient, correlation coefficient and water balance coefficient, respectively, and the average decreases of the peak flow error and process relative error are 24 % and 15 %, respectively.Therefore, it could be concluded that the Liuxihe model setup in LRB with optimized parameters is reasonable and could be used for flood forecasting of LRB.This also implies that the parameter optimization of the distributed hydrological model could improve model performances, and it should be done when it is possible.

Computation time vs. model resolution
To evaluate the spatial resolution scaling effect of distributed hydrological modeling in LRB, the DEM with a 90 m × 90 m resolution is rescaled to the resolutions of 400 m × 400 m, 500 m × 500 m, 600 m × 600 m and 1000 m × 1000 m; the land use and soil type at a 1000 m × 1000 m resolution are also rescaled to the same resolutions of the DEM used.Liuxihe models for LRB flood forecasting at the above resolutions are then set up with the above methods, and the model structures are shown in Fig. 6.With different spatial resolutions, the numbers of grid cells, hillslope cells and river cells are different, but the river channel orders are all set to 3, the numbers of virtual channel nodes for the 400 m × 400 m, 500 m × 500 m, 600 m × 600 m and 1000 m × 1000 m resolution models are 100, 68, 46 and 33, respectively, and numbers of grid cells, hillslope cells and river cells with different model resolution are listed in Table 5.The sizes of every virtual cross sections in Fig. 6 are measured with the in Fig. 6.
From Table 5, it could be seen, number of grid cells of the model with a 200 m × 200 m resolution is 4 times that of the 400 m × 400 m resolution, 6.25 times that of the 500 m × 500 m resolution, 9 times that of the 600 m × 600 m resolution, and 25 times that of the 1000 m × 1000 m resolu-  tion; it increases at an approximate exponential of power 2, not linearly with the model resolution.
Parameters of the models with 400 m × 400 m, 500 m × 500 m, 600 m × 600 m and 1000 m × 1000 m resolutions are optimized with the PSO algorithm by using the same flood event data, and listed in Table 6.From the results it could be seen that some parameters are significantly different with resolution variation, but some change little, and this implies that the model parameters are resolution dependent.
Computation times required for parameter optimization are quite different.

Model performance vs. model resolution
The other 29 flood events are also simulated with the models at 400 m × 400 m resolution, 500 m × 500 m resolution, 600 m × 600 m resolution and 1000 m × 1000 m resolution.Simulated hydrograph of five flood events, including two big,two medium and one small, are shown in Fig. 7.
From the results it could be seen that the simulated hydrological processes with five different spatial resolutions are quite different.The result simulated with 1000 m × 1000 m resolution is not so good, although the flood shapes are simulated well, but the peak flows are much lower than that of the observation; therefore, the result is not acceptable and could not be recommended.The result simulated with 600 m × 600 m resolution is better than that of 1000 m × 1000 m resolution, but there is still big peak flow error, and therefore the result with 600 m × 600 m resolution is also not recommended.The result simulated with the 500 m × 500 m resolution model is a big improvement to those simulated with the 600 m × 600 m resolution and 1000 m × 1000 m resolution models, the flood shapes are more similar to the observation, and the peak flow is also closer to the observation; therefore, it could be recommended for flood forecasting if the spatial resolution could not be much finer.The result simulated with a 400 m × 400 m resolution has some improvements to that of a 500 m × 500 m resolution, but it is not significant, so it is not recommended to replace the results at the 500 m × 500 m resolution.The result simulated with the 200 m × 200 m resolution model is a big improvement to those simulated with the 400 m × 400 m resolution and 500 m × 500 m resolution model, the flood shapes fit the observation much better and the peak flows are also much closer to the observation; it is a good simulation result and could be recommended for flood forecasting of the LRB.As the results are good enough, there is no need to further explore the finer model resolution.

Conclusions
By employing Liuxihe model, a physically based distributed hydrological model, this study sets up a distributed hydrological model for the flood forecasting of the Liujiang River basin in southern China that could be regarded as a large watershed.Terrain data including DEM, soil type and land use type are downloaded from the website freely, and the model structure with a high resolution of 200 m × 200 m grid cell is set up, which divides the whole watershed into 1 469 900 grid cells that is further divided into 1 463 204 hillslope cells and 6696 river cells.The initial model parameters are derived from the terrain property data, and then optimized by us-ing the PSO algorithm with one observed flood event, which improves the model performance largely.29 observed flood events are simulated by using the model with optimized parameters, the results are analyzed, and the model scaling effects are studied.Based on these studies, following conclusions are suggested.
1.In the Liuxihe model, the river channels are divided into virtual channel sections, and the cross section shapes are assumed to be trapezoid and the size is the same within the virtual channel section.The size of the virtual channel section is simplified to three indices, including bottom width, side slope and bottom slope, those are estimated by using remote sensing imageries.This method not only makes the distributed model application practical but also simplifies the river channel routing method.This significantly increases the model computation efficiency, and it could be used in larger watersheds.Results in this study show the model setup with this method has a reasonable performance; i.e., this simplification has not sacrificed the model's flood simulation accuracy significantly, and therefore this simplification could be used in large-watershed-distributed hydrological modeling, including the Liuxihe model and other models.
2. Uncertainty exists for physically derived model parameters.Parameter optimization could reduce parameter uncertainty, and is highly recommended to do so when there is some observed hydrological data.In this study, the simulated hydrograph with optimized model parameters fits the observed hydrograph more in shape than that simulated with initial model parameters, and the five evaluation indices are improved also.The average increases of the Nash-Sutcliffe coefficient, correlation coefficient and water balance coefficient are 0.18, 0.21 and 0.09, respectively, and the average decreases of the peak flow error and process relative error are 24 and 15 %, respectively; this implies that the model performance is improved significantly with parameter optimization.
3. Computation time needed for running a distributed hydrological model increases exponentially at an approximate power of 2, not linearly with the increasing of model spatial resolution.In this study, the computation time required for parameter optimization for the model with a 200 m × 200 m resolution is 220 h, that is 4 times of that of the model at 500 m × 500 m and 18.3 times of that of the model at 1000 m × 1000 m resolution.Based on the Liuxihe model cloud system implemented on the high-performance supercomputer, the 200 m × 200 m model resolution is the highest resolution that could be fulfilled in modeling Liujiang River basin flooding with the Liuxihe model, considering the computation cost.This also means that if the user could pay the high computation cost, then a larger watershed could also be modeled with the Liuxihe model by implemented the Liuxihe model cloud system on a much more advanced high-performance supercomputer, this could be easily done presently if the user thinks this investment is a worth doing.
4. In forecasting a watershed flood by using the distributed hydrological model, minimum model spatial resolution needs to be maintained to keeping the model at an acceptable performance.Usually if the model's spatial resolution increases, i.e., the grid cell gets smaller, the model performance is better, but this will increase the run time significantly; therefore, there is a threshold model spatial resolution to keep the model performance reasonable while keeping the model run at the least amount of time.In this study, the threshold model spatial resolution is at 500 m × 500 m grid cell, but the resolution at 200 m × 200 m grid cell is recommended by trading-off between the computation cost and the model performance.This conclusion may be different in different watersheds for the Liuxihe model, or even different in the same watershed for different models.
5. Terrain data downloaded freely from the website derived a river channel system that is very similar to the natural river channel system after it is rescaled from its original spatial resolution of 90 m × 90 m to 200 m × 200 m, 500 m × 500 m and 1000 m × 1000 m, but the higher-resolution DEM describes the river channel more in details.This means that the freely downloaded DEM could be used to set up the Liuxihe model for Liujiang River basin flood forecasting.

Data availability
The DEM data were downloaded from the SRTM database (http://srtm.csi.cgiar.org), the land use type data were downloaded from the USGS Global Land Cover Characterization (GLCC) database (https://lta.cr.usgs.gov/GLCC),and the soil type data were downloaded from FAO soil type database (http://www.isric.org).The flood event data including the rainfall and river discharge data are provided by the Bureau of Hydrology, Pearl River Water Resources Commission, China; this data can only be used for this study and cannot be provided to others by the authors.
Competing interests.The authors declare that they have no conflict of interest.

Figure 4 .
Figure 4. Parameter optimization results of Liuxihe model for LRB with PSO algorithm.

Figure 5 .
Figure 5. Simulated flood events by Liuxihe model with optimized parameters.
For the model with a 200 m × 200 m resolution, the time for parameter optimization is 220 h, whereas that for models with 400 m × 400 m, 500 m × 500 m, 600 m × 600 m and 1000 m × 1000 m the resolutions are 80, 55, 35 and 12 h, respectively.The times needed for parameter optimization of the model at 200 m × 200 m resolution is 2.75 times that for the 400 m × 400 m resolution model, 4 times that for the 500 m × 500 m resolution model, 6.3 times that for the 600 m × 600 m resolution model, and 18.3 times that for the 1000 m × 1000 m resolution model.Considering the time needed for model run, the 200 m × 200 m model resolution is regarded as appropriate for LRB.

Figure 7 .
Figure 7. Simulated results with different model resolutions.

Table 1 .
Brief information of flood events in LRB.

Table 2 .
The initial values of land use/cover-related parameters.
at saturation condition, soil thickness and soil porosity characteristics coefficient b.Based on past modeling experiences and references

Table 3 .
The initial values of soil-related parameters.

Table 4 .
Evaluation indices of the simulated flood events.

Table 5 .
Grid cell numbers with different model spatial resolution.

Table 6 .
Optimized parameters with different model spatial resolution * .