Robust extraction of thalwegs network from DTM : application on badlands

Robust extraction of thalwegs network from DTM: application on badlands N. Thommeret, J. S. Bailly, and C. Puech CNRS, UMR 8591, Laboratoire de Géographie Physique, 92190, Meudon, France Cemagref, UMR TETIS, 34093, Montpellier, France AgroParisTech, UMR TETIS, 34093, Montpellier, France AgroParisTech, UMR LISAH, 34060, Montpellier, France Received: 23 December 2009 – Accepted: 31 December 2009 – Published: 1 February 2010 Correspondence to: N. Thommeret (nathaliethommeret@gmail.com) Published by Copernicus Publications on behalf of the European Geosciences Union.


Introduction
Badlands are usually defined as intensely dissected landscape where vegetation is sparse or absent and which are useless for agriculture (Bryan and Yair, 1982).A combination of various weathering and erosion processes affects the side slopes and channels at different spatial and time scales.This results in the modelling of a particular landscape.To understand the geomorphological and hydrological functioning over highly eroded areas, different scale cartographies are needed.The badlands morphology consists in numerous gullies of variable size (from 10 cm up to 100 m wide) organised in hierarchical networks.The high drainage density and steep slope make the field work and detailed cartography difficult.Consequently, remote sensing seems Introduction

Conclusions References
Tables Figures

Back Close
Full to be the most appropriate source of data for badlands mapping.Badlands mapping provides essential information for the study of their spatial pattern.The quantitative description participates to a better knowledge of the functioning and scaling relationship.
In addition, it allows accurate comparisons between different eroded areas.
The gullies network representation and characterisation are not obvious.One possibility is to represent the gullies by one of their constitutive elements: thalwegs networks.
Thalweg is defined as the line joining the lowest points of the gully.In badlands topography, these networks are characterised by tree-structured topologies.Then, to study the spatial pattern, digital indices describing the 3-D network geometry and topology must be computed for different extents and resolutions.The indices are very sensitive to artefacts so the network representations must fit the data information to provide stable results.Airborne or high resolution satellite sensors supply source data for thalwegs representation.We need an automatic and easily repeatable method with less subjective parameter as possible to extract thalwegs network.If image interpretation is possible and provides consistent networks, the result depends on the operator and so is not reproducible (Molloy and Stepinski, 2007).Moreover, this operational but fastidious technique is hard to apply on large extent areas.On the other side, the landform can be extracted directly from digital terrain models (DTM).Indeed, DTM are spatial digital representations of the relief and so allow staking out morphological features.Nevertheless, the features extraction as thalwegs from DTM asks some fundamental issues like which methods permit to extract complete and stable networks from spatial data.
The most used methods are based on regular grid DTM.Various drainage algorithms (O' Callaghan and Mark, 1984;Fairfield and Leymarie, 1991;Lea, 1992;Tarboton, 1997) offer possibilities of computing drainage networks all over the grid surface.However, the transition from one drainage flow path to a vector thalweg network is not self-evident (Martz and Garbrecht, 1995;Tarboton and Ames, 2001;Turcotte et al., 2001).Most of the time, the process uses a unique contributing area threshold beyond which thalwegs, valleys or hydrographical networks are depicted.Since this criterion Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc

Printer-friendly Version
Interactive Discussion has no direct physical significance, where to start thalwegs upstream?Some authors proposed topological or physical reasoning to establish the threshold (Tarboton et al., 1991;Montgomery andDietrich, 1988, 1994).Dietrich and Montgomery (1994) seek to distinguish dissected areas where fluvial transport influence is major from undissected areas where diffusionnal hillslope transport processes are dominant to locate channel heads.Whereas, the Tarboton et al. (1991) reasoning rely on searching "the smallest scale, measured in term on support area, where the key elevation scaling laws break".They argue that this point represents the physical transition from channel erosion to hillslope erosion.But as we plan to analyse the network topology and relate it to the morphology, we cannot use these reasoning.The question is then: how to get thalweg networks really fitting to the amount of information about landforms contained in a DTM?In other words, how to compute robust thalwegs network from a grid DTM? Morphological index computed from the DTM permits to locate concave areas represented in the data.A large number of parameters identifying the terrain convergences exist (Wilson and Gallant, 2000) and some have already been used for thalwegs extraction (Tarboton and Ames, 2001;Molloy and Stepinski, 2007).
In this paper, we present a method that combines existing drainage algorithm and a morphological index to delineate thalwegs network.In comparison with the plan curvature, we promote the convergence index that seems to be a good choice thanks to its ability of underscoring the morphological features like valleys.The method aims at extracting thalwegs objectively considering the significant landforms included in a DTM.The paper is divided in four main sections: Sect. 2 presents the study areas and the data; Sect. 3 describes the method proposed to extract and assess thalwegs network from DTM; then, Sect. 4 shows the results obtained with both virtual and actual DTM; and finally Sect. 5 discusses the method and the result mainly by comparing them to another extraction method.

Draix badlands catchments and data
The study area is located in the southern Alps in France, around 6 • 20 E and 44 • 08 N.
The test site represents one of the upper Durance catchment's badlands areas.These badlands are formed in the highly erodible black marls terrain of the Jurassic period.Moreover, the Mediterranean climate and precipitations regimes participate to the active weathering and erosion processes.This particular test site known as Draix experimental catchments constitutes a real field laboratory for the study of mountain erosion processes (Richard and Mathys, 1999).Five small catchments, from 1000 m 2 up to 1 km 2 , have been monitored since 1984 and are now labelled "Observatory for Research on the Environment" (ORE).For this work, we focused on a particular catchment called the Moulin.The Moulin catchment spreads over 0.08 km 2 with a low vegetation cover rate (40%) and a very high drainage density.The very steep slope, with an average slope of 33 the Moulin catchment produced by an airborne LiDAR (Bretar et al., 2009).Moreover, xyz points acquired with a DGPS (Jacome et al., 2008) permits to evaluate the altimetrical error spatial distribution of the two DTM.

Method
The presented method points toward the extraction of morphological feature -thalwegs network -reflecting the information contained in the DTM.More specifically, the objective is to delineate thalwegs where the DTM denotes a terrain concave curvature where flow converges.
We propose a three-step method based on the combination of a drainage algorithm and a morphological index (Fig. 1).The morphological index permits to locate convergent features.Two indices are used: firstly the plan curvature and then the convergence index (K öthe and Lehmeier, 1994;Kiss, 2004).In a first step, areas with significant convergence are derived from the DTM.We assume that these areas correspond to gully floor elements.But as these areas can be discontinuous, a second step of connection is needed.So, the discontinuous convergent areas are connected using a drainage algorithm to obtain a directed tree structure considered as the whole gully floor features.However, the result is a thick network.In the third step, the thick scheme is reduced and filtered to finally get a refined vector thalwegs network which allows some digital indices calculation.

Significant convergence areas detection
Several morphological indices are available to distinguish the flow convergence and divergence on DTM (Wilson and Gallant, 2000).Two main recent studies propose to integrate curvature parameters derived from DTM to extract thalwegs network.For both studies, the core idea is to flag convergence cells but with different convergence computing: one is the plan curvature (Tarboton and Ames, 2001) and the other is the Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc

Printer-friendly Version
Interactive Discussion tangential curvature (Molly and Stepinski, 2007).However, in this work, we compare the propensity of the plan curvature and the convergence index (CI) to extract convergence cells.Indeed, the CI that shows great potential to reveal the gully floors (Bretar et al., 2009).

Using the plan curvature
The plan curvature (or horizontal convexity; Wood, 1996) is defined as the second derivative of the surface elevation (Evans, 1972;Rana, 2006).The plan curvature (PC) measures the contour curvature and permits to distinguish the flow convergent and divergent cells of the DTM (Wilson and Gallant, 2000).The local surface is approximate by a quadratic function (Eq.1), and PC is found by differentiating the surface equation in direction transverse to the slope (Eq.2) (Zevenbergen and Thorne, 1987).

Using the convergence index
The convergence index (CI) calculation principle differs from the PC.The CI is based on the aspect which is defined as the slope direction (Zevenbergen and Thorne, 1987) and computed from the local surface function (Eqs. 1 and 3).
Then, considering a 3×3 moving window (Fig. 2), for each external cell i , θ i is the angle, in degrees, between the aspect of cell i and the direction to the centre i.e. the Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc Printer-friendly Version Interactive Discussion direction of the vector joining the centre of cell i and the centre of the window.The convergence index is the average of the θ i minus 90 CI ranges from −90 • to 90 • (K öthe and Lehmeier, 1994; Kiss, 2004).In the resulting grid, positive values represent divergent areas and negative values represent convergence areas.Null values relate to areas without curvature as planar slopes.

Significant values
However, some CI or PC values close to zero may not be significant: the DTM noise impacts on the results.So to highlight the significant cells which should effectively correspond to gully floors from proposed morphological criteria, a threshold is applied on each grid (CI T for the CI grid and PC T for the PC grid).To objectively determine the thresholds, the main parameter we use is the DTM altimetrical error spatial distribution.
To determine CI T , a 50×50 cells tilted plane DTM (with a slope of 33 • , which is the Draix mean slope) is simulated with a noise respecting the altimetrical error spatial structure of the actual DTM considered (Fig. 3).As spatially correlated errors were found and modelled on the DTM, the noise simulation process is based on geostatistical Gaussian simulation, in the same way and with same spatial covariance function and parameters as the one used in Bretar et al. (2009).The computation of CI from the simulated DTM provides threshold CI T beyond which CI values can be considered as significantly different from a plane landform.Indeed, the Gaussian distribution of the CI values allows setting CI T as mean value minus twice the standard deviation (σ).We accept that 2.5 percents of CI values due to hazard on noise can be classified in significant gully floors (Bretar et al, 2009).Then, a binary image attributes the value 1 to significant convergence cells, i.e. for CI>CI T and 0 to the non-convergence cells.The same process is applied to determine the threshold PC T .Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc Printer-friendly Version

Interactive Discussion
At the end of this step, disconnected convergence areas are identified and considered as gully floor features.Note that this identification is objective and repeatable.

Tree-structured gully floors connection
As the objective is to compute continuous and tree-structured thalwegs network, the disconnected convergence areas need to be connected to create a continuous area.
The use of drainage algorithms permits to achieve this goal of upstream-downstream reconnection.Most of the drainage algorithms implemented are based on local analyses and follow the same steps: i) determination of flow direction; ii) computation of flow accumulation; iii) and delineation of the network using a unique contributing area threshold (CA T ).The idea of flow accumulation permits to generate continuous flow path.Besides, the choice of a unidirectional flow (each cell has a unique output) ensures the tree structure.The main idea proposed here is firstly, to initiate the drainage area upstream only where a convergence is denoted and then, to force the flow through the convergence areas detected.The weight of the flow accumulation computation by the thresholded convergence grid makes the idea operational.Then we keep only the accumulation values different from zero instead of applying any contributing area threshold.
We compared the popular D8 algorithm (O' Callaghan and Mark, 1984) and the kinematic routing algorithm (KRA) (Lea, 1992).The second one presents a more flexible direction determination.Indeed, Lea (1992) assumes that flow moves across a planar surface in the direction of the steepest slope, or aspect angle, similar to a rolling ball such that, if a ball was released from the centre of a grid cell, it would travel down the steepest grade (Wilson, 2007).Moreover, whereas the D8 directions are multiples of 45 • , the KRA has a 1 • accuracy (Wilson, 2007).
Besides, whichever algorithm is chosen, the directions often need to be calculated on a filled DTM.Most of the time, the DTM contain sinks (cells without output because of the higher elevation of the neighbours) that need to be filled in order to obtain a continuous network (Martz and Garbrecht, 1999;Planchon and Darboux, 2001).Although, Introduction

Conclusions References
Tables Figures

Back Close
Full As a result, we obtain a binary raster map of gully floors represented as a continuous and tree-structure.

Vector thalweg network extraction
The last step aims at extracting a vector thalwegs network in order to perform a topological analysis of the network extracted.The network is settled from raster gully floors area.We successively use skeletonisation (Molloy and Stepinski, 2007) and vectorisation operations.The principle of skeletonisation is the reduction of a shape to its main curves.Then, we assume that the thalweg is the main curve of the gully floor area.For example, sinuosity is represented only if the gully floor describes a sinuosity.In Draix badlands landscape, this assumption seems reasonable.Moreover, the topological analysis in perspective needs to be performed on a tree vector network as refined as possible (with less artefacts as possible) because of their high sensitivity to artefacts.
The whole passage from a thick path to vector network is detailed here.Firstly, a raster network is defined by thinning the raster gully floors previously obtained.The raster network -the skeleton of the gully floors area -is one-pixel wide.Finally, it is vectorised and filtered.According to the tree-network representation hypothesis, the vectorisation has to avoid loops development.Then, the filtering process mostly consists in removing the shortest thalwegs.In this work, we assume that removing the segments shorter than two times the resolution was reasonable.According the DTM accuracy and the badlands morphology, these segments seem non significant and unstable.Then, stable topological indices can be computed from the resulting directed tree thalweg network.Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc Printer-friendly Version Interactive Discussion

Network delineation and topology errors quantification
We propose to quantitatively assess the network extracted in order to validate the extraction method.Yet, there is no standard method to assess the quality of extracted network (Molloy and Stepinski, 2007).Although, we propose a two-level error assessment based on the comparison with a reference network: the first level is a global assessment of the network focusing on the lengths values; the second level results from stream-by-stream observations.The reference network is a field mapped network.The questions that can be asked are: how much of the extracted network is over-detected (extracted segments that cannot be matched with the reference) and at this opposite how much is under-detected (reference segments missing on the extracted network)?
To answer these questions, a global comparison of the total network length gives first information on the correspondence between the extracted and reference networks.A more completed assessment is based on the overlap of the networks (Wiedermann et al., 1998;Molloy and Stepinski, 2007).The overlap permits then to quantify the match between the two networks.More specifically, this method consists in measuring the length of the extracted network included in the reference domain and inversely.The domain is defined as a buffer around the network.The width of the buffer is determined by an operator and the choice depends on the accuracy required considering the DTM resolution and the type of terrain.Two simple indicators are computed from the matching: the false negative (FN), which is the length of the reference not included in the extracted network domain, and the false positive (FP), which is the length of the extracted network not included in the reference domain.In other words, FN and FP should respectively correspond to the length of under-detection and the length of overdetection.FN and FP are normalised for more meaningful results; they range between 0, when no length is included, and 1, when all the length is included.Then an error indicator summering the information is computed (Eq.5); it ranges between 0 and 1.The closest to zero the error indicator is, the more compliant the network is.This indicator Introduction

Conclusions References
Tables Figures

Back Close
Full However, even if this overlap method provides interesting information on the network completeness and geometric accuracy (Wiedermann et al., 1998), some anomalies can disrupt the meaning of the length calculated (Fig. 4).For example, the length measure cannot be justified if the streams have not the same orientation.In this case, the measured thalweg does not necessarily correspond to the buffered thalweg.That is why we consider stream-by-stream assessment to provide more significant results, without ambiguity.This method answers directly to the question: which thalweg segment matches with the field reality and which does not?We make a one-way matching, from the extracted network to the field network: the extracted thalweg segments are associated to field thalweg segments.Then matched and non-matched segments are counted and normalised by the total number of extracted segments.Still, the procedure is manual and results should slightly depend on the operator.

Results
The proposed method was applied to both virtual and actual terrain cases.

Virtual valleys DTM
Figure 5 compares the results obtained using different extraction methods for two virtual DTM.The classical method is the kinematic routing algorithm (detailed in Sect.3.2).
For the titled plane DTM (Fig. 5a), we observe that with the classical method, the main stream is divided in two branches whereas the algorithms using a morphological index (PC or CI) return a single thalweg segment.This unique downstream line shows that the morphological index based methods provide more significant results Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc Printer-friendly Version Interactive Discussion in wide valleys.Moreover, it can be add that with the classical method, artefact lines are created on the sides of the DTM although it is a perfect plane; these lines do not appear when a curvature index is taken into account.The sides artefacts are the direct consequences of the unique contributing area threshold: a thalweg is delineated once the threshold is reached.Differences are also observed between the PC and CI based methods.The CI based network effectively fit the virtual valley: all the thalweg segments of the valley are represented and no over-detected segments are observed.Whereas, the PC based network presents under-detected segments.
The corrugated surface (Fig. 5b) shows that the flow is only passing in the concave part of the wave when a morphological index is used whereas with the classical method the flow goes up on the side where no curvature is represented.In this case, no distinctions are observed between the use of PC and CI.
In summary, it seems essential to integrate a morphological index for the thalwegs delineation.Furthermore, the CI based method provides a unique solution fitting the virtual valley.

The Moulin catchment
As in case of virtual terrain, we compare the networks obtained applying the CI based method, the PC based method and the one extracted with unique contributing area criterion methods.The comparison is mainly geometric.Finally, we present the numerical results of the network assessment.
Concerning the CI based method, the threshold CI T resulting of the geostatistical calculation considering the error spatial distribution of the LiDAR DTM is −9.54 • .For the plan curvature, PC T is −0.66.For the classical method extraction, the kinematic routing algorithm is used with a contributing area threshold CA T of 50 m 2 .Figure 6 shows the networks resulting from the three different methods applied on the onemeter resolution LiDAR DTM.
Regarding the geometric comparison, the use of a unique threshold method has resulted in both the presence of numerous drains disrupting the network topology 891 Introduction

Conclusions References
Tables Figures

Back Close
Full (especially at confluences) and the absence of drains that should be represented (especially upstream).This observation stays true for any threshold value.This points out the fact that not all the information of the DTM is used unlike with the CI based method.
The PC based network representation seems to be less informative than the CI based network: we observe less thalweg segments.
But the only way to prove which network is more compliant with the field the reality is to compare the extracted networks to a field reference (Fig. 7).The quantitative assessment of the three networks extracted is shown in Tables 1 and 2. According to the global assessment (Table 1), the more compliant network is the one extracted using the CI based method.Indeed, among all the networks, this one has the higher accuracy statistic (the closest to zero EI).Morphological index based method both supply more correct networks than the classical algorithm.The KRA and PC based network's FPs are low but their FNs are high so the difference between the two is important.In these two cases, most of the extracted network corresponds to the field reality so the geometry is correct but the network admits a lot of lack.CI based FP normalised is slightly higher than for the other method; it could be weighted by the fact that the probability for FP of being low is higher when the total length extracted is short.
Notice that the each network domain is defined as two-meter wide buffer.The more the buffer is reduced, the more FN and FP are high.So the CI based network is generally correct but in detail we can observe some imperfections.This quality measure helps to assess the geometric accuracy of the extraction.Whereas the stream-bystream assessment provides information about the topological accuracy.
The stream-by-stream assessment supply clear and accurate information about the geometry and, more interesting for our purpose, about the completeness of the extraction.As it shows better results in the global assessment, we present only the stream-by stream result of the CI based network (Table 2).81% of the extracted streams corresponds to a reality on the field.And the 19% of over-detected streams represents less than 10% of the total extracted network length.In other words, the total length of streams that have no reality is very low compared to the total length of the network.In Introduction

Conclusions References
Tables Figures

Back Close
Full Screen / Esc Printer-friendly Version Interactive Discussion addition, this assessment permits to directly locate the computation inconstancies.In this case, we observe that most of the over-detected thalwegs are located in forested areas.
Besides, we have to take into account the possible difference of scale representation between the network assessed and the reference.Define a scale for a field work is hard and it could be even heterogeneous according to the terrain accessibility.Indeed, some areas are more detailed in the field network.So the result interpretation must be weighted by this scale consideration.

Discussion and conclusion
In this paper, we propose a method aiming at extracting tree-structured gully floors and corresponding continuous thalweg networks.Whereas classical methods of extraction are based on a unique threshold, the presented method extracts thalwegs only where the DTM expresses a landform.Thus, this method allows a more robust computation of topological indices of thalweg networks that can otherwise be highly disrupted in geometry and topology.The positive points are both the objectivity and the reality of the thalwegs extracted.
To build a robust method, we use terrain parameters reflecting the convergence.Two main parameters are compared: the plan curvature that has already been used for thalwegs network extraction (Tarboton and Ames, 2001;Molly and Stepinski, 2007) and the convergence index.The results show the great ability of the convergence index to reveal the landform.Indeed, it provides a more informative and complete network especially for bare soils.In the few forested areas, the interpretation is fuzzier.The method is suitable for badlands areas or any other with bare soils.In other areas, like forested ones, results are less effective.The tendency is the computation of network between the sets of trees.Though, this is probably due to the DTM quality rather than any method deficiency.We observe that when the quality DTM is good under vegetation cover, the network is well detected.This method is data dependent and data-driven.Results are available for a given spatial resolution of DTM.Moreover, the method relies on few parameters which resume the spatial distribution of the altimetrical error of DTM data (parameters of spatial covariance).These parameters give an objective threshold for the delineation of significant flow convergent areas.This means that some initial work on DTM quality evaluation has to be handled first.As a limit, we only used one noisy tilted plan corresponding to the Draix average slope.The threshold would be more correct if it was determined by considering several planes with different slopes, in accordance with the badlands side slope distribution or by differencing error spatial structure in regards with terrain cover (e.g.vegetated/non vegetated areas).
In other contexts and for other resolutions, convergence indices performed on local neighbourhood can limit and be unsuitable to show wide and rough valleys.In that case, a multi-resolution approach based on the same principles would be interesting to develop.At a given resolution, the use of pyramidal moving windows for CI computation should smooth the resulting network and provide a more accurate result.
Furthermore, as we intend achieve a scaling analysis, we have to consider the effect of the resolution DTM.Can we apply this method on very high resolution DTM?
The answer has to be investigated in the future.Indeed, we assume that a resolution threshold exists beyond which the convergence index will not be the better parameter to reveal thalwegs network.From this value, the convergence may express the terrain roughness.More specifically, in badlands, the convergence may report the micromorphology of the terrain rather than the continuous thalwegs network.Indeed, the link between the studied object size and the resolution of base documents has to be considered for a physically significant extraction.Introduction

Conclusions References
Tables Figures

Back Close
Full have to be computed on the unmodified DTM in order to not disrupt the indices values.
balance of the extraction between the over and under-detection.Error Indicator (EI) = |FN normalised − FP normalised |(5)

2 Material 2.1 Virtual catchment and simulated data
• , make the field work quite hard and the use of remote sensing data like digital terrain model (DTM) becomes essential.Thanks to the ORE structure and the several research teams involved, lots of data are available on the study area.The actual DTM used is a one-meter resolution DTM of Introduction

Table 2 .
Stream-by-stream assessment of the CI based network on the test area of the Moulin catchment.