Terrain surfaces and 3-D landcover classification from small footprint full-waveform lidar data: application to badlands

This article presents the use of new remote sensing data acquired from airborne fullwaveform lidar systems. They are active sensors which record altimeter profiles. This paper introduces a set of methodologies for processing these data. These techniques 5 are then applied to a particular landscape, the badlands, but the methodologies are designed to be applied to any other landscape. Indeed, the knowledge of an accurate topography and a landcover classification is a prior knowledge for any hydrological and erosion model. Badlands tend to be the most significant areas of erosion in the world with the highest erosion rate values. Monitoring and predicting erosion within 10 badland mountainous catchments is highly strategic due to the arising downstream consequences and the need for natural hazard mitigation engineering. Additionaly, beyond the altimeter information, full-waveform lidar data are processed to extract intensity and width of echoes. They are related to the target reflectance and geometry. Wa will investigate the relevancy of using lidar-derived Digital Terrain Models (DTMs) and 15 to investigate the potentiality of the intensity and width information for 3-D landcover classification. Considering the novelty and the complexity of such data, they are presented in details as well as guidelines to process them. DTMs are then validated with field measurements. The morphological validation of DTMs is then performed via the computation of hydrological indexes and photo-interpretation. Finally, a 3-D landcover classification is performed using a Support Vector Machine classifier. The introduction of an ortho-rectified optical image in the classification process as well as full-waveform lidar data for hydrological purposes is then discussed.


Introduction
Remote sensing aims at collecting physical data from the Earth surface. In this special context, these data are used as inputs in erosion/hydrological models or to monitor hydrological fields over large areas (Schultz and Engman, 2000;King et al., 2005).
Images obtained in the visible domain with passive optical sensors can be analysed for generating 2-D landcover and landform maps either automatically by image processing methods (Chowdhury et al., 2007) or by photo-interpretation. The use of the infrared channel helps to detect the vegetation (Lillesand and Kiefer, 1994). In a stereoscopic configuration, images are processed to generate Digital Surface Models (DSMs) (Kasser and Egels, 2002).
More recently, airborne lidar (LIght Detection And Ranging) systems (ALS) provide the topography as 3-D point clouds by the measurement of the time-of-flight of a short laser pulse once reflected on the Earth surface. Moreover, such active systems, called multiple echo lidar, allow to detect several return signals for a single laser shot. It is particularly relevant in case of vegetation areas since a single lidar survey allows to acquire not only the canopy (the only visible layer from passive sensors in case of dense vegetation), but also points inside the vegetation layer and on the ground underneath. Depending on the vegetation density, some of them are likely to belong to the terrain. After a classification step in ground/off-ground points (also called filtering process), Digital Terrain Models (DTMs) can be generated. Such DTMs are of high interest for geomorphologists to study erosion processes (McKean and Roering, 2004) or to map particular landforms (James et al., 2007). Moreover, hydrologic models such as TOPOG (O'Loughlin, 1986) or Published by Copernicus Publications on behalf of the European Geosciences Union.
TOPMODEL (Quinn et al., 1991) handle topographic data either as Digital Surface/Terrain Models or as meshes. By classifying terrain under different levels and types of vegetation cover, lidar data could provide new land classification, i.e., terrain, cover maps. This new 3-D landcover classification can be related to the hydrological processes that are usually modelled in hydrological production indices as the SCS runoff curve number (USD, 1986), the runoff coefficient in the rational method (Pilgrim, 1987) or the plant cover factor in Wischmeier and Smith's Empirical Soil Loss Model (USLE) (Wischmeier et al., 1978). Lidar data have also been investigated by Bailly et al. (2008) and Murphy et al. (2008) for drainage network characterisation (Cobby et al., 2001;Antonarakis et al., 2008) and by Mason et al. (2003) as input data for flood prediction problems. For the latter, the authors use lidar data as resampled elevation grids and detect high and low vegetation areas. Vegetation heights are then converted into friction coefficients (Manning-Strickler coefficients) for hydraulic modeling.
Finally, multiple echo lidar data are typically used for the unique possibility of extracting terrain points as well as vegetation heights with high accuracy. Hollaus et al. (2005) state the possibility to derive the roughness of the ground from lidar point clouds. However, the filtering algorithm used to process lidar data is landscape dependent and the classification result may be altered (Sithole and Vosselman, 2004).
Based on the same technology as multiple echo lidar systems, full-waveform systems record energy profiles as a function of time. These profiles are processed to extract 3-D points (echoes) as well as other interesting features that could be related to landscape characteristics. Depending on the landscape properties (geometry, reflectance) and on the laser beam divergence angle (entailing small or large footprints), the recorded waveform becomes of complex shape. An analytical modelling of the profiles provides the 3-D position of significant targets as well as the amplitude and the width of lidar echoes (Section 3.1). A detailed state-of-the-art of such systems can be found in Mallet and Bretar (2009). This paper introduces a set of methodologies for processing full-waveform lidar data. These techniques are then applied to a particular landscape, the badlands, but the methodologies are designed to be applied to any other landscape.
Indeed, badlands tend to be among the most significant areas of erosion in the world, mainly in semi-arid areas and in sub-humid Mediterranean mountainous areas (Torri and Rodolfi, 2000). For the latter case, more active dynamics of erosion are observed (Regues and Gallart, 2004) with the highest erosion rate values in the world (Walling, 1988). Very high concentrations of sediment during floods, up to 1000 g l −1 , were registered (Descroix and Mathys, 2003).
Badlands are actually defined as intensely dissected natural and steeply landscapes where vegetation is sparse (Bryan and Yair, 1982). They are characterised by V-shape gullies that are highly susceptible to weathering and erosion (Antoine et al., 1995). These landscapes result from unconsol-idated sediments or poorly consolidated bedrock, as marls, under various climatic conditions governing bedrock disintegration through chemical, thermal or rainfall effects (Nadal-Romero et al., 2007).
The hydrological consequences of erosion processes on this type of landscapes are a major issue for economics, industry and environment: high solid transport, bringing heavily loaded downstream flood, are silting up reservoirs (Cravero and Guichon, 1989) and downstream river aquatic habitats (Edwards, 1969). Therefore, monitoring and predicting erosion within badland mountainous catchments is highly strategic due to the arising downstream consequences and the need for natural hazard mitigation engineering . Traditionally, the monitoring activities in catchments are derived from heavy in situ equipments on outlets or from isolated and punctual observations within catchments. In complement to these traditional observations, hydrologists are expecting remote sensing to help them to upscale and/or downscale erosion processes and measurements in other catchments, by providing precise and continuous spatial observations of erosion features or erosion driven factors (Puech, 2000). Among other inputs, erosion monitoring and modelling approaches on badlands  need maps of landform features, mainly gullies (James et al., 2007) that are driving the way flows and maps of important driven factors of erosion in mountainous badland catchments. These factors are soil and rocks characteristics (Malet et al., 2003), vegetation strata used to derive 3-D landcover classes controlling rainfall erosivity, and the terrain topography (Zhang et al., 1996), which allow to derive slope and aspect of marly hillslopes .
This paper aims to investigate the potential of using fullwaveform lidar data as relevant elevation data, but also as a possible data source for 3-D landcover classification focusing on the characterisation of badland erosion features and terrain classification. If some papers have been published regarding the interpretation of full-waveform lidar data, most of them are based on large footprint lidar data acquired from satellite platforms (Zwally et al., 2002). Very few researches have been carried out on the analysis of small footprint fullwaveform airborne lidar data (Wagner et al., 2006;Jutzi and Stilla, 2006;Reitberger et al., 2008;Mallet and Bretar, 2009). Considering their novelty and their complexity, we propose to develop some new and specific guidelines related to their processing (including some physical corrections) and their management. Additionally, the extraction of the amplitude and the width of each echo is investigated as potential information for landcover classification. Amplitude and width are related to the target reflectance as well as to the local geometry (slope, 3-D distribution of the target). Finally, since colour images taken with an embedded digital camera are almost always available at the same time as the lidar survey occurs, we investigated the use of additional colour information on the landcover classification results.
Hydrol. Earth Syst. Sci., 13,[1531][1532][1533][1534][1535][1536][1537][1538][1539][1540][1541][1542][1543][1544][1545]2009 www.hydrol-earth-syst-sci.net/13/1531/2009/ This paper begins with a background on full-waveform lidar systems (Sect. 2.1) as well as a brief presentation of a management system to handle the data (Sect. 2.2). We then present the processes to convert raw data into 3-D point clouds (Section 3.1). Sect. 3.2 is dedicated to the development of a filtering algorithm to classify the lidar point cloud into ground/off-ground points as well as on the generation of DTMs. The echo amplitude and width extracted from full-waveform lidar data are described in Sect. 3.3. We focus this section on theoretical developments leading to a correction of amplitude values. Section 4 presents the badland area whereon investigations have been performed as well as the data: lidar data, orthoimages. DTMs produced by our algorithm are then validated by the computation of an hydrological index (Sect. 5) compared with manually (photo-interpretation) extracted ridges and valleys. We finally present in Sect. 6 the results of a 3-D landcover classification using a vegetation class and different classes of terrain. This classification is based on a supervised classifier: the Support Vector Machines (SVMs). Different features have been tested, including the three visible channels of two orthoimages: the first one has been acquired with an embedded digital camera during the lidar survey, the second one is an extracted part of French national orthoimage database. The opportunity of using full-waveform lidar data for hydrological purposes is then discussed.

Background on full-waveform lidar systems
The physical principle of ALS consists in the emission of short laser pulses, with a width of 5-10 ns at Full-Width-at-Half-Maximum (FWHM), from an airborne platform with a high temporal repetition rate of up to 200 kHz. They provide a high point density and an accurate elevation description within each laser diffraction beam. The two way runtime to the Earth surface and back to the sensor is measured. Then, the range from the lidar system to the illuminated surface is recorded (Baltsavias, 1999) as well as the trajectory of the plane. A lidar survey is composed of several parallel and overlapping strips (100 m to 1000 m width).
The emitted electromagnetic wave interacts with artificial or natural objects depending on its wavelength and is modified accordingly. For ALS systems, near infra-red sensors are used (typical wavelengths from 0.8 to 1.55 µm). The selected Pulse Repetition Frequency (PRF) depends on the acquisition mode and on the flying altitude. Contrary to multiple echo systems which record only some high energy peaks in real time, full-waveform lidar systems record the entire signal of the backscattered laser pulse. Figure 1 shows raw full-waveform data.
Full-waveform systems sample the received waveform of the backscattered pulse at a frequency of 1 GHz. The foot- print size depends on the beam divergence and on the flying altitude. Most commercial airborne systems have a small footprint (typically 0.3 to 1 m diameter at 1000 m altitude).

Handling full-waveform lidar data
Initially, raw full-waveform lidar data are sets of range profiles of various lengths. Raw profiles are acquired and stored following both the scan angle of the lidar system and a chronological order along the flight track. After the georeferencing process and the pre-processing step (Sect. 3.1), raw profiles become vectors of attributes containing, for each 3-D point, the x,y,z-coordinates, additional parameters (amplitude and FWHM) and a time-stamp that links it to the sensor geometry. Managing these data is much more complex than images: the topology (neighborhood system, topological queries) is designed to be as efficient as possible when accessing and storing the data. Indeed, the data volume is drastically larger than traditional laser scanning techniques: it takes 140 GB for an acquisition time of 1.6 h with a PRF of 50 kHz. Moreover, a 3-D/2-D visualisation tool is also necessary to handle the attributes, both in the sensor and in the ortho-rectified geometries. A specific software has therefore been developed for these purposes (Chauve et al., 2009).

From 1-D signals to 3-D point clouds
Contrary to multiple echo lidar sensors which provide directly 3-D point clouds, full-waveform sensors acquire 1D depth profiles along the line of sight for each laser shot. The derivation of 3-D points from these signals is composed of two steps: -The waveform processing step provides the signal maxima location, i.e., the range values, as well as additional parameters describing the echo shape.
-The georeferencing process turns the range value to a {x, y, z} triplet within a given geographic datum.

WAVEFORM PROCESSING:
It aims at maximising the detection rate of relevant peaks within the signal in order to foster information extraction. In the literature, a parametric approach is generally chosen to fit the waveform (Mallet and ). Parameters of a mathematical model are estimated. The objective is twofold. A parametric decomposition gives the signal maxima, i.e., the range values of the different targets hit by the laser beam. Then, the best fit to the waveform is chosen among a class of functions. This allows to introduce new parameters for each echo and to extract additional information about the target shape and its reflectance. Our methodology is based on Chauve et al. (2007). The authors describe an iterative waveform processing using a Non-Linear Least Squares fitting algorithm. After an initial coarse peak detection, missing peaks are found in the residuals of the difference between the modelled and initial signals. If new peaks are detected, the fit is performed again. This process is repeated until no further improvement is possible. This enhanced peak detection method is useful to model complex waveforms with overlapping echoes and also to extract weak echoes.
The Gaussian function has been shown to be suitable to model echoes within the waveforms (Wagner et al., 2006). Its analytical expression is: where µ is the maximum location, A the peak amplitude, and σ the peak width. For each recorded waveform, the transmitted pulse is also digitised. By retrieving its maximum location, the time interval between the pulse emission and its impact on a target is known. The range value of the target ensues from the timeof-flight calculation.
The standard deviation σ corresponds to the half width of the peak at about 60% of the full height. In some applications, however, the Full-Width-at-Half-Maximum (FWHM) is often used instead. We have FWHM = 2σ √ 2 ln 2.

GEOREFERENCING:
Similarly to multiple echo lidar sensors, computing the {x, y, z} coordinates of each echo in a geodetic reference frame from the range value requires additional data. The scan angle is used jointly to the range to calculate the {x, y, z} position for each point in the scanner coordinate frame. Then, the use of the GPS position of the aircraft, and the sensor attitude values (roll, pitch, heading) generate for each laser shot the {x, y, z} in a given geodetic datum. Finally, the positions can be transformed in some cartographic projection (French NTF Lambert IIÉtendu in this paper, see Section 4 for more details).
The transformation formulas cannot be expressed since they differ from one sensor to another. Offset values are different, depending on the configuration of the laser system, GPS and Inertial Measurement Unit (IMU) devices.
After applying the advanced step of waveform modelling, full-waveform lidar data generate denser point clouds than multiple echo data. It is particularly relevant when studying the vegetation structure (Mallet and ).

From point clouds to DTM
Basic processing of a lidar point cloud is the classification as ground/off-ground points, which is generally associated to the resampling of the data on a regular grid. Due to the accurate geometry of a lidar point cloud, many algorithms have been developed to automatically separate ground points from off-ground points (Sithole and Vosselman, 2004). Most of these approaches have good results when the topography is regular, but remain unperfect in case of mixed landscapes and steep slope conditions: parameters of the algorithms are often difficult to tune and do not fit over a large area. When ground points are mis-classified as off-ground points, the accuracy of the DTM may decrease (it depends on the spatial resolution and on the interpolation method). Inversely, when off-ground points (vegetation or man-made objects) are considered as ground points, the DTM becomes erroneous which can bias hydrological models. Vegetated landscapes with sparse vegetation in a mountainous area (alpine landscape) are particularly interesting for the study of natural hydrology and the phenomenons of erosion (cf. Section 4). Nevertheless, the processing of such landscapes need strong human interactions to correct the classification: since the detection of off-ground points of most algorithms are based on the detection of local slope changes, it may occur that the terrain (e.g., mountain ridges) shares the same properties. The DTM can therefore be over or under-estimated on certain areas depending on the algorithm constaints.
A methodology that handles these problems has been recently developed (Bretar and Chehata, 2008) and is used in this study to compute the DTMs. It is based on a two step process: i. The computation of an initial surface using a predictive Kalman filter: it aims at providing a robust surface containing low spatial frequencies of the terrain (main slopes). The algorithm consists in combining a measurement of the terrain by analysing the elevation distribution of the point cloud of a local area in the local slope frame (points of the first elevation mode -lowest points -belong to the terrain) and an estimation of the terrain height calculated from the neighboring pixels. The predictive Kalman framework provides not only a robust terrain surface (the slopes are also integrated in the predictive filter), but also an uncertainty σ DTM for each DTM pixel as well as a map of normal vectors − → n .  ii. The refinement of this surface using a Markovian regularisation: it aims at integrating micro reliefs (lidar points within the uncertainty σ DTM ) in a minimisation process to refine the terrain description. Formulated in a Bayesian framework, additional prior information (ridge, valley etc.) can also be integrated in the refinement process.
The lidar point cloud is then classified based on geometric criteria. A lidar point is labelled as GROUND if it is located within a buffer zone defined as the corresponding DTM uncertainty σ DTM . Otherwise, it is considered as OFF-GROUND. In natural landscapes, off-ground points belong mainly to vegetation, and sometimes to human-made features (e.g., electric power lines, shelters). Vegetation areas are described as non-ordered point cloud (high variance) compared to human-made structures. Vegetation points are therefore extracted by fitting a least-square plane within a circular neighborhood centered sequentially on every off-ground points. If the residuals (average of orthogonal distances) are higher than a defined threshold (0.3 m), points are labeled as VEGETATION. Figure 2 summarises the entire algorithm to calculate a DTM from a lidar point cloud. This classification is not explicitly used in the following, but for generating the validation set related to the supervised classification (Sect. 6).

Processing the amplitude and the width of lidar echoes
Beyond the 3-D point cloud, full-waveform lidar data provide the amplitude and the width of each echo (Sect. 3.1) that are potential relevant features for landcover classification. The backscattered amplitude (or received power) is a function of the laser power, the distance from the source to the target, the incidence angle, the target reflectivity, the absorption by the atmosphere. The use of such features in a landscape classification framework necessitates a global homogeneity between all strips (Wagner et al., 2008b;Kaasalainen et al., 2009). These features therefore need to be corrected from the above-cited contributions so as to be homogeneous and thus, used in a landscape classification framework. We propose also to analyse the effect of the incidence angle on the Full-Width-at-Half-Maximum.

The amplitude
Since the recorded amplitude is proportional to the backscatterred flux and assuming the surfaces to be Lambertian, a feature proportional to the target reflectance is derived from the recorded amplitude by applying the following corrections: 1. Incidence angle: Since the apparent reflecting surface is smaller in case of non-zero incidence angle than in case of zenithal measurements (a cosinus dependency), recorded amplitude values of bare ground points are corrected from the scalar product of the emitted laser direction by the corresponding terrain local slope extracted from the DTM, which is cos θ incidence , 2. Range correction: The recorded amplitudes are corrected by the ratio R 2 R 2 s where R is the currrent range and R s a standard range (Höfle and Pfeifer, 2007),

Emitted power:
We have also remarked that the amplitude of emitted pulses has high temporal variations which may be visible in the amplitude image and therefore alter the spatial analysis of the data. Figure 3 represents a small region of a laser band in the sensor geometry, i.e. in which one pixel is linked toss one emitted pulse and one recorded backscattered signal. Figure 3(a) represents the ratio between the amplitude of each emitted pulse and the average amplitude of all emitted pulses over the whole strip. The x-axis is along the flight track and the y-axis is along the scan direction. Figure 3(a) shows high variations of the emitted power along the flight track, and similar variations (vertical lines) are visible on the image of first echo amplitude ( Fig. 3(b)). We therefore normalized the amplitude of each measured peak (A measured ) by the ratio of the average amplitude value of all emitted pulses (A mean whole strip ) and the amplitude of the emitted pulse of the current peak A current . The effects of the correction are presented in Fig. 3(c) where vertical lines have disappeared.
The whole correction procedure is described in Eq.

The Full-Width-at-Half-Maximum
The FWHM has shown some spatial variability in our data set. Considering the badland and alpine landscape, we investigated the influence of the incidence angle on the FWHM  only in case of bare soil areas. Indeed, the FWHM of undervegetation ground points may have been modified by the complex optical medium. These investigations have been performed on simulated waveforms reflected by a tilted planar surface (Kirchhof et al., 2008). The simulation consists in a temporal convolution product between the emitted laser pulse chosen of Gaussian shape, the impulse response of the receiver, the spatial beam profile and the illuminated area. We show that, for a divergence angle of 0.4 mrd, a flying altitude of 600 m, and an emitted Gaussian pulse of FWHM= 5 ns, the variations of the received pulses with regard to the emitted ones are respectively of 0.03 % (0.001 ns), 0.57 % (0.03 ns) and 5.3 % (0.26 ns) for an incidence angle of 10°, 30°and 60°. The effect of the incidence angle is therefore negligible on the pulse stretching with respect to the temporal sampling interval (1 ns).
We cannot extend this conclusion for ground points below the vegetation since the waveform has been modified through the canopy cover. The spatial variability is therefore attributed to a more complex spatial beam response of the surface due to structures and/or reflectance properties.

Materials
Lidar data have been acquired over the Draix area, France. It is an experimental area on erosion processes in badlands located in the South of the French Alps. It belongs to the Euromediterranean Network of Experimental and Representative Basins (ERB). The Draix area consists in five research experimental catchments, highly equipped and monitored for more than thirty years. Thirteen research units working on erosion and hydrology processes are grouped within the GIS Draix organisation . Results for the most two eroded catchments are presented here: they concern the Laval and the Moulin catchments.

Lidar data
The data acquisition was performed in April 2007 by Sintégra (Meylan, France) using a RIEGL© LMS-Q560 system. This sensor is a small footprint airborne laser scanner and its main technical characteristics are presented in Wagner et al. (2006). The lidar system operated at a PRF of 111 kHz. The flying altitude was approximatively 600 m leading to a footprint size of about 0.25 m. The point density was about 5 pts/m 2 .
The temporal sampling of the system is 1 ns. Each return waveform is made of one or two sequences of 80 samples. For each profile, a record of the emitted laser pulse is also provided.

Orthoimages
Two orthoimages were available for the study. The first one ( Fig. 4(a)) is extracted from the French IGN data basis BDOrtho©. Acquired in fairly good conditions (almost no shadowed zones) by the IGN digital camera, a physical-based radiometric equalisation process has been applied (Paparoditis et al., 2006). The ground resolution is 0.5 m. The triplet of {red, green, blue} channels of the IGN image will be referred in this article to as RGB IGN . The second orthoimage (Figure 4(b)) has been calculated from aerial images acquired during the lidar survey by an embedded digital camera (Applanix DSS Model 322). Since the survey has been performed early in the morning, numerous shadowed areas appear. Moreover, no radiometric equalisation has been performed entailing a rather poor radiometric quality (see Figure 9). The ground resolution is 0.2 m. The triplet of {red, green, blue} channels of this image will be referred in this article to as RGB RAW .

DTM analysis with hydrological indices and photointerpretation
The quality assessment of a DTM for hydrological purposes is not completely satisfying when considering only the elevation error distribution. Other DTM quality criteria directly connected to the usual hydrological information extracted from DTM may be used: drainage networks, drainage areas, slopes like presented in (Charleux-Demargne, 2001). These criteria are mainly based on the basic landform information related to the first and the second derivative of a DTM. However, these criteria are not easy to use in a qualification process since (1) they are conditioned by both the algorithms and the parameters used to produce the information (e.g., a drainage area threshold in the D8 flow accumulation algorithm (O Callaghan and Mark, 1984)), (2) reference data are not easily available (how to survey drainage networks?) and finally (3) the quantification of quality is often not properly defined (how to compare dissimilarities of drainage networks?). Moreover, criteria are usually not generic: it is related to a specific hydrological index. In order to overcome these problems, a single criteria is proposed for a quantified auto-evaluation of DTMs at a given resolution in erosion areas with an hydrological and morphological point of view.
This criteria is simply the linear part (in percentage) of ridges and valleys observed from an orthoimage that fall within areas having significantly non null convergence index (CI) values computed on the DTM (Köthe and Lehmeier, 1994). The convergence index corresponds, for each DTM cell, to the mean difference between angle deviations. These angle deviations are calculated in each of the eight adjacent pixels. For an adjacent pixel, the angle deviation is the absolute difference, in degrees, modulo 180, between its aspect and the azimuth to the central pixel (Zevenbergen and Thorne, 1987). The convergence index is a symmetric and continuous index ranging from −90°up to 90°. This index highlights ridges when highly positive and valleys when highly negative. Figure 5 shows the convergence indexes computed on the lidar strip. Main valleys and ridges appear with respectively highly negative (blue) and positive (red) values.
At a given location, a valley (resp. a ridge) is considered to be detected in the DTM if CI values belong to [−90°, −η] (resp. to [η, 90°], η ∈ R). On a DTM representing inclined planes without noise, only CI=0 (i.e., η=0) indicates the absence of ridges and valleys, whatever the slope is. When dealing with noisy DTM, thresholding the CI with η to retrieve significant ridges and valleys becomes a challenging task. We therefore simulated a distribution of CI from a set of 1000 virtual noisy DTMs. They were generated with a trend corresponding to a plane of constant slope (e.g., 33°is the mean slope of Draix area). The simulation consists in generating Gaussian random fields (Lantuejoul, 2002) using the LU method (Journel and Huijbregts, 1978) following noise spatial distribution models with parameters: range, nugget and sill (variance) for spatial covariance.
Since the simulated CI distribution is of Gaussian shape, we set η to two times the standard deviation. We accept that five percents of CI values due to hazard on noise can be classified in significant ridge and valley.
We show some results on a sub-area of Draix. The simulated CI distribution (performed on 33°slope, Gaussian noise of zero mean and 2.66 standard deviation) provides a threshold value η = 8.46. We show the results of the valley and ridge detection on Fig. 6. Figure 6(a) is a manual delineation of apparent ridges and valleys. The photo-interpretation process is applied on main structures, but very close linear elements as well as the elements near sporadic vegetated elements are not considered. Table 1 presents the quality criteria obtained with the automatic approach (Figure 6(b)). The overall accuracy is 62.8%. This relatively low value can be explained by the following grounds. Firstly, the threshold η has been automatically calculated: the parameters of the simulation may be refined to reach better results. Secondly, the photo-interpreted valleys and ridges have been extracted from a 0.2 m-resolution image and then compared to a 1 m-resolution DTM: reliefs smaller than the resolution cell of the DTM are smoothed. Thirdly, if large ridges and valleys are well defined in the DTM, they may not appear in the photo-interpreted features since they may either be located in shadowed areas (valleys) or in saturated bright areas (ridges). The comparison is therefore biased.
The observation of local erosion processes requires a more detailed relief restitution. Other techniques like terrestrial LiDAR or photogrammetry by unmanned aerial vehicles (Jacome et al., 2008) are more accurate and precise, but, are not well adapted to survey large areas. However, considering the elevation accuracy of DTMs (approximately 0.9 m for 2 standard deviation on the elevation random error), and that the local ablation speed over Draix area is of 1.5 cm per year (Oostwoud and Ergenzinger, 1998), change detection and monitoring of erosion effects would require a delay between surveys of several decades. Nevertheless, the loss of sediment volume within catchments are not homogeneous and are temporary stored on hill-slope gully networks: 200tons/km 2 are trapped in the gully network, which corresponds to an approximate of 150 m 3 (Mathys et al., 1996). These volumes are significant enough to shorter time lag for a multidate analysis of DTM derived by full-waveform LiDAR (lower than a decade), even with an accuracy of some decimeters. Only full-waveform LiDAR survey which gives an adequate compromise between precision, accuracy and extent makes possible the monitoring of sediment volume displacement in the gully network at a catchments scale.

Methodology
Lidar data have been used so far as accurate elevation data to extract ground points and generate DTMs. The challenges were to automatically process the data in a mountainous landscape with steep slopes and vegetation, the whole with the highest accuracy. We mentioned in the introduction that a landcover map is an important input of hydrological models, especially for the parameterisation of the hydrological production function. We therefore propose in this section to describe the inputs and outputs of a classification framework wherein lidar width and amplitude values can be integrated and their benefit evaluated. Indeed, the interpretation of additional lidar parameters has been barely studied and reveals to be of interest for landcover classification. Wagner et al. (2008a) proposed classification rules based on a decision tree for vegetation/non-vegetation areas in a urban landscape using solely the width and the amplitude: a point is considered as VEGETATION if (1) it is not the last pulse of a profile containing multiple returns (2) it is a single return with low amplitude (≤75) and large width (≥1.9 ns). Focusing on the study of the vegetation, Reitberger et al. (2008) have integrated different features to segment individual trees in a normalized graph-cut framework. Among them, the authors show that the feature corresponding to the average intensity Hydrol. Earth Syst. Sci., 13, 1531-1545, 2009 www.hydrol-earth-syst-sci.net/13/1531/2009/ on the entire tree plays the most important role in leaf-on conditions, while the ratio between the number of single reflections and the number of multiple reflections is the most important in leaf-off conditions. Here, we would like to answer the question: do lidar width and intensity values improve a classification pattern in badlands? An efficient supervised classification algorithm called Support Vector Machines (SVM) has been used (Chang and Lin, 2001). In recent years, SVM was shown to be a relevant technique for remote sensing data analysis (Huang et al., 2002): ability to mix data from different sources, robustness to dimensionality, good generalisation ability and a nonlinear decision function (contrary to decision trees for instance). In this paper, the 3-D lidar point cloud is labelled, thus providing a 3-D landcover classification. Mallet et al. (2008) applied this technique with success for classifying urban areas from full-waveform lidar data.
Four classes have been identified focusing on a first and simple hierarchical level of 3-D land cover classification, relevant for badlands landscapes with anthropogenic elements: 1-LAND, 2-ROAD, 3-ROCK and 4-VEGETATION. The three first classes are increasingly sensitive to erosion processes. The first class LAND is taking into account terrain under natural vegetation cover and cultivated areas in grassland. The second one, ROADs, are linear elements with natural (marls), bared but compacted material. These elements are known to impact runoff production within catchments. The third one contains areas with bared black marls in gullies, the main source of sediment production. The latter, vegetation, is a very general land cover class. This class is aggregating more detailed vegetation charcateritics, such as the 3-D vegetation structure, wich can be very usefull to dicriminate further.
The SVM algorithm requires a feature vector for each 3-D lidar point to be classified. Only three lidar features have been retained. Indeed, it appears that the larger the number of features, the more difficult to make an interpretation of the results. They are: -d DTM , the distance between the 3-D point and the DTM, -FWHM, the echo width (see Section 3.1).
-Amp, the echo amplitude, Additionally, RGB IGN or RGB RAW features have been added in the classifier, providing three radiometric attributes (Fig. 8). Their introduction allows a discrimination between road and land impossible with the lidar features and improve the classification results. The training set over each of the four classes has been defined as follow: We have implemented the SVM algorithm with the LIB-SVM software (Hsu and Lin, 2001), selecting the generic Gaussian kernel. For more theoretical explanations, please see (Pontil and Verri, 1997). Figure 7 shows the histograms of lidar derived features corresponding to the four selected classes. d DTM and Amp have bounded values which describe the vegetation (resp. > 1 m and between 0 and 20), whereas the width values tend to be uniform between 3 ns and 4.5 ns. ROAD and LAND have similar distributions for lidar derived features, which explains the high confusion values in Table 2. The distributions of ROCK is flattened for d DTM since many points are chosen in very steep slopes, and are therefore more sensitive to the DTM quality. The amplitude of ROCK is slightly different from the other classes. Figures 8 and 9 show the histograms of RGB IGN and RGB RAW .

Results
The classification is validated with lidar points belonging to the masks defined in the training step, but the training points. 40% of the total number of points have been validated. Figure 10 shows the four validation sets for each class.
A confusion matrix is then calculated for each configuration. True positive values correspond to the diagonal values     Table 2 indicates that the confusion between classes is not negligible particularly for some of them: ROCK with ROAD reaches 22.4%, while LAND with ROAD reaches 19.4%, what was predictable looking through the statistics of the training set (Fig. 7). The vegetation has a high percentage of true positive (94.2%) and is well detected. With an average accuracy of 79.1%, it appears that a classification based only on lidar derived features is consistent.
Before testing the effects of lidar amplitude and combined width with RGB features, we investigated the impact of the radiometric quality of the orthoimages on the classification results. Tables 3 and 4   True positive values are higher when using image-based features {d DTM , RGB IGN } than {d DTM , Amp, FWHM} and the confusion between classes most of the time decreases: LAND with ROAD decreases from 19.4% to 1.5%, ROAD with ROAD decreases from 22.4% to 8%. Nevertheless, the comparison is more mitigated with {d DTM , RGB RAW }. Indeed, true positive values of ROAD decrease from 84.2% to 74% and the confusion between the other classes increases significantly. However, LAND is better classified with less confusion with ROAD (19.4% to 4.2%). As a result, it appears that even if the average accuracy of a classification using imagebased features is better, amplitude and width of lidar echoes have interesting discriminative properties.
The results of the use of lidar amplitude and width in the classification process are shown in Tables 5 and 6 (Table 2), the improvement is particularly consistent for ROCK, LAND and VEGETATION, but true positive values of ROAD decrease from 84.2% to 77.1% and the confusion with ROCK increases from 7.5% to 13.4%. In fact, the radiometry of roads are sensitive to tree shadows. The combination of the very high resolution of RGB RAW and the time of the survey (early in the morning) feeds the training set with bright and dark (shadow) radiometric values. On the contrary, lidar amplitude and width do not depend on the sun position. Superimposed on the orthoimage of Fig. 11, a 3-D landcover classification obtained with {RGB IGN , Amp, FWHM, d DTM } is presented in Fig. 12 and 13.

Discussion
Finally, the quality of the classification depends mainly on the DTM accuracy (represented here as d DTM ). Moreover, within the framework of the methodology, it appears that a classification based on {Amp, FWHM, d DTM } is suitable, but gives a worse accuracy than a classification based on {d DTM , RGB RAW } or {d DTM , RGB IGN }. Used on their own, full-waveform lidar data are relevant to discriminate vegetation from non vegetation points, but the confusion between other classes remains not negligible. The amplitude and the width do not improve the classification accuracy if the radiometric features have a good separation between classes. Otherwise, the benefit is rather small, but in case of artefacts in a class (like shadow) for which lidar measurements are not sensitive. Inversely, the use of poor radiometric features may alter the classification result of specific landscapes (here ROAD) where amplitude and width are well bounded. Even if amplitudes and widths appear poorly discriminant for the first level of 3-D landcover classification we used in addition to usual RGB images, we are quite convinced that it could be more useful for lower hierarchical levels of 3-D landcover classification. For instance, these waveform parameters would probably give information on vegetation density, 3-D structure and type as well as local bared soil structure related to erodibility. These investigations will be the next steps of our research.

Conclusions
Firstly, the accuracy of the full-waveform lidar data on badlands is decimetric. Even if erosion dynamics on these landscapes would require a centimetric accuracy to be studied yearly, DTMs generated from lidar survey are consistent for hydrological sciences at the catchment level. Moreover, we showed that these data permit to identify most of gullies and ridges of badland landscapes through geomorphological indices.
We focused this paper on generating and qualifying DTMs, but also on the automatic computation of a 3-D landcover classification. We showed that lidar amplitude and width contain enough discriminative information on badlands to be classified in LAND, ROAD, ROCK and VEGETA-TION with about 80% accuracy. Compared to usual landcover classification from aerial or satellite images, 3-D landcover classification is a new and interesting approach for hydrologists since it allows to parametrise in a much direct way hydrological or erosion production parameters as, for instance, the plant cover C factor (Wischmeier et al., 1978). However, the introduction of image-based radiometric features combined to lidar ones in the classifier improved the accuracy of the classification (about 92%). They bring relevant discrimination between classes but cancelled most part of the value added from full-waveform data. This is mainly Hydrol. Earth Syst. Sci., 13,[1531][1532][1533][1534][1535][1536][1537][1538][1539][1540][1541][1542][1543][1544][1545]2009 www.hydrol-earth-syst-sci.net/13/1531/2009/ due to the generality of the landcover classes we chose, but it would probably be more discriminant for more detailed landcover classes.