Benchmark dataset for airborne lidar scanning data filtering in forested environments

Data description:

Airborne laser scanning (ALS) data is one of the most commonly used data for terrain products generation. Filtering ground points is a prerequisite step for ALS data processing. Because canopy cover and terrain slope are various in forested environments, filtering in forested environments is more challenging than filtering in other environments, such as urban. A key challenge is lacking of benchmark dataset, which makes the performance of existed geometric methods incomparable and deep learning networks untrainable.

In our study (Jin et al., 2020), we proposed a point-based fully convolutional neural network (PFCN) for filtering in forested environments. The network was trained with 37449157 points from 14 sites and evaluated on 6 sites in various forested environments. Additionally, the method was compared with five widely used filtering methods and one of the best point-based deep learning methods (PointNet++). Results showed that the PFCN achieved the best results in terms of mean omission error (T1 = 1.10%), total error (Te = 1.73%), and kappa coefficient (93.88%), but ranked second for the root mean square error of the digital terrain model caused by the worst commission error. Additionally, our method was on par with or even better than PointNet++ in accuracy. Moreover, our method consumes one-third of the computational resource and one-seventh of the training time. We believe that PFCN is a simple and flexible method that can be widely applied for ground point filtering.

To contribute to the study of filtering in forested environments, we release the abovementioned ground truth dataset, including 20 study sites, for public use. The study area (Southern Sierra Nevada, California, USA) and raw data collection method of the 20 sites are described at Section III.A in Jin et al., 2020 (Fig.1), and the methods for generating ground truth are described at Section III.B. Specifically, the unclassified points were filtered to obtain preliminary results using the automatic filtering method in the TerraScan software. The preliminary results of each site were checked manually to eliminate incorrectly filtered points and increase the number of ground points as much as possible. Manual checking was performed by visualizing the cross section of the point cloud within the LiDAR360 software, which can easily identify mistakes and reclassify incorrect points. The detailed manual process is as follows: (i) DTMs were generated using the ground points from automatic filtering results; (ii) DTMs were manually interpreted to find abnormally raised surfaces and low spots caused by misclassification; (iii) LiDAR points of the transverse sections for these places were checked, and mislabeled points were rectified to make the DTMs smooth; and (iv) this process was repeated until the whole DTMs were smooth.

The released benchmarked dataset is consisted of 20 LAS files (Site1.las, Site2.las, …, Site 20.las). File names are the same as the names used in Table 1 in Jin et al., 2020. Points of each file have “Classification” attribution along with basic “XYZ” coordinates. If “Classification” value of a point is 2, the point is a ground point. Otherwise, the point is a non-ground point (Fig. 2).

You are allowed to use the dataset for scientific studies, but any commercial use is not permitted. If you use the dataset, please cite the following reference.

Study areas
Fig. 1 Study areas indicated by red stars in the Southern Sierra Nevada, California, United States. Twenty sites were chosen across these areas, as shown, colorized by elevation, in the black box in the upper right corner.
A bird-view of points colorized by “Classification” attribute in Site1
Fig. 2 A bird-view of points colorized by “Classification” attribute in Site1. “Classification” = 2 represents the selected green point is a ground point. “Classification” ≠ 2 represents a point is a non-ground point.

Reference

Jin S, Su Y, Zhao X, Hu T, Guo Q*. 2020. A Point-Based Fully Convolutional Neural Network for Airborne LiDAR Ground Point Filtering in Forested Environments. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13: 3958-3974.