JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]


Journal of Information Science and Engineering, Vol. 40 No. 3, pp. 615-629


Improving LiDAR Semantic Segmentation on Minority Classes and Generalization Capability Using U-Net++ for Self-Driving Scenes


CHIAO-HUA TSENG1, YU-TING LIN1, WEN-CHIEH LIN1,+
AND CHIEH-CHIH WANG2
1College of Computer Science
2Department of Electrical and Computer Engineering
National Yang Ming Chiao Tung University
Hsinchu, 300 Taiwan
E-mail: imogen236@gmail.com; liamlin0411.cs10@nycu.edu.tw;
wclin@cs.nctu.edu.tw
+; bobwang@ieee.org


LiDAR has been an important sensor in autonomous driving systems. Compared to the measurements provided by a radar or camera, LiDAR can provide more precise geometric information and be fused with other types of senors to tackle various perception tasks in autonomous driving. Among these perception tasks, semantic segmentation on LiDAR point clouds has received more and more research interest and achieved compelling results. However, there are still two unsolved issues. The first one is about minority classes caused by data imbalance, which is an inevitable problem in large-scale outdoor scenes. The minority classes, which are small in a scene and result in very few LiDAR points, can be important objects to be recognized for self-driving cars, e.g., pedestrians, motorcycles, traffic signs. In order to solve this class imbalance problem, we use U-Net++ architecture and dice loss to enhance the IoU score for the minority classes. The second issue is generalization capability on different LiDAR resolutions. Existing methods mostly need to be retrained to deal with data collected by LiDARs with different resolutions. We adopt KPConv as convolution operator to tackle this issue. With U-Net++ and dice loss, we get 5.1% mIoU improvement on SemanticKITTI, especially 9.5% mIoU improvement of minority classes compared with baseline. Moreover, we show the generalization capability of our model with KPConv by training on 64-beam dataset and testing on 32-beam and 128-beam dataset. We obtain 3.3% mIoU improvement on 128-beam dataset and 1.9% mIoU improvement on 32-beam dataset.


Keywords: LiDAR semantic segmentation, autonomous driving, deep learning, minority class, generalization capability

  Retrieve PDF document (JISE_202403_12.pdf)