3D semantic scene completion with LiDAR point clouds

View/Open
Cita com:
hdl:2117/423092
Document typeMaster thesis
Date2024-07-10
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
In recent years, the development of autonomous vehicles has shown significant potential in improving road safety by reducing traffic accidents and fatalities. One of the critical technologies enabling this advancement is LiDAR (Light Detection and Ranging), which provides precise geometric information about the environment. This master's thesis fo- cuses on 3D Semantic Scene Completion using LiDAR point clouds, a technique that aims to predict complete 3D voxel representations of scenes from incomplete LiDAR data. This task involves determining whether each voxel is occupied and assigning it a semantic label. The study reviews state-of-the-art methods for semantic scene completion, including SSA- SC, JS3C-Net, and SCPNet, which have demonstrated high performance in benchmarks like SemanticKITTI. The chosen method, SCPNet, utilizes a teacher-student framework to distill dense semantic knowledge from multi-frame point clouds (teacher) to single-frame point clouds (student). The implementation involves significant memory management and architectural optimizations to handle large datasets and computational limitations effectively. Experiments were conducted using the SemanticKITTI dataset, and the results were evaluated using mean Intersection over Union (mIoU) metrics. The thesis also explores the fusion of semantic scene completion with object detection tasks, using the nuScenes dataset to assess generalization. The findings indicate that while SCPNet shows superior performance in certain dynamic object classes, challenges remain in accurately detecting and representing moving objects like pedestrians and cyclists. Future research directions include further optimizing memory usage and improving the integration of semantic scene completion with other perception tasks.
SubjectsMachine learning, Computer vision, Pattern recognition systems, Remote sensing, Aprenentatge automàtic, Visió per ordinador, Reconeixement de formes (Informàtica), Teledetecció
DegreeMÀSTER UNIVERSITARI EN TECNOLOGIES AVANÇADES DE TELECOMUNICACIÓ (Pla 2019)