RGBD Segmentation using Capsule Networks
Abstract
This paper explores the application and effectiveness of different network architectures for the task of RGB-D image segmentation, where the goal is to classify each pixel of an image into an object class using both color and depth information. The study focuses on comparing traditional convolutional neural networks, particularly the U-Net model, with capsule networks. To evaluate the performance of these models, experiments were conducted using three datasets: RGB-D Object Dataset, Cityscapes, and NYUDv2. The models were assessed based on pixel accuracy, mean accuracy, and mean Intersection over Union. The results indicate that the addition of depth information marginally improves performance across all models. However, capsule networks, despite their theoretical advantages, did not outperform the U-Net architecture, especially on the more complex NYUDv2 dataset. This study concludes that while capsule networks offer a promising direction for image segmentation tasks, they currently fall short of the established performance benchmarks set by traditional CNNs like U-Net. The findings suggest that capsule networks are still in their infancy and require further refinement before they can be considered a viable alternative for real-world segmentation tasks.
Similar publications
partnership