Semantic Segmentation with Convolutional Neural Networks
(May – Aug 2020)
Project Overview
Designed and implemented a comparative study of four deep learning architectures—U-Net, SegNet, ICNet, and ResNet34—for pixel-wise semantic segmentation. The project focused on evaluating trade-offs between inference speed and segmentation accuracy for autonomous driving (Cityscapes) and biomedical applications.
Key Technical Achievements:
Cloud-Native Training Pipeline: Leveraged AWS Sagemaker and S3 buckets to train models on heavy datasets, utilizing Tesla K80 GPUs (p2.xlarge) for accelerated computing.
Performance Benchmarking: Conducted a rigorous "Speed vs. Accuracy" analysis. Established U-Net as the most accurate model (53.29% mIoU) while identifying ResNet34 as the optimal choice for real-time applications, achieving the fastest inference speed of 45.42 ms/sample.
Architecture Implementation: Built models from scratch in TensorFlow 2.0, implementing advanced layers like Deconvolution, Dilated Convolutions, and Residual blocks to handle multi-scale feature extraction.
More details can be found in the report here.
