Semantic Segmentation with Convolutional Neural Networks

(May – Aug 2020)

Project Overview

Designed and implemented a comparative study of four deep learning architectures—U-Net, SegNet, ICNet, and ResNet34—for pixel-wise semantic segmentation. The project focused on evaluating trade-offs between inference speed and segmentation accuracy for autonomous driving (Cityscapes) and biomedical applications.

Key Technical Achievements:

Cloud-Native Training Pipeline: Leveraged AWS Sagemaker and S3 buckets to train models on heavy datasets, utilizing Tesla K80 GPUs (p2.xlarge) for accelerated computing.
Performance Benchmarking: Conducted a rigorous "Speed vs. Accuracy" analysis. Established U-Net as the most accurate model (53.29% mIoU) while identifying ResNet34 as the optimal choice for real-time applications, achieving the fastest inference speed of 45.42 ms/sample.
Architecture Implementation: Built models from scratch in TensorFlow 2.0, implementing advanced layers like Deconvolution, Dilated Convolutions, and Residual blocks to handle multi-scale feature extraction.

More details can be found in the report here.

Code Link

https://github.com/Eashwar-S/Semantic_Segmentation