Akanksha Baranwal

An ambitious introvert in pursuit of a way to leave a permanent mark in this fleeting life.

About Me

I recently graduated from ETH Zurich with a Masters in Computer Science. My thesis was on Optimizing GPU Convnets with the Scalable Parallel Computing Lab (Prof. Torsten Hoefler) co-supervised by Dr. Nikoli Dryden and Dr. Tal Ben-Nun. I have been fortunate to have worked with both industry and academic experts for optimizing performance of machine learning/computer vision algorithms on multiple platforms - GPU, FPGAs, CPU and a bit of In-Memory-Computing. I read in my free time - mainly fiction, biographies, autobiographis and methodical self-help books. I also like playing badminton, hiking and learning new sports whenever I get a chance. I believe in being brave to suck at something new, so I often find myself exploring uncharted territories. I prefer working in fast paced cooperative setups.

Experience

GPU Performance Architect - NVIDIA
July 2018 - Aug 2020
  • Responsible for directed performance bringup of GA10X class chips.
  • Analyzing performance of next generation chips from the compute pipeline perspective.
  • Developed a full-stack framework for running CUDA apps directly on GPU architecture software models to help analyze application performance.
Student Researcher - IBM and Safari Research Group
Feb 2021 - Sep 2021
  • Joint project between In Memory Computing Group at IBM and Safari Research Group at ETH
  • Assesed the feasibility of neural network based genome basecalling on analog PCM devices.
  • Excel based models to project estimated speedups based on the network architecture
Research Engineer (Deep Learning) - FLOATING ROBOTICS
Mar 2022 - Jul 2022
  • Floating Robotics is a spin-off of Robotics Systems Lab at ETH.
  • Explored different solutions for optimizing fruit and leaf detection for deploying on NVIDIA Jetson GPUs.
  • Added object tracking and explored using GANs to improve dataset.

Education

2020 - 2023
Master of Science in Computer Science
Eidgenössische Technische Hochschule (ETH) Zürich
Featured Courses

  • Advanced Machine Learning
  • Machine Perception
  • Computational Intelligence Lab
  • Design for Parallel and High Performance Computing
  • Advanced Systems Lab
  • Computer Architecture
  • Principles of Distributed Computing
  • Advanced Algorithms
2014 - 2018
Bachelor of Technology (Honors) in Electronics and Communication Engineering
International Institute of Information Technology, Hyderabad
Featured Courses

  • Statistical Methods in AI
  • Digital Image Processing
  • Computer Vision
  • Introduction to parallel scientific computing
  • Complex digital system design
  • Computer System Organization
  • Embedded Hardware Design

Projects

Master Thesis - Optimizing GPU Convnets
CUDA Python MIOpen Data centric programming
Master Thesis - Optimizing GPU Convnets
We use the DaCe framework to develop portable optimizations for 3D convolutions for NVIDIA and AMD GPUs. We benchmark the optimized code against the available manually tuned library implementations.
Fast Graph Based Image Segmentation on GPGPUs
CUDA CPP Course Group project
Fast Graph Based Image Segmentation on GPGPUs
We use CUDA data parallel primitives to accelerate MST based image segmentation methods. In addition we propose a highly efficient solution based on atomic operations.
Arbitrary Precision Ball Arithmetic Library
Intel intrinsics C Course Group project
Arbitrary Precision Ball Arithmetic Library
An arbitrary precision ball arithmetic library that performs addition, subtraction, and multiplication. We focus on an optimized implementation for x86-64 systems using midpoint-radius intervals. We show that we achieve better performance in some cases when compared to libraries that exist out there.
DeCoILFNet - Depth Concatenation and Inter-Layer Fusion based ConvNet Accelerator
Verilog Caffe C++ Undergrad Research project
DeCoILFNet - Depth Concatenation and Inter-Layer Fusion based ConvNet Accelerator
A high-performance FPGA based architecture which exploits the intra-layer parallelism of CNNs by flattening across depth and combines it with a highly pipelined data flow across the layers enabling inter-layer fusion. Compared to a 3.5GHz hexa-core Intel Xeon E7 caffe-implementation, our projections for a 120MHz FPGA accelerator are 30X faster. In addition, our design reduces external memory access by 11.5X along with a speedup of more than 2X in the number of clock cycles compared to state-of-the-art FPGA accelerators.
FPGA based Parallelized Architecture of Efficient Graph Based Image Segmentation Algorithm
Matlab IEEE ROBIO 2017 Second author Presenter
FPGA based Parallelized Architecture of Efficient Graph Based Image Segmentation Algorithm
In this paper, we propose three novel architectures of a well known Efficient Graph based Image Segmentation algorithm. These proposed implementations optimizes time and power consumption when compared to software implementations. The hybrid design proposed, has notable furtherance of acceleration capabilities delivering atleast 2X speed gain over other implementations, which henceforth allows real time image segmentation that can be deployed on Mobile Robotic systems.
Road Segmentation in Satellite Images
Pytorch Open CV Python Course Group Project
Road Segmentation in Satellite Images
We analyze the impact of architecture modifications of a U-Net, namely the GC-DCNN and other selfdeveloped variations. Our experiments with fine tuning and model architecture alterations lead us to a novel better variant of GC-DCNN. We also propose two novel post-processing techniques to remove artefacts in predictions.
Jigsaw Puzzle Solver Using Digital Image Processing
Matlab Course group project
Jigsaw Puzzle Solver Using Digital Image Processing
The aim is to reconstruct the original image from a set of non-overlapping, unordered, square puzzle parts. Multiple puzzles mixed into one and puzzles with up to 30% missing pieces can be handled.

Achievements

Institute Dean's Merit List for all semesters 2014-2018
Awarded to top 5% students with a high SGPA
Institute Research Award 2017
Awarded for excellence in undergraduate research
Mathematics Wizard of the Year 2011-2013
Awarded to one exceptional student across batches for excelling in mathematics olympiads and exhibitions
Qualified for Indian National Mathematics Olympiad 2012-2013
CBSE National Science Exhibition Best Project 2013
Best project in Mathematics Category at National Level
National Talent Search Examination Scholar 2010
Ranked first in Gujarat state in the state round

Get in Touch

I am open to interesting opportunities and discussions in the computer systems space. Feel free to reach out to me.