About Me

I am an MS in CS student at Georgia Institute of Technology, Atlanta campus. I am currently pursuing my specialization in Machine Learning and have taken the following courses as part of my specialization:

  • Fall 2022: CS 7643 Deep Learning, CS 8803 Machine Learning with limited supervision
  • Spring 2022: CS 6476 Computer Vision, CS 7650 Natural Language
  • Fall 2021: CS 7641 Machine Learning, CS 6220 Big Data Systems and Analytics, CS 8803 Data Science for Epidemiology
I completed my B Tech in Information Technology from the University of Delhi where I came across the marvels of Machine Leaning. I explored its application in a gamut of tasks ranging from shallow learning tasks such as Jamming Attack Detection in Wireless Networks to complex deep learning tasks like Visual Question Answering. Then, I worked as a Software Engineer at Walmart Labs for 2 years where I got an opportunity to explore AI in e-commerce.

My research interests are in Machine Learning, Computer Vision and Language. I have actively investigated the tasks of Visual Question Answering and Visual Dialog. Further, I have explored multimodal hate speech and worked on identifying hateful memes on social media platforms. I am currently pursuing my research in vision and language focussed on Guided Diffusion Models for text-to-image generation under Pf. James Hays.

I am a tech enthusiast and an avid learner, eager to solve real-world problems using my programming skills. I believe there is an impending revolution in the computer science field and my ardent desire is to contribute towards it and create an impact.

Experience

Computer Vision Engineering Intern, B Garage

Jan 2023- Present

Working with the perception team to build a curated dataset for scene text detection and recognition in a warehouse setting, and implementing an efficient state-of-the-art algortihm for finetuning on the same.

Software Engineering Intern, Google

June 2022-August 2022

Worked with the Google Assistant team to build an efficient debugging tool using C++ for the query understanding pipeline that ascertains machine-understandable intents for a given user speech query.

Teaching Assistant, Georgia Institute of Technology

Jan 2022-Dec 2022

I worked as teaching assistant for the course CS 6515, Advanced Algorithms in Spring 2023. Further, I took up this role for the Machine Learning course (CS4641/7641) in Fall 2022. My job responsibilities included taking office hours, answering doubts on student-forums & classes, and grading assignments/exams/projects.

Software Engineer 2, Walmart Labs

July 2019-July 2021

I worked as a backend developer for the e-commerce channel of Walmart, South Africa and Walmart, India . I contributed to various crucial features such as stock management, payments and quotes on the website leveraging web technologies- spring MVC (HYBRIS), HTML, CSS, JSP. As part of the Walmart India Team, I worked on building a recommender system that suggested a variant product to customers according to their location. I used Pytorch to build the machine-learning model, React Native to create a User Interface, "Element.ai," an E2E platform by Walmart Labs for model training and deployment. I was also a part of a crowdsourcing initiative to find cost-efficient solutions to track inventory at Walmart Stores using Object Detection.

Researcher, Indian Institute of Technology Kanpur

Dec 2018- March 2019

I worked with the vision and language lab on the Visual Question Answering and Visual Dialog Task. I leveraged PyTorch to implement various VQA frameworks such as Stacked Attention Network (SAN) & VQA Counting, and studied their attention module. Conclusively, I, along with my team, successfully devised a generalisable algorithm to supervise and improve Attention module in any deep learning task. We got out our findings published in IEEE WACV’21.

Software Engineering Intern, Walmart Labs

May 2018 - July 2018

I worked with 3 other summer interns to integrate the Walmart Canada website with the Google Home device. I used Google Dialog Flow to process user speech and recognize the user’s intent and employed Node JS Webhook to connect the Dialog Flow App with Walmart Canada APIs.

Projects

Hateful Meme Detection in Social Media Platforms

I, along with my team members, compared the performance of DL architectures such as MultiModal Bitranformers and Concat-Bert for hateful meme detection. We qualitatively studied the shortcomings of these architectures by analyzing GRADCAM heatmaps of false-negative and false-positive samples. We used PCA to obtain most salient image features and employed ensembling techniques to combine the advantages of various DL models, achieving a significant improvement in the accuracy of baseline architectures.
[Code] [Website]

Hindi Text Summarization

I, along with my team members, compared the performance of variations of Se2Seq models on Hindi Text Summarization. We further examine the performance gain with beam search with controlled patience and also examine improvements in the model with Fast Text Embeddings. We qualitatively studied the the architectures by performing attention visualization. We also finetune Multilingual T5 on Hindi Text Summarization Dataset.
[Code] [Report]

Reconstructing the COVID-19 Infection Trace for South Korea

COVID-19 infected more than 10,000 people in South Korea. In this project, We employed DS4C South Korea Patient dataset (announced by Korea Centers for Disease Control & Prevention) to build the contact trace using SoTA algorithms and determine likely infected people which were not reported.
[Code] [Report]

Paraphase Question Generation Using Graph Convolutional Networks

In this project, we proposed an encoder-decoder model to create a better sentence level embedding and evaluated the model on the task of Paraphrase Question generation. The semantics were captured by a pairwise decoder that enforces encodings of similar sentences to be close to each other. Further, the syntactics of sentences were captured by stacking a GCN over LSTM states.
[Code]

Research Publications

Self-Supervision of Attention Networks

Kasturi GS, Ansh Jain, Badri N Patro, Vinay P Namboodiri, Winter Conference on Applications of Computer Vision (WACV) 2021

[Paper] [Code] [Initial Research]

Detection and Classification of Radio Frequency Jamming Attacks using Machine learning

Kasturi GS, Ansh Jain, Jagdeep Singh, Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA), Vol.11 No.4

[Paper] [Code]