Devaansh Gupta

I am a research assistant at Nanyang Technological University, Singapore with Dr. Albert Li, working on Vision Language Models. I have a bachelors in Electronics Engineering from BITS Pilani, India.

Previously I was advised by Dr. Donglai Wei at Boston College, where I was working on multimodal AI, and by Dr. Rohit Babbar at Aalto University on extreme classification for search and recommendation.

Along with this, I have also worked at the Harvard Visual Computing Group on Neuron Instance Segmentation, advised by Dr. Hanspeter Pfister.

Email / CV / GitHub / LinkedIn

Research

Presently, I am exploring large language models with an aim to gain an intuition of their behaviour. My long-term goal is to develop multimodal conversational search and recommendation solutions. My research focus is on improving quantitative performance while keeping the computational costs low to make models more accessible to the research community.

Publications

CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation

Devaansh Gupta, Siddhant Kharbanda, Jiawei Zhou, Wanhua Li, Hanspeter Pfister, Donglai Wei

ICCV, 2023
project / code

We hypothesise that machine translation can be improved by introducing a visual component. For this, we design a new architecture, CLIPTrans, a combination of the multimodal CLIP and the multilingual mBART. We demonstrate significant improvements over the previous MMT SOTA, especially across low-resource languages.

InceptionXML: A Lightweight Framework with Synchronized Negative Sampling for Short Text Extreme Classification

Siddhant Kharbanda, Atmadeep Banerjee, Devaansh Gupta, Akash Palrecha, Rohit Babbar

SIGIR, 2023

We developed a lightweight convolutional encoder, InceptionXML, in a dynamic negative sampling framework, SyncXML, for short-text extreme classification. InceptionXML in SyncXML beat the previous SOTA on a multitude of performance and parametric metrics.

Gandalf : Data Augmentation is all you need for Extreme Classification

Siddhant Kharbanda, Devaansh Gupta, Erik Schultheis, Atmadeep Banerjee, Vikas Verma, Rohit Babbar

Preprint, 2021

We take a data-centric approach to short-text extreme classification and propose data augmentation methods, LabelMix and Gandalf, which are derived from label-to-label correlations in the training set. We demonstrate their effects on previous architectures and forward the SOTA by imbuing effective inductive biases that were missing in previous models.