Devaansh Gupta

I am a Research Assistant at Nanyang Technological University, Singapore, with Prof. Albert Li, working on AI Planning with (Large) Language Models. I completed my bachelors in Electronics Engineering from BITS Pilani, India, and, starting Fall 2024, will be pursuing my Masters in Computer Science at UCLA.

I did my undergraduate thesis with Prof. Donglai Wei at Boston College, in collaboration with researchers at Harvard VCG, where I was working on multimodal learning. I have also been advised by Prof. Rohit Babbar at Aalto University, working on Extreme Multilabel Classification for retrieval problems.

Email  /  CV  /  GitHub  /  Twitter  /  Google Scholar  /  LinkedIn

Research

I am primarily interested in two streams of research, (i) AI planning and sequential decision making and (ii) retrieval methods for search and recommendation. I find many synergies between these fields, especially with emergent agentic behaviour of large language models. My eventual goal is to develop foundation models for planning, towards which I am working on augmenting conventional planning algorithms with language models.

profile photo

Publications

project image

UniDEC: Unified Dual Encoder and Classifier Training for Extreme Multi-label Classification


Siddhant Kharbanda*, Devaansh Gupta*, Gururaj K, Pankaj Malhotra, Cho-Jui Hsieh, Rohit Babbar
Preprint, 2024

We propose a “Pick-Some-Labels” reduction for multilabel classification - a relaxation of the conventional “Pick-All-Labels” reduction. This is coupled with Supervised Contrastive Learning to develop a framework - UniDEC - to concurrently train a dual encoder and classifier. UniDEC achieves state-of-the-art performance on a single GPU, rivalling baselines which require 8-16 GPUs.

project image

CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation


Devaansh Gupta, Siddhant Kharbanda, Jiawei Zhou, Wanhua Li, Hanspeter Pfister, Donglai Wei
ICCV, 2023
project / arxiv / code

We hypothesise that machine translation can be improved by introducing a visual component. For this, we design a new architecture, CLIPTrans, a combination of the multimodal CLIP and the multilingual mBART. We demonstrate significant improvements over the previous MMT SOTA, especially across low-resource languages.

project image

InceptionXML: A Lightweight Framework with Synchronized Negative Sampling for Short Text Extreme Classification


Siddhant Kharbanda, Atmadeep Banerjee, Devaansh Gupta, Akash Palrecha, Rohit Babbar
SIGIR, 2023
arxiv / code

We developed a lightweight convolutional encoder, InceptionXML, in a dynamic negative sampling framework, SyncXML, for short-text extreme classification. InceptionXML in SyncXML beat the previous SOTA on a multitude of performance and parametric metrics.

project image

Learning label-label correlations in Extreme Multi-label Classification via Label Features


Siddhant Kharbanda, Devaansh Gupta, Erik Schultheis, Atmadeep Banerjee, Cho-Jui Hsieh, Rohit Babbar
Preprint, 2021

We take a data-centric approach to short-text extreme classification and propose data augmentation methods, LabelMix and Gandalf, which are derived from label-to-label correlations in the training set. We demonstrate their effects on previous architectures and forward the SOTA by imbuing effective inductive biases that were missing in previous models.