Devaansh Gupta

I am a masters student at the University of California, Los Angeles, working with Prof. Aditya Grover on diffusion-based language models. Previously, I have worked as a Research Assistant at Nanyang Technological University, Singapore, with Prof. Albert Li, working on AI Planning with (Large) Language Models. I completed my bachelors in Electronics Engineering from BITS Pilani, India.

I did my undergraduate thesis with Prof. Donglai Wei at Boston College, in collaboration with Harvard VCG, where I was working on multimodal learning. I have also been advised by Prof. Rohit Babbar at Aalto University, working on Extreme Multilabel Classification for retrieval problems.

Email  /  CV  /  GitHub  /  Twitter  /  Google Scholar  /  LinkedIn

Research

I am primarily interested in two streams of research, (i) AI planning and sequential decision making and (ii) retrieval methods for search and recommendation. I find many synergies between these fields, especially with emergent agentic behaviour of large language models. My eventual goal is to develop foundation models for planning, which is also the focus of my thesis.

profile photo

Publications

project image

A Training Data Recipe to Accelerate A* Search with Language Models


Devaansh Gupta, Boyang Li
Findings of EMNLP 2024, 2024

We study an important aspect of LM-based tree-search algorithms, the heuristic, by disentangling the search process from heuristic learning. Subsequently, we develop a mathematical model to select training data in accordance with both algorithms, achieving significant speed-ups in finding solutions to classical planning problems.

project image

UniDEC: Unified Dual Encoder and Classifier Training for Extreme Multi-label Classification


Devaansh Gupta*, Siddhant Kharbanda*, Gururaj K, Pankaj Malhotra, Amit Singh, Cho-Jui Hsieh, Rohit Babbar
Preprint, 2024

We propose a “Pick-Some-Labels” reduction for multilabel classification - a relaxation of the conventional “Pick-All-Labels” reduction. This is coupled with Supervised Contrastive Learning to develop a framework - UniDEC - to concurrently train a dual encoder and classifier. UniDEC achieves state-of-the-art performance on a single GPU, rivalling baselines which require 8-16 GPUs.

project image

Learning label-label correlations in Extreme Multi-label Classification via Label Features


Siddhant Kharbanda, Devaansh Gupta, Erik Schultheis, Atmadeep Banerjee, Cho-Jui Hsieh, Rohit Babbar
KDD, 2024

We take a data-centric approach to short-text extreme classification and propose data augmentation methods, LabelMix and Gandalf, which are derived from label-to-label correlations in the training set. We demonstrate their effects on previous architectures and forward the SOTA by imbuing effective inductive biases that were missing in previous models.

project image

CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation


Devaansh Gupta, Siddhant Kharbanda, Jiawei Zhou, Wanhua Li, Hanspeter Pfister, Donglai Wei
ICCV, 2023
project / arxiv / code

We hypothesise that machine translation can be improved by introducing a visual component. For this, we design a new architecture, CLIPTrans, a combination of the multimodal CLIP and the multilingual mBART. We demonstrate significant improvements over the previous MMT SOTA, especially across low-resource languages.

project image

InceptionXML: A Lightweight Framework with Synchronized Negative Sampling for Short Text Extreme Classification


Siddhant Kharbanda, Atmadeep Banerjee, Devaansh Gupta, Akash Palrecha, Rohit Babbar
SIGIR, 2023
arxiv / code

We developed a lightweight convolutional encoder, InceptionXML, in a dynamic negative sampling framework, SyncXML, for short-text extreme classification. InceptionXML in SyncXML beat the previous SOTA on a multitude of performance and parametric metrics.