I am a masters student at the University of California, Los Angeles, working on my Master's Thesis on Reasoning with Diffusion Language Models Prof. Aditya Grover on diffusion-based language models. Previously, I have worked as a Research Assistant at Nanyang Technological University, Singapore, with Prof. Albert Li, working on AI Planning with LLMs. I completed my bachelors from BITS Pilani, India.
I did my undergraduate thesis with Prof. Donglai Wei at Boston College, in collaboration with Harvard VCG, where I was working on multimodal learning. I have also been advised by Prof. Rohit Babbar at Aalto University, working on Extreme Multilabel Classification for retrieval problems.
Email / CV / GitHub / Twitter / Google Scholar / LinkedIn
I am primarily interested in reasoning and how thinking models (like o1) can be trained. Presently, I am exploring Diffusion LLMs and Inference Time Scaling for this task. My eventual goal is to develop foundation models and understand how pretraining data affects these capabilities.
We propose a two-stage framework, d1, that employs masked SFT on distilled reasoning traces, followed by a variant of GRPO for dLLMs, called diffu-GRPO, to convert prertrained dLLMs into storng reasoning models. With this, we demonstrate strong reasoning performance against AR models, and faster convergence rate than conventional GRPO!
We study an important aspect of LM-based tree-search algorithms, the heuristic, by disentangling the search process from heuristic learning. Subsequently, we develop a mathematical model to select training data in accordance with both algorithms, achieving significant speed-ups in finding solutions to classical planning problems.
We propose a “Pick-Some-Labels” reduction for multilabel classification - a relaxation of the conventional “Pick-All-Labels” reduction. This is coupled with Supervised Contrastive Learning to develop a framework - UniDEC - to concurrently train a dual encoder and classifier. UniDEC achieves state-of-the-art performance on a single GPU, rivalling baselines which require 8-16 GPUs.
We take a data-centric approach to short-text extreme classification and propose data augmentation methods, LabelMix and Gandalf, which are derived from label-to-label correlations in the training set. We demonstrate their effects on previous architectures and forward the SOTA by imbuing effective inductive biases that were missing in previous models.
We hypothesise that machine translation can be improved by introducing a visual component. For this, we design a new architecture, CLIPTrans, a combination of the multimodal CLIP and the multilingual mBART. We demonstrate significant improvements over the previous MMT SOTA, especially across low-resource languages.
We developed a lightweight convolutional encoder, InceptionXML, in a dynamic negative sampling framework, SyncXML, for short-text extreme classification. InceptionXML in SyncXML beat the previous SOTA on a multitude of performance and parametric metrics.