Welcome to my homepage!


Note: This webpage was last updated on 02/27/2026.

About me

Hi folks, welcome to my personal homepage! I’m a second-year MS (thesis) student in Computer Science at Virginia Tech, fully funded with a $62,705/year scholarship and maintaining a 4.0/4.0 GPA. My research focuses on adversarial attacks in multimodal large language models, computer vision, and natural language processing. I have 175+ citations (h-index: 4) with papers published at ICLR, CHI, and work under review at ACL and ICML. I also built effGen (100+ stars), an open-source framework for building agentic pipelines with small language models, as well as LLMThinkBench (4.92k+ PyPI downloads).

Most recently, I spent Summer 2025 at Nokia Bell Labs as a Computer Vision Intern in the Industrial Metaverse group, where I built an end-to-end multimodal LLM and multi-view stereo reconstruction pipeline that reduced manual annotation effort by ~95% (patent submitted). Prior to Virginia Tech, I completed my B.Tech in Computer Engineering from K.J. Somaiya College of Engineering, Mumbai (GPA: 9.44/10) in 2022. I then worked at Deloitte as a Software Development Engineer and served as a Research Scholar at Madan Bhandari University of Science and Technology in Nepal.

Research Interests

My research spans multimodal machine learning, adversarial robustness, and NLP. Current focus areas:

  1. Adversarial Attacks on Multimodal Models: I study how audio-only adversarial perturbations can break trimodal (audio-video-language) models. My work on SoundBreak demonstrates that audio-only attacks can induce severe multimodal failures, achieving up to 96% attack success rate across state-of-the-art models.

  2. LLM Evaluation and Reasoning: I work on benchmark-free evaluation of reasoning in language models. I contributed to BeyondBench (ICLR 2026), which generates fresh math problems algorithmically to avoid benchmark contamination. I also work on LLMThinkBench, evaluating math reasoning and overthinking in LLMs with results from 55+ models.

  3. Efficient Agentic AI with Small Language Models: I built effGen (GitHub), a framework for agentic pipelines with small language models. effGen enables SLMs to act as capable autonomous agents with optimized tool-calling, intelligent task decomposition, complexity-based routing, and a unified memory system. The goal is practical: real tasks, low cost, no massive GPU clusters.

News

  • [Jan. 26, 2026] BeyondBench got accepted to ICLR 2026!
  • [Jan. 20, 2026] New preprint on SoundBreak – systematic study of audio-only adversarial attacks on trimodal models! (Under review at ACL 2026)
  • [2026] Designing Multi-Robot Ground Video Sensemaking accepted to CHI 2026!
  • [Jan. 31, 2026] New preprint on effGen—enabling small language models as capable autonomous agents! (Under review at ICML 2026)
  • [Jun. 2025] Started summer internship at Nokia Bell Labs as Computer Vision Intern (Industrial Metaverse)!
  • [Aug. 2024] Began my M.S. in Computer Science at Virginia Tech!

Selected Projects

  • effGen  |  GitHub  ·  Docs An open-source agentic framework optimized for small language models (SLMs) with enhanced tool-calling, intelligent task decomposition, complexity-based routing, and a unified memory system.

  • LLMThinkBench  |  PyPI  ·  Leaderboard A dynamic math reasoning evaluation framework for LLMs across 14 arithmetic tasks. 4.92k+ downloads, results from 55+ LLMs, novel Overthinking Score metric.

  • DataSense Multi-agent data visualization system using LLM ensembles to analyze datasets and automatically create the top 3 charts from 9+ options, reducing manual effort by ~75%.

Technical Skills

  • Programming Languages: Python, Java, JavaScript
  • Libraries & Frameworks: PyTorch, TensorFlow, Transformers, vLLM, scikit-learn, OpenCV, LangChain
  • Technologies: Flask, MySQL
  • Tools: AWS, FastAPI, Docker