Satvik Dixit
I am a master's student at Carnegie Mellon University. I work with Professor Bhiksha Raj on Audio Language Models and Professor Chris Donahue on Generative Audio. I am interested in audio understanding and generation. Previously, I interned with Dr Satrajit Ghosh at MIT and Dr Martin Vetterli at EPFL. I completed my undergraduate degree in Electrical Engineering at IIT Delhi, where my concentration was on signals processing and ML.

Selected Publications and Preprints

Mellow Preview

Mellow: a small audio language model for reasoning

Soham Deshmukh, Satvik Dixit, Rita Singh, Bhiksha Raj

MACE Preview

MACE: Leveraging Audio for Evaluating Audio Captioning Systems

Satvik Dixit, Soham Deshmukh, Bhiksha Raj

  • ICASSP 2025 Speech and Audio Language Models (SALMA) Workshop
  • Paper
  • Code
Vision Language Models Preview

Vision Language Models Are Few-Shot Audio Spectrogram Classifiers

Satvik Dixit, Laurie Heller, Chris Donahue

  • NeuRIPS 2024 Audio Imagination Workshop
  • Paper