Shivam Singh

I'm a Computer Science PhD student at Arizona State University specializing in computer vision and generative models, with a focus on controllable image and video generation. I am broadly interested in developing theoretical foundations and practical tools for generative models, for use-cases like image editing and personalization. I’m also interested in reinforcement learning (RL) to steer generative models to outputs closely aligned with human preferences. I have over 2 years of research experience in model fine-tuning and large-scale image generation.

Previously, I completed my Bachelor’s degree in Computer Science at Jadavpur University, India where I worked as an undergraduate research assistant conducting research on AI in medical imaging.

If you’d like to discuss research opportunities, please feel free to reach out via email.

Email  /  CV  /  Scholar  /  Github

profile photo

News

  • October 2025: I will be presenting our work on RefEdit at ICCV 2025 in Hawaii!
  • September 2025: Our paper Chimera is under review at ICLR.
  • May 2025: Our paper RefEdit is accepted to ICCV 2025!
  • August 2024: I have joined Arizona State University as a PhD student!

Publications

I'm interested in computer vision, deep learning, generative AI, and image processing.

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring Expression
Bimsara Pathiraja, Maitreya Patel, Shivam Singh, Yezhou Yang, Chitta Baral
ICCV, 2025
project page / arXiv

We develop RefEdit, an instruction-based editing model trained on 20,000 synthetic triplets, outperforms baselines trained on millions of samples in complex scene editing and achieves state-of-the-art results on referring expression and traditional benchmarks.

Chimera: Compositional Image Generation using Part-Based Concepting
Shivam Singh, Yiming Chen, Agneet Chatterjee, Amit Raj, James Hays, Yezhou Yang, Chitta Baral
Under Review, 2025  
project page / arXiv

Chimera is a personalized image generation model trained on a semantic part-based dataset, enabling novel object synthesis by combining parts from multiple images via textual instructions, outperforming baselines by 14% in compositional accuracy and 21% in visual quality.

Teaching

  • Operating Systems (Upper-level Undergraduate, Fall 2024) – Supported a class of 150+ students by holding exam review sessions, and guiding students on lab projects involving process management, synchronization, and memory allocation.
  • Object-Oriented Programming and Data Structures (First-year Undergraduate, Spring 2025) – Assisted a class of 100+ students with coursework covering object-oriented design principles, data structures (lists, trees, graphs), and algorithms. Held office hours to clarify concepts and provide feedback on lab implementations.

This website is inspired by Jon Barron and modified by Shivam Singh.