Shivam Singh
I'm a Computer Science PhD student at Arizona State University specializing in computer vision and
generative models, with a focus on controllable
image and video generation. I am broadly interested in developing theoretical foundations and
practical tools for
generative models, for use-cases like image editing and personalization. I’m also interested in
reinforcement
learning (RL) to steer generative models to outputs closely aligned with human
preferences. I have over 2
years of research experience in model fine-tuning and large-scale image generation.
Previously, I completed my Bachelor’s degree in Computer Science at Jadavpur University, India where I worked as an undergraduate research assistant conducting research on AI in medical imaging.
If you’d like to discuss research opportunities, please feel free to reach out via email.
Email /
CV /
Scholar /
Github
|
|
- October 2025: I will be presenting our work on RefEdit at ICCV 2025 in Hawaii!
- September 2025: Our paper Chimera is under review at ICLR.
- May 2025: Our paper RefEdit is accepted to ICCV 2025!
- August 2024: I have joined Arizona State University as a PhD student!
|
Publications
I'm interested in computer vision, deep learning, generative AI, and image processing.
|
|
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring Expression
Bimsara Pathiraja,
Maitreya Patel,
Shivam Singh,
Yezhou Yang,
Chitta Baral
ICCV, 2025
project page
/
arXiv
We develop RefEdit, an instruction-based editing model trained on 20,000 synthetic triplets, outperforms baselines trained on millions of samples in complex scene editing and achieves state-of-the-art results on referring expression and traditional benchmarks.
|
|
Chimera: Compositional Image Generation using Part-Based Concepting
Shivam Singh,
Yiming Chen,
Agneet Chatterjee,
Amit Raj,
James Hays,
Yezhou Yang,
Chitta Baral
Under Review, 2025  
project page
/
arXiv
Chimera is a personalized image generation model trained on a semantic part-based dataset, enabling novel object synthesis by combining parts from multiple images via textual instructions, outperforming baselines by 14% in compositional accuracy and 21% in visual quality.
|
-
Operating Systems (Upper-level Undergraduate, Fall 2024) – Supported a class of 150+ students by holding exam review sessions, and guiding students on lab projects involving process management, synchronization, and memory allocation.
-
Object-Oriented Programming and Data Structures (First-year Undergraduate, Spring 2025) – Assisted a class of 100+ students with coursework covering object-oriented design principles, data structures (lists, trees, graphs), and algorithms. Held office hours to clarify concepts and provide feedback on lab implementations.
|
|