SC 395 • Winter 2025

Image Generative Models in Computer Vision

By Viraj Shah (Research Scientist, Google) IIT Gandhinagar
Overview

Course Description

This short course provides a rigorous overview of the current state-of-the-art in generative modeling, transitioning from foundational adversarial techniques to modern diffusion and flow-based paradigms.

Designed for senior undergraduate and graduate students, the curriculum balances theoretical derivation (SDEs, ODEs, Flow Matching) with practical architectural implementation (Diffusion Transformers, LoRA, ControlNet). The course concludes with an exploration of frontier applications in engineering sciences and the ethical implications of synthetic media.

Prerequisites
  • Probability Theory (Expectation, Variance, Gaussian Distributions)
  • Linear Algebra (Matrix decompositions, Vector spaces)
  • Deep Learning Fundamentals (CNNs, Transformers, Backpropagation)
Curriculum

Syllabus & Materials

Hands-On

Laboratory Sessions

Lab A: Foundations of Diffusion & Flow

  • Noise Schedules (Linear/Cosine)
  • Forward/Reverse Pass (Toy U-Net)
  • Inference (DDIM Sampling loop)
  • Intro to Hugging Face Diffusers

Lab B: Advanced Adaptation & Merging

  • Personalization (DreamBooth/LoRA)
  • ControlNet Inference
  • Model Merging (Diffusion Soup/SLERP)