Generative AI Course

Week 2: Multimodal Generation

Build image and vision-powered workflows with diffusion models, adapters, and multimodal prompting.

Duration: 5 Sessions
Labs: 4
Project: Brand Asset Generator
Week Plan
DAY 1

Diffusion Fundamentals

90 mins lecture30 mins concept map

Topics

Diffusion Pipeline
Text Prompt
Tokenizer
UNet Steps
Image
DAY 2

Prompting for Images

70 mins lecture50 mins guided practice

Prompt Design

prompt
portrait of a startup founder, cinematic light, 85mm lens,
soft shadows, neutral background, photorealistic, ultra detailed
negative: blurry, watermark, extra fingers, low quality
DAY 3

ControlNet and LoRA Adapters

80 mins lecture40 mins coding

Adapter Techniques

DAY 4

Vision-Language Workflows

70 mins lecture50 mins lab

Practical Patterns

DAY 5

Studio Build Day

150 mins implementation

Hands-On Labs

Week 2 Outcomes

  • Generate consistent visual outputs with reliable prompting
  • Apply adapter-based customization for brand-safe assets
  • Create a multimodal mini-product ready for demos
GUIDED PATH

Beginner Walkthrough: From Prompt to Visual Product

Step-by-stepPortfolio focused

Simple explanation of this week

This week teaches you how to generate and control images using text instructions. Instead of just typing random prompts, you will learn a repeatable method: write a clear scene prompt, add quality/style details, include negative prompts, then tune settings until outputs are usable.

Daily workflow (2 to 3 hours/day)

  1. Day 1: Generate 20 images from the same prompt using different steps and samplers. Record differences.
  2. Day 2: Build a reusable prompt template with sections: subject, setting, style, camera, quality, negative.
  3. Day 3: Apply one control method (pose, edge, or depth) and compare controlled versus uncontrolled output.
  4. Day 4: Build a basic vision-language flow: image input, caption extraction, then rewrite caption into marketing copy.
  5. Day 5: Combine everything into one mini studio where user enters brand style and gets 3 campaign images.

Common mistakes and fixes

Assignment to complete Week 2

Create a Brand Asset Generator that takes three inputs: brand personality, product type, and campaign tone. Output at least 6 final images in two style groups. Include your prompt templates and explain how you improved output quality.

Submission checklist

Pass rubric