Generative AI Course

Week 3: Fine-Tuning and RAG

Learn how to adapt models to domain data and ground outputs with retrieval for factual, high-precision generation.

Duration: 5 Sessions

Labs: 5

Project: Knowledge Copilot

DAY 1

Data Curation for Fine-Tuning

Designing instruction-response datasets
Cleaning noisy examples and removing leakage
Balancing diversity and style consistency

DAY 2

LoRA and QLoRA Workflows

Parameter-efficient fine-tuning with minimal GPU cost
Choosing rank, alpha, and target modules
Checkpoint management and rollback strategy

python

# Pseudo setup for PEFT config
lora_r = 16
lora_alpha = 32
dropout = 0.05
batch_size = 4
grad_accum = 8

DAY 3

RAG Architecture and Retrieval Quality

Chunking, embeddings, and vector search strategies
Hybrid retrieval: keyword + semantic reranking
Citation-first generation and confidence scoring

Component	Decision	Impact
Chunk Size	500-900 tokens	Precision versus context coverage
Retriever Top-K	4-8 docs	Cost and hallucination balance
Reranker	Cross-encoder	Higher relevance, extra latency

DAY 4

Evaluation and Error Analysis

Task-specific eval sets and golden responses
Faithfulness, groundedness, and answer completeness
Regression testing before deployment

DAY 5

Build: Domain Knowledge Copilot

Lab 10: Prepare and validate domain dataset
Lab 11: Fine-tune with LoRA
Lab 12: Create embedding pipeline and retriever
Lab 13: Build RAG QA API with citations
Lab 14: Evaluate and iterate with error-driven fixes

Week 3 Outcomes

Run fine-tuning experiments with PEFT methods
Build grounded RAG systems with source citations
Use evaluation loops to improve response quality

GUIDED PATH

Beginner Walkthrough: Make Answers More Accurate

When to use Fine-Tuning vs RAG (simple rule)

Use Fine-Tuning when you want a model to follow your style, tone, or task format consistently.
Use RAG when answers must use latest documents, policies, manuals, or private company knowledge.
Use both when you need domain style + factual grounding from fresh documents.

Daily action plan (2 to 3 hours/day)

Day 1: Build a small, clean dataset of 100 to 300 high-quality instruction-answer pairs.
Day 2: Run a basic LoRA fine-tune. Compare base model and tuned model on the same 20 test prompts.
Day 3: Build a retrieval pipeline: chunk docs, create embeddings, and query top relevant chunks.
Day 4: Add citation output and confidence checks. Reject answers when evidence is weak.
Day 5: Integrate tuned style + RAG grounding into one knowledge copilot.

Quality checklist before moving to Week 4

Fine-tuned model response style is visibly more aligned with your target format
RAG answers include source citation for each key claim
At least 20-question evaluation set with measured scores
Failure cases are documented with planned fixes

Assignment to complete Week 3

Build a Knowledge Copilot for one domain (for example: HR policy, product docs, SOP manuals, or legal templates). The assistant must answer with citations and decline when confidence is low.

Scoring rubric

Category	Expectation	Weight
Groundedness	Answers map to source content without made-up facts	35%
Style consistency	Response format stays stable across prompts	20%
Evaluation rigor	You track metrics and failure patterns clearly	25%
Product readiness	Usable UI/API with basic error handling	20%

← Previous Week Next Week: Production Systems →