What Agents Are, How They Think, and What They Can Do
Most developers confuse chatbots with agents. They are fundamentally different:
# Chatbot: text in → text out
user: "What's my order status?"
bot: "I'd be happy to help! Could you
provide your order number?"
# ⚠️ Cannot actually look up the order# Agent: think → act → observe → respond
user: "What's my order status?"
agent: [THINK] Need order ID → check DB
[ACT] query_orders(user_id=42)
[OBS] Order #7845: Shipped, ETA Apr 29
[REPLY] "Your order #7845 shipped and
arrives by April 29th!"Every agent — whether it's a customer support bot, a coding assistant, or an autonomous researcher — follows the same fundamental loop:
| Step | What Happens | Customer Support Example |
|---|---|---|
| 1. Perceive | Agent receives input (user message, system event) | Customer says: "Where is my order #7845?" |
| 2. Reason | LLM thinks: what do I need to do? Which tool? | "I need to look up order #7845 in the database" |
| 3. Act | Agent calls a tool or takes an action | Calls lookup_order(order_id="7845") |
| 4. Observe | Agent reads the result; decides: done or continue? | Result: {status: "shipped", eta: "Apr 29"} → Done! |
| 5. Respond | Agent formulates final answer for user | "Your order #7845 has shipped and arrives April 29th!" |
Not all agents are fully autonomous. There's a spectrum:
| Level | Name | Description | Example |
|---|---|---|---|
| L0 | No AI | Pure rule-based, hardcoded | IVR phone menus |
| L1 | Chatbot | LLM generates text, no tools | FAQ bot with fixed answers |
| L2 | Copilot | LLM suggests, human decides & acts | Copilot suggests code, you accept |
| L3 | Agent | LLM reasons, acts, uses tools — human approves | Agent drafts reply, human sends |
| L4 | Autonomous Agent | Full autonomy, acts without human in loop | Agent auto-resolves & closes ticket |
| L5 | Multi-Agent System | Multiple agents collaborating | Triage agent → specialist agent → QA agent |
lookup_order("7845") → Shipped Apr 22, ETA Apr 28get_refund_policy("shipping_delay") → ₹200 credit if >2 days lateThe LLM is the central reasoning engine. It receives all inputs (user message, tool results, memory) and decides what to do next. Think of it as the brain — everything else (tools, memory) are its arms and notebooks.
| Responsibility | Description |
|---|---|
| Understanding | Parse user intent from natural language |
| Reasoning | Decide which tool to use and what arguments to pass |
| Generation | Create the final human-readable response |
| Orchestration | Manage multi-step workflows (call tool A → use result in tool B) |
from langchain_ollama import ChatOllama
# The "brain" of our agent
llm = ChatOllama(
model="qwen2.5:3b",
temperature=0.2, # Low for consistent support replies
base_url="http://localhost:11434"
)
# The LLM decides what to do
response = llm.invoke("Customer says: 'My order is late.' What tool should I use?")
# Output: "I should use the lookup_order tool to check the order status."
Tools are functions the agent can call to interact with the outside world. Without tools, the LLM is just a text generator. With tools, it becomes an agent that can take action.
from langchain.tools import tool
@tool
def lookup_order(order_id: str) -> dict:
"""Look up an order by its ID and return status, items, and ETA."""
# In production: query SQL Server database
# For demo: hardcoded data
orders = {
"7845": {"status": "shipped", "eta": "2026-04-29", "items": ["MacBook Air M3"], "total": 89999},
"7846": {"status": "delivered", "eta": "2026-04-25", "items": ["iPhone 16"], "total": 79999},
}
return orders.get(order_id, {"error": f"Order {order_id} not found"})
@tool
def get_refund_policy(issue_type: str) -> str:
"""Retrieve the refund policy for a given issue type."""
policies = {
"shipping_delay": "Shipping delay >2 days: ₹200 credit. >7 days: full refund.",
"wrong_item": "Wrong item: free return pickup + full refund within 48 hours.",
"damaged": "Damaged product: photo required. Full refund + ₹500 inconvenience credit.",
}
return policies.get(issue_type, "Policy not found. Please escalate to supervisor.")
@tool
def send_email(to: str, subject: str, body: str) -> str:
"""Send an email to the customer."""
# In production: use SMTP/SendGrid
print(f"📧 Sending to {to}: {subject}")
return f"Email sent successfully to {to}"
# The agent has access to these tools:
tools = [lookup_order, get_refund_policy, send_email]
Memory lets the agent remember information across conversation turns and even across sessions.
| Memory Type | Scope | Use Case |
|---|---|---|
| Short-term (Buffer) | Current conversation | Remember customer said their name is Rajesh 3 messages ago |
| Long-term (Persistent) | Across sessions | Remember this customer had a bad experience last month |
| Episodic | Past interactions | Recall how a similar ticket was resolved before |
| Semantic (Vector) | Knowledge base | Search FAQs and policies by meaning, not keywords |
Planning is how the agent breaks a complex task into steps. Without planning, the agent might try to do everything at once and fail.
The system prompt defines the agent's persona, rules, and boundaries. This is what you built in Day 5.
SYSTEM_PROMPT = """ You are a customer support agent for ShopEasy, an Indian e-commerce platform. You have access to these tools: - lookup_order: Check order status by order ID - get_refund_policy: Get refund/return policy for issue types - send_email: Send email to customer Rules: 1. Always look up the order before making any claims about it 2. Always check the refund policy before offering refunds 3. Be empathetic, professional, and concise 4. If you cannot resolve, say: "Let me connect you to a specialist" 5. Never make up information — only use tool results 6. Reference specific order numbers and amounts in your reply """
You don't build agents from scratch. Frameworks wire the 5 components together:
| Framework | Language | Best For | We'll Use |
|---|---|---|---|
| LangChain | Python | General-purpose agents, RAG, chains | ✅ Week 2-3 |
| LangGraph | Python | Complex multi-step workflows as graphs | ✅ Week 3 |
| Semantic Kernel | C# / Python | Enterprise .NET integration | ✅ Week 4 |
| CrewAI | Python | Multi-agent collaboration | ✅ Week 4 |
| AutoGen | Python | Microsoft's multi-agent conversations | 📖 Overview |
| Component | Our Implementation |
|---|---|
| 🧠 LLM | Qwen 2.5:3B via Ollama (local, free) |
| 🔧 Tools | lookup_order, get_refund_policy, search_faq, send_email, escalate_ticket |
| 💾 Memory | ConversationBufferMemory (short-term) + ChromaDB vector store (long-term) |
| 📋 Planning | ReAct pattern (Reason + Act) — decides step-by-step |
| 📝 Prompt | System prompt with persona, rules, guardrails, output format |
When an LLM has tools available, it doesn't just generate text — it can output a structured function call instead. The framework then executes the function and feeds the result back to the LLM.
{"tool": "lookup_order", "args": {"order_id": "7845"}}lookup_order(order_id="7845") → Returns resultfrom langchain.tools import tool
@tool
def lookup_order(order_id: str) -> str:
"""Look up an order by its ID. Returns order status, items, total, and ETA.
Use this when a customer asks about their order status or delivery."""
# Simulate database lookup (production: query SQL Server)
import json
orders_db = {
"7845": {"order_id": "7845", "status": "shipped", "eta": "2026-04-29",
"items": ["MacBook Air M3"], "total": 89999, "customer": "Rajesh Kumar"},
"7846": {"order_id": "7846", "status": "delivered", "eta": "2026-04-25",
"items": ["iPhone 16", "AirPods Pro"], "total": 99998, "customer": "Priya Singh"},
}
order = orders_db.get(order_id)
if order:
return json.dumps(order, indent=2)
return f"Error: Order #{order_id} not found in the system."
# Check the auto-generated schema
print(lookup_order.name) # "lookup_order"
print(lookup_order.description) # "Look up an order by its ID..."
print(lookup_order.args_schema.model_json_schema())
# {'properties': {'order_id': {'type': 'string'}}, 'required': ['order_id']}
from langchain.tools import StructuredTool
from pydantic import BaseModel, Field
class RefundPolicyInput(BaseModel):
"""Input for refund policy lookup."""
issue_type: str = Field(
description="Type of issue: 'shipping_delay', 'wrong_item', 'damaged', 'defective', 'changed_mind'"
)
days_since_delivery: int = Field(
default=0,
description="Number of days since the item was delivered (0 if not delivered)"
)
def get_refund_policy(issue_type: str, days_since_delivery: int = 0) -> str:
"""Get the applicable refund/return policy based on issue type and delivery date."""
policies = {
"shipping_delay": {
"policy": "If delayed >2 days: ₹200 store credit. If >7 days: full refund.",
"auto_approve": True
},
"wrong_item": {
"policy": "Free return pickup scheduled within 24 hours. Full refund processed in 3-5 days.",
"auto_approve": True
},
"damaged": {
"policy": "Customer must upload photo of damage. Full refund + ₹500 inconvenience credit.",
"auto_approve": False,
"requires": "photo_upload"
},
"defective": {
"policy": "Eligible for replacement or refund within 15 days of delivery.",
"auto_approve": days_since_delivery <= 15
},
"changed_mind": {
"policy": "Return within 7 days if unused and in original packaging. Customer pays return shipping.",
"auto_approve": days_since_delivery <= 7
}
}
import json
result = policies.get(issue_type, {"policy": "Unknown issue. Escalate to supervisor.", "auto_approve": False})
return json.dumps(result, indent=2)
refund_policy_tool = StructuredTool.from_function(
func=get_refund_policy,
name="get_refund_policy",
description="Get refund/return policy for a specific issue type. Use when customer requests refund or return.",
args_schema=RefundPolicyInput
)
import requests
from langchain.tools import tool
@tool
def check_shipping_status(tracking_id: str) -> str:
"""Check real-time shipping status using tracking ID.
Use when customer asks about delivery status or tracking."""
try:
# Example: call shipping partner API
# In production: use actual shipping API (Delhivery, BlueDart, etc.)
response = requests.get(
f"http://localhost:8000/api/tracking/{tracking_id}",
timeout=10
)
response.raise_for_status()
return response.json()
except requests.exceptions.Timeout:
return "Error: Shipping service is temporarily slow. Please try again."
except requests.exceptions.RequestException as e:
return f"Error checking shipping status: Unable to reach shipping service."
@tool
def search_knowledge_base(query: str) -> str:
"""Search the FAQ and policy knowledge base for relevant information.
Use when you need to find answers about company policies, product info, or procedures."""
# In production: vector search with ChromaDB (Week 3)
# For now: simple keyword matching
kb = {
"return window": "Standard return window is 7 days. Electronics: 15 days.",
"payment methods": "We accept UPI, credit cards, debit cards, and net banking.",
"warranty": "All electronics come with 1-year manufacturer warranty.",
"cancel order": "Orders can be cancelled before shipping. Go to My Orders → Cancel."
}
results = []
for key, value in kb.items():
if any(word in query.lower() for word in key.split()):
results.append(f"- {value}")
return "\n".join(results) if results else "No relevant information found."
from langchain_ollama import ChatOllama
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# 1. LLM
llm = ChatOllama(model="qwen2.5:3b", temperature=0.2)
# 2. Tools
tools = [lookup_order, refund_policy_tool, check_shipping_status, search_knowledge_base]
# 3. Prompt
prompt = ChatPromptTemplate.from_messages([
("system", """You are a customer support agent for ShopEasy.
Available tools: lookup_order, get_refund_policy, check_shipping_status, search_knowledge_base
Rules:
- Always verify order details before making claims
- Always check policy before offering refunds
- Be empathetic and professional
- If stuck, say: "Let me connect you to a specialist"
"""),
MessagesPlaceholder("chat_history", optional=True),
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad"),
])
# 4. Create Agent
agent = create_tool_calling_agent(llm, tools, prompt)
# 5. Agent Executor (runs the loop)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # See the reasoning!
max_iterations=5, # Safety: max 5 tool calls per turn
handle_parsing_errors=True
)
# 6. Run it!
result = agent_executor.invoke({
"input": "Hi, my order #7845 hasn't arrived yet. Can you check?"
})
print(result["output"])
verbose=True Shows You> Entering new AgentExecutor chain...
Thought: The customer wants to know about order #7845. I should look it up.
Action: lookup_order
Action Input: {"order_id": "7845"}
Observation: {"order_id": "7845", "status": "shipped", "eta": "2026-04-29",
"items": ["MacBook Air M3"], "total": 89999, "customer": "Rajesh Kumar"}
Thought: The order is shipped with ETA April 29. Let me give the customer this info.
Final Answer: Hi Rajesh! I checked your order #7845 — your MacBook Air M3 has been shipped
and is expected to arrive by April 29th. You can track it in the "My Orders" section.
Let me know if you need anything else! 😊
> Finished chain.
@tool
def lookup_order(order_id: str) -> str:
"""Look up an order by ID. Returns order details or error message."""
# Validate input
if not order_id or not order_id.strip():
return "Error: No order ID provided. Ask the customer for their order number."
# Sanitize input (prevent injection)
order_id = order_id.strip().replace("'", "").replace(";", "")[:20]
try:
# Database lookup (simulated)
order = db_lookup(order_id) # Your actual DB call
if order:
return json.dumps(order)
return f"Order #{order_id} not found. Please verify the order number with the customer."
except Exception:
return "Error: Unable to access order system. Please try again or escalate to support team."
| Tool | When Agent Uses It | Returns |
|---|---|---|
lookup_order | Customer asks about order status | Order details JSON |
get_refund_policy | Customer wants refund/return | Policy + auto-approve flag |
check_shipping_status | Customer asks "where is my package?" | Live tracking info |
search_knowledge_base | General questions about policies | Relevant FAQ entries |
send_email | After resolving — confirmation email | Success/failure status |
escalate_ticket | Cannot resolve — hand off to human | Ticket ID + assigned agent |
cancel_order tool that checks if order is cancellable (only before shipping)calculate_refund tool that computes refund amount based on policyverbose=True and trace the agent's reasoning for each scenarioWithout memory, every LLM call is stateless — the model doesn't remember anything from previous messages. This creates terrible user experiences:
User: My name is Rajesh, order #7845
Bot: Let me check order #7845 for you.
User: What's the status?
Bot: Could you provide your order number?
← FORGOT everything!User: My name is Rajesh, order #7845
Bot: Let me check order #7845 for you.
User: What's the status?
Bot: Hi Rajesh! Your order #7845 is
shipped and arrives April 29th.
← REMEMBERS context!Stores the entire conversation as-is. Simple but grows fast.
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# Stores every message
memory.save_context(
{"input": "My name is Rajesh, order #7845"},
{"output": "Hi Rajesh! Let me check order #7845 for you."}
)
memory.save_context(
{"input": "What's the status?"},
{"output": "Your order #7845 is shipped, arriving April 29th."}
)
# Load everything
print(memory.load_memory_variables({}))
# Returns ALL messages — both user and agent
Keeps only the last K messages. Older messages are discarded.
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(
memory_key="chat_history",
return_messages=True,
k=5 # Keep last 5 exchange pairs (10 messages)
)
# After 20 messages, only the last 10 are kept
# Good for: real-time support where recent context matters most
Uses the LLM to summarize older messages. Keeps a running summary instead of raw messages.
from langchain.memory import ConversationSummaryMemory
from langchain_ollama import ChatOllama
llm = ChatOllama(model="qwen2.5:3b")
memory = ConversationSummaryMemory(
llm=llm,
memory_key="chat_history",
return_messages=True
)
# After many messages, the summary might be:
# "Customer Rajesh Kumar contacted about order #7845 (MacBook Air).
# Order is shipped, ETA April 29. Customer expressed frustration about
# the delay. Agent offered ₹200 credit for the delay. Customer accepted."
# ✅ Compact — captures key facts
# ✅ Fits in small context windows
# ⚠️ Uses extra LLM calls to summarize
Stores conversation chunks as embeddings and retrieves the most relevant ones based on the current query. Perfect for long conversations or knowledge bases.
# pip install chromadb langchain-chroma
from langchain.memory import VectorStoreRetrieverMemory
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings
# Setup vector store
embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma(
collection_name="support_memory",
embedding_function=embeddings,
persist_directory="./memory_db"
)
# Create memory with retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
memory = VectorStoreRetrieverMemory(retriever=retriever)
# Save conversations as embeddings
memory.save_context(
{"input": "I need to return my laptop, order #7845"},
{"output": "I'll help with the return. Your MacBook Air is within the 15-day window."}
)
# Later, when customer asks about returns:
# The memory searches by MEANING and retrieves the relevant conversation
# Even if the customer says "send it back" instead of "return"
| Type | How it Works | Context Usage | Best For |
|---|---|---|---|
| Buffer | Stores everything | 🔴 High — grows linearly | Short conversations (<10 turns) |
| Window | Last K messages only | 🟢 Fixed — always same size | Real-time support chats |
| Summary | LLM summarizes older messages | 🟡 Medium — compact summaries | Long conversations, complex issues |
| Vector Store | Embed & search by meaning | 🟢 Fixed — retrieves top-K | Knowledge base, cross-session memory |
Production agents often use multiple memory types together:
from langchain.memory import CombinedMemory
# Short-term: last 3 exchanges
window_memory = ConversationBufferWindowMemory(
memory_key="recent_chat",
return_messages=True,
k=3
)
# Long-term: vector search
vector_memory = VectorStoreRetrieverMemory(
retriever=vectorstore.as_retriever(search_kwargs={"k": 2}),
memory_key="relevant_history"
)
# Combine them
memory = CombinedMemory(memories=[window_memory, vector_memory])
# The agent now gets:
# 1. Last 3 messages (immediate context)
# 2. Most relevant past conversations (semantic search)
# → Best of both worlds!
This is the difference between a forgettable chatbot and a personalized agent.
The reasoning pattern determines how the agent decides what to do. Different patterns have different strengths:
ReAct interleaves reasoning (thinking out loud) with acting (calling tools). The agent thinks about what to do, does it, observes the result, then thinks again.
lookup_order(order_id="7845")
get_refund_policy(issue_type="wrong_item")
from langchain_ollama import ChatOllama
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import PromptTemplate
llm = ChatOllama(model="qwen2.5:3b", temperature=0.2)
# ReAct-specific prompt template
react_prompt = PromptTemplate.from_template("""
You are a customer support agent. Answer the customer's question using the available tools.
You have access to these tools:
{tools}
Tool names: {tool_names}
Use this EXACT format:
Question: the customer's message
Thought: think about what to do
Action: the tool to use (must be one of [{tool_names}])
Action Input: the input to the tool
Observation: the result from the tool
... (repeat Thought/Action/Observation as needed)
Thought: I now have enough information to answer
Final Answer: your response to the customer
Begin!
Question: {input}
{agent_scratchpad}
""")
# Create ReAct agent
agent = create_react_agent(llm, tools, react_prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
max_iterations=5,
handle_parsing_errors=True
)
CoT forces the agent to reason through the entire problem before taking any action. Best for complex, multi-step reasoning.
Think → Act → Observe → Think → Act → Observe → Think → Final Answer
Good for: dynamic situations where each step depends on the previous result.
Think → Think → Think → (Full plan ready) → Act → Act → Act → Final Answer
Good for: problems where you can plan upfront and execute sequentially.
cot_planning_prompt = """
You are a customer support agent. Before taking any action, think through
your complete plan step by step.
Customer message: "{message}"
Think step by step:
1. What is the customer's intent?
2. What information do I need?
3. Which tools should I call, in what order?
4. What are the possible outcomes?
5. What's my response plan for each outcome?
Plan:
"""
# After planning, execute the plan
execution_prompt = """
Based on this plan:
{plan}
And these tool results:
{tool_results}
Generate the final response to the customer.
"""
ReWOO creates the entire plan upfront, executes all tool calls in parallel, then reasons about the combined results. Much faster than ReAct for independent tool calls.
# Step 1: Planner (1 LLM call)
Plan:
#E1 = lookup_order("7845") ← get order details
#E2 = get_refund_policy("wrong_item") ← get policy
#E3 = search_knowledge_base("return process") ← get FAQ
# Step 2: Worker (all 3 tool calls run IN PARALLEL)
#E1 result = {status: "delivered", items: ["AirPods Pro"]}
#E2 result = {policy: "Free return pickup..."}
#E3 result = "Return process: schedule pickup → refund in 3-5 days"
# Step 3: Solver (1 LLM call with ALL results)
Given: #E1, #E2, #E3 → Generate final response
# Total LLM calls: 2 (plan + solve) instead of ReAct's 4-6
# Total time: much faster because tools run in parallel!
| Pattern | LLM Calls | Tool Execution | Speed | Best For |
|---|---|---|---|---|
| ReAct | Many (per step) | Sequential | 🐢 Slowest | Dynamic, dependent steps |
| CoT + Act | 2-3 (plan + act) | Sequential | 🐇 Medium | Complex reasoning needed |
| ReWOO | 2 (plan + solve) | Parallel | 🚀 Fastest | Independent tool calls |
Agents don't always reason correctly. Common failures:
| Failure | Description | Mitigation |
|---|---|---|
| Infinite Loop | Agent keeps calling the same tool | Set max_iterations=5 |
| Wrong Tool | Agent picks the wrong tool | Better tool descriptions / fewer tools |
| Hallucinated Tool | Agent tries to call a tool that doesn't exist | handle_parsing_errors=True |
| Missing Info | Agent responds without checking data | Prompt: "ALWAYS use tools before answering" |
| Over-Planning | Agent plans 10 steps for a 2-step problem | Prompt: "Use minimum necessary steps" |
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
max_iterations=5, # Stop after 5 tool calls
max_execution_time=30, # Timeout after 30 seconds
handle_parsing_errors=True, # Recover from malformed output
early_stopping_method="generate", # Generate answer if stuck
)
| Scenario | Best Pattern | Why |
|---|---|---|
| "Where is my order?" | ReAct | Simple: 1 tool call, 1 answer |
| "Wrong item, want refund + replacement" | ReAct | Each step depends on previous (verify order → check policy → decide) |
| "Compare return vs refund options" | ReWOO | Fetch both policies in parallel, then compare |
| "Complex complaint — multiple issues" | CoT + Act | Need full plan upfront to handle all issues systematically |
Build a complete, working Customer Support Agent using everything from Week 2. This will be the foundation we enhance in Weeks 3-4.
| Component | Implementation |
|---|---|
| 🧠 LLM | Qwen 2.5:3B via Ollama |
| 🔧 Tools (min 4) | lookup_order, get_refund_policy, search_knowledge_base, send_email |
| 💾 Memory | ConversationBufferWindowMemory (k=5) |
| 📋 Planning | ReAct pattern |
| 📝 Prompt | System prompt with persona + rules + guardrails |
| 🖥️ Interface | CLI (input loop) or Streamlit chat |
# Scenario 1: Order Status
User: "Where is my order #7845?"
→ Agent should call lookup_order and give status
# Scenario 2: Refund Request
User: "My laptop arrived damaged. Order #7845. I want a refund."
→ Agent should: lookup_order → get_refund_policy("damaged") → offer refund
# Scenario 3: Multi-turn with Memory
User: "My name is Rajesh"
User: "Check order #7845"
User: "What's the refund policy for damaged items?"
User: "Ok process the refund"
→ Agent should remember Rajesh's name throughout
# Scenario 4: Unknown Issue (Escalation)
User: "I want to sue your company for negligence"
→ Agent should NOT attempt to handle legal issues; should escalate
# Scenario 5: General Knowledge
User: "What payment methods do you accept?"
→ Agent should search knowledge base
cancel_order tool that checks if the order is cancellableescalate_ticket tool that creates a ticket for human reviewverbose=True logging to a file for debuggingNext week: Week 3 — Building Real Agents → RAG, LangGraph Workflows, Semantic Kernel, Multi-Agent Systems