March 26, 202610 min read

Build AI Agents with LangChain and Claude — Step by Step

A practical tutorial on building AI agents that can use tools, make decisions, and complete multi-step tasks. Uses LangChain with Claude, covers tool creation, agent loops, memory, and real-world patterns.

ai agents langchain claude llm tutorial

An AI agent is not a chatbot with extra steps. A chatbot takes a message and returns a response. An agent takes a goal and figures out the steps to achieve it -- choosing which tools to use, interpreting results, handling errors, and deciding when it's done.

The difference matters. A chatbot answers "What's the weather in Tokyo?" An agent can "Book me a flight to Tokyo next week, pick a hotel near Shibuya under $200/night, and add both to my calendar." That requires calling multiple APIs, making decisions based on results, and maintaining context across steps.

Building agents used to require writing complex state machines. With LangChain and a capable model like Claude, you can build functional agents in under 100 lines of code.

How Agents Work

The core loop is simple:

1. Receive a goal/task

Think about what to do next
Choose a tool and execute it
Observe the result
Repeat from step 2 until the task is complete
Return the final answer

This is called the ReAct (Reasoning + Acting) pattern. The model alternates between reasoning ("I need to look up the user's order history") and acting (calling the order history API).

Setup

pip install langchain langchain-anthropic langchain-community

Set your API key:

export ANTHROPIC_API_KEY="sk-ant-..."

Your First Agent

from langchain_anthropic import ChatAnthropic
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

# Define tools
@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    # In reality, this would call a weather API
    weather_data = {
        "Tokyo": "72°F, partly cloudy",
        "London": "58°F, rainy",
        "New York": "65°F, sunny",
    }
    return weather_data.get(city, f"Weather data not available for {city}")

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression. Use Python syntax."""
    try:
        result = eval(expression)  # In production, use a safe math parser
        return str(result)
    except Exception as e:
        return f"Error: {e}"

# Create the model
llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0)

# Create the prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Use tools when needed to answer questions accurately."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Create the agent
agent = create_tool_calling_agent(llm, [get_weather, calculate], prompt)
executor = AgentExecutor(agent=agent, tools=[get_weather, calculate], verbose=True)

# Run it
result = executor.invoke({"input": "What's the weather in Tokyo? Also, what's 15% tip on a $84.50 dinner?"})
print(result["output"])

With verbose=True, you'll see the agent's reasoning:

> Entering new AgentExecutor chain... I'll help with both questions. Let me check the weather and calculate the tip. Tool: get_weather Input: Tokyo Output: 72°F, partly cloudy Tool: calculate Input: 84.50 * 0.15 Output: 12.675

The weather in Tokyo is 72°F and partly cloudy. A 15% tip on $84.50 comes to $12.68. > Finished chain.

Building Useful Tools

Tools are where agents become powerful. Each tool is a function the agent can call:

import httpx
from datetime import datetime

@tool
def search_web(query: str) -> str:
    """Search the web for current information. Use for recent events, prices, or facts you're unsure about."""
    response = httpx.get(
        "https://api.search.example.com/search",
        params={"q": query, "count": 5},
        headers={"Authorization": f"Bearer {os.environ['SEARCH_API_KEY']}"},
    )
    results = response.json()["results"]
    return "\n".join(
        f"- {r['title']}: {r['snippet']}" for r in results
    )

@tool
def read_url(url: str) -> str:
    """Fetch and read the content of a web page. Returns the text content."""
    response = httpx.get(url, follow_redirects=True, timeout=10)
    # Strip HTML tags (use a proper parser in production)
    from bs4 import BeautifulSoup
    soup = BeautifulSoup(response.text, "html.parser")
    text = soup.get_text(separator="\n", strip=True)
    return text[:4000]  # Truncate to avoid token limits

@tool
def run_sql_query(query: str) -> str:
    """Run a read-only SQL query against the application database.
    Only SELECT queries are allowed. Use this to look up user data, orders, or statistics."""
    if not query.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are allowed"

import sqlite3
    conn = sqlite3.connect("app.db")
    cursor = conn.execute(query)
    columns = [desc[0] for desc in cursor.description]
    rows = cursor.fetchall()
    conn.close()

if not rows:
        return "No results found"

# Format as a readable table
    result = " | ".join(columns) + "\n"
    result += "-" * len(result) + "\n"
    for row in rows:
        result += " | ".join(str(v) for v in row) + "\n"
    return result

@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email. Use this when the user explicitly asks to send an email."""
    # In production: actual email API call
    import smtplib
    from email.message import EmailMessage

msg = EmailMessage()
    msg["From"] = "agent@yourapp.com"
    msg["To"] = to
    msg["Subject"] = subject
    msg.set_content(body)

with smtplib.SMTP("smtp.yourapp.com", 587) as server:
        server.starttls()
        server.login("agent@yourapp.com", os.environ["SMTP_PASSWORD"])
        server.send_message(msg)

return f"Email sent to {to}"

@tool
def get_current_time() -> str:
    """Get the current date and time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S %Z")

Good tool design matters. The docstring is what the model reads to decide when to use the tool. Be specific about what the tool does and when it should be used.

Conversation Memory

Without memory, each invocation is independent. The agent forgets everything from the previous turn:

from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# Store for conversation histories
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# Modify prompt to include history
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to tools. Use conversation history for context."),
    ("placeholder", "{chat_history}"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

# Wrap with message history
agent_with_history = RunnableWithMessageHistory(
    executor,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

# Use it -- subsequent calls remember previous context
config = {"configurable": {"session_id": "user_123"}}
agent_with_history.invoke({"input": "My name is Alice"}, config=config)
agent_with_history.invoke({"input": "What's my name?"}, config=config)
# Agent remembers: "Your name is Alice"

For production, swap InMemoryChatMessageHistory with Redis or a database-backed store.

Structured Output from Agents

Sometimes you need the agent to return structured data, not free text:

from pydantic import BaseModel, Field

class TripPlan(BaseModel):
    destination: str = Field(description="The travel destination")
    dates: str = Field(description="Travel dates")
    flights: list[dict] = Field(description="Recommended flights")
    hotels: list[dict] = Field(description="Recommended hotels")
    estimated_budget: float = Field(description="Estimated total cost in USD")
    notes: str = Field(description="Additional recommendations")

# Use with_structured_output for the final response
structured_llm = llm.with_structured_output(TripPlan)

Error Handling and Retries

Agents can fail at any step -- API timeouts, malformed tool inputs, rate limits. Handle it:

from langchain_core.tools import ToolException

@tool
def fragile_api_call(query: str) -> str:
    """Call an external API that might fail."""
    try:
        response = httpx.get(f"https://api.example.com/data?q={query}", timeout=5)
        response.raise_for_status()
        return response.text
    except httpx.TimeoutException:
        raise ToolException("API timed out. Try again with a simpler query.")
    except httpx.HTTPStatusError as e:
        raise ToolException(f"API returned error {e.response.status_code}. Try a different approach.")

# Configure the executor to handle tool errors
executor = AgentExecutor(
    agent=agent,
    tools=tools,
    max_iterations=10,           # Don't loop forever
    max_execution_time=60,       # Timeout after 60 seconds
    handle_tool_errors=True,     # Pass errors back to the agent instead of crashing
    return_intermediate_steps=True,  # Include the reasoning chain in output
)

With handle_tool_errors=True, when a tool raises ToolException, the error message goes back to the agent as an observation. The agent can then decide to retry, try a different tool, or tell the user what went wrong.

Multi-Agent Patterns

For complex tasks, single agents hit limits. Split the work:

from langchain_core.prompts import ChatPromptTemplate

# Research agent: finds information
research_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a research assistant. Your job is to find accurate information
    using the search and URL reading tools. Return comprehensive, factual findings."""),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])
research_agent = create_tool_calling_agent(llm, [search_web, read_url], research_prompt)
research_executor = AgentExecutor(agent=research_agent, tools=[search_web, read_url])

# Writer agent: creates content from research
writer_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a technical writer. Given research findings, write clear,
    well-structured content. Use code examples where appropriate."""),
    ("human", "Based on this research:\n{research}\n\nWrite: {task}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Orchestrator
async def research_and_write(topic: str):
    # Step 1: Research
    research = await research_executor.ainvoke({
        "input": f"Research the following topic thoroughly: {topic}"
    })

# Step 2: Write
    writer = ChatAnthropic(model="claude-sonnet-4-20250514")
    article = await writer.ainvoke(
        writer_prompt.format_messages(
            research=research["output"],
            task=f"Write a comprehensive guide about {topic}"
        )
    )

return article.content

Safety and Guardrails

Agents that can take actions (send emails, run queries, modify data) need guardrails:

@tool
def delete_user(user_id: str) -> str:
    """Delete a user account. REQUIRES CONFIRMATION."""
    # Never auto-execute destructive actions
    return f"⚠️ CONFIRMATION REQUIRED: Delete user {user_id}? This action is irreversible. The user must confirm this action through the UI."

# Input validation
@tool
def run_sql_query(query: str) -> str:
    """Run a read-only SQL query."""
    forbidden = ["DROP", "DELETE", "UPDATE", "INSERT", "ALTER", "TRUNCATE"]
    query_upper = query.upper()
    for word in forbidden:
        if word in query_upper:
            return f"Error: {word} operations are not allowed. Read-only queries only."

# Parameterize to prevent injection
    # Never interpolate user input directly into SQL
    ...

Key safety principles:

Read-only tools by default. Write tools need explicit confirmation.
Rate-limit tool calls. An agent in a loop can burn through your API quota fast.
Log every tool invocation for audit trails.
Set max_iterations to prevent infinite loops.
Validate inputs rigorously -- the model can produce unexpected inputs.

Cost Management

Agent loops are expensive. Each iteration is an API call with the full conversation context:

Scenario	Approximate Cost
Simple task (2-3 tool calls)	$0.01-0.03
Complex task (8-10 tool calls)	$0.05-0.15
Research task with long context	$0.10-0.50

Strategies to reduce cost:

Use smaller models (Haiku) for simple tool-selection tasks
Cache tool results when possible
Set tight max_iterations limits
Use structured output to get the answer in fewer rounds

# Use Haiku for the cheap routing, Sonnet for the final answer
router_llm = ChatAnthropic(model="claude-haiku-4-20250514", temperature=0)
answer_llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0)

A Real Example: Customer Support Agent

@tool
def lookup_order(order_id: str) -> str:
    """Look up an order by its ID. Returns order details."""
    order = db.orders.find_one({"id": order_id})
    if not order:
        return f"No order found with ID {order_id}"
    return json.dumps(order, default=str)

@tool
def lookup_customer(email: str) -> str:
    """Look up a customer by email. Returns customer profile and recent orders."""
    customer = db.customers.find_one({"email": email})
    if not customer:
        return f"No customer found with email {email}"
    orders = list(db.orders.find({"customer_id": customer["id"]}).sort("date", -1).limit(5))
    return json.dumps({"customer": customer, "recent_orders": orders}, default=str)

@tool
def initiate_refund(order_id: str, reason: str) -> str:
    """Initiate a refund for an order. Returns the refund status."""
    order = db.orders.find_one({"id": order_id})
    if not order:
        return "Order not found"
    if order["status"] == "refunded":
        return "This order has already been refunded"
    # Create refund record, don't actually process yet
    refund_id = db.refunds.insert_one({
        "order_id": order_id,
        "reason": reason,
        "status": "pending_approval",
        "amount": order["total"],
    })
    return f"Refund initiated (ID: {refund_id}). Amount: ${order['total']}. Status: pending approval."

support_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a customer support agent for an e-commerce company.
    - Always verify the customer's identity before sharing order details
    - Be empathetic but efficient
    - You can look up orders and customer profiles
    - You can initiate refunds, but they require approval
    - If you can't resolve the issue, escalate to a human agent"""),
    ("placeholder", "{chat_history}"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

This agent can handle "I want a refund for order #12345" end-to-end: look up the order, verify the customer, check the order status, and initiate the refund process. That's a real workflow that previously required a human for every step.

Agents are the most practical application of LLMs beyond chat. If you're building tools on CodeUp or any platform where users have multi-step workflows, adding an agent layer can automate significant chunks of what currently requires manual intervention. Start small -- a single tool, a simple loop -- and expand as you see what the model handles well.