Agent Architectures & Patterns

TL;DR

Agent architectures determine how an agent reasons and acts. The simplest is ReAct (reason + act in a loop). More advanced patterns include Chain-of-Thought (step-by-step reasoning), Tree-of-Thought (explore multiple reasoning paths), Reflexion (self-critique and retry), and Plan-and-Execute (create a plan, then follow it). Choose based on task complexity and reliability requirements.

Explain Like I'm 12

Think of these architectures like different ways to solve a maze. ReAct is like walking through the maze one step at a time — look around, pick a direction, walk, repeat. Chain-of-Thought is like talking through your logic out loud: "If I go left, I hit a wall, so I'll go right." Tree-of-Thought is like sending clones of yourself down every path and picking the clone that gets furthest. Reflexion is like hitting a dead end, backing up, and saying "okay, what did I do wrong?" And Plan-and-Execute is like studying the maze from above first, drawing a route on paper, then following your map.

Architecture Comparison

Diagram comparing agent architectures: ReAct loop, Chain-of-Thought reasoning, Tree-of-Thought branching, Reflexion self-critique cycle, and Plan-and-Execute two-phase approach

ReAct: Reasoning + Acting

ReAct (Reasoning and Acting) is the most common agent architecture. The agent alternates between thinking (generating reasoning traces) and acting (calling tools). It's simple, effective, and the foundation for most production agents.

Info: ReAct was introduced in the 2022 paper "ReAct: Synergizing Reasoning and Acting in Language Models" by Yao et al. It showed that interleaving reasoning with actions significantly outperforms either alone.
# ReAct pattern: Thought → Action → Observation loop
system_prompt = """You are a helpful agent. For each step:
1. Thought: Reason about what to do next
2. Action: Call a tool if needed
3. Observation: Review the result
Repeat until the task is complete."""

# The LLM naturally generates this pattern:
# Thought: I need to find the bug. Let me read the auth module.
# Action: read_file(path="src/auth.py")
# Observation: [file contents...]
# Thought: I see the issue — the token expiry check is using < instead of <=
# Action: edit_file(path="src/auth.py", ...)
ProsCons
Simple to implementCan get stuck in loops
Works with any LLMNo backtracking — can't undo bad decisions
Easy to debug (visible reasoning)Greedy — takes the first "good enough" action
Low overhead per stepNo global planning — purely reactive
Tip: ReAct is the right starting point for 80% of agent tasks. Only move to more complex architectures when you hit specific limitations like lack of planning or need for self-correction.

Chain-of-Thought (CoT)

Chain-of-Thought prompting makes the LLM show its reasoning step by step before reaching a conclusion. It's not an agent architecture per se, but it's a critical technique within agent architectures to improve reasoning quality.

# Chain-of-Thought: Force step-by-step reasoning
prompt = """Analyze this error and determine the root cause.

Think step by step:
1. What does the error message say?
2. What code path triggered it?
3. What are the possible causes?
4. Which cause is most likely given the context?
5. What's the fix?

Error: TypeError: Cannot read property 'id' of undefined at line 42 of users.js"""

# The LLM reasons through each step explicitly,
# leading to better answers than "just fix it"
Info: CoT improves accuracy on complex reasoning tasks by 10-30%. It works because LLMs "think" through token generation — forcing them to generate intermediate steps gives them more "compute time" to reason.

Tree-of-Thought (ToT)

Tree-of-Thought extends CoT by exploring multiple reasoning paths in parallel and selecting the best one. Instead of committing to the first chain of reasoning, the agent generates several candidates, evaluates them, and prunes bad branches.

# Tree-of-Thought: Explore multiple approaches
def tree_of_thought(problem, num_branches=3):
    # Generate multiple candidate approaches
    candidates = []
    for i in range(num_branches):
        approach = llm.generate(
            f"Propose approach #{i+1} to solve: {problem}"
        )
        candidates.append(approach)

    # Evaluate each approach
    scores = []
    for approach in candidates:
        score = llm.generate(
            f"Rate this approach 1-10 for correctness and efficiency: {approach}"
        )
        scores.append(score)

    # Pick the best and execute
    best = candidates[scores.index(max(scores))]
    return agent_execute(best)
Warning: ToT multiplies your LLM calls by the branching factor. 3 branches with 5 evaluation steps = 15+ LLM calls per decision point. Use it only for high-stakes decisions where accuracy justifies the cost.

Reflexion: Self-Critique and Retry

Reflexion adds a self-evaluation step after each attempt. If the agent fails or produces poor results, it reflects on what went wrong, generates a critique, and retries with the insight. This creates a learning loop within a single task.

# Reflexion: Try → Evaluate → Reflect → Retry
def reflexion_loop(task, max_attempts=3):
    reflections = []

    for attempt in range(max_attempts):
        # Attempt the task (with past reflections as context)
        result = agent_execute(task, past_reflections=reflections)

        # Evaluate the result
        evaluation = llm.generate(
            f"Did this succeed? What went wrong?\n"
            f"Task: {task}\nResult: {result}"
        )

        if evaluation.success:
            return result

        # Reflect and store the lesson
        reflection = llm.generate(
            f"What should I do differently next time?\n"
            f"Failed attempt: {result}\nEvaluation: {evaluation}"
        )
        reflections.append(reflection)

    return "Failed after max attempts"
Tip: Reflexion is especially powerful for coding agents. The agent writes code, runs tests, sees failures, reflects on why, and writes better code on the next attempt. This mirrors how human developers work.

Plan-and-Execute

Plan-and-Execute separates planning from execution into two distinct phases. A planner agent creates a high-level plan, then an executor agent follows the plan step by step. The planner can be re-invoked if the plan needs adjustment.

# Plan-and-Execute: Two-phase architecture
def plan_and_execute(goal):
    # Phase 1: Create the plan
    plan = planner_llm.generate(
        f"Create a step-by-step plan to accomplish: {goal}\n"
        f"Output as a numbered list. Each step should be actionable."
    )

    results = []
    for step in plan.steps:
        # Phase 2: Execute each step
        result = executor_agent.run(step)
        results.append(result)

        # Re-plan if needed
        if result.needs_replanning:
            remaining = plan.steps[plan.steps.index(step)+1:]
            plan = planner_llm.generate(
                f"Original plan: {plan}\n"
                f"Completed: {results}\n"
                f"Problem: {result.error}\n"
                f"Revise remaining steps: {remaining}"
            )

    return results
Info: Plan-and-Execute is ideal for complex, multi-step tasks where you want visibility into the plan before execution starts. It also enables human-in-the-loop review — show the plan to a human for approval before executing.

Choosing the Right Architecture

ArchitectureBest ForComplexityCost
ReActMost tasks, simple tool useLowLow
CoT + ReActTasks requiring careful reasoningLowLow-Medium
ReflexionCoding, writing, tasks with clear success criteriaMediumMedium
Plan-and-ExecuteComplex multi-step projectsMediumMedium
Tree-of-ThoughtHigh-stakes decisions, researchHighHigh
Tip: Start with ReAct. Add CoT if reasoning quality is poor. Add Reflexion if the agent makes mistakes but could self-correct. Use Plan-and-Execute for multi-step workflows. Reserve ToT for critical decision points.

Test Yourself

What does ReAct stand for and how does it work?

ReAct = Reasoning + Acting. The agent alternates between generating reasoning traces (Thought) and calling tools (Action), then observing the result (Observation). This loop continues until the task is complete. It's the most common agent architecture.

When would you choose Tree-of-Thought over ReAct?

Use Tree-of-Thought when the task has high stakes and multiple valid approaches where picking the wrong one is costly. ToT explores several reasoning paths in parallel and selects the best. The tradeoff is significantly higher cost (multiplied LLM calls). For most tasks, ReAct's simpler greedy approach is sufficient.

How does Reflexion improve agent performance?

Reflexion adds a self-evaluation loop: the agent attempts the task, evaluates its result, reflects on what went wrong, and retries with that insight. It turns failures into learning signals within a single task execution. Especially effective for coding agents where tests provide clear success/failure signals.

What's the main advantage of Plan-and-Execute over ReAct?

Plan-and-Execute separates planning from execution, giving visibility into the full plan before any action is taken. This enables human-in-the-loop review, better resource estimation, and systematic execution. ReAct is purely reactive — it decides one step at a time without a global plan.

Why does Chain-of-Thought prompting improve LLM reasoning?

LLMs "think" through token generation — each generated token is a unit of computation. Forcing the model to generate intermediate reasoning steps gives it more "compute time" before reaching a conclusion. Without CoT, the model must jump directly from question to answer, compressing all reasoning into the first few tokens.

Interview Questions

You're building an agent that writes and debugs code. Which architecture would you choose and why?

A ReAct + Reflexion combination is ideal for coding agents. ReAct handles the basic loop of reading code, making edits, and running commands. Reflexion adds self-correction — when tests fail, the agent reflects on the error, understands what went wrong, and generates a better fix on the next attempt. This mirrors the human debugging workflow of "try → fail → analyze → fix."

How would you prevent a ReAct agent from getting stuck in an infinite loop?

Multiple safeguards: (1) Step limit — cap the maximum number of reasoning/action cycles (e.g., 25 steps). (2) Token budget — set a max token spend per task. (3) Loop detection — track recent actions and detect repetition. (4) Escalation — after N failed attempts, escalate to a human or a different strategy. (5) Timeout — wall-clock time limit for the entire task.

Compare the cost tradeoffs between ReAct and Tree-of-Thought for a production system.

ReAct uses 1 LLM call per step — linear cost growth. Tree-of-Thought uses B calls per decision point (where B = branching factor), plus evaluation calls — multiplicative cost growth. For a 10-step task with 3 branches: ReAct ~10 calls, ToT ~30-50 calls. In production, this means 3-5x higher API costs and latency. Use ToT selectively at critical decision points, not for every step.