← Concept library

Agents & Tool Use

Agent Frameworks Compared

LangGraph, CrewAI, AutoGen, and the OpenAI Agents SDK solve different problems; the harder question is whether you need a framework at all.

intermediate · 9 min read

The framework question is almost always asked too early. Anthropic's review of production agent deployments reached a deflating conclusion: the most successful implementations were not built on complex frameworks at all, but on simple, composable patterns, with teams reaching for abstraction only when a simpler version demonstrably failed. So before comparing frameworks, answer the prior question they all paper over: do you need an agent, or a workflow?

Workflow or agent?

The distinction is sharp and it decides everything downstream. A workflow orchestrates LLM calls and tools through predefined code paths; you wrote the control flow, the model fills in the steps. An agent lets the model dynamically direct its own process, choosing tools and deciding when it is done. Workflows win when the task is well-defined and you want predictability; agents win when the path genuinely cannot be known in advance. Most "agent" projects are workflows wearing a costume, and they would be more reliable if built as one. Frameworks matter only once you have decided you are in genuine agent territory or a workflow complex enough that hand-rolled glue has become the bottleneck.

What a framework actually buys you

Strip away the marketing and a framework provides some subset of: a way to define control flow (a graph, a chain, a conversation), state that survives across steps, persistence so a long run can resume after a crash, human-in-the-loop checkpoints, and observability into what the model did and why. The cost is an abstraction layer between you and the prompt, which is exactly where agents fail. That trade is the whole decision.

The landscape, opinionated

Framework Mental model Best when
LangGraph low-level graph: nodes, edges, shared state you want explicit control and durable, stateful long-running agents
CrewAI high-level roles: agents with goals collaborate you want a multi-role team running quickly with little wiring
AutoGen conversation between agents research and multi-agent dialogue patterns
OpenAI Agents SDK lightweight loop with handoffs and guardrails a thin, mostly-OpenAI stack without heavy machinery

LangGraph is the most control-oriented: it models the application as a graph of nodes that read and write a shared state object, and prioritises orchestration over abstraction, with persistence, checkpointing, and human-in-the-loop as first-class features. You pay in verbosity and get explicit control, which is why it dominates serious production deployments. CrewAI sits at the opposite end: define agents by role and goal, and the framework handles collaboration; fast to start, harder to steer when a task does not fit the role metaphor. AutoGen frames everything as a conversation between agents, which is elegant for multi-agent research and awkward when you want deterministic control. The OpenAI Agents SDK is deliberately minimal: an agent loop plus tool handoffs and guardrails, attractive if your stack is already OpenAI-centric and you do not want a graph engine.

When it falls down

  • The abstraction hides the prompt. Most agent failures are prompt or tool-description failures (see tool-use-function-calling), and a framework that buries the assembled prompt makes them far harder to diagnose. If you cannot see the exact tokens the model received, you cannot debug it.
  • Framework lock-in is real. State models and control-flow primitives do not port between frameworks, so a migration is a rewrite. Choose for the next year, not the next demo.
  • Complexity you did not need. A graph engine wrapped around what is really a three-step prompt chain adds failure modes (state bugs, version churn) without adding capability. Start with plain code and a loop; adopt a framework when the absence of one is the thing hurting you.
  • Multi-agent overhead. Role-based and conversational frameworks make spawning many agents trivial, which tempts teams into multi-agent designs that cost more tokens and latency than a single well-prompted agent would (see agentic-react).

Further reading

  • Building Effective AI Agents - Anthropic; the workflow-vs-agent distinction and the composable-patterns argument, with the case for minimal frameworks.
  • LangGraph documentation overview - the graph-and-state model, persistence, and human-in-the-loop primitives in the most control-oriented framework.
Sign in to save and react.
Share Copied