Build AI Agents From Scratch With Python (2026)

The fastest way to truly understand AI agents is to build one yourself — and you don’t need a heavyweight framework to do it. In this tutorial you’ll build a working AI agent from scratch in Python, using nothing but the standard library and a model API. By the end you’ll have a complete, runnable agent under 80 lines, and — more importantly — you’ll understand every line of it.

The core idea: an agent is a loop

Strip away the hype and an AI agent is a simple loop. The dominant beginner-friendly pattern in 2026 is ReAct, where the agent alternates between reasoning in natural language and acting through tools. Each turn: the model decides what to do, you run any tool it asks for, you feed the result back, and you repeat until the model produces a final answer or you hit a safety limit. That’s it. Everything else — frameworks, orchestration, memory stores — is convenience layered on top of this loop.

We’ll build an agent with one tool (a calculator), short-term memory, and a guardrail, then show you how to extend it.

Build it in 6 steps

Set up your environment

You need Python 3.10+ and an API key from a model provider. Install the client library and set your key as an environment variable so it never sits in your code:

Step 1 — install and set your key (terminal)

pip install openai          # or anthropic, etc.
export OPENAI_API_KEY="your-key-here"

Define a tool

A tool is just a plain Python function the agent can call. Here’s a simple, safe calculator. Note the input validation — never blindly eval() untrusted input in real projects; this version only allows math characters:

Step 2 — define a tool (a plain Python function)

def calculator(expression: str) -> str:
    """Safely evaluate a basic math expression."""
    try:
        # Only allow digits and basic math operators
        allowed = set("0123456789+-*/(). ")
        if not set(expression) <= allowed:
            return "Error: invalid characters."
        return str(eval(expression))
    except Exception as e:
        return f"Error: {e}"

TOOLS = {"calculator": calculator}

Write the reasoning loop

This is the heart of the agent: the ReAct loop. We ask the model what to do; if it asks for a tool, we run it and feed the result back; if it gives an answer, we stop. We tell the model to reply in JSON so we can parse its intent reliably:

Step 3–5 — the agent: reasoning loop + memory + guardrails

import json
from openai import OpenAI

client = OpenAI()  # reads OPENAI_API_KEY from your environment

SYSTEM_PROMPT = """You are a helpful agent.
You can use one tool: calculator(expression).
To use it, reply with ONLY this JSON: {"tool": "calculator", "input": "2+2"}
When you have the final answer, reply with: {"answer": "..."}"""

def run_agent(question, max_steps=5):
    # memory: a running list of messages
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": question},
    ]

    for step in range(max_steps):          # guardrail: never loop forever
        reply = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
        ).choices[0].message.content

        messages.append({"role": "assistant", "content": reply})

        try:
            action = json.loads(reply)
        except json.JSONDecodeError:
            return reply                    # model answered in plain text

        if "answer" in action:              # stopping condition
            return action["answer"]

        if action.get("tool") in TOOLS:     # the "act" step
            result = TOOLS[action["tool"]](action["input"])
            # feed the observation back into memory
            messages.append({"role": "user",
                             "content": f"Tool result: {result}"})

    return "Stopped: reached the step limit."

Add memory

Memory here is simply the running messages list — every reasoning step, tool call, and result gets appended, so the agent always has the full context of the task so far. For agents that must remember across sessions, you’d later swap this for a vector store.

Add guardrails

The single most important line is max_steps. Without a step limit, a confused agent can loop forever and run up a bill. Our loop also degrades gracefully: if the model replies in plain text instead of JSON, we just return it rather than crashing.

Run and test it

Call run_agent() with a question that needs the tool. Watch it reason, call the calculator, read the result, and return the final answer:

Step 6 — run it

print(run_agent("What is 1234 * 9 plus 17?"))
# The agent calls the calculator tool, reads the result,
# then returns the final answer.

The loop, visualized

The ReAct agent loop in codeThe ReAct agent loop in codeAsk modelwhat to do?Parse intenttool or answer?Run toolfeed result backAnswer / capstop when done

Figure 1: the exact loop our Python code implements — ask, parse, act, repeat, with a hard step cap.

The complete code

Here’s the whole agent in one file. Copy it into agent.py, set your API key, and run python agent.py:

agent.py — the complete, runnable file

import json
from openai import OpenAI

client = OpenAI()

def calculator(expression: str) -> str:
    allowed = set("0123456789+-*/(). ")
    if not set(expression) <= allowed:
        return "Error: invalid characters."
    try:
        return str(eval(expression))
    except Exception as e:
        return f"Error: {e}"

TOOLS = {"calculator": calculator}

SYSTEM_PROMPT = """You are a helpful agent.
You can use one tool: calculator(expression).
To use it, reply with ONLY this JSON: {"tool": "calculator", "input": "2+2"}
When you have the final answer, reply with: {"answer": "..."}"""

def run_agent(question, max_steps=5):
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": question},
    ]
    for _ in range(max_steps):
        reply = client.chat.completions.create(
            model="gpt-4o-mini", messages=messages
        ).choices[0].message.content
        messages.append({"role": "assistant", "content": reply})
        try:
            action = json.loads(reply)
        except json.JSONDecodeError:
            return reply
        if "answer" in action:
            return action["answer"]
        if action.get("tool") in TOOLS:
            result = TOOLS[action["tool"]](action["input"])
            messages.append({"role": "user",
                             "content": f"Tool result: {result}"})
    return "Stopped: reached the step limit."

if __name__ == "__main__":
    print(run_agent("What is 1234 * 9 plus 17?"))

That’s a genuine, working AI agent — reasoning, tool use, memory, a stopping condition, and a guardrail — in well under 80 lines and with no framework.

Want the no-code path instead?Our beginner guide builds an agent without writing any Python.

Learn more →

How to extend it

Once the basics work, here’s how this same skeleton grows into something genuinely useful:

Add more tools. Drop new functions into the TOOLS dictionary — web search, a weather API, a database query — and describe them in the system prompt. (See how to write a system prompt for an agent.)
Upgrade memory. Swap the in-memory messages list for a vector store so the agent recalls past sessions.
Harden it for production. Add output validation, error handling around tool calls, and approval gates for risky actions. (See how to stop your agent from failing.)
Graduate to a framework. When projects get bigger, LangChain and similar tools handle scheduling, tracing, and tool management for you — but now you know what they’re doing under the hood.

The reason building from scratch is worth it: when you understand this loop, every agent product and framework suddenly makes sense. You can read our reviews of production agents and recognize exactly what’s happening inside them.

Frequently asked questions

Can I build an AI agent from scratch in Python without a framework?

Yes. An agent is a loop that calls a model, lets it choose a tool, runs the tool, feeds the result back, and repeats until done. Plain Python handles that; frameworks just add convenience.

What do I need to build a Python AI agent?

Python 3.10+, a model API key, the ReAct loop, one or two tools, simple memory (a message list), and a guardrail like a max-step limit.

How many lines of code is a basic AI agent?

A minimal but real agent is roughly 40–80 lines, including a tool, the loop, and a step limit — like the complete example in this tutorial.

Should beginners use LangChain or build from scratch?

Build a small one from scratch first to see the moving parts. Once the loop clicks, frameworks like LangChain save time on larger projects.

The OneAppleFall Team

We independently test every AI agent and tool we review — on our own dime, on real work. We never accept payment for a score, and we disclose affiliate links clearly. Read our review methodology →

Build AI Agents From Scratch With Python: A Working Tutorial (2026)

The core idea: an agent is a loop

Build it in 6 steps

Set up your environment

Define a tool

Write the reasoning loop

Add memory

Add guardrails

Run and test it

The loop, visualized

The complete code

How to extend it

Frequently asked questions

Further Reading

Leave a comment Cancel

Build AI Agents From Scratch With Python: A Working Tutorial (2026)

The core idea: an agent is a loop

Build it in 6 steps

Set up your environment

Define a tool

Write the reasoning loop

Add memory

Add guardrails

Run and test it

The loop, visualized

The complete code

How to extend it

Frequently asked questions

Further Reading

Related Articles

LLM API Pricing Explained: What You’ll Actually Pay in 2026

How to Fine-Tune an LLM in 2026 (Without Wasting Money)

How to Stop Your AI Agent From Failing or Hallucinating (2026 Fixes)

Leave a comment Cancel