The fastest way to truly understand AI agents is to build one yourself — and you don’t need a heavyweight framework to do it. In this tutorial you’ll build a working AI agent from scratch in Python, using nothing but the standard library and a model API. By the end you’ll have a complete, runnable agent under 80 lines, and — more importantly — you’ll understand every line of it.
The core idea: an agent is a loop
Strip away the hype and an AI agent is a simple loop. The dominant beginner-friendly pattern in 2026 is ReAct, where the agent alternates between reasoning in natural language and acting through tools. Each turn: the model decides what to do, you run any tool it asks for, you feed the result back, and you repeat until the model produces a final answer or you hit a safety limit. That’s it. Everything else — frameworks, orchestration, memory stores — is convenience layered on top of this loop.
We’ll build an agent with one tool (a calculator), short-term memory, and a guardrail, then show you how to extend it.
Build it in 6 steps
Set up your environment
You need Python 3.10+ and an API key from a model provider. Install the client library and set your key as an environment variable so it never sits in your code:
Step 1 — install and set your key (terminal)
pip install openai # or anthropic, etc. export OPENAI_API_KEY="your-key-here"
Define a tool
A tool is just a plain Python function the agent can call. Here’s a simple, safe calculator. Note the input validation — never blindly eval() untrusted input in real projects; this version only allows math characters:
Step 2 — define a tool (a plain Python function)
def calculator(expression: str) -> str:
"""Safely evaluate a basic math expression."""
try:
# Only allow digits and basic math operators
allowed = set("0123456789+-*/(). ")
if not set(expression) <= allowed:
return "Error: invalid characters."
return str(eval(expression))
except Exception as e:
return f"Error: {e}"
TOOLS = {"calculator": calculator}
Write the reasoning loop
This is the heart of the agent: the ReAct loop. We ask the model what to do; if it asks for a tool, we run it and feed the result back; if it gives an answer, we stop. We tell the model to reply in JSON so we can parse its intent reliably:
Step 3–5 — the agent: reasoning loop + memory + guardrails
import json
from openai import OpenAI
client = OpenAI() # reads OPENAI_API_KEY from your environment
SYSTEM_PROMPT = """You are a helpful agent.
You can use one tool: calculator(expression).
To use it, reply with ONLY this JSON: {"tool": "calculator", "input": "2+2"}
When you have the final answer, reply with: {"answer": "..."}"""
def run_agent(question, max_steps=5):
# memory: a running list of messages
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": question},
]
for step in range(max_steps): # guardrail: never loop forever
reply = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
).choices[0].message.content
messages.append({"role": "assistant", "content": reply})
try:
action = json.loads(reply)
except json.JSONDecodeError:
return reply # model answered in plain text
if "answer" in action: # stopping condition
return action["answer"]
if action.get("tool") in TOOLS: # the "act" step
result = TOOLS[action["tool"]](action["input"])
# feed the observation back into memory
messages.append({"role": "user",
"content": f"Tool result: {result}"})
return "Stopped: reached the step limit."
Add memory
Memory here is simply the running messages list — every reasoning step, tool call, and result gets appended, so the agent always has the full context of the task so far. For agents that must remember across sessions, you’d later swap this for a vector store.
Add guardrails
The single most important line is max_steps. Without a step limit, a confused agent can loop forever and run up a bill. Our loop also degrades gracefully: if the model replies in plain text instead of JSON, we just return it rather than crashing.
Run and test it
Call run_agent() with a question that needs the tool. Watch it reason, call the calculator, read the result, and return the final answer:
Step 6 — run it
print(run_agent("What is 1234 * 9 plus 17?"))
# The agent calls the calculator tool, reads the result,
# then returns the final answer.
The loop, visualized
The complete code
Here’s the whole agent in one file. Copy it into agent.py, set your API key, and run python agent.py:
agent.py — the complete, runnable file
import json
from openai import OpenAI
client = OpenAI()
def calculator(expression: str) -> str:
allowed = set("0123456789+-*/(). ")
if not set(expression) <= allowed:
return "Error: invalid characters."
try:
return str(eval(expression))
except Exception as e:
return f"Error: {e}"
TOOLS = {"calculator": calculator}
SYSTEM_PROMPT = """You are a helpful agent.
You can use one tool: calculator(expression).
To use it, reply with ONLY this JSON: {"tool": "calculator", "input": "2+2"}
When you have the final answer, reply with: {"answer": "..."}"""
def run_agent(question, max_steps=5):
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": question},
]
for _ in range(max_steps):
reply = client.chat.completions.create(
model="gpt-4o-mini", messages=messages
).choices[0].message.content
messages.append({"role": "assistant", "content": reply})
try:
action = json.loads(reply)
except json.JSONDecodeError:
return reply
if "answer" in action:
return action["answer"]
if action.get("tool") in TOOLS:
result = TOOLS[action["tool"]](action["input"])
messages.append({"role": "user",
"content": f"Tool result: {result}"})
return "Stopped: reached the step limit."
if __name__ == "__main__":
print(run_agent("What is 1234 * 9 plus 17?"))
That’s a genuine, working AI agent — reasoning, tool use, memory, a stopping condition, and a guardrail — in well under 80 lines and with no framework.
How to extend it
Once the basics work, here’s how this same skeleton grows into something genuinely useful:
- Add more tools. Drop new functions into the
TOOLSdictionary — web search, a weather API, a database query — and describe them in the system prompt. (See how to write a system prompt for an agent.) - Upgrade memory. Swap the in-memory
messageslist for a vector store so the agent recalls past sessions. - Harden it for production. Add output validation, error handling around tool calls, and approval gates for risky actions. (See how to stop your agent from failing.)
- Graduate to a framework. When projects get bigger, LangChain and similar tools handle scheduling, tracing, and tool management for you — but now you know what they’re doing under the hood.
The reason building from scratch is worth it: when you understand this loop, every agent product and framework suddenly makes sense. You can read our reviews of production agents and recognize exactly what’s happening inside them.
Frequently asked questions
Can I build an AI agent from scratch in Python without a framework?
What do I need to build a Python AI agent?
How many lines of code is a basic AI agent?
Should beginners use LangChain or build from scratch?
Further Reading
- What Are the 7 Types of AI Agents? (2026 Guide With Examples)
- AI Agents for CRM: How Autonomous Agents Replace Manual Data Entry
- AI Agents vs Chatbots: What's the Difference? (2026 Guide)
- Why Do 85% of AI Projects Fail? (2026 Data + How to Be in the 15%)
- How to Build a WhatsApp AI Booking Bot With No Code (2026 Guide)
