Projects with the OpenAI Agents SDK

Learn to build autonomous AI apps with OpenAI's Agents SDK. Create goal-oriented systems like multi-agent coordination for tutors, weather, and travel agents.

Building Intelligent Applications with OpenAI’s Agents SDK

One of the most exciting developments in AI recently has been OpenAI’s Agents SDK. This powerful framework allows developers to create intelligent, autonomous agents that can perform complex tasks with minimal human intervention. At a recent OpenAI Application Explorers meetup, Godfrey Nolan, President at RIIS, showed off a wide array of tasks you could tackle with agents to the group.

In this article based on that presentation, we’ll explore what makes the Agents SDK special, how it differs from traditional LLMs, and demonstrate practical examples of how you can leverage this technology to build sophisticated AI applications such as homework tutors, weather services, and a custom travel agent. As usual you can follow the video or the written version below.

Understanding the Difference: LLMs vs Agents

What sets the Agents SDK apart is its ability to not just respond to user queries but to take action and achieve specific goals. This makes it particularly well-suited for building applications like virtual assistants, travel agents, customer service bots, and more.

To understand why agents represent such a significant step forward, it’s important to distinguish between traditional Large Language Models (LLMs) and agent-based systems:

LLMs:

Conversational only

Wait for user input
Have a simple memory model

Agents:

Goal-oriented
Feature looping capability to refine their approach
Remember context throughout the interaction
Are autonomous
Can take action

The fundamental difference is that while an LLM waits for a user to ask a question and then provides a response, an agent actively works to achieve a specific goal. An agent can leverage tools to search the web, access APIs, use computer functions, or even control physical devices to accomplish its tasks.

Here’s a simple diagram of how agents operate:

User asks a question

The LLM processes the question
If the LLM needs external information, it activates its tools and tasks
The agent uses tools (web search, API calls, file search, computer use)
The tools provide observations back to the LLM
The LLM formulates a response based on these observations
The agent delivers the response to the user

This loop represents the essence of what makes agents so powerful - they can continuously work toward a goal by leveraging external resources until the task is complete.

Hello World with Agents SDK

Let’s start with a basic example to demonstrate how to create a simple agent using the OpenAI Agents SDK. This “Hello World” program will ask the agent to write a haiku about recursion in programming:

from agents import Agent, Runner
agent = Agent(name="Assistant", instructions="You are a helpful assistant")
result = Runner.run_sync(agent, "Write a haiku about recursion in programming.")
print(result.final_output)

This code demonstrates the minimal requirements for creating an agent. We import the necessary modules, define an agent with a name and instructions, run the agent with a specific input, and print the response.

To use this code, you’ll need to install the OpenAI Agents SDK and set up your environment:

Create a virtual environment
Install the SDK: pip install openai-agents
Set your OpenAI API key in your environment variables

When executed, this simple program will return a haiku about recursion, demonstrating the agent’s basic capabilities. Not a bad haiku for its first try. You can go back into the backend of OpenAI under Dashboard>Logs, you can find the call we just executed.

Additionally, OpenAI has now made it easier to ‘trace’ issues for our agent workflows under Dashboard>Traces.

Given the complexity of additional calls and failure points of an agent, this is an invaluable tool for debugging.

As you can see, the breakdown is very similar to the Network tab in most browser inspection tools, complete with call times. Let’s look at a failure example:

Here, you can easily see that it broke at get_flight, after being handed off to the ‘Airline agent'.

Exploring Agent Components

At its most basic level, an agent consists of several key components:

Instructions: Guidelines that define the agent’s purpose and behavior
Models: Core intelligence capable of reasoning, making decisions, and processing different modalities
Tools: External capabilities the agent can use to accomplish tasks
Handoffs: The ability to transfer tasks between specialized agents
Guardrails: Safety measures to prevent unwanted behaviors
Tracing/Logging: Monitoring capabilities to understand agent actions
Orchestration: Develop, deploy, monitor, and improve agents.
Voice Agents and Audio and Speech: Create agents that can understand audio and respond back in natural language
Knowledge and Memory: Augment agents with external persistent knowledge
Computer Use: Have the agent control your computer, like navigating directories and files.

Let’s take a closer look at guardrails, because they are unique and important for not just stopping bad language or low-risk things of that nature. If you are using agents to purchase items or make important business decisions on your behalf, these will be important for stopping the agent workflow. If you’ve ever experienced unexpected charges from an AWS instance for going over capacity, you know the anxiety of an unexpected bill from an automated system. Guardrails help prevent scenarios like that with your agents.

Building a Multi-Agent System with Guardrails

Let’s examine a more complex example than Hello World. We’ll create a homework tutoring system with specialized agents for different subjects. This example will demonstrate multiple agents, handoffs, guardrails, and structured output types.

We’re going to implement both a Math Tutor and History Tutor agent types. The Guardrails in this case are going to check if the user is asking an actual homework question. The Triage Agent is going to direct the user question to the appropriate Tutor or Guardrail. It’s important to note, that guardrails are still interpreted by the LLM, so it’s important to be as explicit as possible when setting them up.

First, we’ll define our data model for the guardrail’s output:

from agents import Agent, InputGuardrail, GuardrailFunctionOutput, Runner
from pydantic import BaseModel
import asyncio

class HomeworkOutput(BaseModel):
    is_homework: bool    
    reasoning: str

Next, we’ll create our specialized agents - a guardrail agent to ensure queries are about homework, and two tutor agents for different subjects:

guardrail_agent = Agent(
    name="Guardrail check",
    instructions="Check if the user is asking about homework.",
    output_type=HomeworkOutput,
)

math_tutor_agent = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
)

history_tutor_agent = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries. Explain important events and context clearly.",
)

Then we’ll create our guardrail function and triage agent:

async def homework_guardrail(ctx, agent, input_data):
    result = await Runner.run(guardrail_agent, input_data, context=ctx.context)
    final_output = result.final_output_as(HomeworkOutput)
    return GuardrailFunctionOutput(
        output_info=final_output,
        tripwire_triggered=not final_output.is_homework,
    )
    
triage_agent = Agent(
    name="Triage Agent",
    instructions="You determine which agent to use based on the user's homework question",
    handoffs=[history_tutor_agent, math_tutor_agent],
    input_guardrails=[
        InputGuardrail(guardrail_function=homework_guardrail),
    ],
)

Finally, we’ll create our main function to run the system:

async def main():
    result = await Runner.run(triage_agent, "who was the first president of the united states?")
    print(result.final_output)
    
    #result = await Runner.run(triage_agent, "what is life")    
    #print(result.final_output)
    
if __name__ == "__main__":
  asyncio.run(main())
In this system

The user asks a question (e.g., “Who was the first president of the United States?”)

The triage agent receives the question and checks if it’s homework-related using the guardrail
If it passes the guardrail, the triage agent determines whether to send it to the math or history tutor
The appropriate tutor responds to the question

We’ve left in failure proof in the commented out lines. The guardrail should be triggered by the vague philosophical question, “what is life”, if the workflow is working as intended. It’s easy to see that this guardrail prevents irrelevant questions from consuming resources and ensures the system stays focused on its intended purpose.

Remember when we brought up traces earlier? You can see how, if you add complexity to your agents, the traceback becomes invaluable. Here above, we can see that the Guardrail check is triggered before the Triage agent hands the task off to the History tutor.

Tools: Connecting Agents to the World

One of the most powerful features of the Agents SDK is the ability to give agents access to external tools. OpenAI supports several types of tools:

Function Calling: Allows agents to call functions that can interact with APIs
Web Search: Enables agents to search the Internet for information
File Search/Embeddings: Lets agents access and query documents and files
Computer Use: Allows agents to control a computer (mouse, keyboard, etc.)

Let’s look at a practical example of function calling by creating a weather agent that can access real-time weather data:

import asyncio
import requests
import os
from agents import Agent, Runner, function_tool


@function_tool
def get_weather(city: str) -> str:
    api_key = os.getenv("WEATHER_API_KEY")
    url = (
        "https://api.openweathermap.org/data/2.5/weather"        
        f"?q={city}&appid={api_key}&units=imperial"    
    )
    resp = requests.get(url)
    data = resp.json()
    return {
        "city": data["name"],
        "temp_celsius": data["main"]["temp"],
        "conditions": data["weather"][0]["description"],
        "humidity": data["main"]["humidity"],
    }

This function allows our agent to call the OpenWeatherMap API to get current weather information. The @function_tool decorator makes this function available to our agents.

Now we can create a multilingual weather system with specialized agents:

agent = Agent(
    name="Weather agent",
    instructions="You can only provide weather information.",
    tools=[get_weather],
)
spanish_agent = Agent(
    name="Spanish agent",
    instructions="You only speak Spanish.",
    tools=[get_weather]
)
english_agent = Agent(
    name="English agent",
    instructions="You only speak English",
    tools=[get_weather]
)
triage_agent = Agent(
    name="Triage agent",
    instructions="Handoff to the appropriate agent based on the language of the request.",
    handoffs=[spanish_agent, english_agent]
)

In this system:

The triage agent determines the language of the user’s query
It hands off to either the Spanish or English agent
The appropriate agent calls the OpenWeatherMap API to get current conditions
The OpenWeatherMap API response is put into a structured object
The agent responds in the user’s preferred language with up-to-date weather information using the structured data as its reference

Now let’s call that all in our main():

async def main():
    result = await Runner.run(triage_agent, input="Hola, ¿cómo estás? ¿Puedes darme el clima para Detroit?")
    # result = await Runner.run(triage_agent, input="What is the weather like in Detroit?")
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

For testing purposes, we’ve included two prompt inputs, one in Spanish, and one in English. If everything is working correctly, the appropriate language agent should respond to you with the weather in Detroit. If you get a response in the wrong language, remember, you can always review the trace.

This example demonstrates how agents can use external APIs to access real-time data that isn’t included in their training data, making them much more useful for practical applications, but this is just the beginning of what’s possible.

Building a Travel Agent

Taking the concept further, we can build a more complex application like a virtual travel agent. It can be tedious sifting through the available flights and hotels when you know you have certain preferred destinations and airlines, and tolerances for accommodations and prices. This example combines multiple capabilities to create a system that can help users plan trips by searching for flights, finding accommodations, and creating itineraries, removing all that busy work.

First, we’ll set up our connections to travel APIs:

import os
import asyncio
from agents import Agent, Runner, function_tool, WebSearchTool
from amadeus import Client, ResponseError
from dotenv import load_dotenv
load_dotenv()
amadeus = Client(
    client_id=os.getenv("AMADEUS_CLIENT_ID"),
    client_secret=os.getenv("AMADEUS_CLIENT_SECRET")
)
CITY_TO_IATA = {
    "Houston": "IAH",
    "Detroit": "DTW",
    "New York": "JFK",
    "Los Angeles": "LAX",
    "Paris": "PAR",
    # …
    
}

We will be using Amadeus as our all-in-one travel API. There are others, and those that are specific to hotels, airlines, and other services, but for example, Amadeus will cover all our bases. So, make sure to get your API key to test this example.

Next, we’ll create functions to access flight and hotel information:

@function_tool
def get_flight(city: str) -> str:
    dest_code = CITY_TO_IATA.get(city)
    response = amadeus.shopping.flight_offers_search.get(
        originLocationCode='DTW',
        destinationLocationCode={dest_code},
        departureDate='2025-05-19',
        adults=1)
    first_offer = response.data[0]  # Get the first flight result    
    return {
            "carrier": first_offer['validatingAirlineCodes'][0],
            "departure": first_offer['itineraries'][0]['segments'][0]['departure']['at'],
            "arrival": first_offer['itineraries'][0]['segments'][1]['arrival']['at'],
            "duration": first_offer['itineraries'][0]['duration'],
            "cost": first_offer['price']['grandTotal']
    }
@function_tool
def get_hotel(city: str) -> str:
    dest_code = CITY_TO_IATA.get(city)
    response = amadeus.reference_data.locations.hotels.by_city.get(cityCode=dest_code)
    return {
        "hotelName": response.data[0]['chainCode'],
        "hotelAddress": response.data[0]['name']
    }

These functions come directly from the amadeus Github, which has some fantastic examples to pull from.

Then we’ll create specialized agents for different travel tasks:

agent = Agent(
    name="Travel agent",
    instructions="You can only provide flights and hotel information.",
    tools=[get_hotel, get_flight],
)
hotel_agent = Agent(
    name="Hotel agent",
    instructions="You are a helpful hotel booking assistant only. You know nothing about flight information.",
    tools=[get_hotel]
)
airline_agent = Agent(
    name="Airline agent",
    instructions="You only a helpful flight assistant. You know nothing about booking hotels.",
    tools=[get_flight]
)
itinerary_agent = Agent(
    name="Itinerary agent",
    instructions="You only a helpful assistant that creates itineraries for visits to cities. You know nothing about booking hotels or flights.",
    tools=[WebSearchTool()]
)
triage_agent = Agent(
    name="Triage agent",
    instructions="Handoff to the appropriate agent based on the whether they want flight information or hotel booking information",
    handoffs=[airline_agent, hotel_agent, itinerary_agent]
)

Let’s enumerate all our agents. Like before, we have our triage agent, who will be our switchboard router for the others. Our hotel and airline agents do what you’d expect them to, focusing on their specific domain and calling the get_flight and get_hotel functions from above for booking purposes. We have a generalized agent that can only provide hotel and flight information, but doesn’t handle booking. Then we have an itinerary agent who can plan a trip with all of the important details. Our itinerary agent utilizes the OpenAI WebSearchTool() which will search the web to bolster the information.

The system can respond to queries like “Please can you give me flight information from DTW to IAH?”, “Where can I stay in Paris?”, or “Give me an itinerary for a weekend trip in Dublin?” by activating the appropriate specialized agent and returning relevant information.

Let’s finish this off with our main() again:

async def main():
    result = await Runner.run(triage_agent, input="Please can you give me flight information from DTW to IAH?")
    # result = await Runner.run(triage_agent, input="Where can I stay in Paris?")
    # result = await Runner.run(triage_agent, input="Give me an itinerary for a weekend trip in Dublin?")

    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

Great, so we can see in the response that it is giving us flight information from DTW to IAH.

If we move to our next commented out result , we should get a response more like this instead:

You can stay at the **RT Hotel** located at **Mercure Paris Lafayette**

This is really just the beginning. Currently this tool is only pulling back the first answer, but we can set up our agent directives so that they pull multiple possible answers, and even possibly spin off new agents to make selections based on criteria and guardrails.

The Web Search Tool

One of the most powerful tools available in the Agents SDK is the WebSearchTool(). This tool allows agents to search the internet for information, greatly expanding their capabilities beyond their training data.

For example, our itinerary agent can use the web search tool to find the latest information about attractions in Dublin:

result = await Runner.run(triage_agent, input="Give me an itinerary for a weekend trip in Dublin?")

When processing this request, the itinerary agent can search for current information about popular activities, operating hours of museums, local events, and more. This real-time data access makes the agent much more useful for planning purposes.

That looks like an event-filled day.

So you can see that you can start building teams of agents in a category that essentially act as a single assistant for you across that domain.

Conclusion

After completing this tutorial, you've gained a comprehensive understanding of OpenAI's Agents SDK and how it transforms traditional LLMs into goal-oriented autonomous systems. You've learned to distinguish between LLMs and agents, built basic and complex multi-agent systems with specialized roles, guardrails, and API integrations. From homework tutors to multilingual weather services and travel planning systems, you now have the knowledge to leverage external tools, implement handoffs between agents, and use web search capabilities to create powerful AI applications. Take this foundation and build something amazing that solves real problems in your domain.