Primary navigation

Legacy APIs

Running agents

Understand the agent loop, continuation strategies, and streaming.

Defining an agent is only the setup step. The runtime questions are what a single run does, how the next turn continues, and how the workflow behaves when it pauses for approvals or tool work.

The agent loop

One SDK run is one application-level turn. The runner keeps looping until it reaches a real stopping point:

  1. Call the current agent’s model with the prepared input.
  2. Inspect the model output.
  3. If the model produced tool calls, execute them and continue.
  4. If the model handed off to another specialist, switch agents and continue.
  5. If the model produced a final answer with no more tool work, return a result.

That loop is the core concept behind the SDK. Tools, handoffs, approvals, and streaming all build on top of it rather than replacing it.

Choose one conversation strategy

There are four common ways to carry state into the next turn:

StrategyWhere state livesBest forWhat you pass on the next turn
result.historyYour applicationSmall chat loops and maximum controlThe replay-ready history
sessionYour storage plus the SDKPersistent chat state, resumable runs, and storage you controlThe same session
conversationIdOpenAI Conversations APIShared server-managed state across workers or servicesThe same conversation ID and only the new turn
previousResponseIdOpenAI Responses APIThe lightest server-managed continuation from one response to the nextThe last response ID and only the new turn

In most applications, pick one strategy per conversation. Mixing local replay with server-managed state can duplicate context unless you are deliberately reconciling both layers.

Persist multi-turn state with sessions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import { Agent, MemorySession, run } from "@openai/agents";

const agent = new Agent({
  name: "Tour guide",
  instructions: "Answer with compact travel facts.",
});

const session = new MemorySession();

const firstTurn = await run(
  agent,
  "What city is the Golden Gate Bridge in?",
  { session },
);
console.log(firstTurn.finalOutput);

const secondTurn = await run(agent, "What state is it in?", { session });
console.log(secondTurn.finalOutput);

Sessions are the best default when you want durable memory, resumable approval flows, or storage that your application controls.

Continue with server-managed state
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import { Agent, run } from "@openai/agents";
import OpenAI from "openai";

const agent = new Agent({
  name: "Assistant",
  instructions: "Reply very concisely.",
});

const client = new OpenAI();
const { id: conversationId } = await client.conversations.create({});

const first = await run(agent, "What city is the Golden Gate Bridge in?", {
  conversationId,
});
console.log(first.finalOutput);

const second = await run(agent, "What state is it in?", {
  conversationId,
});
console.log(second.finalOutput);

Use conversationId when multiple systems should share one named conversation. Use previousResponseId when you want the cheapest response-to-response continuation option.

Stream runs incrementally

Streaming uses the same agent loop and the same state strategies. The only difference is that you consume events while the run is still happening.

Stream a run as text arrives
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import { Agent, run } from "@openai/agents";

const agent = new Agent({
  name: "Planet guide",
  instructions: "Answer with short facts.",
});

const stream = await run(agent, "Give me three short facts about Saturn.", {
  stream: true,
});

for await (const event of stream) {
  if (
    event.type === "raw_model_stream_event" &&
    event.data.type === "response.output_text.delta"
  ) {
    process.stdout.write(event.data.delta);
  }
}

await stream.completed;
console.log("\nFinal:", stream.finalOutput);

Three practical rules matter:

  • Wait for the stream to finish before treating the run as settled.
  • If the run pauses for approval, resolve interruptions and resume from state rather than starting a fresh user turn.
  • If you cancel a stream mid-turn, resume the unfinished turn from state if you want the same turn to continue later.

Handle pauses and failures deliberately

Two broad classes of non-happy-path outcomes matter:

  • Runtime or validation failures such as max-turn limits, guardrail exceptions, or tool errors.
  • Expected pauses such as human approval requests, where the run is intentionally interrupted and should later resume from the same state.

Treat approvals as paused runs, not as new turns. That distinction keeps turn counts, history, and server-managed continuation IDs consistent.

Next steps

Once the runtime loop is clear, move to the guide that matches the next workflow boundary you need to design.