GPT-5.4 pro Model | OpenAI API

Models

GPT-5.4 pro

Default

Version of GPT-5.4 that produces smarter and more precise responses.

Reasoning

Highest

Speed

Slowest

Price

$30•$180

Input•Output

Input

Text, image

Output

Text

GPT-5.4 pro uses more compute to think harder and provide consistently better answers.

GPT-5.4 pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since GPT-5.4 pro is designed to tackle tough problems, some requests may take several minutes to finish. To avoid timeouts, try using background mode. GPT-5.4 pro supports reasoning.effort: medium, high, xhigh.

1,050,000 context window

128,000 max output tokens

Aug 31, 2025 knowledge cutoff

Reasoning token support

Pricing

Pricing is based on the number of tokens used, or other metrics based on the model type. For tool-specific models, like search and computer use, there’s a fee per tool call. See details in the pricing page.

Text tokens

Per 1M tokens

∙

Batch API price

Input

$30.00

Output

$180.00

Quick comparison

Input

Output

GPT-5.4 pro

$30.00

o3-pro

$20.00

GPT-5.4

$2.50

For models with a 1.05M context window (GPT-5.4 and GPT-5.4 pro), prompts with >272K input tokens are priced at 2x input and 1.5x output for the full session for standard, batch, and flex.

Regional processing (data residency) endpoints are charged a 10% uplift for GPT-5.4 and GPT-5.4 pro.

Modalities

Text

Input and output

Image

Input only

Audio

Not supported

Video

Not supported

Endpoints

Chat Completions

v1/chat/completions

Responses

v1/responses

Realtime

v1/realtime

Assistants

v1/assistants

Batch

v1/batch

Fine-tuning

v1/fine-tuning

Embeddings

v1/embeddings

Image generation

v1/images/generations

Videos

v1/videos

Image edit

v1/images/edits

Speech generation

v1/audio/speech

Transcription

v1/audio/transcriptions

Translation

v1/audio/translations

Moderation

v1/moderations

Completions (legacy)

v1/completions

Features

Streaming

Supported

Function calling

Supported

Structured outputs

Not supported

Fine-tuning

Not supported

Distillation

Not supported

Tools

Tools supported by this model when using the Responses API.

Web search

Supported

File search

Supported

Image generation

Supported

Code interpreter

Not supported

Hosted shell

Not supported

Apply patch

Supported

Skills

Not supported

Computer use

Supported

MCP

Supported

Tool search

Supported

Snapshots

Snapshots let you lock in a specific version of the model so that performance and behavior remain consistent. Below is a list of all available snapshots and aliases for GPT-5.4 pro.

gpt-5.4-pro

gpt-5.4-pro-2026-03-05

Rate limits

Rate limits ensure fair and reliable access to the API by placing specific caps on requests or tokens used within a given time period. Your usage tier determines how high these limits are set and automatically increases as you send more requests and spend more on the API.

Tier	RPM	TPM	Batch queue limit
Free	Not supported
Tier 1	500	30,000	90,000
Tier 2	5,000	450,000	1,350,000
Tier 3	5,000	800,000	50,000,000
Tier 4	10,000	2,000,000	200,000,000
Tier 5	10,000	30,000,000	5,000,000,000

Search the API docs

Get started

Core concepts

Agents

Tools

Run and scale

Evaluation

Realtime API

Model optimization

Specialized models

Going live

Legacy APIs

Resources

Getting Started

Using Codex

Configuration

Administration

Automation

Learn

Community

Releases

Core Concepts

Plan

Build

Deploy

Guides

Resources

Guides

Commerce specs

Product feeds

Topics

Contribute

Recent

Topics

Categories

Topics