GPT-5.4
Default
Our most capable model for professional work
Our most capable model for professional work
Reasoning
Highest
Speed
Medium
Price
$2.5•$15
Input•Output
Input
Text, image
Output
Text
GPT-5.4 is our frontier model for complex professional work. Learn more in our latest model guide. Reasoning.effort supports: none (default), low, medium, high and xhigh.
1,050,000 context window
128,000 max output tokens
Aug 31, 2025 knowledge cutoff
Reasoning token support
Pricing
Pricing is based on the number of tokens used, or other metrics based on the model type. For tool-specific models, like search and computer use, there’s a fee per tool call. See details in the pricing page.
Text tokens
Per 1M tokens
∙
Batch API price
Input
$2.50
Cached input
$0.25
Output
$15.00
Quick comparison
Input
Cached input
Output
GPT-5.4
$2.50
GPT-5.2
$1.75
GPT-5 mini
$0.25
For models with a 1.05M context window (GPT-5.4 and GPT-5.4 pro), prompts with >272K input tokens are priced at 2x input and 1.5x output for the full session for standard, batch, and flex.
Regional processing (data residency) endpoints are charged a 10% uplift for GPT-5.4 and GPT-5.4 pro.
Modalities
Text
Input and output
Image
Input only
Audio
Not supported
Video
Not supported
Endpoints
Chat Completions
v1/chat/completions
Responses
v1/responses
Realtime
v1/realtime
Assistants
v1/assistants
Batch
v1/batch
Fine-tuning
v1/fine-tuning
Embeddings
v1/embeddings
Image generation
v1/images/generations
Videos
v1/videos
Image edit
v1/images/edits
Speech generation
v1/audio/speech
Transcription
v1/audio/transcriptions
Translation
v1/audio/translations
Moderation
v1/moderations
Completions (legacy)
v1/completions
Features
Streaming
Supported
Function calling
Supported
Structured outputs
Supported
Fine-tuning
Not supported
Distillation
Supported
Tools
Tools supported by this model when using the Responses API.
Web search
Supported
File search
Supported
Image generation
Supported
Code interpreter
Supported
Hosted shell
Supported
Apply patch
Supported
Skills
Supported
Computer use
Supported
MCP
Supported
Tool search
Supported
Snapshots
Snapshots let you lock in a specific version of the model so that performance and behavior remain consistent. Below is a list of all available snapshots and aliases for GPT-5.4.
gpt-5.4
gpt-5.4-2026-03-05
gpt-5.4-2026-03-05
Rate limits
Rate limits ensure fair and reliable access to the API by placing specific caps on requests or tokens used within a given time period. Your usage tier determines how high these limits are set and automatically increases as you send more requests and spend more on the API.
| Tier | RPM | TPM | Batch queue limit |
|---|---|---|---|
| Free | Not supported | ||
| Tier 1 | 500 | 500,000 | 1,500,000 |
| Tier 2 | 5,000 | 1,000,000 | 3,000,000 |
| Tier 3 | 5,000 | 2,000,000 | 100,000,000 |
| Tier 4 | 10,000 | 4,000,000 | 200,000,000 |
| Tier 5 | 15,000 | 40,000,000 | 15,000,000,000 |