GPT-5.4 pro uses more compute to think harder and provide consistently better answers.
GPT-5.4 pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since GPT-5.4 pro is designed to tackle tough problems, some requests may take several minutes to finish. To avoid timeouts, try using background mode. GPT-5.4 pro supports reasoning.effort: medium, high, xhigh.
For models with a 1.05M context window (GPT-5.4 and GPT-5.4 pro), prompts with >272K input tokens are priced at 2x input and 1.5x output for the full session for standard, batch, and flex.
Regional processing (data residency) endpoints are charged a 10% uplift for GPT-5.4 and GPT-5.4 pro.
| Tier | RPM | TPM | Batch queue limit |
|---|---|---|---|
| Free | Not supported | ||
| Tier 1 | 500 | 30,000 | 90,000 |
| Tier 2 | 5,000 | 450,000 | 1,350,000 |
| Tier 3 | 5,000 | 800,000 | 50,000,000 |
| Tier 4 | 10,000 | 2,000,000 | 200,000,000 |
| Tier 5 | 10,000 | 30,000,000 | 5,000,000,000 |