Cyber Safety

GPT-5.3-Codex is the first model we are treating as High cybersecurity capability under our Preparedness Framework, which requires additional safeguards. These safeguards include training the model to refuse clearly malicious requests like stealing credentials.

In addition to safety training, automated classifier-based monitors detect signals of suspicious cyber activity and route high-risk traffic to a less cyber-capable model (GPT-5.2). We expect a very small portion of traffic to be affected by these mitigations, and are working to refine our policies, classifiers, and in-product notifications.

Why we’re doing this

Over recent months, we’ve seen meaningful gains in model performance on cybersecurity tasks, benefiting both developers and security professionals. As our models improve at cybersecurity-related tasks like vulnerability discovery, we’re taking a precautionary approach: expanding protections and enforcement to support legitimate research while slowing misuse.

Cyber capabilities are inherently dual-use. The same knowledge and techniques that underpin important defensive work — penetration testing, vulnerability research, high-scale scanning, malware analysis, and threat intelligence — can also enable real-world harm.

These capabilities and techniques need to be available and easier to use in contexts where they can be used to improve security. Our Trusted Access for Cyber pilot enables individuals and organizations to continue using models for potentially high-risk cybersecurity activity without disruption.

How it works

Developers and security professionals doing cybersecurity-related work or similar activity that could be mistaken by automated detection systems may have requests rerouted to GPT-5.2 as a fallback. We expect a very small portion of traffic to affected by mitigations, and are actively working to calibrate our policies and classifiers.

The latest alpha version of the Codex CLI includes in-product messaging for when requests are rerouted. This messaging will be supported in all clients in the next few days.

Accounts impacted by mitigations can regain access to GPT-5.3-Codex by joining the Trusted Access program below.

We recognize that joining Trusted Access may not be a good fit for everyone, so we plan to move from account-level safety checks to request-level checks in most cases as we scale these mitigations and strengthen cyber resilience.

Trusted Access for Cyber

We are piloting “trusted access” which allows developers to retain advanced capabilities while we continue to calibrate policies and classifiers for general availability. Our goal is for very few users to need to join Trusted Access for Cyber.

To use models for potentially high-risk cybersecurity work:

Users can verify their identity at chatgpt.com/cyber
Enterprises can request trusted access for their entire team by default through their OpenAI representative

Security researchers and teams who may need access to even more cyber-capable or permissive models to accelerate legitimate defensive work can express interest in our invite-only program⁠. Users with trusted access must still abide by our Usage Policies⁠ and Terms of Use⁠.

False positives

Legitimate or non-cybersecurity activity may occasionally be flagged. When rerouting occurs, the responding model will be visible in API request logs and in with an in-product notice in the CLI, soon all surfaces. If you’re experiencing rerouting that you believe is incorrect, please report via /feedback for false positives.

Get started

Core concepts

Agents

Tools

Run and scale

Evaluation

Realtime API

Model optimization

Specialized models

Coding agents

Going live

Legacy APIs

Resources

Getting Started

Using Codex

Configuration

Administration

Automation

Learn

Releases

Core Concepts

Plan

Build

Deploy

Guides

Resources

Guides

Commerce specs

Product feeds

Categories

Topics

Topics

Contribute

Recent

Topics

Why we’re doing this

How it works

Trusted Access for Cyber

False positives