Primary navigation

Scaling

Shipping with Codex

Shipping with Codex

DevDay talk on building, testing, and delivering products with Codex.

video
Rate limits guide

Rate limits guide

Guide to understanding and managing rate limits

guide
Balance accuracy, latency, and cost

Balance accuracy, latency, and cost

Talk on optimizing AI systems for accuracy, speed, and cost.

video
DevDay — optimization breakout

DevDay — optimization breakout

DevDay session discussing optimization of models and prompts.

video
Evals Best Practices

Evals Best Practices

Best practices for designing and running evals.

guide
Getting Started with Evals

Getting Started with Evals

Step-by-step guide to setting up your first eval.

guide
Graders

Graders

Guide to using graders for evaluations.

guide
Keep costs low & accuracy high

Keep costs low & accuracy high

Guide on balancing cost efficiency with model accuracy.

guide
Latency optimization guide

Latency optimization guide

Best practices for reducing model response latency.

guide
Launch apps with evaluations

Launch apps with evaluations

Video on incorporating evals when deploying AI products.

video
LLM correctness and consistency

LLM correctness and consistency

Best practices for achieving accurate and consistent model outputs.

guide
Model optimization guide

Model optimization guide

Guide on optimizing OpenAI models for performance and cost.

guide
Predicted outputs guide

Predicted outputs guide

Guide to understanding and using predicted outputs.

guide
Production best practices

Production best practices

Guide on best practices for running AI applications in production

guide
Prompt Optimizer

Prompt Optimizer

Guide to refining prompts with the Prompt Optimizer.

guide
Reinforcement fine-tuning overview

Reinforcement fine-tuning overview

Guide on reinforcement learning-based fine-tuning techniques.

guide
Working with the Evals API

Working with the Evals API

Guide to building evaluations with the Evals API.

guide
Eval Driven System Design - From Prototype to Production

Eval Driven System Design - From Prototype to Production

Cookbook for eval-driven design of a receipt parsing automation workflow.

cookbook
Reinforcement Fine-Tuning for Conversational Reasoning with the OpenAI API

Reinforcement Fine-Tuning for Conversational Reasoning with the OpenAI API

Cookbook for reinforcement fine-tuning conversational reasoning using HealthBench evaluations.

cookbook
Evals API Use-case - Responses Evaluation

Evals API Use-case - Responses Evaluation

Cookbook to evaluate new models against stored Responses API logs.

cookbook