The image generation tool allows you to generate images using a text prompt, and optionally image inputs. It leverages GPT Image models (gpt-image-1, gpt-image-1-mini, and gpt-image-1.5), and automatically optimizes text inputs for improved performance.
To learn more about image generation, refer to our dedicated image generation guide.
Usage
When you include the image_generation tool in your request, the model can decide when and how to generate images as part of the conversation, using your prompt and any provided image inputs.
The image_generation_call tool call result will include a base64-encoded image.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from openai import OpenAI
import base64
client = OpenAI()
response = client.responses.create(
model="gpt-5",
input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
tools=[{"type": "image_generation"}],
)
# Save the image to a file
image_data = [
output.result
for output in response.output
if output.type == "image_generation_call"
]
if image_data:
image_base64 = image_data[0]
with open("otter.png", "wb") as f:
f.write(base64.b64decode(image_base64))You can provide input images using file IDs or base64 data.
To force the image generation tool call, you can set the parameter tool_choice to {"type": "image_generation"}.
Tool options
You can configure the following output options as parameters for the image generation tool:
- Size: Image dimensions (e.g., 1024x1024, 1024x1536)
- Quality: Rendering quality (e.g. low, medium, high)
- Format: File output format
- Compression: Compression level (0-100%) for JPEG and WebP formats
- Background: Transparent or opaque
- Action: Whether the request should automatically choose, generate, or edit an image
size, quality, and background support the auto option, where the model will automatically select the best option based on the prompt.
For more details on available options, refer to the image generation guide.
For gpt-image-1.5 and chatgpt-image-latest when used with the Responses API, you can optionally set the action parameter (auto, generate, or edit) to control whether the request performs image generation or editing. We recommend leaving it at auto so the model chooses whether to generate a new image or edit one already in context, but if your use case requires always editing or always creating images, you can force the behavior by setting action. If not specified, the default is auto.
Revised prompt
When using the image generation tool, the mainline model (e.g. gpt-4.1) will automatically revise your prompt for improved performance.
You can access the revised prompt in the revised_prompt field of the image generation call:
1
2
3
4
5
6
7
{
"id": "ig_123",
"type": "image_generation_call",
"status": "completed",
"revised_prompt": "A gray tabby cat hugging an otter. The otter is wearing an orange scarf. Both animals are cute and friendly, depicted in a warm, heartwarming style.",
"result": "..."
}Prompting tips
Image generation works best when you use terms like “draw” or “edit” in your prompt.
For example, if you want to combine images, instead of saying “combine” or “merge”, you can say something like “edit the first image by adding this element from the second image”.
Multi-turn editing
You can iteratively edit images by referencing previous response or image IDs. This allows you to refine images across multiple turns in a conversation.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from openai import OpenAI
import base64
client = OpenAI()
response = client.responses.create(
model="gpt-5",
input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
tools=[{"type": "image_generation"}],
)
image_data = [
output.result
for output in response.output
if output.type == "image_generation_call"
]
if image_data:
image_base64 = image_data[0]
with open("cat_and_otter.png", "wb") as f:
f.write(base64.b64decode(image_base64))
# Follow up
response_fwup = client.responses.create(
model="gpt-5",
previous_response_id=response.id,
input="Now make it look realistic",
tools=[{"type": "image_generation"}],
)
image_data_fwup = [
output.result
for output in response_fwup.output
if output.type == "image_generation_call"
]
if image_data_fwup:
image_base64 = image_data_fwup[0]
with open("cat_and_otter_realistic.png", "wb") as f:
f.write(base64.b64decode(image_base64))1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import openai
import base64
response = openai.responses.create(
model="gpt-5",
input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
tools=[{"type": "image_generation"}],
)
image_generation_calls = [
output
for output in response.output
if output.type == "image_generation_call"
]
image_data = [output.result for output in image_generation_calls]
if image_data:
image_base64 = image_data[0]
with open("cat_and_otter.png", "wb") as f:
f.write(base64.b64decode(image_base64))
# Follow up
response_fwup = openai.responses.create(
model="gpt-5",
input=[
{
"role": "user",
"content": [{"type": "input_text", "text": "Now make it look realistic"}],
},
{
"type": "image_generation_call",
"id": image_generation_calls[0].id,
},
],
tools=[{"type": "image_generation"}],
)
image_data_fwup = [
output.result
for output in response_fwup.output
if output.type == "image_generation_call"
]
if image_data_fwup:
image_base64 = image_data_fwup[0]
with open("cat_and_otter_realistic.png", "wb") as f:
f.write(base64.b64decode(image_base64))Streaming
The image generation tool supports streaming partial images as the final result is being generated. This provides faster visual feedback for users and improves perceived latency.
You can set the number of partial images (1-3) with the partial_images parameter.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from openai import OpenAI
import base64
client = OpenAI()
stream = client.images.generate(
prompt="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",
model="gpt-image-1",
stream=True,
partial_images=2,
)
for event in stream:
if event.type == "image_generation.partial_image":
idx = event.partial_image_index
image_base64 = event.b64_json
image_bytes = base64.b64decode(image_base64)
with open(f"river{idx}.png", "wb") as f:
f.write(image_bytes)Supported models
The image generation tool is supported for the following models:
gpt-4ogpt-4o-minigpt-4.1gpt-4.1-minigpt-4.1-nanoo3gpt-5gpt-5-nanogpt-5.2
The model used for the image generation process is always a GPT Image model (gpt-image-1.5, gpt-image-1, or gpt-image-1-mini), but these models are not valid values for the model field in the Responses API. Use a text-capable mainline model (for example, gpt-4.1 or gpt-5) with the hosted image_generation tool.