Create image

images.generate(**kwargs) -> ImagesResponse { created, background, data, 4 more }

POST/images/generations

Creates an image given a prompt. Learn more.

ParametersExpand Collapse

prompt: String

A text description of the desired image(s). The maximum length is 32000 characters for the GPT image models, 1000 characters for dall-e-2 and 4000 characters for dall-e-3.

background: :transparent | :opaque | :auto

Allows to set transparency for the background of the generated image(s). This parameter is only supported for GPT image models that support transparent backgrounds. Must be one of transparent, opaque, or auto (default value). When auto is used, the model will automatically determine the best background for the image.

gpt-image-2 and gpt-image-2-2026-04-21 do not support transparent backgrounds. Requests with background set to transparent will return an error for these models; use opaque or auto instead.

If transparent, the output format needs to support transparency, so it should be set to either png (default value) or webp.

One of the following:

:transparent

:opaque

:auto

model: String | ImageModel

The model to use for image generation. One of dall-e-2, dall-e-3, or a GPT image model (gpt-image-1, gpt-image-1-mini, gpt-image-1.5, gpt-image-2, or gpt-image-2-2026-04-21). Defaults to dall-e-2 unless a parameter specific to the GPT image models is used.

One of the following:

String = String

ImageModel = :"gpt-image-1" | :"gpt-image-1-mini" | :"gpt-image-2" | 5 more

One of the following:

:"gpt-image-1"

:"gpt-image-1-mini"

:"gpt-image-2"

:"gpt-image-2-2026-04-21"

:"gpt-image-1.5"

:"chatgpt-image-latest"

:"dall-e-2"

:"dall-e-3"

moderation: :low | :auto

Control the content-moderation level for images generated by the GPT image models. Must be either low for less restrictive filtering or auto (default value).

One of the following:

:low

:auto

n: Integer

The number of images to generate. Must be between 1 and 10. For dall-e-3, only n=1 is supported.

minimum1

maximum10

output_compression: Integer

The compression level (0-100%) for the generated images. This parameter is only supported for the GPT image models with the webp or jpeg output formats, and defaults to 100.

output_format: :png | :jpeg | :webp

The format in which the generated images are returned. This parameter is only supported for the GPT image models. Must be one of png, jpeg, or webp.

One of the following:

:png

:jpeg

:webp

partial_images: Integer

The number of partial images to generate. This parameter is used for streaming responses that return partial images. Value must be between 0 and 3. When set to 0, the response will be a single image sent in one streaming event.

Note that the final image may be sent before the full number of partial images are generated if the full image is generated more quickly.

maximum3

minimum0

quality: :standard | :hd | :low | 3 more

The quality of the image that will be generated.

auto (default value) will automatically select the best quality for the given model.
high, medium and low are supported for the GPT image models.
hd and standard are supported for dall-e-3.
standard is the only option for dall-e-2.

One of the following:

:standard

:hd

:low

:medium

:high

:auto

response_format: :url | :b64_json

The format in which generated images with dall-e-2 and dall-e-3 are returned. Must be one of url or b64_json. URLs are only valid for 60 minutes after the image has been generated. This parameter isn’t supported for the GPT image models, which always return base64-encoded images.

One of the following:

:url

:b64_json

size: String | :auto | :"1024x1024" | :"1536x1024" | 5 more

The size of the generated images. For gpt-image-2 and gpt-image-2-2026-04-21, arbitrary resolutions are supported as WIDTHxHEIGHT strings, for example 1536x864. Width and height must both be divisible by 16 and the requested aspect ratio must be between 1:3 and 3:1. Resolutions above 2560x1440 are experimental, and the maximum supported resolution is 3840x2160. The requested size must also satisfy the model’s current pixel and edge limits. The standard sizes 1024x1024, 1536x1024, and 1024x1536 are supported by the GPT image models; auto is supported for models that allow automatic sizing. For dall-e-2, use one of 256x256, 512x512, or 1024x1024. For dall-e-3, use one of 1024x1024, 1792x1024, or 1024x1792.

One of the following:

String = String

Size = :auto | :"1024x1024" | :"1536x1024" | 5 more

One of the following:

:auto

:"1024x1024"

:"1536x1024"

:"1024x1536"

:"256x256"

:"512x512"

:"1792x1024"

:"1024x1792"

stream: bool

Generate the image in streaming mode. Defaults to false. See the Image generation guide for more information. This parameter is only supported for the GPT image models.

style: :vivid | :natural

The style of the generated images. This parameter is only supported for dall-e-3. Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images.

One of the following:

:vivid

:natural

user: String

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more.

ReturnsExpand Collapse

class ImagesResponse { created, background, data, 4 more }

The response from the image generation endpoint.

created: Integer

The Unix timestamp (in seconds) of when the image was created.

formatunixtime

background: :transparent | :opaque

The background parameter used for the image generation. Either transparent or opaque.

One of the following:

:transparent

:opaque

data: Array[Image { b64_json, revised_prompt, url } ]

The list of generated images.

b64_json: String

The base64-encoded JSON of the generated image. Returned by default for the GPT image models, and only present if response_format is set to b64_json for dall-e-2 and dall-e-3.

revised_prompt: String

For dall-e-3 only, the revised prompt that was used to generate the image.

url: String

When using dall-e-2 or dall-e-3, the URL of the generated image if response_format is set to url (default value). Unsupported for the GPT image models.

formaturi

output_format: :png | :webp | :jpeg

The output format of the image generation. Either png, webp, or jpeg.

One of the following:

:png

:webp

:jpeg

quality: :low | :medium | :high

The quality of the image generated. Either low, medium, or high.

One of the following:

:low

:medium

:high

size: :"1024x1024" | :"1024x1536" | :"1536x1024"

The size of the image generated. Either 1024x1024, 1024x1536, or 1536x1024.

One of the following:

:"1024x1024"

:"1024x1536"

:"1536x1024"

usage: Usage{ input_tokens, input_tokens_details, output_tokens, 2 more}

For gpt-image-1 only, the token usage information for the image generation.

input_tokens: Integer

The number of tokens (images and text) in the input prompt.

input_tokens_details: InputTokensDetails{ image_tokens, text_tokens}

The input tokens detailed information for the image generation.

image_tokens: Integer

The number of image tokens in the input prompt.

text_tokens: Integer

The number of text tokens in the input prompt.

output_tokens: Integer

The number of output tokens generated by the model.

total_tokens: Integer

The total number of tokens (images and text) used for the image generation.

output_tokens_details: OutputTokensDetails{ image_tokens, text_tokens}

The output token details for the image generation.

image_tokens: Integer

The number of image output tokens generated by the model.

text_tokens: Integer

The number of text output tokens generated by the model.

ImageGenStreamEvent = ImageGenPartialImageEvent { b64_json, background, created_at, 5 more } | ImageGenCompletedEvent { b64_json, background, created_at, 5 more }

Emitted when a partial image is available during image generation streaming.

One of the following:

class ImageGenPartialImageEvent { b64_json, background, created_at, 5 more }

Emitted when a partial image is available during image generation streaming.

b64_json: String

Base64-encoded partial image data, suitable for rendering as an image.

background: :transparent | :opaque | :auto

The background setting for the requested image.

One of the following:

:transparent

:opaque

:auto

created_at: Integer

The Unix timestamp when the event was created.

formatunixtime

output_format: :png | :webp | :jpeg

The output format for the requested image.

One of the following:

:png

:webp

:jpeg

partial_image_index: Integer

0-based index for the partial image (streaming).

quality: :low | :medium | :high | :auto

The quality setting for the requested image.

One of the following:

:low

:medium

:high

:auto

size: :"1024x1024" | :"1024x1536" | :"1536x1024" | :auto

The size of the requested image.

One of the following:

:"1024x1024"

:"1024x1536"

:"1536x1024"

:auto

type: :"image_generation.partial_image"

The type of the event. Always image_generation.partial_image.

class ImageGenCompletedEvent { b64_json, background, created_at, 5 more }

Emitted when image generation has completed and the final image is available.

b64_json: String

Base64-encoded image data, suitable for rendering as an image.

background: :transparent | :opaque | :auto

The background setting for the generated image.

One of the following:

:transparent

:opaque

:auto

created_at: Integer

The Unix timestamp when the event was created.

formatunixtime

output_format: :png | :webp | :jpeg

The output format for the generated image.

One of the following:

:png

:webp

:jpeg

quality: :low | :medium | :high | :auto

The quality setting for the generated image.

One of the following:

:low

:medium

:high

:auto

size: :"1024x1024" | :"1024x1536" | :"1536x1024" | :auto

The size of the generated image.

One of the following:

:"1024x1024"

:"1024x1536"

:"1536x1024"

:auto

type: :"image_generation.completed"

The type of the event. Always image_generation.completed.

usage: Usage{ input_tokens, input_tokens_details, output_tokens, total_tokens}

For the GPT image models only, the token usage information for the image generation.

input_tokens: Integer

The number of tokens (images and text) in the input prompt.

input_tokens_details: InputTokensDetails{ image_tokens, text_tokens}

The input tokens detailed information for the image generation.

image_tokens: Integer

The number of image tokens in the input prompt.

text_tokens: Integer

The number of text tokens in the input prompt.

output_tokens: Integer

The number of image tokens in the output image.

total_tokens: Integer

The total number of tokens (images and text) used for the image generation.

Create image

require "openai"

openai = OpenAI::Client.new(api_key: "My API Key")

images_response = openai.images.generate(prompt: "A cute baby sea otter")

puts(images_response)

{
  "created": 0,
  "background": "transparent",
  "data": [
    {
      "b64_json": "b64_json",
      "revised_prompt": "revised_prompt",
      "url": "https://example.com"
    }
  ],
  "output_format": "png",
  "quality": "low",
  "size": "1024x1024",
  "usage": {
    "input_tokens": 0,
    "input_tokens_details": {
      "image_tokens": 0,
      "text_tokens": 0
    },
    "output_tokens": 0,
    "total_tokens": 0,
    "output_tokens_details": {
      "image_tokens": 0,
      "text_tokens": 0
    }
  }
}

Returns Examples

{
  "created": 0,
  "background": "transparent",
  "data": [
    {
      "b64_json": "b64_json",
      "revised_prompt": "revised_prompt",
      "url": "https://example.com"
    }
  ],
  "output_format": "png",
  "quality": "low",
  "size": "1024x1024",
  "usage": {
    "input_tokens": 0,
    "input_tokens_details": {
      "image_tokens": 0,
      "text_tokens": 0
    },
    "output_tokens": 0,
    "total_tokens": 0,
    "output_tokens_details": {
      "image_tokens": 0,
      "text_tokens": 0
    }
  }
}

Suggested

Create image

ParametersExpand Collapse

ReturnsExpand Collapse

Create image

Returns Examples