Skip to content
Primary navigation

Create speech

client.Audio.Speech.New(ctx, body) (*Response, error)
POST/audio/speech

Generates audio from the input text.

Returns the audio file content, or a stream of audio events.

ParametersExpand Collapse
body AudioSpeechNewParams
Input param.Field[string]

The text to generate audio for. The maximum length is 4096 characters.

maxLength4096
Model param.Field[SpeechModel]

One of the available TTS models: tts-1, tts-1-hd, gpt-4o-mini-tts, or gpt-4o-mini-tts-2025-12-15.

string
type SpeechModel string
One of the following:
const SpeechModelTTS1 SpeechModel = "tts-1"
const SpeechModelTTS1HD SpeechModel = "tts-1-hd"
const SpeechModelGPT4oMiniTTS SpeechModel = "gpt-4o-mini-tts"
const SpeechModelGPT4oMiniTTS2025_12_15 SpeechModel = "gpt-4o-mini-tts-2025-12-15"

The voice to use when generating the audio. Supported built-in voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, and cedar. You may also provide a custom voice object with an id, for example { "id": "voice_1234" }. Previews of the voices are available in the Text to speech guide.

string
type AudioSpeechNewParamsVoiceString string
One of the following:
const AudioSpeechNewParamsVoiceStringAlloy AudioSpeechNewParamsVoiceString = "alloy"
const AudioSpeechNewParamsVoiceStringAsh AudioSpeechNewParamsVoiceString = "ash"
const AudioSpeechNewParamsVoiceStringBallad AudioSpeechNewParamsVoiceString = "ballad"
const AudioSpeechNewParamsVoiceStringCoral AudioSpeechNewParamsVoiceString = "coral"
const AudioSpeechNewParamsVoiceStringEcho AudioSpeechNewParamsVoiceString = "echo"
const AudioSpeechNewParamsVoiceStringSage AudioSpeechNewParamsVoiceString = "sage"
const AudioSpeechNewParamsVoiceStringShimmer AudioSpeechNewParamsVoiceString = "shimmer"
const AudioSpeechNewParamsVoiceStringVerse AudioSpeechNewParamsVoiceString = "verse"
const AudioSpeechNewParamsVoiceStringMarin AudioSpeechNewParamsVoiceString = "marin"
const AudioSpeechNewParamsVoiceStringCedar AudioSpeechNewParamsVoiceString = "cedar"
type AudioSpeechNewParamsVoiceID struct{…}

Custom voice reference.

ID string

The custom voice ID, e.g. voice_1234.

Instructions param.Field[string]optional

Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.

maxLength4096
ResponseFormat param.Field[AudioSpeechNewParamsResponseFormat]optional

The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.

const AudioSpeechNewParamsResponseFormatMP3 AudioSpeechNewParamsResponseFormat = "mp3"
const AudioSpeechNewParamsResponseFormatOpus AudioSpeechNewParamsResponseFormat = "opus"
const AudioSpeechNewParamsResponseFormatAAC AudioSpeechNewParamsResponseFormat = "aac"
const AudioSpeechNewParamsResponseFormatFLAC AudioSpeechNewParamsResponseFormat = "flac"
const AudioSpeechNewParamsResponseFormatWAV AudioSpeechNewParamsResponseFormat = "wav"
const AudioSpeechNewParamsResponseFormatPCM AudioSpeechNewParamsResponseFormat = "pcm"
Speed param.Field[float64]optional

The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.

minimum0.25
maximum4
StreamFormat param.Field[AudioSpeechNewParamsStreamFormat]optional

The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.

const AudioSpeechNewParamsStreamFormatSSE AudioSpeechNewParamsStreamFormat = "sse"
const AudioSpeechNewParamsStreamFormatAudio AudioSpeechNewParamsStreamFormat = "audio"
ReturnsExpand Collapse
type AudioSpeechNewResponse interface{…}

Create speech

package main

import (
  "context"
  "fmt"

  "github.com/openai/openai-go"
  "github.com/openai/openai-go/option"
)

func main() {
  client := openai.NewClient(
    option.WithAPIKey("My API Key"),
  )
  speech, err := client.Audio.Speech.New(context.TODO(), openai.AudioSpeechNewParams{
    Input: "input",
    Model: openai.SpeechModelTTS1,
    Voice: openai.AudioSpeechNewParamsVoiceUnion{
      OfString: openai.String("string"),
    },
  })
  if err != nil {
    panic(err.Error())
  }
  fmt.Printf("%+v\n", speech)
}
Returns Examples