Skip to content

Create translation

TranslationCreateResponse audio().translations().create(TranslationCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
POST/audio/translations

Translates audio into English.

ParametersExpand Collapse
TranslationCreateParams params
InputStream file

The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

ID of the model to use. Only whisper-1 (which is powered by our open source Whisper V2 model) is currently available.

WHISPER_1("whisper-1")
GPT_4O_TRANSCRIBE("gpt-4o-transcribe")
GPT_4O_MINI_TRANSCRIBE("gpt-4o-mini-transcribe")
GPT_4O_MINI_TRANSCRIBE_2025_12_15("gpt-4o-mini-transcribe-2025-12-15")
GPT_4O_TRANSCRIBE_DIARIZE("gpt-4o-transcribe-diarize")
Optional<String> prompt

An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English.

Optional<ResponseFormat> responseFormat

The format of the output, in one of these options: json, text, srt, verbose_json, or vtt.

JSON("json")
TEXT("text")
SRT("srt")
VERBOSE_JSON("verbose_json")
VTT("vtt")
Optional<Double> temperature

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

ReturnsExpand Collapse
class TranslationCreateResponse: A class that can be one of several variants.union
class Translation:
String text
class TranslationVerbose:
double duration

The duration of the input audio.

String language

The language of the output translation (always english).

String text

The translated text.

Optional<List<TranscriptionSegment>> segments

Segments of the translated text and their corresponding details.

long id

Unique identifier of the segment.

double avgLogprob

Average logprob of the segment. If the value is lower than -1, consider the logprobs failed.

formatfloat
double compressionRatio

Compression ratio of the segment. If the value is greater than 2.4, consider the compression failed.

formatfloat
double end

End time of the segment in seconds.

formatfloat
double noSpeechProb

Probability of no speech in the segment. If the value is higher than 1.0 and the avg_logprob is below -1, consider this segment silent.

formatfloat
long seek

Seek offset of the segment.

double start

Start time of the segment in seconds.

formatfloat
double temperature

Temperature parameter used for generating the segment.

formatfloat
String text

Text content of the segment.

List<long> tokens

Array of token IDs for the text content.

Create translation

package com.openai.example;

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.audio.AudioModel;
import com.openai.models.audio.translations.TranslationCreateParams;
import com.openai.models.audio.translations.TranslationCreateResponse;
import java.io.ByteArrayInputStream;

public final class Main {
    private Main() {}

    public static void main(String[] args) {
        OpenAIClient client = OpenAIOkHttpClient.fromEnv();

        TranslationCreateParams params = TranslationCreateParams.builder()
            .file(ByteArrayInputStream("some content".getBytes()))
            .model(AudioModel.WHISPER_1)
            .build();
        TranslationCreateResponse translation = client.audio().translations().create(params);
    }
}
{
  "text": "text"
}
Returns Examples
{
  "text": "text"
}