Create session
Create an ephemeral API token for use in client-side applications with the
Realtime API. Can be configured with the same session parameters as the
session.update client event.
It responds with a session object, plus a client_secret key which contains
a usable ephemeral API token that can be used to authenticate browser clients
for the Realtime API.
Body ParametersJSON
The format of input audio. Options are pcm16, g711_ulaw, or g711_alaw.
The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format, (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior.
Note that the server sets default instructions which will be used if this field is not set and are visible in the session.created event at the start of the session.
The format of output audio. Options are pcm16, g711_ulaw, or g711_alaw.
The speed of the model's spoken response. 1.0 is the default speed. 0.25 is the minimum speed. 1.5 is the maximum speed. This value can only be changed in between model turns, not while a response is in progress.
Sampling temperature for the model, limited to [0.6, 1.2]. Defaults to 0.8.
How the model chooses tools. Options are auto, none, required, or
specify a function.
Returns
Unique identifier for the session that looks like sess_1234567890abcdef.
Expiration timestamp for the session, in seconds since epoch.
Additional fields to include in server outputs.
item.input_audio_transcription.logprobs: Include logprobs for input audio transcription.
The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format, (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior.
Note that the server sets default instructions which will be used if this
field is not set and are visible in the session.created event at the
start of the session.
The Realtime model used for this session.
The object type. Always realtime.session.
How the model chooses tools. Options are auto, none, required, or
specify a function.
Create session
curl https://api.openai.com/v1/realtime/sessions \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"client_secret": {
"expires_at": 0,
"value": "value"
}
}'{
"id": "id",
"audio": {
"input": {
"format": {
"rate": 24000,
"type": "audio/pcm"
},
"noise_reduction": {
"type": "near_field"
},
"transcription": {
"language": "language",
"model": "string",
"prompt": "prompt"
},
"turn_detection": {
"prefix_padding_ms": 0,
"silence_duration_ms": 0,
"threshold": 0,
"type": "type"
}
},
"output": {
"format": {
"rate": 24000,
"type": "audio/pcm"
},
"speed": 0,
"voice": "ash"
}
},
"expires_at": 0,
"include": [
"item.input_audio_transcription.logprobs"
],
"instructions": "instructions",
"max_output_tokens": 0,
"model": "model",
"object": "object",
"output_modalities": [
"text"
],
"tool_choice": "tool_choice",
"tools": [
{
"description": "description",
"name": "name",
"parameters": {},
"type": "function"
}
],
"tracing": "auto",
"turn_detection": {
"prefix_padding_ms": 0,
"silence_duration_ms": 0,
"threshold": 0,
"type": "type"
}
}Returns Examples
{
"id": "id",
"audio": {
"input": {
"format": {
"rate": 24000,
"type": "audio/pcm"
},
"noise_reduction": {
"type": "near_field"
},
"transcription": {
"language": "language",
"model": "string",
"prompt": "prompt"
},
"turn_detection": {
"prefix_padding_ms": 0,
"silence_duration_ms": 0,
"threshold": 0,
"type": "type"
}
},
"output": {
"format": {
"rate": 24000,
"type": "audio/pcm"
},
"speed": 0,
"voice": "ash"
}
},
"expires_at": 0,
"include": [
"item.input_audio_transcription.logprobs"
],
"instructions": "instructions",
"max_output_tokens": 0,
"model": "model",
"object": "object",
"output_modalities": [
"text"
],
"tool_choice": "tool_choice",
"tools": [
{
"description": "description",
"name": "name",
"parameters": {},
"type": "function"
}
],
"tracing": "auto",
"turn_detection": {
"prefix_padding_ms": 0,
"silence_duration_ms": 0,
"threshold": 0,
"type": "type"
}
}