Realtime Distiller API¶
Realtime Distiller extends AI Refinery's Distiller to support real-time streaming interactions with both text and voice input. It supports:
- Voice input: Real-time audio streaming from microphone
- Voice output: Speech synthesis responses
- Text input: Text queries with voice responses
Before you begin, you must create an authenticated AsyncAIRefinery client, as shown below. All Realtime Distiller APIs are accessed via client.realtime_distiller.
import os
from air import AsyncAIRefinery
from dotenv import load_dotenv
load_dotenv() # loads your API_KEY from your local '.env' file
api_key=str(os.getenv("API_KEY"))
client = AsyncAIRefinery(api_key=api_key)
Realtime Distiller Workflow¶
Preliminaries¶
Creating Your Project¶
client.realtime_distiller.create_project() (synchronous)¶
Creates a new project based on the specified YAML configuration file.
Parameters:
config_path(str): The path to the YAML configuration file.project(str): A name for your project (letters, digits, hyphens, underscores only).
Returns:
bool:Trueif the project is successfully created.
Project Versioning:
- Realtime Distiller automatically handles project versioning, starting at version 0.
- The first time you create a project with a given name, it is assigned version 0. If you create another project with the same name, Distiller increments the version to 1, and so on.
- By default, connections are made to the latest project version unless a specific version is specified. For more details, refer to the distiller connection section below.
Example:
# This command registers the project "example" using the "example.yaml" configuration file.
client.realtime_distiller.create_project(config_path="example.yaml", project="example")
Downloading Your Project Configuration¶
client.realtime_distiller.download_project() (synchronous)¶
Retrieves the configuration of a specified project from the server.
Parameters:
project(str): The name of the project whose configuration you want to download.project_version(str, optional): The version of the project configuration to download. Defaults to the latest version if not provided.
Returns:
dict: A Python dictionary containing the downloaded configuration.
Example:
# This command downloads version "1" of the "example" project.
project_config = client.realtime_distiller.download_project(project="example", project_version="1")
Connecting to Realtime Distiller¶
client.realtime_distiller.__call__() (asynchronous)¶
Establishes an asynchronous connection (via a WebSocket) to the RealtimeDistiller endpoint for a specific project. Usage of this function within an async context manager allows easy management of all Distiller-related operations.
Parameters:
project(str): The project name (letters, digits, hyphens, underscores only).uuid(str): A unique user identifier (letters, digits, hyphens, underscores only).executor_dict(dict[str, Callable], optional): A dictionary mapping custom agent names to callable functions. These callables are invoked when their corresponding agents are triggered by the super agent or orchestrator. Defaults to{}.project_version(str, optional): The project version to connect to. If not provided, Distiller uses the latest version.
Returns:
_VoiceDistillerContextManager: An asynchronous context manager that handles operations within the given project.
Example:
async with client.realtime_distiller(
project="example",
uuid="test"
) as vc:
# Your asynchronous operations here
pass
Audio Input¶
client.realtime_distiller.send_audio_chunk() (asynchronous)¶
Send chunks of audio bytes containing voice query to WebSocket asynchronously. Typically used within a loop to stream audio input.
Parameters:
audio_bytes(bytes): Raw audio data to send to the server.
Example:
async with client.realtime_distiller(
project="example",
uuid="test"
) as vc:
async for audio_chunk in audio:
await vc.send_audio_chunk(audio_chunk)
Text Input¶
client.realtime_distiller.send_text_query() (asynchronous)¶
Send text-based query to the WebSocket asynchronously.
Parameters:
text(str): The text query to send.
Example:
async with client.realtime_distiller(
project="example",
uuid="test"
) as vc:
text = "example query"
await vc.send_text_query(text)
Response Stream¶
client.realtime_distiller.get_responses() (asynchronous)¶
Continuously retrieve output (text or audio) responses from the WebSocket asynchronously.
Yields:
Dict: A dictionary representing a Realtime Event, containing a response type and an optional response content. Responses can be status events, text response, or speech response in the form of streamed audio chunks.
Example:
async with client.realtime_distiller(
project="example",
uuid="test"
) as vc:
async for response in vc.get_responses():
print(response)
Realtime Wrapper Methods¶
High-level methods that handle the complete voice interaction loop. These wrap the base voice APIs (send_audio_chunk(), send_text_query(), get_responses()) to provide a ready-to-use, end-to-end realtime voice experience.
client.realtime_distiller.listen_and_respond() (asynchronous)¶
Captures audio from the microphone, streams it to the server, and plays back audio responses through the speaker.
Parameters:
sample_rate(int, optional): Audio sample rate in Hz. Must match thesample_ratein your YAMLspeech_config. Defaults to16000.
Behavior:
- Streams microphone audio to the server using
send_audio_chunk() - Stops microphone capture when the server begins responding
- Receives server responses via
get_responses() - Plays TTS audio responses through the speaker
- Prints text transcriptions
Example:
async with client.realtime_distiller(
project="example",
uuid="test"
) as vc:
await vc.listen_and_respond(sample_rate=16000)
client.realtime_distiller.send_text_and_respond() (asynchronous)¶
Sends a text query to the server and plays back audio responses through the speaker.
Parameters:
text(str): The text query to send.sample_rate(int, optional): Audio sample rate in Hz. Must match thesample_ratein your YAMLspeech_config. Defaults to16000.
Raises:
ValueError: Iftextis empty.
Behavior:
- Sends the text query using
send_text_query() - Receives server responses via
get_responses() - Plays TTS audio responses through the speaker
- Prints text transcriptions
Example:
async with client.realtime_distiller(
project="example",
uuid="test"
) as vc:
await vc.send_text_and_respond(
text="example query",
sample_rate=16000
)
Realtime Events¶
Response events representing status, text response or speech response.
| Type | Fields/Description |
|---|---|
session.created |
Status event indicating Realtime session creation |
response.audio_transcript.delta |
delta (string) : Partial transcription text |
response.audio_transcript.done |
text (string) : Final transcription text |
response.created |
Status event indicating response has started |
response.audio.delta |
audio (string) : Base64-encoded audio chunk. |
response.audio.done |
Status event indicating current audio response is complete. |
response.text.delta |
content (string): Partial text output from Distiller. |
response.text.done |
Status event indicating Distiller text response is completed. |
response.done |
Status event indicating response has completed |
For examples of using Realtime Distiller, check out the tutorials: