The Audio Generation API allows you to convert text into natural-sounding speech using advanced AI models from OpenAI, Google, and more. This is ideal for accessibility, voice assistants, content narration, and more. You can customize the voice, output format, and speed to fit your needs.
Note: You need an API key to use this endpoint. See the Quickstart Guide to get started.
Tip: The available voice
options depend on the selected model
. Choose a voice that best fits your use case and language.
For OpenAI TTS models (e.g., openai/tts
, openai/tts-hd
), use one of the following voices:
Voice Name | Description / Style |
---|---|
alloy | Warm, balanced |
echo | Smooth, clear |
fable | Narrative, expressive |
onyx | Deep, serious |
nova | Bright, crisp |
shimmer | Cheerful, lively |
For Google TTS models (e.g., google/gemini-2.5-flash-preview-tts
), use one of the following voices. Styles are shown in both English and Indonesian for clarity:
Voice Name | Style (EN) | Style (ID) | Description / Notes |
---|---|---|---|
Zephyr | Bright | Cerah | Clear and cheerful |
Puck | Upbeat | Upbeat | Energetic, lively |
Charon | Informative | Informatif | Neutral, factual |
Kore | Firm | Tegas | Strong, confident |
Fenrir | Happy | Senang | Joyful, positive |
Leda | Youthful | Muda | Young-sounding |
Orus | Firm | Tegas | Strong, confident |
Aoede | Breezy | Breezy | Light, easygoing |
Callirrhoe | Relaxed | Santai | Casual, laid-back |
Autonoe | Bright | Cerah | Clear and cheerful |
Enceladus | Breathy | Berbisik | Soft, airy |
Iapetus | Clear | Jelas | Distinct, easy to understand |
Umbriel | Relaxed | Santai | Casual, laid-back |
Algieba | Smooth | Lembut | Soft, flowing |
Despina | Smooth | Lembut | Soft, flowing |
Erinome | Clear | Jernih | Distinct, easy to understand |
Algenib | Gravelly | Berbatu | Rough, textured |
Rasalgethi | Informative | Informatif | Neutral, factual |
Laomedeia | Upbeat | Upbeat | Energetic, lively |
Achernar | Soft | Lembut | Gentle, mild |
Alnilam | Corporate | Perusahaan | Professional, formal |
Schedar | Even | Rata | Balanced, steady |
Gacrux | Mature | Dewasa | Adult, experienced |
Pulcherrima | Continue | Teruskan | Ongoing, persistent (ambiguous) |
Achird | Friendly | Ramah | Welcoming, kind |
Zubenelgenubi | Casual | Kasual | Informal, relaxed |
Vindemiatrix | Gentle | Lembut | Soft, tender |
Sadachbia | Lively | Ceria | Full of life, spirited |
Sadaltager | Knowledgeable | Berpengetahuan | Wise, informed |
Sulafat | Warm | Hangat | Friendly, inviting |
text
, model
, voice
, format
, etc.).Generate speech from text using the OpenAI Node.js library:
1const response = await openai.audio.speech.create({
2 model: 'tts-1',
3 input: 'Hello, this is a test.',
4 voice: 'alloy',
5 response_format: 'mp3',
6});
7// Save response.audio to file
model
: The speech model to use (e.g., tts-1
).input
: The text you want to convert to speech.voice
: The voice style (e.g., alloy
).response_format
: Output format (mp3
, wav
, etc.).Generate speech using the official Lunos client:
1const client = new LunosClient({ apiKey: 'YOUR_API_KEY' });
2const response = await client.audio.textToSpeech({
3 text: 'Hello, this is a test.',
4 voice: 'alloy',
5 model: 'openai/tts',
6 response_format: 'mp3',
7 speed: 1.0,
8 appId: 'my-audio-app-v1.0',
9});
10// Save response.audioBuffer to file
text
: The text to synthesize.voice
: Voice style (see Model Discovery).model
: Model ID (e.g., openai/tts
).response_format
: Output format.speed
: Adjusts the speech speed (1.0 = normal).Direct API call for maximum control:
1curl -X POST https://api.lunos.tech/v1/audio/generations -H "Content-Type: application/json" -H "Authorization: Bearer YOUR_API_KEY" -H "X-App-ID: my-audio-app-v1.0" -d '{
2 "model": "openai/tts",
3 "input": "Hello, this is a test.",
4 "voice": "alloy",
5 "response_format": "mp3"
6 }' --output output.mp3
App Tracking: The X-App-ID
header allows you to track audio generation usage per application. This helps with analytics, billing, and performance monitoring across different apps.
1// Binary audio data is returned. Example headers:
2// Content-Type: audio/mpeg
3// Content-Disposition: attachment; filename="audio-1623168000.mp3"
Important: Your API key is sensitive. Never share it publicly or commit it to version control.
voice
and model
for your use case (see Model Discovery).speed
to fine-tune the pace of speech for accessibility or user preference.Content-Type
header to determine the audio format in the response.model
and voice
parameters.No headings found on this page.