💬 Chat Generation

The Chat Generation API enables you to build conversational AI experiences, such as chatbots, virtual assistants, and customer support agents. It supports multi-turn conversations, streaming responses, and fallback models for robust, real-time interactions.

Note: You need an API key to use this endpoint. See the Quickstart Guide to get started.

How to Generate Chat Responses (Step-by-Step)

Choose your preferred client library or use cURL for direct API calls.
Set your API key and endpoint URL.
Configure the request parameters (model, messages, max_tokens, etc.).
Send the request and process the AI's response.

Example: Using OpenAI Library

Generate a chat completion using the OpenAI Node.js library:

1const response = await openai.chat.completions.create({
2  model: 'gpt-4.1-mini',
3  messages: [
4    { role: 'user', content: 'Tell me a joke.' }
5  ],
6  max_tokens: 100,
7});
8console.log(response.choices[0].message.content);

Parameters:

model: The chat model to use (e.g., gpt-4.1-mini).
messages: Array of message objects (role: user | assistant | system).
max_tokens: Maximum tokens in the response.

Example: Using Lunos Client

Generate a chat completion using the official Lunos client:

1const client = new LunosClient({ apiKey: 'YOUR_API_KEY' });
2const response = await client.chat.createCompletion({
3  model: 'openai/gpt-4.1-mini',
4  messages: [
5    { role: 'user', content: 'Tell me a joke.' }
6  ],
7  max_tokens: 100,
8  appId: 'my-web-app-v1.0',
9});
10console.log(response.choices[0].message.content);

Parameters:

model: Model ID (e.g., openai/gpt-4.1-mini).
messages: Conversation history as an array of objects.
max_tokens: Maximum tokens for the reply.

Example: Using cURL

Direct API call for chat completion:

1curl -X POST   https://api.lunos.tech/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer YOUR_API_KEY"   -H "X-App-ID: my-mobile-app-v1.2"   -d '{
2    "model": "openai/gpt-4.1-mini",
3    "messages": [{"role": "user", "content": "Tell me a joke."}],
4    "max_tokens": 100
5  }'

App Tracking: The X-App-ID header allows you to track usage per application. This helps with analytics, billing, and performance monitoring across different apps.

Example Response

1{
2  "model": "openai/gpt-4.1-mini",
3  "choices": [
4    {
5      "message": {
6        "role": "assistant",
7        "content": "Why did the scarecrow win an award? Because he was outstanding in his field!"
8      },
9      "finish_reason": "stop",
10      "index": 0
11    }
12  ],
13  "usage": {
14    "prompt_tokens": 12,
15    "completion_tokens": 16,
16    "total_tokens": 28
17  }
18}

Important: Your API key is sensitive. Never share it publicly or commit it to version control.

Best Practices & Tips

Use stream: true for real-time, token-by-token responses.
Set fallback_model to automatically switch to a backup model if the primary is unavailable.
Maintain conversation context by passing the full message history.
See Model Discovery for available models.

Troubleshooting

Authentication error: Verify your API key and permissions.
Incomplete or unexpected responses: Check your max_tokens and messages structure.
Request issues: Ensure your request payload matches the API specification.

Documentation

Getting Started

API Reference

Generative Operations

Resources