Embeddings API

Use embeddings to turn text into vectors for semantic search, retrieval-augmented generation, recommendations, and clustering.

Endpoint

POST /v1/embeddings

Base URL:

https://api.lunos.tech/v1

Authentication

Send your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
X-App-ID: optional-app-name

Request body

Field	Type	Required	Description
`model`	string	Yes	Embedding-capable model ID from Lunos catalog, for example `openai/text-embedding-3-small`.
`input`	string \| string[]	Yes	One text string or an array of text strings.
`encoding_format`	`"float"` \| `"base64"`	No	Output encoding. Default is `float`.
`dimensions`	number	No	Output vector length when supported by the model.
`user`	string	No	End-user identifier for tracking and analytics.

Example request

cURL Python TypeScript PHP Go

curl https://api.lunos.tech/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LUNOS_API_KEY" \
  -d '{
    "model": "openai/text-embedding-3-small",
    "input": "Your text string goes here",
    "encoding_format": "float"
  }'

import requests

response = requests.post(
    "https://api.lunos.tech/v1/embeddings",
    headers={
        "Authorization": "Bearer " + LUNOS_API_KEY,
        "Content-Type": "application/json",
    },
    json={
        "model": "openai/text-embedding-3-small",
        "input": "Your text string goes here",
        "encoding_format": "float",
    },
    timeout=60,
)

result = response.json()
print(result["data"][0]["embedding"])

const response = await fetch("https://api.lunos.tech/v1/embeddings", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.LUNOS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "openai/text-embedding-3-small",
    input: "Your text string goes here",
    encoding_format: "float",
  }),
});

const result = await response.json();
console.log(result.data[0].embedding);

<?php

$payload = [
    "model" => "openai/text-embedding-3-small",
    "input" => "Your text string goes here",
    "encoding_format" => "float",
];

$ch = curl_init("https://api.lunos.tech/v1/embeddings");
curl_setopt_array($ch, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_HTTPHEADER => [
        "Authorization: Bearer " . getenv("LUNOS_API_KEY"),
        "Content-Type: application/json",
    ],
    CURLOPT_POSTFIELDS => json_encode($payload),
]);

$response = curl_exec($ch);
curl_close($ch);

$result = json_decode($response, true);
print_r($result["data"][0]["embedding"]);

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"net/http"
	"os"
)

func main() {
	payload := map[string]any{
		"model":           "openai/text-embedding-3-small",
		"input":           "Your text string goes here",
		"encoding_format": "float",
	}
	body, _ := json.Marshal(payload)

	req, _ := http.NewRequest("POST", "https://api.lunos.tech/v1/embeddings", bytes.NewBuffer(body))
	req.Header.Set("Authorization", "Bearer "+os.Getenv("LUNOS_API_KEY"))
	req.Header.Set("Content-Type", "application/json")

	resp, _ := http.DefaultClient.Do(req)
	defer resp.Body.Close()

	var result map[string]any
	_ = json.NewDecoder(resp.Body).Decode(&result)
	fmt.Println(result["data"])
}

Example response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [-0.0069, -0.0053, -0.00004, -0.0240]
    }
  ],
  "model": "openai/text-embedding-3-small",
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

Model guidance

Use openai/text-embedding-3-small as a default for balanced quality and cost.
Use openai/text-embedding-3-large when retrieval quality is more important than cost.
Keep embedding dimension consistent within the same vector index.

Dimensions

Use dimensions to reduce vector size and storage/query cost when supported by the model.

{
  "model": "openai/text-embedding-3-large",
  "input": "Testing 123",
  "dimensions": 1024
}

Typical retrieval flow

Chunk source documents.
Create embeddings for each chunk.
Store vectors and metadata in a vector database.
Embed user query text.
Retrieve nearest chunks by cosine similarity.
Send top chunks into /v1/chat/completions.

Error codes

400 invalid request body, invalid model, or unsupported format
402 insufficient balance
500 upstream/provider error