Lunos logoLunos

Tool & Function Calling

Tool calls (also known as function calls) let an LLM interact with external tools through your application. The model does not run the tool by itself. Instead, it suggests which tool to call and provides the arguments. Your app executes the tool and then sends the tool results back to the model so it can produce the final response.

This pattern is useful for:

  • Searching or retrieving data from your own systems
  • Calculating results with deterministic code
  • Triggering actions in workflows (billing, notifications, queues, and more)

Lunos is OpenAI-compatible. If a model supports tool calling, you can use the standard tools + tool_calls flow on POST /v1/chat/completions.

Supported models

Not every model supports tool calling. Before you rely on this feature, check the model you plan to use (see GET /v1/models in the Models API).

The standard three-step flow

Tool calling usually follows three steps.

1. Send a request that includes tools

When you want the model to be allowed to call tools, include a tools array in your chat request.

Example (request):

{
  "model": "openai/gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": "What are the titles of some James Joyce books?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "searchBooks",
        "description": "Search for books based on provided keywords",
        "parameters": {
          "type": "object",
          "properties": {
            "search_terms": {
              "type": "array",
              "items": { "type": "string" },
              "description": "List of keywords to search for books"
            }
          },
          "required": ["search_terms"]
        }
      }
    }
  ]
}

2. Execute the requested tool in your app

The model responds with tool_calls when it decides a tool is needed. Your application should:

  1. Read the tool name from tool_calls
  2. Parse the tool arguments (typically JSON)
  3. Run the matching local function
  4. Convert the result to a string (usually JSON) for the next request

3. Send tool results back to the model

After you execute the tool(s), send a new chat request that includes the original context, the assistant’s tool call, and one tool message per tool call result.

Example (tool result step):

{
  "model": "openai/gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": "What are the titles of some James Joyce books?"
    },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_abc123",
          "type": "function",
          "function": {
            "name": "searchBooks",
            "arguments": "{\"search_terms\":[\"James\",\"Joyce\"]}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_abc123",
      "content": "[{\"id\":4300,\"title\":\"Ulysses\",\"authors\":[{\"name\":\"Joyce, James\"}]}]"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "searchBooks",
        "description": "Search for books based on provided keywords",
        "parameters": {
          "type": "object",
          "properties": {
            "search_terms": {
              "type": "array",
              "items": { "type": "string" },
              "description": "List of keywords to search for books"
            }
          },
          "required": ["search_terms"]
        }
      }
    }
  ]
}

The model can then use the tool output and generate the final answer.

Example: Tool calling loop (TypeScript)

Warning: @lunos/client may be outdated versus the API; if this sample fails, use the OpenAI JavaScript SDK with the Lunos baseURL instead.

Below is a minimal loop that:

  • Calls Lunos with tools
  • Executes each tool_call
  • Sends each tool result back
  • Stops when the model no longer requests tools
import { LunosClient } from "@lunos/client";

const client = new LunosClient({
  apiKey: process.env.LUNOS_API_KEY!,
  baseURL: "https://api.lunos.tech/v1",
  appId: "tool-function-calling-v1",
});

const MODEL = "openai/gpt-4o";
const MAX_TOOL_ITERATIONS = 5;

const tools = [
  {
    type: "function",
    function: {
      name: "searchBooks",
      description: "Search for books based on provided keywords",
      parameters: {
        type: "object",
        properties: {
          search_terms: {
            type: "array",
            items: { type: "string" },
            description: "List of keywords to search for books",
          },
        },
        required: ["search_terms"],
      },
    },
  },
];

async function searchBooks(search_terms: string[]) {
  const url = "https://gutendex.com/books";
  const searchQuery = encodeURIComponent(search_terms.join(" "));
  const response = await fetch(`${url}?search=${searchQuery}`);
  const data = await response.json();
  return (data.results ?? []).map((book: any) => ({
    id: book.id,
    title: book.title,
    authors: book.authors,
  }));
}

const TOOL_MAPPING: Record<
  string,
  (args: any) => Promise<unknown>
> = {
  searchBooks: ({ search_terms }) => searchBooks(search_terms),
};

async function run() {
  const baseMessages = [
    {
      role: "system",
      content: "You are a helpful assistant.",
    },
    {
      role: "user",
      content: "What are the titles of some James Joyce books?",
    },
  ];

  let messages = [...baseMessages];
  let iterations = 0;
  let finalContent = "";

  while (iterations < MAX_TOOL_ITERATIONS) {
    iterations += 1;

    const result = await client.chat.completions.create({
      model: MODEL,
      tools,
      messages,
    });

    const assistantMessage = result.choices[0]?.message;
    if (!assistantMessage) break;

    const toolCalls = assistantMessage.tool_calls ?? [];
    if (toolCalls.length === 0) {
      finalContent = assistantMessage.content ?? "";
      break;
    }

    const nextMessages: any[] = [...messages, assistantMessage];

    for (const toolCall of toolCalls) {
      const toolName = toolCall.function.name;
      const rawArgs = toolCall.function.arguments ?? "{}";
      const args = JSON.parse(rawArgs);

      const toolFn = TOOL_MAPPING[toolName];
      if (!toolFn) {
        throw new Error(`No local tool handler for ${toolName}`);
      }

      const toolResult = await toolFn(args);

      nextMessages.push({
        role: "tool",
        tool_call_id: toolCall.id,
        content: JSON.stringify(toolResult),
      });
    }

    messages = nextMessages;
  }

  console.log(finalContent);
}

void run();

Tool schema tips

When you define a tool, focus on clarity:

  • Use a specific, descriptive function name
  • Write a concrete description for when the tool should be used
  • Use a strict JSON Schema for parameters
  • Keep the parameter list small and well-scoped

Selecting models for tools

Before deploying, confirm:

  • The model supports tool calling
  • The model can produce consistent tool arguments for your schema
  • Your application validates tool inputs before calling external systems

Tool usage controls (optional)

Some OpenAI-compatible implementations expose parameters that control tool behavior. If your model/proxy supports them, you may see options such as:

  • tool_choice to request or disable tool usage
  • parallel_tool_calls to control whether multiple tools can run at once

Always verify the exact parameter behavior for the specific model you’re using.

Interleaved thinking (reasoning between tool calls)

Some models/proxies can reason with intermediate tool results before requesting the next tool call. This can improve decision-making in multi-step workflows.

Implementation details vary by model. Before relying on it, check whether your chosen model supports the relevant reasoning mode (for example, via model metadata in GET /v1/models), then confirm the parameter names in your model/proxy docs.

Streaming with tool calls

If you enable streaming (stream: true), the response may arrive as incremental chunks. Tool calls can be sent as deltas, and the final “finish reason” indicates whether the model is requesting tools or returning a completed answer.

In your client, collect the streamed tool_calls until the stream signals that tool execution should happen, run the tools locally, then send the tool results back in a non-streaming follow-up (or continue the loop with streaming if your implementation supports it).

Parallel tool calls

Some implementations allow the model to request multiple tools in the same turn.

If you want tool execution to be strictly sequential, disable parallel tool calls:

{
  "parallel_tool_calls": false
}

Multi-tool workflows

You can define multiple tools and let the model chain them naturally. This usually works best when each tool has a narrowly-scoped purpose.

Example:

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "searchProducts",
        "description": "Search for products in your catalog",
        "parameters": {
          "type": "object",
          "properties": {
            "query": { "type": "string" }
          },
          "required": ["query"]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "getProductDetails",
        "description": "Fetch detailed information for a product",
        "parameters": {
          "type": "object",
          "properties": {
            "productId": { "type": "string" }
          },
          "required": ["productId"]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "checkInventory",
        "description": "Check inventory levels for a product",
        "parameters": {
          "type": "object",
          "properties": {
            "productId": { "type": "string" }
          },
          "required": ["productId"]
        }
      }
    }
  ]
}

Best practices

  • Treat tool arguments as untrusted input. Validate and sanitize before executing anything dangerous.
  • Add timeouts and retries around external tool calls in your app.
  • Set a maximum number of tool iterations to avoid runaway loops.
  • Log tool execution metadata safely (avoid logging API keys or sensitive payloads).