Generate conversational responses using our AI models.

Endpoint Details

/api/chat/completions

Generate a chat completion from a conversation

Request Format

Headers

Content-Type
string
required

Must be set to application/json

x-api-key
string
required

Your API key

Body Parameters

model
string

The model to use for completion. Defaults to meta-llama/Meta-Llama-3-8B-Instruct-Turbo

messages
array
required

Array of messages in the conversation

Example Request

{
  "model": "meta-llama/Meta-Llama-3-8B-Instruct-Turbo",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}

Response Format

{
  "id": "cmpl-123abc",
  "object": "chat.completion",
  "created": 1677649420,
  "model": "meta-llama/Meta-Llama-3-8B-Instruct-Turbo",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      }
    }
  ]
}

Model Parameters

  • Maximum Output Tokens: 725
  • Temperature: 0.7
  • Top P: 0.9

Caching

  • Responses are cached for 1 hour
  • Cache key is based on model and messages content
  • Cached responses are served instantly

Error Scenarios

  • 400: Invalid request format or empty messages array
  • 401: Missing API key
  • 403: Invalid API key
  • 429: Rate limit exceeded
  • 500: Internal server error

Error Response Example

{
  "error": {
    "message": "Error description",
    "code": 400
  }
}