API Endpoints
Chat Completions
Learn how to use the chat completions endpoint
Generate conversational responses using our AI models.
Endpoint Details
/api/chat/completions
Generate a chat completion from a conversation
Request Format
Headers
Must be set to application/json
Your API key
Body Parameters
The model to use for completion. Defaults to meta-llama/Meta-Llama-3-8B-Instruct-Turbo
Array of messages in the conversation
Example Request
Response Format
Model Parameters
- Maximum Output Tokens: 725
- Temperature: 0.7
- Top P: 0.9
Caching
- Responses are cached for 1 hour
- Cache key is based on model and messages content
- Cached responses are served instantly
Error Scenarios
- 400: Invalid request format or empty messages array
- 401: Missing API key
- 403: Invalid API key
- 429: Rate limit exceeded
- 500: Internal server error