Chat Completions

The Chat Completions endpoints let you generate dynamic responses from a replica while maintaining conversation context. Each request automatically stores the prompt and response in the chat history (unless you opt out), making it easy to build interactive, stateful experiences across platforms like web, Discord, or Telegram.

Key features include:

  • Contextual responses – use prior messages for continuity, or skip history for single-turn replies.
  • Flexible output – get responses in streamed or JSON formats, compatible with tools like the Vercel AI SDK.
  • Platform-specific support – include metadata for Discord or Telegram to keep conversations tied to their source.
  • Experimental OpenAI-style endpoint – limited compatibility for systems already built around the OpenAI Chat Completions API, returning structured JSON without streaming.