# REST API Contract Base URL: `/api` behind web proxy, or server root directly in local/dev. Authentication: - If `ADMIN_TOKEN` is set on server, send `Authorization: Bearer `. - If `ADMIN_TOKEN` is unset, API is open for local/dev use. Content type: - Requests with bodies use `application/json`. - Responses are JSON unless noted otherwise. Chat upload limits: - Chat completion and direct message payloads support inline attachments up to a 32 MB request body. - Up to 8 attachments per message. - Image attachments: PNG or JPEG only, max 6 MB each. - Text attachments: up to 8 MB source size each; server accepts at most 200,000 characters of inlined text content per attachment. ## Health + Auth ### `GET /health` - Response: `{ "ok": true }` ### `GET /v1/auth/session` - Response: `{ "authenticated": true, "mode": "open" | "token" }` ## Models ### `GET /v1/models` - Response: ```json { "providers": { "openai": { "models": ["gpt-4.1-mini"], "loadedAt": "2026-02-14T00:00:00.000Z", "error": null }, "anthropic": { "models": ["claude-3-5-sonnet-latest"], "loadedAt": null, "error": null }, "xai": { "models": ["grok-3-mini"], "loadedAt": null, "error": null }, "hermes-agent": { "models": ["hermes-agent"], "loadedAt": null, "error": null } } } ``` - OpenAI model lists are filtered to models that are expected to work with the backend's Responses API implementation. - `hermes-agent` is included only when `HERMES_AGENT_API_KEY` is configured. Set it to Hermes `API_SERVER_KEY`, or any non-empty value if that local server does not require auth. `HERMES_AGENT_API_BASE_URL` defaults to `http://127.0.0.1:8642/v1`; set `HERMES_AGENT_MODEL` only when you need an additional fallback/override model id. ## Active Runs ### `GET /v1/active-runs` - Response: ```json { "chats": ["chat-id-with-active-stream"], "searches": ["search-id-with-active-stream"] } ``` Behavior notes: - Lists in-memory chat/search streams that are still running on this server process. - Clients should use this after app start or page refresh to restore per-row generating indicators. - The lists are not durable across server restarts. ## Chats ### `GET /v1/chats` - Response: `{ "chats": ChatSummary[] }` ### `POST /v1/chats` - Body: ```json { "title": "optional title", "provider": "optional openai|anthropic|xai|hermes-agent", "model": "optional model id", "messages": [ { "role": "system|user|assistant|tool", "content": "string", "name": "optional", "attachments": [] } ] } ``` - Response: `{ "chat": ChatSummary }` Behavior notes: - `provider` and `model` must be supplied together when present. - When `provider`/`model` are supplied, the new chat initializes `initiatedProvider`/`initiatedModel` and `lastUsedProvider`/`lastUsedModel`. - Optional `messages` are inserted as the initial transcript. Attachment metadata uses the same schema and limits as chat completion messages. ### `PATCH /v1/chats/:chatId` - Body: `{ "title": string }` - Response: `{ "chat": ChatSummary }` - Not found: `404 { "message": "chat not found" }` ### `POST /v1/chats/title/suggest` - Body: ```json { "chatId": "chat-id", "content": "user request text" } ``` - Response: `{ "chat": ChatSummary }` Behavior notes: - If the chat already has a non-empty title, server returns the existing chat unchanged. - Server always uses OpenAI `gpt-4.1-mini` to generate a one-line title (up to ~4 words), updates the chat title, and returns the updated chat. ### `DELETE /v1/chats/:chatId` - Response: `{ "deleted": true }` - Not found: `404 { "message": "chat not found" }` ### `GET /v1/chats/:chatId` - Response: `{ "chat": ChatDetail }` ### `POST /v1/chats/:chatId/messages` - Body: ```json { "role": "system|user|assistant|tool", "content": "string", "name": "optional", "metadata": {}, "attachments": [ { "kind": "image", "id": "attachment-id", "filename": "photo.jpg", "mimeType": "image/jpeg", "sizeBytes": 12345, "dataUrl": "data:image/jpeg;base64,..." }, { "kind": "text", "id": "attachment-id", "filename": "notes.md", "mimeType": "text/markdown", "sizeBytes": 4567, "text": "# Notes\\n...", "truncated": false } ] } ``` - Response: `{ "message": Message }` Notes: - `attachments` is optional and is merged into stored `message.metadata.attachments`. - Tool messages should not include attachments. ## Chat Completions (non-streaming) ### `POST /v1/chat-completions` - Body: ```json { "chatId": "optional-chat-id", "provider": "openai|anthropic|xai|hermes-agent", "model": "string", "messages": [ { "role": "system|user|assistant|tool", "content": "string", "name": "optional", "attachments": [ { "kind": "image", "id": "attachment-id", "filename": "photo.jpg", "mimeType": "image/jpeg", "sizeBytes": 12345, "dataUrl": "data:image/jpeg;base64,..." }, { "kind": "text", "id": "attachment-id", "filename": "notes.md", "mimeType": "text/markdown", "sizeBytes": 4567, "text": "# Notes\\n...", "truncated": false } ] } ], "temperature": 0.2, "maxTokens": 256 } ``` - Response: ```json { "chatId": "chat-id-or-null", "provider": "openai", "model": "gpt-4.1-mini", "message": { "role": "assistant", "content": "..." }, "usage": { "inputTokens": 10, "outputTokens": 20, "totalTokens": 30 }, "raw": {} } ``` Behavior notes: - If `chatId` is present, server validates chat existence. - For `chatId` calls, server stores only *new* non-assistant messages from provided history to avoid duplicates. - Server persists final assistant output and call metadata (`LlmCall`) in DB. - Server updates chat-level model metadata on each call: `lastUsedProvider`/`lastUsedModel`; first successful/failed call also initializes `initiatedProvider`/`initiatedModel` if unset. - Attachments are optional and currently apply to `user` messages. Persisted chat history stores them under `message.metadata.attachments`. - Images are forwarded inline to providers as multimodal image parts. Use PNG or JPEG for cross-provider compatibility. - Text files are forwarded as explicit text blocks rather than provider-managed file references. Large text attachments should already be truncated client-side before submission. - For `openai`, backend calls OpenAI's Responses API and enables internal tool use with an internal system instruction. - For `xai`, backend calls xAI's OpenAI-compatible Chat Completions API and enables internal tool use with the same internal system instruction. - For `hermes-agent`, backend calls the configured Hermes Agent OpenAI-compatible Chat Completions API without adding Sybil-managed tool definitions; Hermes Agent handles its own tools server-side. - For `openai`, image attachments are sent as Responses `input_image` items and text attachments are sent as `input_text` items. - For `xai` and `hermes-agent`, image attachments are sent as Chat Completions content parts alongside text. - For `openai`, Responses calls that can enter the server-managed tool loop use `store: true` so reasoning and function-call items can be passed between tool rounds. - For `anthropic`, image attachments are sent as Messages API `image` blocks using base64 source data; text attachments are added as `text` blocks. - Available Sybil-managed tool calls for `openai` and `xai`: `web_search` and `fetch_url`. When `CHAT_CODEX_TOOL_ENABLED=true`, `codex_exec` is also available. When `CHAT_SHELL_TOOL_ENABLED=true`, `shell_exec` is also available. - `web_search` returns ranked results with per-result summaries/snippets. Its backend engine is selected by `CHAT_WEB_SEARCH_ENGINE` (`exa` default, or `searxng` with `SEARXNG_BASE_URL` set). SearXNG mode requires the instance to allow `format=json`. - `fetch_url` fetches a URL and returns plaintext page content (HTML converted to text server-side). - `codex_exec` delegates coding, shell, repository inspection, and other complex software tasks to a persistent remote Codex CLI workspace over SSH. The server runs `codex exec --dangerously-bypass-approvals-and-sandbox --skip-git-repo-check ` on the configured devbox inside `CHAT_CODEX_REMOTE_WORKDIR`, with SSH stdin closed. - `shell_exec` runs arbitrary non-interactive shell commands on the same configured devbox, starting in `CHAT_CODEX_REMOTE_WORKDIR`. It uses `bash -lc` when bash exists, otherwise `sh -lc`, closes SSH stdin, and does not run inside the Sybil server container. - Devbox tool configuration: - `CHAT_MAX_TOOL_ROUNDS=100` (optional; maximum model/tool result cycles before the backend returns a limit message) - `CHAT_CODEX_TOOL_ENABLED=true` - `CHAT_SHELL_TOOL_ENABLED=true` - `CHAT_CODEX_REMOTE_HOST=` (required when enabled) - `CHAT_CODEX_REMOTE_USER=` (optional; omitted if `CHAT_CODEX_REMOTE_HOST` already contains `user@host`) - `CHAT_CODEX_REMOTE_PORT=22` (optional) - `CHAT_CODEX_REMOTE_WORKDIR=/workspace/sybil-codex` (optional; created on the remote host if missing) - `CHAT_CODEX_SSH_KEY_PATH=/run/secrets/codex_ssh_key` (recommended private-key delivery via read-only volume mount) - `CHAT_CODEX_SSH_PRIVATE_KEY_B64=` (optional fallback when a volume mount is not practical) - `CHAT_CODEX_EXEC_TIMEOUT_MS=600000` (optional) - `CHAT_SHELL_EXEC_TIMEOUT_MS=120000` (optional) - When a tool call is executed, backend stores a chat `Message` with `role: "tool"` and tool metadata (`metadata.kind = "tool_call"`). Streaming requests persist each completed tool call as its SSE `tool_call` event is emitted, then store the assistant output when the completion finishes. - `anthropic` currently runs without server-managed tool calls. ## Searches ### `GET /v1/searches` - Response: `{ "searches": SearchSummary[] }` ### `POST /v1/searches` - Body: `{ "title"?: string, "query"?: string }` - Response: `{ "search": SearchSummary }` ### `DELETE /v1/searches/:searchId` - Response: `{ "deleted": true }` - Not found: `404 { "message": "search not found" }` ### `GET /v1/searches/:searchId` - Response: `{ "search": SearchDetail }` ### `POST /v1/searches/:searchId/chat` - Body: `{ "title"?: string }` - Response: `{ "chat": ChatSummary }` - Not found: `404 { "message": "search not found" }` Behavior notes: - Creates a new chat seeded with a hidden `system` message containing the search query, answer text, answer citations, and top search results. - Clients should include existing `system` messages when sending the chat history to `/v1/chat-completions` or `/v1/chat-completions/stream`; they may hide those messages in the transcript UI. - The default chat title is `Search: `, unless `title` is supplied. ### `POST /v1/searches/:searchId/run` - Body: ```json { "query": "optional override", "title": "optional override", "type": "auto|fast|deep|instant", "numResults": 10, "includeDomains": ["example.com"], "excludeDomains": ["example.org"] } ``` - Response: `{ "search": SearchDetail }` Search run notes: - Backend executes Exa search and Exa answer. - Search mode is independent from chat `web_search` tool configuration and remains Exa-only. - Persists answer text/citations + ranked results. - If both search and answer fail, endpoint returns an error. ### `POST /v1/searches/:searchId/run/stream` - Body: same as `POST /v1/searches/:searchId/run` - Response: `text/event-stream` Events: - `search_results`: `{ "requestId": string|null, "results": SearchResultItem[] }` - `search_error`: `{ "error": string }` - `answer`: `{ "answerText": string|null, "answerRequestId": string|null, "answerCitations": SearchDetail["answerCitations"] }` - `answer_error`: `{ "error": string }` - terminal `done`: `{ "search": SearchDetail }` - terminal `error`: `{ "message": string }` Behavior notes: - The stream is owned by the backend after it starts. If the original HTTP client disconnects, the backend keeps running and persists the final search state. - While a search stream is active, `GET /v1/active-runs` includes the `searchId`. - If a stream is already active for the same `searchId`, this endpoint attaches to the existing stream instead of starting a second run. ### `POST /v1/searches/:searchId/run/stream/attach` - Body: none - Response: `text/event-stream` with the same event names as `POST /v1/searches/:searchId/run/stream` - Not found: `404 { "message": "active search stream not found" }` Behavior notes: - Replays buffered events for the active in-memory stream, then emits new events until `done` or `error`. - Intended for clients that discovered a pending search via `GET /v1/active-runs`, such as after browser refresh. ## Type Shapes `ChatSummary` ```json { "id": "...", "title": null, "createdAt": "...", "updatedAt": "...", "initiatedProvider": "openai|anthropic|xai|hermes-agent|null", "initiatedModel": "string|null", "lastUsedProvider": "openai|anthropic|xai|hermes-agent|null", "lastUsedModel": "string|null" } ``` `Message` ```json { "id": "...", "createdAt": "...", "role": "system|user|assistant|tool", "content": "...", "name": null, "metadata": { "attachments": [ { "kind": "image", "id": "attachment-id", "filename": "photo.jpg", "mimeType": "image/jpeg", "sizeBytes": 12345, "dataUrl": "data:image/jpeg;base64,..." }, { "kind": "text", "id": "attachment-id", "filename": "notes.md", "mimeType": "text/markdown", "sizeBytes": 4567, "text": "# Notes\\n...", "truncated": false } ] } } ``` `metadata` remains nullable. Tool-call log messages still use `metadata.kind = "tool_call"`; regular user messages with attachments use `metadata.attachments`. `ChatDetail` ```json { "id": "...", "title": null, "createdAt": "...", "updatedAt": "...", "initiatedProvider": "openai|anthropic|xai|hermes-agent|null", "initiatedModel": "string|null", "lastUsedProvider": "openai|anthropic|xai|hermes-agent|null", "lastUsedModel": "string|null", "messages": [Message] } ``` `SearchSummary` ```json { "id": "...", "title": null, "query": null, "createdAt": "...", "updatedAt": "..." } ``` `SearchDetail` ```json { "id": "...", "title": "...", "query": "...", "createdAt": "...", "updatedAt": "...", "requestId": "...", "latencyMs": 123, "error": null, "answerText": "...", "answerRequestId": "...", "answerCitations": [], "answerError": null, "results": [] } ``` For streaming contracts, see `docs/api/streaming-chat.md`.