# Sybil Server Backend API for: - LLM multiplexer (OpenAI / Anthropic / xAI (Grok)) - Personal chat database (chats/messages + LLM call log) ## Stack - Node.js + TypeScript - Fastify (HTTP) - Prisma + SQLite (dev) ## Quick start ```bash cp .env.example .env npm run dev ``` Migrations are applied automatically on server startup (`prisma migrate deploy`). Open docs: `http://localhost:8787/docs` API contract docs for clients: - `../docs/api/rest.md` - `../docs/api/streaming-chat.md` ## Run Modes - `npm run dev`: runs `src/index.ts` with `tsx` in watch mode (auto-restart on file changes). Use for local development. - `npm run start`: runs compiled `dist/index.js` with Node.js (no watch mode). Use for production-like runs. Both modes run startup checks (`predev` / `prestart`) and apply migrations at app boot. ## Auth Set `ADMIN_TOKEN` and send: `Authorization: Bearer ` If `ADMIN_TOKEN` is not set, the server runs in open mode (dev). ## Env - `OPENAI_API_KEY` - `ANTHROPIC_API_KEY` - `XAI_API_KEY` - `EXA_API_KEY` - `CHAT_WEB_SEARCH_ENGINE` (`exa` by default, or `searxng` for chat tool calls only) - `SEARXNG_BASE_URL` (required when `CHAT_WEB_SEARCH_ENGINE=searxng`; instance must allow `format=json`) - `CHAT_CODEX_TOOL_ENABLED` (`false` by default; enables the `codex_exec` chat tool for OpenAI/xAI) - `CHAT_CODEX_REMOTE_HOST` (required when Codex tool is enabled; SSH host/IP or `user@host`) - `CHAT_CODEX_REMOTE_USER` (optional SSH user when host does not include one) - `CHAT_CODEX_REMOTE_PORT` (`22` by default) - `CHAT_CODEX_REMOTE_WORKDIR` (`/workspace/sybil-codex` by default; created and reused on the devbox) - `CHAT_CODEX_SSH_KEY_PATH` (recommended: path to a read-only mounted private key) - `CHAT_CODEX_SSH_PRIVATE_KEY_B64` (optional fallback private key delivery) - `CHAT_CODEX_EXEC_TIMEOUT_MS` (`600000` by default) ## API - `GET /health` - `GET /v1/auth/session` - `GET /v1/chats` - `POST /v1/chats` - `GET /v1/chats/:chatId` - `POST /v1/chats/:chatId/messages` - `POST /v1/chat-completions` - `POST /v1/chat-completions/stream` (SSE) - `GET /v1/searches` - `POST /v1/searches` - `GET /v1/searches/:searchId` - `POST /v1/searches/:searchId/run` - `POST /v1/searches/:searchId/run/stream` (SSE) Search runs now execute both Exa `searchAndContents` and Exa `answer`, storing: - ranked search results (for result cards), and - a top-level answer block + citations. When `chatId` is provided to completion endpoints, you can send full conversation context. The server now stores only new non-assistant messages to avoid duplicate history rows. `POST /v1/chat-completions` body example: ```json { "chatId": "", "provider": "openai", "model": "gpt-4.1-mini", "messages": [ {"role":"system","content":"You are helpful."}, {"role":"user","content":"Say hi"} ], "temperature": 0.2, "maxTokens": 256 } ``` ## Next steps (planned) - Better streaming protocol compatibility (OpenAI-style chunks + cancellation) - Tool/function calling normalization - User accounts + per-device API keys - Postgres support + migrations for prod - Attachments + embeddings + semantic search