server/README.md

# Sybil Server

Backend API for:
- LLM multiplexer (OpenAI / Anthropic / xAI (Grok))
- Personal chat database (chats/messages + LLM call log)

## Stack
- Node.js + TypeScript
- Fastify (HTTP)
- Prisma + SQLite (dev)

## Quick start

```bash
cp .env.example .env
npm run dev
```

Migrations are applied automatically on server startup (`prisma migrate deploy`).

Open docs: `http://localhost:8787/docs`

API contract docs for clients:
- `../docs/api/rest.md`
- `../docs/api/streaming-chat.md`

## Run Modes

- `npm run dev`: runs `src/index.ts` with `tsx` in watch mode (auto-restart on file changes). Use for local development.
- `npm run start`: runs compiled `dist/index.js` with Node.js (no watch mode). Use for production-like runs.

Both modes run startup checks (`predev` / `prestart`) and apply migrations at app boot.

## Auth

Set `ADMIN_TOKEN` and send:

`Authorization: Bearer <ADMIN_TOKEN>`

If `ADMIN_TOKEN` is not set, the server runs in open mode (dev).

## Env
- `OPENAI_API_KEY`
- `ANTHROPIC_API_KEY`
- `XAI_API_KEY`
- `EXA_API_KEY`
- `CHAT_WEB_SEARCH_ENGINE` (`exa` by default, or `searxng` for chat tool calls only)
- `SEARXNG_BASE_URL` (required when `CHAT_WEB_SEARCH_ENGINE=searxng`; instance must allow `format=json`)
- `CHAT_MAX_TOOL_ROUNDS` (`8` by default; maximum model/tool result cycles per chat completion)
- `CHAT_CODEX_TOOL_ENABLED` (`false` by default; enables the `codex_exec` chat tool for OpenAI/xAI)
- `CHAT_CODEX_REMOTE_HOST` (required when Codex tool is enabled; SSH host/IP or `user@host`)
- `CHAT_CODEX_REMOTE_USER` (optional SSH user when host does not include one)
- `CHAT_CODEX_REMOTE_PORT` (`22` by default)
- `CHAT_CODEX_REMOTE_WORKDIR` (`/workspace/sybil-codex` by default; created and reused on the devbox)
- `CHAT_CODEX_SSH_KEY_PATH` (recommended: path to a read-only mounted private key)
- `CHAT_CODEX_SSH_PRIVATE_KEY_B64` (optional fallback private key delivery)
- `CHAT_CODEX_EXEC_TIMEOUT_MS` (`600000` by default)
- `CHAT_SHELL_TOOL_ENABLED` (`false` by default; enables the `shell_exec` chat tool for OpenAI/xAI on the same devbox)
- `CHAT_SHELL_EXEC_TIMEOUT_MS` (`120000` by default)

## API
- `GET /health`
- `GET /v1/auth/session`
- `GET /v1/chats`
- `POST /v1/chats`
- `GET /v1/chats/:chatId`
- `POST /v1/chats/:chatId/messages`
- `POST /v1/chat-completions`
- `POST /v1/chat-completions/stream` (SSE)
- `GET /v1/searches`
- `POST /v1/searches`
- `GET /v1/searches/:searchId`
- `POST /v1/searches/:searchId/run`
- `POST /v1/searches/:searchId/run/stream` (SSE)

Search runs now execute both Exa `searchAndContents` and Exa `answer`, storing:
- ranked search results (for result cards), and
- a top-level answer block + citations.

When `chatId` is provided to completion endpoints, you can send full conversation context. The server now stores only new non-assistant messages to avoid duplicate history rows.

`POST /v1/chat-completions` body example:

```json
{
  "chatId": "<optional chat id>",
  "provider": "openai",
  "model": "gpt-4.1-mini",
  "messages": [
    {"role":"system","content":"You are helpful."},
    {"role":"user","content":"Say hi"}
  ],
  "temperature": 0.2,
  "maxTokens": 256
}
```

## Next steps (planned)
- Better streaming protocol compatibility (OpenAI-style chunks + cancellation)
- Tool/function calling normalization
- User accounts + per-device API keys
- Postgres support + migrations for prod
- Attachments + embeddings + semantic search
initial commit: add server 2026-02-13 22:43:55 -08:00			`# Sybil Server`

			`Backend API for:`
			`- LLM multiplexer (OpenAI / Anthropic / xAI (Grok))`
			`- Personal chat database (chats/messages + LLM call log)`

			`## Stack`
			`- Node.js + TypeScript`
			`- Fastify (HTTP)`
			`- Prisma + SQLite (dev)`

			`## Quick start`

			```bash
			`cp .env.example .env`
			`npm run dev`
			```

Add web frontend 2026-02-13 23:15:12 -08:00			Migrations are applied automatically on server startup (`prisma migrate deploy`).

initial commit: add server 2026-02-13 22:43:55 -08:00			Open docs: `http://localhost:8787/docs`

docs 2026-02-14 21:20:14 -08:00			`API contract docs for clients:`
			- `../docs/api/rest.md`
			- `../docs/api/streaming-chat.md`

Add web frontend 2026-02-13 23:15:12 -08:00			`## Run Modes`

			- `npm run dev`: runs `src/index.ts` with `tsx` in watch mode (auto-restart on file changes). Use for local development.
			- `npm run start`: runs compiled `dist/index.js` with Node.js (no watch mode). Use for production-like runs.

			Both modes run startup checks (`predev` / `prestart`) and apply migrations at app boot.

initial commit: add server 2026-02-13 22:43:55 -08:00			`## Auth`

			Set `ADMIN_TOKEN` and send:

			`Authorization: Bearer <ADMIN_TOKEN>`

			If `ADMIN_TOKEN` is not set, the server runs in open mode (dev).

			`## Env`
			- `OPENAI_API_KEY`
			- `ANTHROPIC_API_KEY`
			- `XAI_API_KEY`
adds search support with exa 2026-02-13 23:49:55 -08:00			- `EXA_API_KEY`
Adds searxng support for tool calling 2026-05-02 18:14:41 -07:00			- `CHAT_WEB_SEARCH_ENGINE` (`exa` by default, or `searxng` for chat tool calls only)
			- `SEARXNG_BASE_URL` (required when `CHAT_WEB_SEARCH_ENGINE=searxng`; instance must allow `format=json`)
Various fixes for tool calling 2026-05-02 21:19:52 -07:00			- `CHAT_MAX_TOOL_ROUNDS` (`8` by default; maximum model/tool result cycles per chat completion)
experimental devbox support 2026-05-02 19:38:15 -07:00			- `CHAT_CODEX_TOOL_ENABLED` (`false` by default; enables the `codex_exec` chat tool for OpenAI/xAI)
			- `CHAT_CODEX_REMOTE_HOST` (required when Codex tool is enabled; SSH host/IP or `user@host`)
			- `CHAT_CODEX_REMOTE_USER` (optional SSH user when host does not include one)
			- `CHAT_CODEX_REMOTE_PORT` (`22` by default)
			- `CHAT_CODEX_REMOTE_WORKDIR` (`/workspace/sybil-codex` by default; created and reused on the devbox)
			- `CHAT_CODEX_SSH_KEY_PATH` (recommended: path to a read-only mounted private key)
			- `CHAT_CODEX_SSH_PRIVATE_KEY_B64` (optional fallback private key delivery)
			- `CHAT_CODEX_EXEC_TIMEOUT_MS` (`600000` by default)
adds shell tool 2026-05-02 19:52:09 -07:00			- `CHAT_SHELL_TOOL_ENABLED` (`false` by default; enables the `shell_exec` chat tool for OpenAI/xAI on the same devbox)
			- `CHAT_SHELL_EXEC_TIMEOUT_MS` (`120000` by default)
initial commit: add server 2026-02-13 22:43:55 -08:00
			`## API`
			- `GET /health`
Add web frontend 2026-02-13 23:15:12 -08:00			- `GET /v1/auth/session`
initial commit: add server 2026-02-13 22:43:55 -08:00			- `GET /v1/chats`
			- `POST /v1/chats`
			- `GET /v1/chats/:chatId`
			- `POST /v1/chats/:chatId/messages`
			- `POST /v1/chat-completions`
			- `POST /v1/chat-completions/stream` (SSE)
adds search support with exa 2026-02-13 23:49:55 -08:00			- `GET /v1/searches`
			- `POST /v1/searches`
			- `GET /v1/searches/:searchId`
			- `POST /v1/searches/:searchId/run`
docs 2026-02-14 21:20:14 -08:00			- `POST /v1/searches/:searchId/run/stream` (SSE)
initial commit: add server 2026-02-13 22:43:55 -08:00
Search answers 2026-02-14 00:14:10 -08:00			Search runs now execute both Exa `searchAndContents` and Exa `answer`, storing:
			`- ranked search results (for result cards), and`
			`- a top-level answer block + citations.`

Add web frontend 2026-02-13 23:15:12 -08:00			When `chatId` is provided to completion endpoints, you can send full conversation context. The server now stores only new non-assistant messages to avoid duplicate history rows.

initial commit: add server 2026-02-13 22:43:55 -08:00			`POST /v1/chat-completions` body example:

			```json
			`{`
			`"chatId": "<optional chat id>",`
			`"provider": "openai",`
			`"model": "gpt-4.1-mini",`
			`"messages": [`
			`{"role":"system","content":"You are helpful."},`
			`{"role":"user","content":"Say hi"}`
			`],`
			`"temperature": 0.2,`
			`"maxTokens": 256`
			`}`
			```

			`## Next steps (planned)`
			`- Better streaming protocol compatibility (OpenAI-style chunks + cancellation)`
			`- Tool/function calling normalization`
			`- User accounts + per-device API keys`
			`- Postgres support + migrations for prod`
			`- Attachments + embeddings + semantic search`