buzzert/Sybil-2

Fork 0

Files

History

James Magahern 29e340fd08 quick question feature

2026-05-02 23:48:01 -07:00

prisma

chat: remember last model

2026-02-14 22:06:30 -08:00

scripts

Add web frontend

2026-02-13 23:15:12 -08:00

src

quick question feature

2026-05-02 23:48:01 -07:00

tests

fix streaming

2026-05-02 23:09:39 -07:00

.gitignore

adds search support with exa

2026-02-13 23:49:55 -08:00

package-lock.json

[feature] adds web_search and fetch_url tool calls

2026-03-02 16:13:34 -08:00

package.json

fix streaming

2026-05-02 23:09:39 -07:00

README.md

oai responses api, tool call retries

2026-05-02 21:44:32 -07:00

tsconfig.json

initial commit: add server

2026-02-13 22:43:55 -08:00

README.md

Sybil Server

Backend API for:

LLM multiplexer (OpenAI Responses / Anthropic / xAI Chat Completions-compatible Grok)
Personal chat database (chats/messages + LLM call log)

Stack

Node.js + TypeScript
Fastify (HTTP)
Prisma + SQLite (dev)

Quick start

cp .env.example .env
npm run dev

Migrations are applied automatically on server startup (prisma migrate deploy).

Open docs: http://localhost:8787/docs

API contract docs for clients:

../docs/api/rest.md
../docs/api/streaming-chat.md

Run Modes

npm run dev: runs src/index.ts with tsx in watch mode (auto-restart on file changes). Use for local development.
npm run start: runs compiled dist/index.js with Node.js (no watch mode). Use for production-like runs.

Both modes run startup checks (predev / prestart) and apply migrations at app boot.

Auth

Set ADMIN_TOKEN and send:

Authorization: Bearer <ADMIN_TOKEN>

If ADMIN_TOKEN is not set, the server runs in open mode (dev).

Env

OPENAI_API_KEY
ANTHROPIC_API_KEY
XAI_API_KEY
EXA_API_KEY
CHAT_WEB_SEARCH_ENGINE (exa by default, or searxng for chat tool calls only)
SEARXNG_BASE_URL (required when CHAT_WEB_SEARCH_ENGINE=searxng; instance must allow format=json)
CHAT_MAX_TOOL_ROUNDS (100 by default; maximum model/tool result cycles per chat completion)
CHAT_CODEX_TOOL_ENABLED (false by default; enables the codex_exec chat tool for OpenAI/xAI)
CHAT_CODEX_REMOTE_HOST (required when Codex tool is enabled; SSH host/IP or user@host)
CHAT_CODEX_REMOTE_USER (optional SSH user when host does not include one)
CHAT_CODEX_REMOTE_PORT (22 by default)
CHAT_CODEX_REMOTE_WORKDIR (/workspace/sybil-codex by default; created and reused on the devbox)
CHAT_CODEX_SSH_KEY_PATH (recommended: path to a read-only mounted private key)
CHAT_CODEX_SSH_PRIVATE_KEY_B64 (optional fallback private key delivery)
CHAT_CODEX_EXEC_TIMEOUT_MS (600000 by default)
CHAT_SHELL_TOOL_ENABLED (false by default; enables the shell_exec chat tool for OpenAI/xAI on the same devbox)
CHAT_SHELL_EXEC_TIMEOUT_MS (120000 by default)

API

GET /health
GET /v1/auth/session
GET /v1/chats
POST /v1/chats
GET /v1/chats/:chatId
POST /v1/chats/:chatId/messages
POST /v1/chat-completions
POST /v1/chat-completions/stream (SSE)
GET /v1/searches
POST /v1/searches
GET /v1/searches/:searchId
POST /v1/searches/:searchId/run
POST /v1/searches/:searchId/run/stream (SSE)

Search runs now execute both Exa searchAndContents and Exa answer, storing:

ranked search results (for result cards), and
a top-level answer block + citations.

When chatId is provided to completion endpoints, you can send full conversation context. The server now stores only new non-assistant messages to avoid duplicate history rows.

POST /v1/chat-completions body example:

{
  "chatId": "<optional chat id>",
  "provider": "openai",
  "model": "gpt-4.1-mini",
  "messages": [
    {"role":"system","content":"You are helpful."},
    {"role":"user","content":"Say hi"}
  ],
  "temperature": 0.2,
  "maxTokens": 256
}

Next steps (planned)

Better streaming protocol compatibility (OpenAI-style chunks + cancellation)
Tool/function calling normalization
User accounts + per-device API keys
Postgres support + migrations for prod
Attachments + embeddings + semantic search