fix most web_fetches from getting blocked using a real user agent

This commit is contained in:
2026-06-11 23:36:19 -07:00
parent 22aa652257
commit d7214c88ad
8 changed files with 403 additions and 54 deletions

View File

@@ -299,7 +299,7 @@ Behavior notes:
- For `anthropic`, image attachments are sent as Messages API `image` blocks using base64 source data; text attachments are added as `text` blocks.
- Available Sybil-managed tool calls for `openai` and `xai`: `web_search` and `fetch_url`. When `CHAT_CODEX_TOOL_ENABLED=true`, `codex_exec` is also available. When `CHAT_SHELL_TOOL_ENABLED=true`, `shell_exec` is also available.
- `web_search` returns ranked results with per-result summaries/snippets. Its backend engine is selected by `CHAT_WEB_SEARCH_ENGINE` (`exa` default, or `searxng` with `SEARXNG_BASE_URL` set). SearXNG mode requires the instance to allow `format=json`.
- `fetch_url` fetches a URL and returns plaintext page content (HTML converted to text server-side).
- `fetch_url` fetches a URL with browser-like navigation headers and returns plaintext page content (HTML converted to text server-side).
- `codex_exec` delegates coding, shell, repository inspection, and other complex software tasks to a persistent remote Codex CLI workspace over SSH. The server runs `codex exec --dangerously-bypass-approvals-and-sandbox --skip-git-repo-check <non-interactive wrapped prompt>` on the configured devbox inside `CHAT_CODEX_REMOTE_WORKDIR`, with SSH stdin closed.
- `shell_exec` runs arbitrary non-interactive shell commands on the same configured devbox, starting in `CHAT_CODEX_REMOTE_WORKDIR`. It uses `bash -lc` when bash exists, otherwise `sh -lc`, closes SSH stdin, and does not run inside the Sybil server container.
- Devbox tool configuration:

View File

@@ -172,6 +172,7 @@ Terminal tool-call event:
- `openai`: backend uses OpenAI's Responses API and may execute internal function tool calls (`web_search`, `fetch_url`, optional `codex_exec`, and optional `shell_exec`) before producing final text.
- `xai`: backend uses xAI's OpenAI-compatible Chat Completions API and may execute the same internal tool calls before producing final text.
- `fetch_url` sends browser-like navigation headers for outbound URL requests to reduce false 403s from sites that reject generic server clients.
- `hermes-agent`: backend uses the configured Hermes Agent OpenAI-compatible Chat Completions API. Sybil does not add its own tool definitions for this provider; Hermes Agent handles its own tools server-side. Custom Hermes stream events are normalized away unless they produce text deltas in this SSE contract.
- `openai`: image attachments are sent as Responses `input_image` items; text attachments are sent as `input_text` items.
- `xai` and `hermes-agent`: image attachments are sent as Chat Completions content parts; text attachments are inlined as text parts.