quick question feature

This commit is contained in:
2026-05-02 23:48:01 -07:00
parent 6fbcaecbf8
commit 29e340fd08
8 changed files with 748 additions and 106 deletions

View File

@@ -19,6 +19,7 @@ Authentication:
```json
{
"chatId": "optional-chat-id",
"persist": true,
"provider": "openai|anthropic|xai",
"model": "string",
"messages": [
@@ -53,10 +54,12 @@ Authentication:
```
Notes:
- If `chatId` is omitted, backend creates a new chat.
- `persist` defaults to `true`.
- If `persist` is `true` and `chatId` is omitted, backend creates a new chat.
- If `chatId` is provided, backend validates it exists.
- Backend stores only new non-assistant input history rows to avoid duplicates.
- Attachments are optional and are persisted under `message.metadata.attachments` on stored user messages.
- If `persist` is `false`, `chatId` must be omitted. Backend does not create a chat and does not persist input messages, tool-call messages, assistant output, or `LlmCall` metadata.
- For persisted streams, backend stores only new non-assistant input history rows to avoid duplicates.
- Attachments are optional and are persisted under `message.metadata.attachments` on stored user messages when `persist` is `true`.
## Event Stream Contract
@@ -71,13 +74,15 @@ Event order:
```json
{
"type": "meta",
"chatId": "chat-id",
"callId": "llm-call-id",
"chatId": "chat-id-or-null",
"callId": "llm-call-id-or-null",
"provider": "openai",
"model": "gpt-4.1-mini"
}
```
For `persist: false` streams, `chatId` and `callId` are `null`.
### `delta`
```json
@@ -148,17 +153,22 @@ Tool-enabled streaming notes (`openai`/`xai`):
Backend database remains source of truth.
During stream:
For persisted streams:
- Client may optimistically render accumulated `delta` text.
- Backend persists each completed tool call as a `tool` message before emitting its `tool_call` SSE event, so chat detail refreshes can show completed tool calls while the assistant response is still running.
On successful completion:
On successful persisted completion:
- Backend persists assistant `Message` and updates `LlmCall` usage/latency in a transaction.
- Backend then emits `done`.
On failure:
On persisted failure:
- Backend records call error and emits `error`.
For `persist: false` streams:
- Client may render the same `meta`, `tool_call`, `delta`, and terminal events.
- Backend does not write any chat, message, tool-call log, assistant output, or call metadata rows.
- `done.text` is the canonical assistant text if the client later imports the result into a saved chat.
Client recommendation (for iOS/web):
1. Render deltas in real time for UX.
2. On `done`, refresh chat detail from REST (`GET /v1/chats/:chatId`) and use DB-backed data as canonical.