Files
sigilbox/README.md

54 lines
1.5 KiB
Markdown
Raw Normal View History

2026-05-14 08:12:13 -07:00
# Local Page Archiver
2026-05-15 01:00:27 -07:00
This project saves self-contained HTML archives. It opens the input with Playwright, captures the rendered HTML, and inlines external resources as `data:` URLs.
2026-05-14 08:12:13 -07:00
## CLI
```sh
npm install
npm run install-browsers
node src/cli.mjs archive "https://example.com/article"
```
For an existing HTML file:
```sh
2026-05-15 01:00:27 -07:00
node src/cli.mjs archive ./page.html
2026-05-14 08:12:13 -07:00
```
Archives are written to `ARCHIVE_PATH`, or to a development directory under the system temp directory when `ARCHIVE_PATH` is not set.
2026-05-16 16:05:32 -07:00
## Ephemeral container worker
The host-facing container boundary is `src/container-runner.mjs`. It starts a short-lived Docker/Podman worker container, mounts the host archive directory at `/archives`, sends one archive request, reads a JSON result, and exits.
Build the worker image:
```sh
podman build -t local-page-archiver:latest .
```
Archive through the worker on macOS with Podman:
```sh
node src/container-runner.mjs archive "https://example.com/article" \
--runtime podman \
--image local-page-archiver:latest \
--archive-path ./archives
```
The convenience wrapper does the same thing and builds the image if missing:
```sh
./podman-run.sh archive "https://example.com/article"
```
For visual debugging, expose VNC from the worker:
```sh
./podman-run.sh vnc-archive "https://example.com/article"
# Then open vnc://localhost:5901
```
The worker image starts Xvfb internally, so callers do not need to mount the host X11 socket or override the entrypoint.