2026-05-15 09:12:28 -07:00
2026-05-15 23:40:39 -07:00
2026-05-16 16:05:32 -07:00
2026-05-16 16:05:32 -07:00
2026-05-16 16:05:32 -07:00
2026-05-16 16:05:32 -07:00
2026-05-16 16:05:32 -07:00
2026-05-14 08:12:13 -07:00
2026-05-16 16:05:32 -07:00
2026-05-16 16:05:32 -07:00
2026-05-16 16:05:32 -07:00

Local Page Archiver

This project saves self-contained HTML archives. It opens the input with Playwright, captures the rendered HTML, and inlines external resources as data: URLs.

CLI

npm install
npm run install-browsers
node src/cli.mjs archive "https://example.com/article"

For an existing HTML file:

node src/cli.mjs archive ./page.html

Archives are written to ARCHIVE_PATH, or to a development directory under the system temp directory when ARCHIVE_PATH is not set.

Ephemeral container worker

The host-facing container boundary is src/container-runner.mjs. It starts a short-lived Docker/Podman worker container, mounts the host archive directory at /archives, sends one archive request, reads a JSON result, and exits.

Build the worker image:

podman build -t local-page-archiver:latest .

Archive through the worker on macOS with Podman:

node src/container-runner.mjs archive "https://example.com/article" \
  --runtime podman \
  --image local-page-archiver:latest \
  --archive-path ./archives

The convenience wrapper does the same thing and builds the image if missing:

./podman-run.sh archive "https://example.com/article"

For visual debugging, expose VNC from the worker:

./podman-run.sh vnc-archive "https://example.com/article"
# Then open vnc://localhost:5901

The worker image starts Xvfb internally, so callers do not need to mount the host X11 socket or override the entrypoint.

Description
Website Archiver
Readme 2.4 MiB
Languages
JavaScript 98.7%
CSS 0.6%
Shell 0.4%
HTML 0.2%