sekko
Capture browser and terminal sessions; extract structured artifacts AI agents can read.
sekko
Capture browser and terminal sessions, then extract structured artifacts an AI agent can read to understand how an app or CLI tool works. Record what a user does — clicks, commands, API calls — with optional voice-over narration explaining why.
DISCLAIMER / PSA: Shared as-is, I hope it helps. This space is evolving quickly. I am sharing to help others wade through the fog and swamp, as I have been doing. Look around; there are probably better tools than this one. 😉
Why
Working with Claude Code on existing projects, I wanted a quick way to
teach Claude the app — how to use it, the user flow, the important
screens. So Claude Code and I did a sidebar on the project we were
working on, built sekko, then used it. It lowered a lot of the friction
of working on brownfield code. One concrete win: sekko made it much
easier to add or patch acceptance-test suites, which are vital when
working in existing codebases.
Your mileage may vary — I share it in case it helps. There are other tools in this space; look around before settling.
How it works
Two steps: record a session, then extract structured artifacts from the recording. Narration is optional.
sekko record-web <url>— open a Chromium browser, capture clicks, navigation, network requests, DOM snapshots, screenshots. Save as a Playwright trace (.zip) + HAR + user-events JSON.sekko record-terminal— open an interactive shell, capture every command + output + exit code. Save as asciicast (.cast) with command-boundary markers.sekko extract <input>— turn a trace or terminal recording into agent-consumable markdown (actions, selectors, network, screenshots, terminal session, summary). Auto-detects format by extension.--narrateon either recording command — capture voice-over audio alongside the session. Transcribe inline (or later viasekko transcribe); extract merges the transcript into the output.
Install
Platform: macOS Apple Silicon (
darwin-arm64) only. The terminal-recording path usesnode-ptywith a vendored arm64 prebuild; on other platformsnpm installwill refuse withEBADPLATFORM. Cross-platform support hasn’t been validated; if you’d find it useful, open an issue.
npm install -g sekko
npx playwright install chromium
Or from source:
git clone https://github.com/csepulv/save-the-tokens
cd save-the-tokens/tools/sekko
npm install
npm link
npx playwright install chromium
For voice-over narration, see Narration below — needs SoX and either whisper-cpp (local) or Deepgram (cloud).
Quick Start
1. Record a browser trace
sekko record-web https://your-app.com --output ./my-trace
A Chromium browser opens. Use the app — navigate, click, fill forms. When you’re done, close the browser window. sekko saves:
trace.zip— full Playwright trace (actions, DOM snapshots, screenshots, network)recording.har— HAR file (all HTTP requests/responses)user-events.json— captured user interactions (clicks, form fills, navigation)
2. Record a terminal session
sekko record-terminal --output ./my-session
An interactive shell opens (zsh preferred, bash fallback). Use it
normally — run commands, install packages, configure tools. Type
exit or Ctrl-D to stop. sekko saves:
recording.cast— asciicast v2 recording with command-boundary markers
3. Extract artifacts
# From a browser trace
sekko extract ./my-trace/trace.zip --output ./context
# From a terminal recording
sekko extract ./my-session/recording.cast --output ./context
Browser trace produces:
summary.md— start here. Lists all artifacts and how to use them.actions.md— what the user did, in order (clicks, navigation, form fills with selectors). Correlated with network request IDs.network.md— HTTP request summary table with IDs, correlated to triggering actions.network-detail.json— full request/response bodies, referenced by ID.selectors.md— unique selectors for interactive elements.screenshots/— visual state at key moments.narration.md— timestamped voice-over transcript (when narration.json is present).
Terminal recording produces:
summary.md— artifact manifest.terminal-session.md— every command, its output, exit code, and duration.terminal-session.json— same data in structured JSON.narration.md— voice-over transcript (when narration.json is present).
Terminal extraction includes credential redaction (GitHub tokens, AWS keys, bearer tokens, database passwords), interactive-program detection (vim, less), and long-output truncation.
4. Filter to your app’s API (browser traces)
Most traces include noise — CDN requests, auth providers, dev-server assets. Filter to just your app’s API:
sekko extract ./my-trace/trace.zip --include-hosts localhost:3456 --output ./context
Or exclude specific hosts:
sekko extract ./my-trace/trace.zip --exclude-hosts fonts.googleapis.com,clerk.accounts.dev --output ./context
Commands
| Command | Purpose |
|---|---|
sekko record-web <url> | Record a browser session as a Playwright trace |
sekko record-terminal | Record a terminal session as asciicast |
sekko extract <input> | Extract agent-consumable artifacts (auto-detects .zip vs .cast) |
sekko transcribe <audio> | Transcribe voice-over WAV to narration.json |
sekko setup | Check and install narration dependencies (SoX, whisper-cpp, model) |
sekko record-web <url>
Record a browser session.
sekko record-web https://your-app.com
sekko record-web https://your-app.com --narrate
sekko record-web https://your-app.com --output ./traces/session-1
sekko record-web https://your-app.com --auth auth-state.json # reuse saved login
sekko record-web https://your-app.com --save-auth auth-state.json # save login for later
| Flag | Description | Default |
|---|---|---|
-o, --output <dir> | Output directory | ./sekko-output |
--auth <path> | Load browser storage state from JSON | — |
--save-auth <path> | Save browser storage state to JSON on close | — |
--narrate | Record voice-over audio (requires SoX) | off |
--keyterm <terms> | Domain-specific terms for transcription accuracy (comma-separated) | — |
sekko record-terminal
Record a terminal session. Shell hooks inject command-boundary markers
(preexec/precmd for zsh, PROMPT_COMMAND / DEBUG trap for bash) so
extraction can separate each command’s output cleanly.
sekko record-terminal
sekko record-terminal --output ./sessions/session-1
sekko record-terminal --shell bash
sekko record-terminal --narrate --keyterm "kubectl,terraform"
| Flag | Description | Default |
|---|---|---|
-o, --output <dir> | Output directory | ./sekko-output |
--shell <shell> | Shell to use (zsh or bash) | auto-detect (prefers zsh) |
--narrate | Record voice-over audio (requires SoX) | off |
--keyterm <terms> | Domain-specific terms for transcription accuracy (comma-separated) | — |
sekko extract <input>
Extract artifacts from a browser trace (.zip) or terminal recording
(.cast). Auto-detects the format.
sekko extract ./my-trace/trace.zip --output ./context
sekko extract ./my-session/recording.cast --output ./context
sekko extract ./my-trace/trace.zip --include-hosts localhost:3456 --output ./context
If narration.json exists alongside the input, extraction includes
narration.md automatically. If voice-over.wav exists but
narration.json doesn’t, extract reminds you to run sekko transcribe
first.
| Flag | Description | Default |
|---|---|---|
-o, --output <dir> | Output directory | ./sekko-extract |
--include-hosts <hosts> | Only include requests to these hosts (comma-separated) | all hosts |
--exclude-hosts <hosts> | Exclude requests to these hosts (comma-separated) | none |
sekko transcribe <audio-file>
Transcribe a voice-over recording separately (if you skipped transcription after recording, or want to re-transcribe with different settings).
sekko transcribe ./my-trace/voice-over.wav
sekko transcribe ./my-trace/voice-over.wav --keyterm "JunkDrawer,foobar"
Reads voice-over-meta.json from the same directory for timestamp
correlation. Outputs narration.json. If a .mp3 exists alongside the
.wav, uses that for Deepgram uploads.
sekko setup
Check and install narration dependencies.
sekko setup
Walks through SoX, whisper-cpp, and the whisper model. Idempotent — re-running skips already-installed items.
Narration
Optional voice-over recording that adds why context to the what of
actions and commands. During recording, sekko captures your microphone;
after the recording, the audio is transcribed into narration.json.
When sekko extract runs and finds narration.json next to the
recording, the transcript merges into the output as narration.md.
Install narration deps
sekko setup
Or manually:
brew install sox # audio recording
brew install ffmpeg # compresses WAV → MP3 for Deepgram uploads
brew install whisper-cpp # local transcription (skip if using Deepgram)
mkdir -p ~/.sekko/models
curl -L -o ~/.sekko/models/ggml-small.en.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.en.bin
ffmpeg is recommended — it compresses recordings from ~30 MB WAV to
~700 KB MP3, which is required for Deepgram cloud uploads and saves
disk space.
Configure transcription backend
Copy .env.sample to .env:
cp .env.sample .env
Defaults to local whisper. For cloud transcription (Deepgram):
SEKKO_TRANSCRIPTION_MODE=deepgram
DEEPGRAM_API_KEY=your_api_key_here
See .env.sample for all options.
Record with narration
sekko record-web https://your-app.com --narrate --output ./my-trace
sekko record-terminal --narrate --output ./my-session
Speak naturally while you use the app — explain what you’re doing, why you’re clicking things, what the app is showing. When the session ends, sekko prompts to transcribe immediately or defer.
Use --keyterm to improve transcription of domain-specific terms:
sekko record-web https://your-app.com --narrate --keyterm "JunkDrawer,foobar"
Transcribe later
If you deferred during recording:
sekko transcribe ./my-trace/voice-over.wav --keyterm "JunkDrawer,foobar"
Then re-run sekko extract to pick up the new narration.json.
Config file
For projects where you extract repeatedly with the same settings,
create sekko.config.yaml in your project root:
includeHosts:
- localhost:3456
CLI flags override config file values. See sekko.config.yaml.example
for all options.
Auth state workflow
First session — log in once and save the cookies/local storage:
sekko record-web https://your-app.com --save-auth auth-state.json --output ./trace-1
Subsequent sessions — reuse the saved auth:
sekko record-web https://your-app.com --auth auth-state.json --output ./trace-2
Combine both on the first session to keep the auth file fresh:
sekko record-web https://your-app.com \
--save-auth auth-state.json \
--auth auth-state.json \
--output ./trace-3
Viewing traces
Open a trace in Playwright’s trace viewer:
npx playwright show-trace ./my-trace/trace.zip
Or drag trace.zip onto trace.playwright.dev.
Troubleshooting
sekko record-* --narrate fails with “SoX not found”
SoX isn’t installed. Run sekko setup or brew install sox.
Playwright can’t find Chromium
Install browsers: npx playwright install chromium. This is a one-time
step; sekko doesn’t run it for you on install.
macOS microphone permission
First time you run --narrate, macOS asks your terminal (or IDE) for
microphone permission. If you denied it accidentally, grant it in
System Settings → Privacy & Security → Microphone.
Deepgram returns 401 or 413
- 401:
DEEPGRAM_API_KEYmissing or invalid in.env. - 413: WAV file over Deepgram’s ~25 MB limit. sekko compresses
WAV → MP3 via ffmpeg before upload — install ffmpeg
(
brew install ffmpeg) if missing. Without it, long recordings fail.
Extract finds no actions
Playwright’s context.tracing only records Playwright API calls.
sekko injects user-event listeners via addInitScript to capture
manual clicks, form fills, and navigation. If actions are missing,
the page probably closed before events flushed — the trace saves in
a page.on('close') handler, so close the page (not Ctrl-C the
process).
Whisper transcription is slow or inaccurate
The default model is ggml-small.en.bin (small/fast/English). For
higher accuracy, swap to a larger model and point SEKKO_WHISPER_MODEL
at it. Or switch to Deepgram by setting SEKKO_TRANSCRIPTION_MODE=deepgram.
Development
npm test # vitest
npm run test:watch