← Back to Cadence
How it works

How Cadence turns local logs into an offline report.

Cadence turns local agent activity into a local SQLite cache, then renders an offline HTML report from that cache.

The normal path is:

Claude Code / Codex JSONL logs
  -> Cadence adapters parse supported events
  -> SQLite cache under ~/.cache/cadence
  -> static reports rendered from SQLite

The recommended daily-driver workflow is cadence serve, which starts a loopback HTTP server, runs an initial scan, renders the dashboard, and opens it in a browser. One-click rescan and optional auto-rescan update the dashboard without restarting the process. The server shuts down automatically after 30 minutes of idle.

The one-shot cadence command (scan then render to a static HTML file) is still available when a persistent server is not wanted.

Both paths write to the same cache. cadence wrapped and other commands use that shared cache too.

The local SQLite cache preserves sessions after upstream log retention removes old JSONL files — Claude Code and Codex auto-delete logs after approximately 30 days. Cadence retains the cached rows by default so historical data survives those deletions. cadence scan --prune opts back into deletion.

cadence publish is a separate opt-in that submits aggregate leaderboard metrics to cadence.report. It never sends raw prompt, assistant, or tool text. Participation requires explicit consent.

Local session scan

cadence scan and the implicit scan inside cadence are implemented by src/cadence/scan.py. They select one or both supported agents:

  • claude_code: Claude Code JSONL files under ~/.claude/projects by default.
  • codex: Codex rollout JSONL files under ~/.codex/sessions, ~/.codex/archived_sessions, ~/.codex/logs, ~/.cache/codex, or $XDG_DATA_HOME/codex when those roots exist.

--data-dir and [scan].projects_dirs override the default roots for the default profile. [[profiles]] entries are explicit opt-ins for additional Claude/Codex accounts. Cadence does not auto-discover ~/.claude-* directories.

For each source file, Cadence stores the file mtime, line count, parser version, profile, and device in processed_files. A later scan reparses only files whose state changed, files whose adapter parser version changed, or files that need a schema rebuild. --strict turns recoverable parser warnings into hard errors.

Session-kind filters are derived at report time from cached sessions metadata, not stored as a schema column. [classify] config maps originators and cwd globs into interactive, delegated, or agentic; the default Time report shows interactive plus delegated sessions and hides agentic sessions until the toolbar toggle enables them. Existing caches reclassify without a session-log rescan because Codex originator is already stored as sessions.entrypoint and Codex source as sessions.user_type.

Parsed records are normalized into these main SQLite tables:

  • sessions: one session row per source JSONL session.
  • prompts: user prompt events.
  • assistant_turns: assistant responses, timestamps, model metadata, and token counters when the agent log exposes them.
  • tool_uses: tool calls attached to assistant turns.
  • processed_files: source-file bookkeeping and warning counts.

The schema uses integer row identities for hot relationships. sessions keeps session_key as a unique natural key for idempotent scans and source-file replacement, while prompts, assistant_turns, and tool_uses store integer foreign keys (session_id_fk and, for tools, assistant_turn_id_fk) instead of copying long text session or turn keys into every row. The natural event keys (prompt_key, turn_key, and tool_use_key) remain unique columns so rescans can still de-duplicate deterministically. Session-level session_id and source_path stay on sessions; child tables derive that metadata through joins.

The local cache currently stores normalized prompt text, assistant text, and tool input JSON in SQLite. Cadence does not send those fields during normal local scanning or report rendering. Use cadence export when you need a sanitized cache copy; it strips raw prompt, reply, and tool input text while preserving structured metrics.

When an in-scope source JSONL file disappears, Cadence retains the cached rows by default so the cache can preserve history after upstream log retention removes old files. cadence scan --prune or [scan].prune_deleted_logs = true opts back into deleting cached rows whose source files are gone.

scan-git: local git and authored data

cadence scan-git is optional. It is implemented by src/cadence/core/scan_git.py and reads local git history for projects that already appear in the session cache, or for absolute paths passed with --project.

For session-level output metrics, Cadence:

  1. Resolves each project to a git scan root.
  2. Applies [scan].git_exclude_repos to local repo labels, paths, and bare repo names.
  3. Runs git log --branches --numstat --author-date-order --no-renames.
  4. Filters commits to the local git user.name / user.email, plus [scan].git_author_names and [scan].git_author_emails.
  5. Excludes generated artifacts by default when [scan].exclude_generated_artifacts = true.
  6. Builds padded session windows around cached sessions.
  7. Attributes a commit to the latest overlapping session window.
  8. Writes per-session LOC rows to session_loc.

When git_exclude_repos changes, the next non-dry scan-git also removes stale local session_loc and git-sourced project_prs rows for excluded repos.

scan-git also detects merged PRs from local git subjects:

  • Merge commits matching Merge pull request #<number>.
  • Squash commits ending with (#<number>).

Those rows are stored in project_prs with source = 'git' and state = 'merged'.

On an unwindowed, non-dry run, scan-git also computes the Authored layer. This is independent of session logs: Cadence scans full local repo history with a non-merge git log, filters to the configured author identities, tracks language LOC, and replaces rows in:

  • authored_commits
  • authored_commit_langs

The dashboard's Authored view aggregates those tables by day, repo, and language. This is why Authored can span full repo history while session_loc only attributes commits that overlap known Cadence sessions.

cadence scan-git --github, or [github].scan_repos = true, extends only the Authored layer. It uses the authenticated gh CLI to enumerate owned, non-fork GitHub repositories, skips locally present or excluded repos, clones missing repos one at a time into a temporary directory, scans their git history, stores them under stable owner/repo identities, and deletes each clone before moving to the next. Cadence does not persist GitHub tokens.

serve: local dashboard server

cadence serve starts a stdlib ThreadingHTTPServer bound to 127.0.0.1 only. It renders the dashboard from the local cache, injects a per-session CSRF token into the served HTML, and accepts mutating requests only from the same Origin with that token.

Serve mode exposes:

  • POST /api/rescan: runs the same local session-log scan used by the normal dashboard flow, then regenerates report.html.
  • POST /api/config: writes live Settings changes to ~/.config/cadence/config.toml, including [serve] auto-rescan settings and [scan].git_exclude_repos.
  • GET /api/heartbeat: keeps the idle watchdog alive and returns the current report file version so an open dashboard can reload after an auto-rescan.

Auto-rescan is serve-only. [serve].auto_rescan_enabled = true and [serve].auto_rescan_interval_minutes schedule rescans from the server's own serve loop while the process is running. Cadence does not install cron, launchd, or any other background scheduler, and nothing ticks after the server stops.

scan-github: PR metadata

cadence scan-github is a separate opt-in scanner implemented by src/cadence/core/scan_github.py. The top-level cadence command does not run it automatically.

The scanner is disabled by default. cadence scan-github --enable writes [github].enabled = true to the config and runs the first scan. Subsequent cadence scan-github runs require:

  • gh available on PATH.
  • gh auth status to succeed.

Cadence shells out to the already-authenticated gh CLI. It does not read or store GitHub tokens. The scan:

  1. Resolves configured accounts, including @me.
  2. Runs gh search prs --author=<login> across merged, open, closed-unmerged, and draft PRs.
  3. Applies [github].include_orgs and [github].exclude_repos.
  4. Enriches selected PRs with gh api repos/<owner>/<repo>/pulls/<number> for additions, deletions, merged time, merge commit SHA, and commit count.
  5. Writes rows into project_prs with source = 'github' and a synthetic cwd of github://owner/repo.
  6. Updates the single-row github_scan_meta table.

Conflict handling is conservative: local git rows are trusted and survive a GitHub scan unless --force is used. GitHub rows fill gaps for repos that were not cloned locally or PR states that local git cannot see. Interrupted scans use ~/.cache/cadence/github-scan.resume and remove it on clean exit.

Cache, config, and stores

Cadence runtime state is derived from src/cadence/paths.py.

Default cache paths:

  • ~/.cache/cadence/usage.db: SQLite cache.
  • ~/.cache/cadence/report.html: Time view report.
  • ~/.cache/cadence/wrapped.html: Wrapped report.
  • ~/.cache/cadence/wrapped.png: default Wrapped image output.
  • ~/.cache/cadence/assets: generated static assets.
  • ~/.cache/cadence/parse-errors.log: rolling parse warning log.
  • ~/.cache/cadence/scan-timing.log: per-file scan timing log.
  • ~/.cache/cadence/github-scan.resume: interrupt-resume marker for scan-github.

Default config paths:

  • ~/.config/cadence/config.toml: user-editable config.
  • ~/.config/cadence/.onboarded: one-time first-run onboarding marker.
  • ~/.config/cadence/.github-onboarded: one-time local onboarding marker for whole-account Authored scans.

CADENCE_HOME changes the base home for both config and cache. When it is set, Cadence reads and writes:

  • $CADENCE_HOME/.config/cadence/config.toml
  • $CADENCE_HOME/.cache/cadence/usage.db

This makes multiple independent Cadence stores possible: choose a different CADENCE_HOME for each store. CADENCE_CONFIG can also point at a specific config file when no explicit home is passed.

The cache records its schema version in meta.schema_version. Additive schema changes can migrate in place, but schema v2 is intentionally a destructive cache rebuild: on version mismatch Cadence recreates the SQLite schema and the normal scan repopulates it from local JSONL sources. The rebuild happens inside a transaction so a schema-creation failure rolls back instead of leaving a half-created store.

The config model includes:

  • [scan]: source roots, generated-artifact filtering, pruning, extra git author identities, and local scan-git repo excludes.
  • [display]: default report range, weekend handling, and theme.
  • [plan]: plan pricing for ROI cards.
  • [performance]: report precompute options.
  • [classify]: read-time session-kind classification from session entrypoint and cwd.
  • [[profiles]]: explicit multi-account scan entries.
  • [github]: opt-in GitHub PR and whole-account Authored settings.
  • [leaderboard]: explicit opt-in aggregate publishing settings.
  • [serve]: loopback dashboard port and serve-only auto-rescan settings.

Network boundary

The local session scan, local scan-git, and offline report rendering do not need network access. Networked behavior is explicit:

  • cadence scan-github --enable / cadence scan-github uses gh.
  • cadence scan-git --github uses gh and git clone for missing repos in the Authored layer.
  • cadence publish submits aggregate leaderboard metrics only after opt-in.

The generated Time and Wrapped HTML reports are designed to be static local artifacts with bundled assets.