Architecture

Memory system design, plugin structure, and changes from the upstream Compound Engineering plugin.

Memory System

Knowledge is stored in one shared log plus local derived caches:

JSONL (knowledge.jsonl) — shared append-only source of truth committed to git
SQLite FTS5 (knowledge.db) — local search index built from the JSONL
Curated local cache (knowledge.active.jsonl, knowledge.active.db) — sanitizer-produced active working set for lower-noise recall
Audit log (knowledge.audit.jsonl) — local explanation of what the sanitizer filtered, merged, downgraded, or skipped
Compiled helper (.memory-sanitize-go) — gitignored local binary built from the Go sanitizer source when Go is available

memory-capture.sh writes raw shared memory. memory-sanitize.sh then builds local curated artifacts asynchronously. The shell script is only an orchestrator now: it schedules work, compiles/runs the Go helper from plugins/lavra/hooks/memorysanitize/ when go is available, and falls back to a reduced jq sanitizer path otherwise. If sqlite3 is unavailable, JSONL still works and grep-based search remains available automatically.

{
  "key": "learned-oauth-redirect-must-match-exactly",
  "type": "learned",
  "content": "OAuth redirect URI must match exactly",
  "source": "user",
  "tags": ["oauth", "auth", "security"],
  "ts": 1706918400,
  "bead": "BD-001"
}

FTS5 Search: Uses porter stemming and BM25 ranking — “webhook authentication” finds entries about HMAC signature verification even when those exact words don’t appear together
Auto-tagging: Keywords detected and added as tags
Git-tracked: Knowledge files can be committed to git for team sharing and portability
Conflict-free collaboration: Multiple users can capture knowledge simultaneously without merge conflicts
Auto-sync: First session after git pull automatically imports new knowledge into local search index
Local refinement: dedupe, noise filtering, and drift validation happen in gitignored caches, not by rewriting shared history
Rotation: After 5000 entries, oldest 2500 archived (JSONL only)
Search: .lavra/memory/recall.sh "keyword" or automatic at session start

Future shared cleanup is intentionally separate from this local pipeline. The planned model is append-only reviewed curation, not hook-based write-back. See Knowledge System.

Project Artifacts

Beyond knowledge.jsonl, lavra manages several project-scoped artifacts:

`.lavra/config/lavra.json` (committed)

Workflow configuration — toggle research, review, goal verification, parallelism:

{
  "workflow": {
    "research": true,
    "plan_review": true,
    "goal_verification": true
  },
  "execution": {
    "max_parallel_agents": 3,
    "commit_granularity": "task"
  },
  "model_profile": "balanced"
}

Created by provision-memory.sh on install. Existing projects receive it automatically on next session start via the version self-heal in auto-recall.sh. Read by /lavra-design (skip phases), /lavra-work (parallelism, commits), /lavra-review, /lavra-eng-review, and /lavra-ship.

model_profile: "quality" routes critical agents (security-sentinel, architecture-strategist, goal-verifier, performance-oracle) to opus. Default "balanced" keeps agents at their configured tier.

`.lavra/config/codebase-profile.md` (committed)

Optional brownfield codebase analysis generated by /lavra-setup Step 1.5 (user must run manually and opt in). Three sections (Stack & Integrations, Architecture & Structure, Conventions & Testing) up to 200 lines. Read by /lavra-design and /lavra-work with injection safety (XML wrapping, sanitization, size cap). Not auto-generated — existing projects should run /lavra-setup after upgrading to get this.

`.lavra/memory/session-state.md` (gitignored, ephemeral)

Position awareness across context compaction. Written by /lavra-work, /lavra-design, and /lavra-checkpoint at milestones. Contains current bead, phase, task count, last completed task, and next steps. Recalled once by auto-recall.sh at session start, then deleted. Stale files (>24h) auto-cleaned.

Plugin Structure

lavra/                              # Marketplace root
├── .claude-plugin/
│   └── marketplace.json
├── plugins/
│   └── lavra/                      # Plugin root
│       ├── .claude-plugin/
│       │   └── plugin.json
│       ├── agents/
│       │   ├── review/             # 16 review agents
│       │   ├── research/           # 5 research agents
│       │   ├── design/             # 3 design agents
│       │   ├── workflow/           # 5 workflow agents
│       │   └── docs/               # 1 docs agent
│       ├── commands/               # 22 core commands + optional/
│       ├── skills/                 # 16 skills
│       ├── hooks/                  # 4 hooks + shared library + hooks.json
│       ├── scripts/
│       └── .mcp.json
├── install.sh
├── uninstall.sh
├── CLAUDE.md
└── README.md

Changes from Compound Engineering

This plugin is a fork of compound-engineering-plugin (MIT license) with the following changes:

Memory System

Replaced markdown-based knowledge storage with beads-based persistent memory (.lavra/memory/knowledge.jsonl)
SQLite FTS5 full-text search with BM25 ranking for knowledge recall, improving precision by 18%, recall by 17%, and MRR by 24% over grep-based search across 25 benchmark queries
Automatic knowledge capture from bd comments add with typed prefixes (LEARNED/DECISION/FACT/PATTERN/INVESTIGATION/DEVIATION/SKIP), dual-writing to SQLite (for fast searching with fuzzy matching) and JSONL (for committing to git)
Automatic knowledge recall at session start based on open beads and git branch context
Subagent knowledge enforcement via SubagentStop hook
All workflows create and update beads instead of markdown files
Automatic one-time backfill from existing JSONL and beads.db comments on first FTS5 run
First session (like cloning a lavra enabled repo) triggers rebuilding the FTS5 index from the JSONL in git. Everything self-heals on first session.

Performance Optimizations

Context budget optimization (94% reduction): Plugin now uses only 8,227 chars of Claude Code’s 16,000 char description budget. This prevents components from being silently excluded from Claude’s context.
- Trimmed all 28 agent descriptions to under 250 chars, moving verbose examples into agent bodies wrapped in <examples> tags
- Added disable-model-invocation: true to 17 manual utility commands (they remain available when explicitly invoked via /command-name but don’t clutter Claude’s auto-suggestion context)
- Added disable-model-invocation: true to 7 manual utility skills (lavra-knowledge, create-agent-skills, file-todos, git-worktree, rclone, gemini-imagegen)
- Core beads workflow commands (/lavra-brainstorm, /lavra-plan, /lavra-work, /lavra-work-ralph, /lavra-work-teams, /lavra-review, /lavra-compound, /lavra-research, /lavra-eng-review) remain fully auto-discoverable
Model tier assignments: Each agent specifies a model: field (haiku/sonnet/opus) based on reasoning complexity, reducing costs 60-70% compared to running all agents on the default model. High-frequency agents like learnings-researcher run on Haiku; deep reasoning agents like architecture-strategist run on Opus.

Structural Changes

Rewrote learnings-researcher to search knowledge.jsonl instead of markdown docs
Adapted code-simplicity-reviewer to protect .lavra/memory/ files
Renamed compound-docs skill to lavra-knowledge
Added beads- prefix to all commands to avoid conflicts

Architecture

Memory System

Project Artifacts

.lavra/config/lavra.json (committed)

.lavra/config/codebase-profile.md (committed)

.lavra/memory/session-state.md (gitignored, ephemeral)

Plugin Structure

Changes from Compound Engineering

Memory System

Performance Optimizations

Structural Changes

`.lavra/config/lavra.json` (committed)

`.lavra/config/codebase-profile.md` (committed)

`.lavra/memory/session-state.md` (gitignored, ephemeral)