Architecture

Memory system design, plugin structure, and changes from the upstream Compound Engineering plugin.

Back to README

Memory System

Knowledge is stored in two formats:

  • SQLite FTS5 (knowledge.db) — Primary search backend with full-text search and BM25 ranking
  • JSONL (knowledge.jsonl) — Portable export format, grep-compatible fallback

Both are written to simultaneously. If sqlite3 is unavailable, only JSONL is written and grep-based search is used automatically.

{
  "key": "learned-oauth-redirect-must-match-exactly",
  "type": "learned",
  "content": "OAuth redirect URI must match exactly",
  "source": "user",
  "tags": ["oauth", "auth", "security"],
  "ts": 1706918400,
  "bead": "BD-001"
}
  • FTS5 Search: Uses porter stemming and BM25 ranking — “webhook authentication” finds entries about HMAC signature verification even when those exact words don’t appear together
  • Auto-tagging: Keywords detected and added as tags
  • Git-tracked: Knowledge files can be committed to git for team sharing and portability
  • Conflict-free collaboration: Multiple users can capture knowledge simultaneously without merge conflicts
  • Auto-sync: First session after git pull automatically imports new knowledge into local search index
  • Rotation: After 5000 entries, oldest 2500 archived (JSONL only)
  • Search: .lavra/memory/recall.sh "keyword" or automatic at session start

Project Artifacts

Beyond knowledge.jsonl, lavra manages several project-scoped artifacts:

.lavra/config/lavra.json (committed)

Workflow configuration — toggle research, review, goal verification, parallelism:

{
  "workflow": {
    "research": true,
    "plan_review": true,
    "goal_verification": true
  },
  "execution": {
    "max_parallel_agents": 3,
    "commit_granularity": "task"
  },
  "model_profile": "balanced"
}

Created by provision-memory.sh on install. Existing projects receive it automatically on next session start via the version self-heal in auto-recall.sh. Read by /lavra-design (skip phases), /lavra-work (parallelism, commits), /lavra-review, /lavra-eng-review, and /lavra-ship.

model_profile: "quality" routes critical agents (security-sentinel, architecture-strategist, goal-verifier, performance-oracle) to opus. Default "balanced" keeps agents at their configured tier.

.lavra/config/codebase-profile.md (committed)

Optional brownfield codebase analysis generated by /project-setup Step 1.5 (user must run manually and opt in). Three sections (Stack & Integrations, Architecture & Structure, Conventions & Testing) up to 200 lines. Read by /lavra-design and /lavra-work with injection safety (XML wrapping, sanitization, size cap). Not auto-generated — existing projects should run /project-setup after upgrading to get this.

.lavra/memory/session-state.md (gitignored, ephemeral)

Position awareness across context compaction. Written by /lavra-work, /lavra-design, and /lavra-checkpoint at milestones. Contains current bead, phase, task count, last completed task, and next steps. Recalled once by auto-recall.sh at session start, then deleted. Stale files (>24h) auto-cleaned.

Plugin Structure

lavra/                              # Marketplace root
├── .claude-plugin/
│   └── marketplace.json
├── plugins/
│   └── lavra/                      # Plugin root
│       ├── .claude-plugin/
│       │   └── plugin.json
│       ├── agents/
│       │   ├── review/             # 16 review agents
│       │   ├── research/           # 5 research agents
│       │   ├── design/             # 3 design agents
│       │   ├── workflow/           # 5 workflow agents
│       │   └── docs/               # 1 docs agent
│       ├── commands/               # 22 core commands + optional/
│       ├── skills/                 # 15 skills
│       ├── hooks/                  # 4 hooks + shared library + hooks.json
│       ├── scripts/
│       └── .mcp.json
├── install.sh
├── uninstall.sh
├── CLAUDE.md
└── README.md

Changes from Compound Engineering

This plugin is a fork of compound-engineering-plugin (MIT license) with the following changes:

Memory System

  • Replaced markdown-based knowledge storage with beads-based persistent memory (.lavra/memory/knowledge.jsonl)
  • SQLite FTS5 full-text search with BM25 ranking for knowledge recall, improving precision by 18%, recall by 17%, and MRR by 24% over grep-based search across 25 benchmark queries
  • Automatic knowledge capture from bd comments add with typed prefixes (LEARNED/DECISION/FACT/PATTERN/INVESTIGATION/DEVIATION), dual-writing to SQLite (for fast searching with fuzzy matching) and JSONL (for committing to git)
  • Automatic knowledge recall at session start based on open beads and git branch context
  • Subagent knowledge enforcement via SubagentStop hook
  • All workflows create and update beads instead of markdown files
  • Automatic one-time backfill from existing JSONL and beads.db comments on first FTS5 run
  • First session (like cloning a lavra enabled repo) triggers rebuilding the FTS5 index from the JSONL in git. Everything self-heals on first session.

Performance Optimizations

  • Context budget optimization (94% reduction): Plugin now uses only 8,227 chars of Claude Code’s 16,000 char description budget. This prevents components from being silently excluded from Claude’s context.
    • Trimmed all 28 agent descriptions to under 250 chars, moving verbose examples into agent bodies wrapped in <examples> tags
    • Added disable-model-invocation: true to 17 manual utility commands (they remain available when explicitly invoked via /command-name but don’t clutter Claude’s auto-suggestion context)
    • Added disable-model-invocation: true to 7 manual utility skills (lavra-knowledge, create-agent-skills, file-todos, git-worktree, rclone, gemini-imagegen)
    • Core beads workflow commands (/lavra-brainstorm, /lavra-plan, /lavra-work, /lavra-work-ralph, /lavra-work-teams, /lavra-review, /lavra-compound, /lavra-research, /lavra-eng-review) remain fully auto-discoverable
  • Model tier assignments: Each agent specifies a model: field (haiku/sonnet/opus) based on reasoning complexity, reducing costs 60-70% compared to running all agents on the default model. High-frequency agents like learnings-researcher run on Haiku; deep reasoning agents like architecture-strategist run on Opus.

Structural Changes

  • Rewrote learnings-researcher to search knowledge.jsonl instead of markdown docs
  • Adapted code-simplicity-reviewer to protect .lavra/memory/ files
  • Renamed compound-docs skill to lavra-knowledge
  • Added beads- prefix to all commands to avoid conflicts