maku85/vemora
GitHub: maku85/vemora
vemora 为大型代码库构建版本化的结构化索引和 RAG 检索层,让 LLM 辅助开发时精准获取所需代码上下文,降低 token 消耗并提升代码发现能力。
Stars: 0 | Forks: 0
# vemora
[](https://www.npmjs.com/package/vemora)
[](https://www.npmjs.com/package/vemora?activeTab=versions)
[](LICENSE)
Repository-local memory system for LLM-assisted development.
Builds a structured, versioned index of your codebase — code chunks, symbols, dependency graph, and LLM-generated summaries — and enables semantic or keyword search over it. The result is a **RAG (Retrieval-Augmented Generation) layer** that lets you give an LLM only the code it actually needs, instead of entire files.
## Why
When working on a large codebase with Claude Code or similar LLM tools, you face two problems:
1. **Context cost** — dropping 50 files into the context wastes tokens on irrelevant code
2. **Discovery** — you don't always know *which* files are relevant to a given task
`vemora` solves both by pre-indexing the repo and making it queryable. It also provides higher-level commands that go beyond retrieval:
- **`vemora plan`** — a pro LLM (planner) decomposes a complex task into concrete steps; a smaller/free LLM (executor) carries out each step against targeted code context. Cuts costs by using expensive models only where they matter.
- **`vemora audit`** — systematic, checklist-driven analysis of your codebase for security vulnerabilities, performance issues, and bugs. Covers every file, produces structured findings with severity levels.
- **`vemora triage`** — zero-LLM static heuristic scan for bugs, security issues, and performance problems. Instant results with no API calls, useful as a first pass before a deeper audit.
- **`vemora dead-code`** — zero-LLM static analysis that detects unused private symbols, exports nobody imports, and files that are never imported. Works entirely from the call graph and dependency graph already in the index.
- **`vemora focus`** — aggregates all structural context about a file or symbol in one shot: implementation, exports, dependency graph, callers, test files, and saved knowledge.
## Architecture in three layers
.vemora/ ← versioned in git, shared across the team
config.json
metadata.json
index/
files.json ← file hashes for incremental indexing
chunks.json ← code chunks (function/class/window slices)
symbols.json ← extracted symbol map
deps.json ← intra-project dependency graph
callgraph.json ← function-level call relationships
todos.json ← TODO/FIXME/HACK/XXX annotations extracted from source
summaries/
file-summaries.json ← LLM-generated 2-3 line description per file
project-summary.json ← LLM-generated ~500 word project overview
knowledge/
entries.json ← human/LLM-authored notes: decisions, gotchas, patterns
~/.vemora-cache// ← local to each developer, NOT in git
embeddings.json ← metadata (model, dimensions, chunk mapping)
embeddings.bin ← binary buffer of vectors (Float32Array)
embeddings.hnsw.json ← serialized HNSW index for ultra-fast search
The index, summaries, and knowledge entries are committed to git so teammates share them. Embeddings are generated locally by each developer from the shared index.
## Installation
# Inside the vemora/ directory
pnpm install
pnpm build
# Link globally (optional)
pnpm link
Or run directly with `node vemora/dist/cli.js` from the project root.
### Installing the alpha version from npm
pnpm install vemora@alpha # local
pnpm install -g vemora@alpha # global
# or with npm:
npm install -g vemora@alpha
## The Core Workflow
### 1. Setup (first time only)
vemora init # create .vemora/ and config.json
vemora index --no-embed # build index without embeddings (fast)
vemora index # or: build index + generate embeddings
vemora summarize # recommended: generate LLM descriptions per file
vemora init-agent # generate instruction files for AI agents
vemora init-agent --hooks # also write Claude Code auto-save hooks
### 1b. Start of each session (CLI mode)
vemora brief --root . # compact primer: project overview + critical knowledge
### 1c. MCP server mode (Claude Desktop / Claude Code)
Install the SDK once, then start the server:
npm install @modelcontextprotocol/sdk
vemora mcp --root .
Add to `.claude/settings.json` (Claude Code) or `claude_desktop_config.json` (Claude Desktop) — see [`vemora mcp`](#vemora-mcp) for full config. In MCP mode, the agent calls `search`, `focus`, `brief`, `remember`, and `deps` as native tools without manual `vemora context` invocations.
### 2. Query during development
# Search for relevant code
vemora query "how does IMAP reconnect work?"
# Full context block ready to paste into any LLM
vemora context --query "email retry logic" > context.md
# Context with a task-type skill preset (pre-configures retrieval + adds focus note)
vemora context --query "auth flow fails silently" --skill debug
vemora context --query "extract retry logic to utility" --skill refactor
vemora context --query "add webhook support" --skill add-feature
# One-shot answer from the configured LLM
vemora ask "why does the sync queue stall?"
# All context about a file or symbol in one call (no LLM needed)
vemora focus src/core/email/services/email.service.ts
vemora focus EmailService.send
# Static scan for bugs/perf/security (no API key required)
vemora triage --type bugs,performance
# Save a finding for future sessions
vemora remember "EmailService.send queues if SMTP is offline — see OutboxRepository"
### 3. Complex tasks with the planner-executor pattern
# Pro LLM plans, small/free LLM executes each step
vemora plan "add rate limiting to the API layer" --confirm --synthesize
# Audit the codebase for issues
vemora audit --type security --root .
vemora audit --since HEAD~1 # only changed files (great for CI)
### 4. Keep the index fresh
vemora index --watch # incremental re-index on file save
vemora index --no-embed # after code changes, update structure only
## Commands
### `vemora init`
Creates the `.vemora/` folder structure and adds `.vemora-cache/` to `.gitignore`.
Options:
--root project root (default: cwd)
### `vemora index`
Scans the repo, parses symbols, builds the dependency graph, extracts TODO/FIXME/HACK/XXX annotations, and generates embeddings. **Incremental** — only re-processes files whose SHA-256 hash has changed.
If LLM-generated summaries exist (from `vemora summarize`), each summary is also embedded as a synthetic file-overview chunk. This allows abstract queries ("how does authentication work?") to match against natural-language descriptions rather than raw code.
Options:
--root project root (default: cwd)
--force re-index all files, ignoring hashes
--no-embed skip embedding generation (index structure only)
-w, --watch watch for changes and re-index automatically
### `vemora query ""`
Searches the index using vector similarity (or keyword fallback). Results use a **three-tier display** that compresses output by relevance rank.
Options:
--root project root (default: cwd)
-k, --top-k number of results (default: 10)
-c, --show-code show full code for all results (overrides tier system)
--keyword force keyword/BM25 search (no API call needed)
--format output format: terminal (default) | json | markdown | terse
--rerank re-score results with a cross-encoder model
--hybrid use hybrid search (vector + BM25)
--alpha hybrid weight for vector search (0-1, default 0.7)
--budget max tokens to include across results
--mmr apply Maximal Marginal Relevance to diversify results
--lambda MMR relevance weight (0=diverse, 1=relevant, default 0.5)
--merge merge adjacent chunks from the same file
--merge-gap max line gap between chunks to still merge (default 3)
--centrality boost results from widely-imported files (dep-graph centrality)
--session enable session tracking: demote already-seen chunks (×0.3) and
compress seen-file chunks to signature only (auto-expires after 30 min idle)
--fresh reset session memory before this query (implies --session)
#### Output formats
| Format | Use case |
|---|---|
| `terminal` | Default coloured output for interactive use |
| `json` | Machine-readable — for piping to scripts |
| `markdown` | Paste-ready Markdown with code blocks |
| `terse` | One line per result — recommended for small/local models |
#### Output tiers (terminal/markdown)
| Rank | Tier | Content shown |
|------|------|--------------|
| 1–3 | high | Full code block (capped at 30 lines) |
| 4–7 | med | Declaration signature only |
| 8+ | low | File path + symbol + score + AI summary |
### `vemora context`
Generates an **optimized LLM context block** combining project overview, a specific file, and relevant code chunks. Designed to be piped to a file or clipboard.
Options:
--root project root (default: cwd)
-q, --query natural-language query to find relevant code
-f, --file include a specific file in full with its dependency graph
-k, --top-k number of search results to include (default: 5)
--keyword use keyword search instead of semantic search
--show-code show full code without line cap
--format output format: markdown (default) | plain | terse
--rerank re-score results with a cross-encoder model
--hybrid use hybrid search (vector + BM25)
--alpha hybrid weight for vector search (0-1, default 0.7)
--budget max tokens to include across retrieved chunks
--mmr apply Maximal Marginal Relevance to diversify results
--lambda MMR relevance weight (0=diverse, 1=relevant, default 0.5)
--merge merge adjacent or overlapping chunks from the same file
--merge-gap max line gap between chunks to still merge (default 3)
--structured emit a structured block (Entry Point / Dependencies / Types / Patterns)
--since restrict search to files changed since this git ref (e.g. HEAD~5, main)
--skill task-type preset — pre-configures retrieval and prepends a focused
instruction block (debug | refactor | add-feature | security | explain | test)
--centrality boost results from widely-imported files (dep-graph centrality)
--session enable session tracking: demote already-seen chunks and compress
seen-file chunks to signature only (auto-expires after 30 min idle)
--fresh reset session memory before this query (implies --session)
At least one of `--query` or `--file` is required.
When `--file` is used, the context block also includes:
- **Recent git commits** that touched the file (last 5, via `git log --follow`)
- **TODO/FIXME/HACK/XXX annotations** present in the file
- **Test files** linked to the file — convention-based and import-based discovery
- **Symbol callers** — for each symbol defined in the file, which other project symbols call it
#### Skills (`--skill`)
A skill is a **task-type preset** that does two things at once:
1. **Pre-configures the retrieval pipeline** — sets task-appropriate defaults for `topK`, `hybrid`, `mmr`, `budget`, and `structured` so you don't need to tune flags manually.
2. **Prepends a focused instruction block** to the output — tells the LLM exactly what kind of reasoning to apply (error paths, blast radius, code style, etc.).
Explicit flags always override skill defaults.
| Skill | Best for | Key retrieval changes |
|---|---|---|
| `debug` | Tracing errors, call paths, exception handling | topK 8, hybrid (BM25 for error strings), high MMR relevance |
| `refactor` | Safe minimal-diff changes | structured output (callers explicit), topK 6 |
| `add-feature` | Adding new functionality following existing patterns | structured + high MMR diversity to surface varied patterns |
| `security` | Security review, injection, auth, path traversal | BM25-heavy (α=0.55), boosts `gotcha` knowledge entries |
| `explain` | Understanding purpose and design decisions | Maximum MMR diversity (λ=0.4), lower budget |
| `test` | Writing or improving tests | hybrid search, surfaces test file patterns |
vemora context --root . --query "auth flow fails silently" --skill debug
vemora context --root . --query "extract retry logic to utility" --skill refactor
vemora context --root . --query "add webhook support" --skill add-feature
#### Session tracking (`--session`)
When `--session` is active, vemora maintains a per-project session file (`~/.vemora-cache//session.json`) that tracks which chunks and files have already been returned. On subsequent queries, rather than hard-filtering seen results:
- **Chunk decay** — already-seen chunk scores are multiplied by 0.3, so they remain candidates if genuinely re-relevant but don't dominate fresh results.
- **Signature compression** — chunks from files already seen in this session have their content replaced with just the declaration signature, saving tokens without losing location context.
Sessions auto-expire after 30 minutes of idle time. Use `--fresh` to reset explicitly.
vemora query "error handling" --session --root . # first query: full results
vemora query "logging setup" --session --root . # seen chunks demoted, seen files compressed
vemora query "error handling" --fresh --root . # reset session, start fresh
### `vemora ask ""`
One-shot Q&A: retrieves relevant context and calls the configured LLM to answer directly.
Options:
--root project root (default: cwd)
-k, --top-k chunks to retrieve (default: 5)
--keyword use keyword search (no embeddings needed)
--hybrid use hybrid vector+BM25 search
--budget max context tokens to send to LLM (default: 6000)
--show-context print the retrieved context before the answer
--terse inject brevity constraint into LLM system prompt (~50-70% fewer output tokens)
vemora ask "how does the IMAP reconnect logic work?" --root .
vemora ask "what does EmailService.send do?" --root . --keyword
### `vemora plan ""`
**Planner-executor pattern**: a capable LLM decomposes the task into a structured plan; a smaller/cheaper LLM executes each step against targeted code context.
The planner works from **file summaries and the symbol list** — not raw code — so its token cost stays low regardless of codebase size. The executor receives only the chunks relevant to its specific step (targeted by file/symbol, not just search).
Options:
--root project root (default: cwd)
-k, --top-k chunks to retrieve per step when falling back to search (default: 5)
--keyword use keyword search (no embeddings required)
--budget max context tokens per step (default: 4000)
--confirm show the plan and ask for confirmation before executing
--synthesize call the planner again after all steps to produce a single final answer
--show-context print retrieved context for each step
--verify after each executor step, have the planner review the output
--apply automatically apply unified diffs produced by write steps (via patch -p1)
--max-retries max re-runs of a step when verification fails (default: 2)
--resume resume a previous session by short ID (first 8 chars) or full UUID
--terse inject brevity constraint into executor analyze and synthesis prompts (~50-70% fewer output tokens)
#### Step action types
| Action | Behaviour |
|---|---|
| `read` | Pull code into context — no LLM call, zero executor tokens |
| `analyze` | Executor answers a question in prose |
| `write` | Executor produces a unified diff ready to apply |
| `test` | Run a shell command; capture stdout/stderr as step result |
#### Key features
- **Parallel execution** — steps without dependencies run concurrently; sequential steps stream tokens to stdout in real time
- **Step dependencies** (`dependsOn`) — later steps receive prior results as context
- **Context deduplication** — the same file/symbol combination is retrieved only once per session
- **Adaptive re-planning** — if an executor step reports insufficient context (`INSUFFICIENT:`), vemora first retries with a refined BM25 search derived from the missing-context description; only if that also fails does it invoke the planner to add remediation steps
- **Planner verification** (`--verify`) — after each executor step, the planner reviews the output and can request a retry with specific feedback
- **Diff application** (`--apply`) — diffs from `write` steps are applied to the filesystem via `patch -p1`; live file contents are read before write steps to avoid stale index data
- **Session persistence** — every session is saved to `~/.vemora-cache//sessions/` after each wave; interrupted runs can be resumed with `--resume `
- **Save synthesis** — after `--synthesize`, optionally save the result as a knowledge entry
# Plan, preview, and execute with final synthesis
vemora plan "add batch() method to OpenAIEmbeddingProvider" --confirm --synthesize
# Analysis only (no code changes)
vemora plan "explain how the hybrid search pipeline works" --keyword
# Executor writes diffs, planner verifies each step, apply to disk
vemora plan "fix the N+1 query in UserRepository.findAll" --verify --apply
# Resume an interrupted session (use vemora sessions to find the ID)
vemora plan "..." --resume a1b2c3d4
#### Configuration
{
"planner": { "provider": "anthropic", "model": "claude-opus-4-6" },
"executor": { "provider": "ollama", "model": "qwen2.5-coder:14b",
"baseUrl": "http://localhost:11434" }
}
`executor` is the model that carries out each step. If `executor` is omitted, `summarization` is used as the fallback. If `planner` is omitted, both roles use the same model.
##### Using Claude Code as the planner
Set `provider: "claude-code"` to use the local `claude` CLI subprocess as the planner. The subprocess can autonomously explore the codebase with `Read`, `Grep`, and `Glob` tools before generating the plan:
{
"planner": {
"provider": "claude-code",
"model": "claude-sonnet-4-6",
"baseUrl": "/path/to/claude",
"allowedTools": ["Read", "Grep", "Glob"],
"maxBudgetUsd": 0.50
}
}
`baseUrl` is the path to the `claude` binary (default: `"claude"`, assumed on `PATH`).
### `vemora sessions`
Lists recent plan sessions for the current project, showing their short ID, status, creation date, and task preview.
vemora sessions --root .
Use the short ID printed here with `vemora plan "" --resume ` to continue an interrupted run.
### `vemora audit`
Systematic, checklist-driven code audit for **security vulnerabilities**, **performance issues**, and **bugs**. Covers every file in the codebase (or only changed files with `--since`).
Options:
--root project root (default: cwd)
--type comma-separated: security, performance, bugs (default: all three)
--since only audit files changed since this git ref (e.g. HEAD~5, main)
--budget max context tokens per step (default: 5000)
--keyword use keyword search (no embeddings required)
--output terminal (default) | json | markdown
--save save critical/high findings as knowledge entries
#### Built-in checklists
| Type | Examples |
|---|---|
| `security` | SQL/command/path injection, hardcoded secrets, weak crypto, missing auth/authz, XSS, CSRF, prototype pollution |
| `performance` | N+1 queries, sync I/O in async context, unbounded data loading, memory accumulation, blocking event loop |
| `bugs` | Null dereference, unhandled promise rejections, race conditions, resource leaks, swallowed errors, off-by-one |
#### How it works
1. The **planner** receives the file list + summaries and generates a systematic audit plan, grouping 2-5 related files per step with specific checklist items.
2. Steps execute in **parallel waves of 3** — the executor returns structured JSON findings for each group.
3. Findings are **deduplicated, sorted by severity**, and displayed as a report.
4. `--save` persists critical/high findings to the knowledge store for future sessions.
# Full audit
vemora audit --root .
# Security only
vemora audit --type security --root .
# Audit only what changed in the last commit (ideal for CI/CD)
vemora audit --since HEAD~1 --root .
# Audit changes vs main branch, save findings
vemora audit --since main --type security,bugs --save --root .
# Export for a PR review
vemora audit --since main --output markdown --root . > audit-report.md
#### Example output
── Audit Report [security] ─────────────────────────────
12 file(s) analysed · 3 finding(s)
[CRITICAL] Injection src/api/users.ts:89
User input concatenated directly into SQL query without parameterization.
→ Use parameterized queries or a query builder.
[HIGH] Hardcoded Secret src/config.ts:12
API key hardcoded in source — will be exposed in version control.
→ Move to environment variables and rotate the key.
[MEDIUM] Missing Authorization src/api/admin.ts:34
Admin endpoint does not verify that the caller has the admin role.
→ Add role check before processing the request.
─────────────────────────────────────────────────────────
1 critical · 1 high · 1 medium
### `vemora remember ""`
Saves a persistent knowledge entry to `.vemora/knowledge/entries.json`. The entry is committed to git and included automatically in future `context` and `ask` results when relevant.
When `--category` is omitted, the configured LLM classifies the entry automatically into one of the four categories. Falls back to `pattern` if no LLM is available.
Options:
--root project root (default: cwd)
--category decision | pattern | gotcha | glossary (auto-classified if omitted)
--files comma-separated related file paths
--symbols comma-separated related symbol names
--confidence high | medium | low (default: medium)
--supersedes invalidate an existing entry and link this one as its replacement
# Category auto-classified by the LLM
vemora remember "EmailService.send queues if SMTP offline — see OutboxRepository"
# Or specify explicitly
vemora remember "EmailService.send queues if SMTP offline — see OutboxRepository" \
--category gotcha \
--files src/core/email/services/email.service.ts \
--symbols EmailService.send
# Replace an existing entry (invalidates abc12345, creates a linked replacement)
vemora remember "Updated behaviour: EmailService.send now retries 3×" --supersedes abc12345
### `vemora brief`
Prints a compact session primer — project overview and knowledge entries (medium + high confidence) — designed to be run at the start of each LLM session to re-establish context with minimal tokens.
Options:
--root project root (default: cwd)
--all include all knowledge entries, including low-confidence ones
--skill task-type preset: surfaces knowledge entries most relevant
to the skill's category boost list and prepends a focused
instruction block (debug | refactor | add-feature | security | explain | test)
vemora brief --root . # overview + medium & high-confidence entries
vemora brief --root . --skill debug # gotcha + pattern entries first, debug focus note
vemora brief --root . --skill security # security gotchas surfaced first
vemora brief --root . --all # all entries, no filter
### `vemora knowledge`
Manages saved knowledge entries.
vemora knowledge list --root . # list all entries grouped by category
vemora knowledge update "" --root . # edit an existing entry in-place
vemora knowledge forget --root . # remove an entry by ID (prefix match)
`knowledge update` replaces the body of an existing entry without creating a new one — use it to correct a typo or refine wording. For a change in meaning, prefer `remember … --supersedes ` to preserve history.
Options for knowledge update:
--root project root (default: cwd)
--title override title (auto-derived from text if omitted)
--category decision | pattern | gotcha | glossary
--confidence high | medium | low
--files comma-separated related file paths
--symbols comma-separated related symbol names
### `vemora init-agent`
Generates AI agent instruction files from the existing index. Supports Claude Code, Gemini, GitHub Copilot, Cursor, and Windsurf.
Options:
--root project root (default: cwd)
--agent target a single agent: claude, gemini, copilot, cursor, windsurf (default: all)
--force overwrite existing files that have no vemora markers
--hooks write Claude Code hooks to .claude/settings.json (claude target only)
--mcp write vemora MCP server config to .claude/settings.json (claude target only)
Use `--hooks` to register a `PreCompact` hook that reminds Claude Code to persist key decisions before context is compressed.
Use `--mcp` to add the vemora MCP server entry to `.claude/settings.json` so Claude Code discovers it automatically (requires `@modelcontextprotocol/sdk`):
# Full Claude Code setup in one command: instruction file + hooks + MCP server config
vemora init-agent --agent claude --hooks --mcp --root .
| Agent | Output file |
|---|---|
| `claude` | `CLAUDE.md` |
| `gemini` | `GEMINI.md` |
| `copilot` | `.github/copilot-instructions.md` |
| `cursor` | `.cursor/rules/vemora.mdc` (with `alwaysApply: true`) |
| `windsurf` | `.windsurfrules` |
Re-running `init-agent` only updates the auto-generated block between `` markers. Custom content outside the markers is preserved.
### `vemora summarize`
Generates LLM-powered summaries for every indexed file and a high-level project overview. **Incremental** — only re-generates summaries for files whose content has changed.
Summaries serve two purposes:
- **Retrieval**: after re-running `vemora index`, each summary is embedded as a synthetic file-overview chunk. Abstract queries ("how does auth work?") can match against natural-language descriptions rather than raw code.
- **Planner context**: used by `vemora plan` and `vemora audit` as cheap, dense context (instead of raw code chunks).
Options:
--root project root (default: cwd)
--force re-generate all summaries
--model override LLM model (default: from config)
--files-only only generate per-file summaries
--project-only (re)generate project overview from existing file summaries
--show print the existing project overview without regenerating
vemora summarize --show --root . # print overview without regenerating
### `vemora status`
Prints index stats, embedding cache info, knowledge store summary, and a count of TODO/FIXME/HACK/XXX annotations by type.
### `vemora deps `
Shows the full dependency context for a file: what it imports, what imports it.
Options:
--root project root (default: cwd)
-d, --depth transitive depth for outgoing imports (default: 1)
-r, --reverse-depth transitive depth for incoming importers (default: 1)
# All files that depend on SyncOrchestrator, up to 3 hops
vemora deps src/core/sync/SyncOrchestrator.ts --root . --reverse-depth 3
### `vemora usages `
Finds all files that use a named symbol, following re-export chains.
Options:
--root project root (default: cwd)
-d, --depth max re-export chain depth to follow (default: 10)
--callers-only show only files with call graph data
### `vemora chat`
Interactive chat session with the codebase. Supports OpenAI, Anthropic, Gemini, and Ollama.
vemora chat --provider anthropic --model claude-opus-4-6
vemora chat --provider ollama --model qwen2.5-coder:14b
### `vemora report`
Shows a usage statistics report: commands breakdown, token savings, and most frequent query terms.
Options:
--root project root (default: cwd)
--days limit report to events from the last N days
-v, --verbose show per-query breakdown (last 20 queries)
--clear clear all recorded usage data
### `vemora triage`
Zero-LLM static heuristic scan for bugs, security issues, and performance problems. Works entirely from the existing index — no API key or network access required.
Options:
--root project root (default: cwd)
--type comma-separated: bugs, security, performance (default: all)
-k, --top-k max findings to return, ranked by score (default: 30)
--min-score skip findings below this threshold (default: 1)
--file restrict scan to files matching this substring
--output terminal (default) | json | markdown
Each finding includes a severity (high/medium/low), a reason, and the exact code location.
# Full scan
vemora triage --root .
# Bugs only, top 10, export to Markdown
vemora triage --type bugs -k 10 --output markdown --root .
# Security scan limited to the API layer
vemora triage --type security --file src/api --root .
Heuristics cover: empty catch blocks, unguarded `JSON.parse`, sync I/O in loops, `any` casts, hardcoded secrets, dangerous `eval`/`exec`, prototype pollution, SQL/command injection patterns, and more.
### `vemora dead-code`
Zero-LLM static analysis for unused code. Works entirely from the existing index — no API key or network access required.
Options:
--root project root (default: cwd)
--type comma-separated: uncalled-private, unused-export, unreachable-file (default: all)
--output terminal (default) | json
Three detection categories:
| Type | What it finds |
|---|---|
| `uncalled-private` | Private functions and methods with an entry in the call graph but no recorded callers |
| `unused-export` | Exported symbols not imported by any file in the dep graph (namespace imports excluded) |
| `unreachable-file` | Files that export symbols but are never imported by any other file in the project |
# Full scan — all three categories
vemora dead-code --root .
# Only private methods with no callers
vemora dead-code --type uncalled-private --root .
# Machine-readable output
vemora dead-code --output json --root . | jq '.[] | select(.type == "unused-export")'
**Caveats:** call graph coverage is incomplete for dynamic dispatch, arrow functions assigned to variables, and `require()` calls. Namespace imports (`import * as X`) prevent false positives on `unused-export` by marking the whole file as used. Entry points (`cli.ts`, `index.ts`, `main.ts`, etc.) are excluded from `unreachable-file`.
### `vemora mcp`
Starts a **Model Context Protocol server** that exposes vemora's search capabilities as native MCP tools. Unlike the CLI (which spawns a new process per command and re-reads the index from disk each time), the MCP server loads the index **once at startup** and serves all tool calls from memory — significantly faster on large codebases.
Options:
--root project root directory (default: cwd)
Requires `@modelcontextprotocol/sdk`:
npm install @modelcontextprotocol/sdk
# or: pnpm add @modelcontextprotocol/sdk
#### MCP tools
| Tool | Description |
|---|---|
| `search` | Semantic / hybrid / keyword search; supports all `context` flags (`mmr`, `rerank`, `session`, `structured`, `skill`, `budget`) |
| `focus` | Implementation, deps, callers, tests, and knowledge for a file or symbol |
| `brief` | Session primer: project overview + knowledge entries |
| `remember` | Persist a knowledge entry |
| `deps` | Dependency graph for a file |
#### Configuration
**Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
{
"mcpServers": {
"vemora": {
"command": "vemora",
"args": ["mcp", "--root", "/absolute/path/to/your/project"]
}
}
}
**Claude Code** (`.claude/settings.json` in the project root):
{
"mcpServers": {
"vemora": {
"command": "vemora",
"args": ["mcp", "--root", "."]
}
}
}
Once the server is running, the CLAUDE.md / GEMINI.md instruction files (generated by `vemora init-agent`) remain useful as fallback guidance for agents that call vemora via Bash, but MCP-compatible clients will call the tools natively without needing those instructions.
### `vemora focus `
Aggregates all structural context about a file or symbol in one call — replaces the need to run `context`, `deps`, `usages`, and `knowledge` separately.
Options:
--root project root (default: cwd)
--format markdown (default) | plain
--budget max tokens to include in output (truncates from the end)
--lines restrict implementation output to chunks overlapping a line range
--depth expand class members: method — shows implementation for each member
`` can be a file path (full or partial) or a symbol name:
# File focus — exports, chunks, imports, importers, call graph, tests, knowledge
vemora focus src/core/email/services/email.service.ts --root .
vemora focus email.service --root . # partial path match
# Symbol focus — implementation, callers, callees, members, tests
vemora focus EmailService --root . # shows Methods section with member list
vemora focus EmailService --root . --depth method # also expands each method's implementation
vemora focus EmailService.send --root . # focus on a single method directly
# Restrict to a specific line range (useful when --budget cuts off a method)
vemora focus src/core/email/services/email.service.ts --root . --lines 200-280
# Pipe into a context block for any LLM
vemora focus src/search/hybrid.ts --root . --format plain > context.md
## Configuration
Edit `.vemora/config.json` after `init`:
{
"projectId": "b88eb8199f78331e",
"projectName": "my-app",
"version": "1.0.0",
"include": ["**/*.ts", "**/*.tsx"],
"exclude": ["**/node_modules/**", "**/dist/**"],
"maxChunkLines": 80,
"maxChunkChars": 3000,
"embedding": {
"provider": "ollama",
"model": "nomic-embed-text",
"dimensions": 768
},
"summarization": {
"provider": "ollama",
"model": "gemma4:e2b",
"baseUrl": "http://localhost:11434"
},
"reranker": {
"provider": "ollama"
},
"display": {
"format": "terse"
}
}
### Planner-executor configuration
Add `planner` and `executor` blocks to use different models for planning and execution:
{
"planner": {
"provider": "anthropic",
"model": "claude-opus-4-6"
},
"executor": {
"provider": "gemini",
"model": "gemini-2.0-flash",
"apiKey": "your-google-ai-studio-key"
}
}
`planner` is used by `vemora plan` and `vemora audit`. `executor` handles step execution. Fallback chain: `executor` → `summarization` → same model for both roles.
The `planner` config also accepts two extra fields when using `claude-code`:
| Field | Type | Description |
|---|---|---|
| `allowedTools` | `string[]` | Tools the subprocess may call (default: `["Read","Grep","Glob"]`) |
| `maxBudgetUsd` | `number` | Spend cap per plan call in USD (default: `0.50`) |
### `display.format`
Sets the default output format for `query`, `context`, and `ask`. Set to `"terse"` for small/local models with limited context windows.
### Embedding providers
| Provider | Config | Notes |
|---|---|---|
| `openai` | `OPENAI_API_KEY` env or `apiKey` in config | Best quality. Requires `npm install openai`. |
| `ollama` | `baseUrl`, `maxChars` (see below) | Local, no cost. |
| `none` | — | Keyword search only, no embeddings. |
#### Ollama embedding options
| Field | Default | Description |
|---|---|---|
| `model` | `"nomic-embed-text"` | Embedding model to pull and use |
| `dimensions` | `768` | Must match the model output dimensions |
| `baseUrl` | `"http://localhost:11434"` | Ollama server URL |
| `maxChars` | `3800` | Max characters per chunk before truncation. Prevents exceeding the model's context window. Increase for models with larger context (e.g. `mxbai-embed-large`: ~8000). |
"embedding": {
"provider": "ollama",
"model": "nomic-embed-text",
"dimensions": 768,
"maxChars": 3800
}
### LLM providers
Used by `ask`, `chat`, `summarize`, `plan`, and `audit`.
| Provider | Config | Notes |
|---|---|---|
| `openai` | `OPENAI_API_KEY` env or `apiKey` in config | Also works with any OpenAI-compatible endpoint via `baseUrl`. |
| `anthropic` | `ANTHROPIC_API_KEY` env or `apiKey` in config | Requires `npm install @anthropic-ai/sdk`. |
| `gemini` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` env or `apiKey` in config | Uses Google's OpenAI-compatible endpoint. Free tier available via Google AI Studio. |
| `ollama` | `baseUrl` (default: `http://localhost:11434`) | Local, no cost. |
| `claude-code` | `baseUrl` = path to `claude` binary (default: `"claude"`) | Planner-only. Spawns the Claude Code CLI subprocess; the subprocess can explore the codebase with `Read`/`Grep`/`Glob` before answering. Requires Claude Code installed and authenticated. |
#### OpenAI-compatible endpoints
The `openai` provider accepts a `baseUrl` field, enabling any compatible API:
{ "provider": "openai", "model": "llama-3.3-70b-versatile", "baseUrl": "https://api.groq.com/openai/v1", "apiKey": "..." }
| Service | `baseUrl` | Free tier |
|---|---|---|
| Groq | `https://api.groq.com/openai/v1` | Yes (rate limited) |
| OpenRouter | `https://openrouter.ai/api/v1` | Some free models |
| Gemini (compat) | `https://generativelanguage.googleapis.com/v1beta/openai/` | Yes |
### Reranker
Controls how search results are re-scored when `--rerank` is passed to `query`, `context`, or `ask`, and always in `chat`.
| Provider | Config | Notes |
|---|---|---|
| `xenova` | _(no extra config)_ | Local cross-encoder (`ms-marco-MiniLM-L-6-v2`). Best quality. Requires `npm install @xenova/transformers`. |
| `ollama` | `model` (optional), `baseUrl` (optional) | Uses the configured LLM to rank results in a single call. No extra dependency. |
| `none` | — | Skip reranking entirely. |
{
"reranker": { "provider": "ollama" }
}
When `provider` is `ollama` and `model` is omitted, the model from `summarization` is used.
### Recommended configurations
**Maximum quality (cloud)**
{
"planner": { "provider": "anthropic", "model": "claude-opus-4-6" },
"executor": { "provider": "openai", "model": "gpt-4o-mini" }
}
**Claude Code as planner + free executor**
{
"planner": { "provider": "claude-code", "model": "claude-sonnet-4-6",
"allowedTools": ["Read","Grep","Glob"], "maxBudgetUsd": 0.50 },
"executor": { "provider": "ollama", "model": "qwen2.5-coder:14b",
"baseUrl": "http://localhost:11434" }
}
**Pro planner + free executor**
{
"planner": { "provider": "anthropic", "model": "claude-opus-4-6" },
"executor": { "provider": "gemini", "model": "gemini-2.0-flash", "apiKey": "..." }
}
**Fully local (no API keys)**
{
"embedding": { "provider": "ollama", "model": "nomic-embed-text", "dimensions": 768 },
"summarization": { "provider": "ollama", "model": "gemma4:e2b" },
"executor": { "provider": "ollama", "model": "qwen2.5-coder:7b" },
"reranker": { "provider": "ollama" },
"display": { "format": "terse" }
}
Other local executor models that work well: `qwen2.5-coder:14b`, `llama3.2`, `mistral`.
## What goes in git
✓ .vemora/config.json
✓ .vemora/metadata.json
✓ .vemora/index/files.json
✓ .vemora/index/chunks.json
✓ .vemora/index/symbols.json
✓ .vemora/index/deps.json
✓ .vemora/index/callgraph.json
✓ .vemora/summaries/file-summaries.json
✓ .vemora/summaries/project-summary.json
✓ .vemora/knowledge/entries.json ← shared knowledge store
✗ .vemora-cache/ ← local embedding vectors (gitignored)
## Incremental indexing
Chunk IDs are derived from `sha256(filePath + content)`. If a function's code doesn't change, its chunk ID is stable across branches — embeddings are reused without any API call.
## Tech stack
- **TypeScript + Node.js** (CommonJS, ES2022 target)
- **commander** — CLI framework
- **fast-glob** — repository scanning
- **tree-sitter** (optional) — AST-based symbol extraction for TS/JS
- **openai** SDK _(optional)_ — embedding generation, OpenAI and Gemini LLM provider; `npm install openai`
- **@anthropic-ai/sdk** _(optional)_ — Anthropic/Claude LLM provider; `npm install @anthropic-ai/sdk`
- **@modelcontextprotocol/sdk** _(optional)_ — MCP server transport for `vemora mcp`; `npm install @modelcontextprotocol/sdk`
- **@xenova/transformers** _(optional)_ — local cross-encoder model for `--rerank` with `reranker.provider = "xenova"`; `npm install @xenova/transformers`. Not needed if using `reranker.provider = "ollama"` or `"none"`.
- **hnsw** — HNSW index for sub-millisecond vector search
- **chokidar** — file watching for `--watch` mode
- **chalk + ora** — terminal output
标签:AI辅助开发, MITM代理, RAG, SQL查询, 云安全监控, 代码索引, 暗色界面, 本地知识库, 自动化攻击, 静态分析