maku85/vemora

GitHub: maku85/vemora

vemora 为大型代码库构建版本化的结构化索引和 RAG 检索层，让 LLM 辅助开发时精准获取所需代码上下文，降低 token 消耗并提升代码发现能力。

Stars: 0 | Forks: 0

# vemora [![npm version](https://img.shields.io/npm/v/vemora?label=npm)](https://www.npmjs.com/package/vemora) [![npm alpha](https://img.shields.io/npm/v/vemora/alpha?label=alpha)](https://www.npmjs.com/package/vemora?activeTab=versions) [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) Repository-local memory system for LLM-assisted development. Builds a structured, versioned index of your codebase — code chunks, symbols, dependency graph, and LLM-generated summaries — and enables semantic or keyword search over it. The result is a **RAG (Retrieval-Augmented Generation) layer** that lets you give an LLM only the code it actually needs, instead of entire files. ## Why When working on a large codebase with Claude Code or similar LLM tools, you face two problems: 1. **Context cost** — dropping 50 files into the context wastes tokens on irrelevant code 2. **Discovery** — you don't always know *which* files are relevant to a given task `vemora` solves both by pre-indexing the repo and making it queryable. It also provides higher-level commands that go beyond retrieval: - **`vemora plan`** — a pro LLM (planner) decomposes a complex task into concrete steps; a smaller/free LLM (executor) carries out each step against targeted code context. Cuts costs by using expensive models only where they matter. - **`vemora audit`** — systematic, checklist-driven analysis of your codebase for security vulnerabilities, performance issues, and bugs. Covers every file, produces structured findings with severity levels. - **`vemora triage`** — zero-LLM static heuristic scan for bugs, security issues, and performance problems. Instant results with no API calls, useful as a first pass before a deeper audit. - **`vemora dead-code`** — zero-LLM static analysis that detects unused private symbols, exports nobody imports, and files that are never imported. Works entirely from the call graph and dependency graph already in the index. - **`vemora focus`** — aggregates all structural context about a file or symbol in one shot: implementation, exports, dependency graph, callers, test files, and saved knowledge. ## Architecture in three layers .vemora/ ← versioned in git, shared across the team config.json metadata.json index/ files.json ← file hashes for incremental indexing chunks.json ← code chunks (function/class/window slices) symbols.json ← extracted symbol map deps.json ← intra-project dependency graph callgraph.json ← function-level call relationships todos.json ← TODO/FIXME/HACK/XXX annotations extracted from source summaries/ file-summaries.json ← LLM-generated 2-3 line description per file project-summary.json ← LLM-generated ~500 word project overview knowledge/ entries.json ← human/LLM-authored notes: decisions, gotchas, patterns ~/.vemora-cache// ← local to each developer, NOT in git embeddings.json ← metadata (model, dimensions, chunk mapping) embeddings.bin ← binary buffer of vectors (Float32Array) embeddings.hnsw.json ← serialized HNSW index for ultra-fast search The index, summaries, and knowledge entries are committed to git so teammates share them. Embeddings are generated locally by each developer from the shared index. ## Installation # Inside the vemora/ directory pnpm install pnpm build # Link globally (optional) pnpm link Or run directly with `node vemora/dist/cli.js` from the project root. ### Installing the alpha version from npm pnpm install vemora@alpha # local pnpm install -g vemora@alpha # global # or with npm: npm install -g vemora@alpha ## The Core Workflow ### 1. Setup (first time only) vemora init # create .vemora/ and config.json vemora index --no-embed # build index without embeddings (fast) vemora index # or: build index + generate embeddings vemora summarize # recommended: generate LLM descriptions per file vemora init-agent # generate instruction files for AI agents vemora init-agent --hooks # also write Claude Code auto-save hooks ### 1b. Start of each session (CLI mode) vemora brief --root . # compact primer: project overview + critical knowledge ### 1c. MCP server mode (Claude Desktop / Claude Code) Install the SDK once, then start the server: npm install @modelcontextprotocol/sdk vemora mcp --root . Add to `.claude/settings.json` (Claude Code) or `claude_desktop_config.json` (Claude Desktop) — see [`vemora mcp`](#vemora-mcp) for full config. In MCP mode, the agent calls `search`, `focus`, `brief`, `remember`, and `deps` as native tools without manual `vemora context` invocations. ### 2. Query during development # Search for relevant code vemora query "how does IMAP reconnect work?" # Full context block ready to paste into any LLM vemora context --query "email retry logic" > context.md # Context with a task-type skill preset (pre-configures retrieval + adds focus note) vemora context --query "auth flow fails silently" --skill debug vemora context --query "extract retry logic to utility" --skill refactor vemora context --query "add webhook support" --skill add-feature # One-shot answer from the configured LLM vemora ask "why does the sync queue stall?" # All context about a file or symbol in one call (no LLM needed) vemora focus src/core/email/services/email.service.ts vemora focus EmailService.send # Static scan for bugs/perf/security (no API key required) vemora triage --type bugs,performance # Save a finding for future sessions vemora remember "EmailService.send queues if SMTP is offline — see OutboxRepository" ### 3. Complex tasks with the planner-executor pattern # Pro LLM plans, small/free LLM executes each step vemora plan "add rate limiting to the API layer" --confirm --synthesize # Audit the codebase for issues vemora audit --type security --root . vemora audit --since HEAD~1 # only changed files (great for CI) ### 4. Keep the index fresh vemora index --watch # incremental re-index on file save vemora index --no-embed # after code changes, update structure only ## Commands ### `vemora init` Creates the `.vemora/` folder structure and adds `.vemora-cache/` to `.gitignore`. Options: --root project root (default: cwd) ### `vemora index` Scans the repo, parses symbols, builds the dependency graph, extracts TODO/FIXME/HACK/XXX annotations, and generates embeddings. **Incremental** — only re-processes files whose SHA-256 hash has changed. If LLM-generated summaries exist (from `vemora summarize`), each summary is also embedded as a synthetic file-overview chunk. This allows abstract queries ("how does authentication work?") to match against natural-language descriptions rather than raw code. Options: --root project root (default: cwd) --force re-index all files, ignoring hashes --no-embed skip embedding generation (index structure only) -w, --watch watch for changes and re-index automatically ### `vemora query ""` Searches the index using vector similarity (or keyword fallback). Results use a **three-tier display** that compresses output by relevance rank. Options: --root project root (default: cwd) -k, --top-k number of results (default: 10) -c, --show-code show full code for all results (overrides tier system) --keyword force keyword/BM25 search (no API call needed) --format output format: terminal (default) | json | markdown | terse --rerank re-score results with a cross-encoder model --hybrid use hybrid search (vector + BM25) --alpha hybrid weight for vector search (0-1, default 0.7) --budget max tokens to include across results --mmr apply Maximal Marginal Relevance to diversify results --lambda MMR relevance weight (0=diverse, 1=relevant, default 0.5) --merge merge adjacent chunks from the same file --merge-gap max line gap between chunks to still merge (default 3) --centrality boost results from widely-imported files (dep-graph centrality) --session enable session tracking: demote already-seen chunks (×0.3) and compress seen-file chunks to signature only (auto-expires after 30 min idle) --fresh reset session memory before this query (implies --session) #### Output formats | Format | Use case | |---|---| | `terminal` | Default coloured output for interactive use | | `json` | Machine-readable — for piping to scripts | | `markdown` | Paste-ready Markdown with code blocks | | `terse` | One line per result — recommended for small/local models | #### Output tiers (terminal/markdown) | Rank | Tier | Content shown | |------|------|--------------| | 1–3 | high | Full code block (capped at 30 lines) | | 4–7 | med | Declaration signature only | | 8+ | low | File path + symbol + score + AI summary | ### `vemora context` Generates an **optimized LLM context block** combining project overview, a specific file, and relevant code chunks. Designed to be piped to a file or clipboard. Options: --root project root (default: cwd) -q, --query natural-language query to find relevant code -f, --file include a specific file in full with its dependency graph -k, --top-k number of search results to include (default: 5) --keyword use keyword search instead of semantic search --show-code show full code without line cap --format output format: markdown (default) | plain | terse --rerank re-score results with a cross-encoder model --hybrid use hybrid search (vector + BM25) --alpha hybrid weight for vector search (0-1, default 0.7) --budget max tokens to include across retrieved chunks --mmr apply Maximal Marginal Relevance to diversify results --lambda MMR relevance weight (0=diverse, 1=relevant, default 0.5) --merge merge adjacent or overlapping chunks from the same file --merge-gap max line gap between chunks to still merge (default 3) --structured emit a structured block (Entry Point / Dependencies / Types / Patterns) --since restrict search to files changed since this git ref (e.g. HEAD~5, main) --skill task-type preset — pre-configures retrieval and prepends a focused instruction block (debug | refactor | add-feature | security | explain | test) --centrality boost results from widely-imported files (dep-graph centrality) --session enable session tracking: demote already-seen chunks and compress seen-file chunks to signature only (auto-expires after 30 min idle) --fresh reset session memory before this query (implies --session) At least one of `--query` or `--file` is required. When `--file` is used, the context block also includes: - **Recent git commits** that touched the file (last 5, via `git log --follow`) - **TODO/FIXME/HACK/XXX annotations** present in the file - **Test files** linked to the file — convention-based and import-based discovery - **Symbol callers** — for each symbol defined in the file, which other project symbols call it #### Skills (`--skill`) A skill is a **task-type preset** that does two things at once: 1. **Pre-configures the retrieval pipeline** — sets task-appropriate defaults for `topK`, `hybrid`, `mmr`, `budget`, and `structured` so you don't need to tune flags manually. 2. **Prepends a focused instruction block** to the output — tells the LLM exactly what kind of reasoning to apply (error paths, blast radius, code style, etc.). Explicit flags always override skill defaults. | Skill | Best for | Key retrieval changes | |---|---|---| | `debug` | Tracing errors, call paths, exception handling | topK 8, hybrid (BM25 for error strings), high MMR relevance | | `refactor` | Safe minimal-diff changes | structured output (callers explicit), topK 6 | | `add-feature` | Adding new functionality following existing patterns | structured + high MMR diversity to surface varied patterns | | `security` | Security review, injection, auth, path traversal | BM25-heavy (α=0.55), boosts `gotcha` knowledge entries | | `explain` | Understanding purpose and design decisions | Maximum MMR diversity (λ=0.4), lower budget | | `test` | Writing or improving tests | hybrid search, surfaces test file patterns | vemora context --root . --query "auth flow fails silently" --skill debug vemora context --root . --query "extract retry logic to utility" --skill refactor vemora context --root . --query "add webhook support" --skill add-feature #### Session tracking (`--session`) When `--session` is active, vemora maintains a per-project session file (`~/.vemora-cache//session.json`) that tracks which chunks and files have already been returned. On subsequent queries, rather than hard-filtering seen results: - **Chunk decay** — already-seen chunk scores are multiplied by 0.3, so they remain candidates if genuinely re-relevant but don't dominate fresh results. - **Signature compression** — chunks from files already seen in this session have their content replaced with just the declaration signature, saving tokens without losing location context. Sessions auto-expire after 30 minutes of idle time. Use `--fresh` to reset explicitly. vemora query "error handling" --session --root . # first query: full results vemora query "logging setup" --session --root . # seen chunks demoted, seen files compressed vemora query "error handling" --fresh --root . # reset session, start fresh ### `vemora ask ""` One-shot Q&A: retrieves relevant context and calls the configured LLM to answer directly. Options: --root project root (default: cwd) -k, --top-k chunks to retrieve (default: 5) --keyword use keyword search (no embeddings needed) --hybrid use hybrid vector+BM25 search --budget max context tokens to send to LLM (default: 6000) --show-context print the retrieved context before the answer --terse inject brevity constraint into LLM system prompt (~50-70% fewer output tokens) vemora ask "how does the IMAP reconnect logic work?" --root . vemora ask "what does EmailService.send do?" --root . --keyword ### `vemora plan ""` **Planner-executor pattern**: a capable LLM decomposes the task into a structured plan; a smaller/cheaper LLM executes each step against targeted code context. The planner works from **file summaries and the symbol list** — not raw code — so its token cost stays low regardless of codebase size. The executor receives only the chunks relevant to its specific step (targeted by file/symbol, not just search). Options: --root project root (default: cwd) -k, --top-k chunks to retrieve per step when falling back to search (default: 5) --keyword use keyword search (no embeddings required) --budget max context tokens per step (default: 4000) --confirm show the plan and ask for confirmation before executing --synthesize call the planner again after all steps to produce a single final answer --show-context print retrieved context for each step --verify after each executor step, have the planner review the output --apply automatically apply unified diffs produced by write steps (via patch -p1) --max-retries max re-runs of a step when verification fails (default: 2) --resume resume a previous session by short ID (first 8 chars) or full UUID --terse inject brevity constraint into executor analyze and synthesis prompts (~50-70% fewer output tokens) #### Step action types | Action | Behaviour | |---|---| | `read` | Pull code into context — no LLM call, zero executor tokens | | `analyze` | Executor answers a question in prose | | `write` | Executor produces a unified diff ready to apply | | `test` | Run a shell command; capture stdout/stderr as step result | #### Key features - **Parallel execution** — steps without dependencies run concurrently; sequential steps stream tokens to stdout in real time - **Step dependencies** (`dependsOn`) — later steps receive prior results as context - **Context deduplication** — the same file/symbol combination is retrieved only once per session - **Adaptive re-planning** — if an executor step reports insufficient context (`INSUFFICIENT:`), vemora first retries with a refined BM25 search derived from the missing-context description; only if that also fails does it invoke the planner to add remediation steps - **Planner verification** (`--verify`) — after each executor step, the planner reviews the output and can request a retry with specific feedback - **Diff application** (`--apply`) — diffs from `write` steps are applied to the filesystem via `patch -p1`; live file contents are read before write steps to avoid stale index data - **Session persistence** — every session is saved to `~/.vemora-cache//sessions/` after each wave; interrupted runs can be resumed with `--resume ` - **Save synthesis** — after `--synthesize`, optionally save the result as a knowledge entry # Plan, preview, and execute with final synthesis vemora plan "add batch() method to OpenAIEmbeddingProvider" --confirm --synthesize # Analysis only (no code changes) vemora plan "explain how the hybrid search pipeline works" --keyword # Executor writes diffs, planner verifies each step, apply to disk vemora plan "fix the N+1 query in UserRepository.findAll" --verify --apply # Resume an interrupted session (use vemora sessions to find the ID) vemora plan "..." --resume a1b2c3d4 #### Configuration { "planner": { "provider": "anthropic", "model": "claude-opus-4-6" }, "executor": { "provider": "ollama", "model": "qwen2.5-coder:14b", "baseUrl": "http://localhost:11434" } } `executor` is the model that carries out each step. If `executor` is omitted, `summarization` is used as the fallback. If `planner` is omitted, both roles use the same model. ##### Using Claude Code as the planner Set `provider: "claude-code"` to use the local `claude` CLI subprocess as the planner. The subprocess can autonomously explore the codebase with `Read`, `Grep`, and `Glob` tools before generating the plan: { "planner": { "provider": "claude-code", "model": "claude-sonnet-4-6", "baseUrl": "/path/to/claude", "allowedTools": ["Read", "Grep", "Glob"], "maxBudgetUsd": 0.50 } } `baseUrl` is the path to the `claude` binary (default: `"claude"`, assumed on `PATH`). ### `vemora sessions` Lists recent plan sessions for the current project, showing their short ID, status, creation date, and task preview. vemora sessions --root . Use the short ID printed here with `vemora plan "" --resume ` to continue an interrupted run. ### `vemora audit` Systematic, checklist-driven code audit for **security vulnerabilities**, **performance issues**, and **bugs**. Covers every file in the codebase (or only changed files with `--since`). Options: --root project root (default: cwd) --type comma-separated: security, performance, bugs (default: all three) --since only audit files changed since this git ref (e.g. HEAD~5, main) --budget max context tokens per step (default: 5000) --keyword use keyword search (no embeddings required) --output terminal (default) | json | markdown --save save critical/high findings as knowledge entries #### Built-in checklists | Type | Examples | |---|---| | `security` | SQL/command/path injection, hardcoded secrets, weak crypto, missing auth/authz, XSS, CSRF, prototype pollution | | `performance` | N+1 queries, sync I/O in async context, unbounded data loading, memory accumulation, blocking event loop | | `bugs` | Null dereference, unhandled promise rejections, race conditions, resource leaks, swallowed errors, off-by-one | #### How it works 1. The **planner** receives the file list + summaries and generates a systematic audit plan, grouping 2-5 related files per step with specific checklist items. 2. Steps execute in **parallel waves of 3** — the executor returns structured JSON findings for each group. 3. Findings are **deduplicated, sorted by severity**, and displayed as a report. 4. `--save` persists critical/high findings to the knowledge store for future sessions. # Full audit vemora audit --root . # Security only vemora audit --type security --root . # Audit only what changed in the last commit (ideal for CI/CD) vemora audit --since HEAD~1 --root . # Audit changes vs main branch, save findings vemora audit --since main --type security,bugs --save --root . # Export for a PR review vemora audit --since main --output markdown --root . > audit-report.md #### Example output ── Audit Report [security] ───────────────────────────── 12 file(s) analysed · 3 finding(s) [CRITICAL] Injection src/api/users.ts:89 User input concatenated directly into SQL query without parameterization. → Use parameterized queries or a query builder. [HIGH] Hardcoded Secret src/config.ts:12 API key hardcoded in source — will be exposed in version control. → Move to environment variables and rotate the key. [MEDIUM] Missing Authorization src/api/admin.ts:34 Admin endpoint does not verify that the caller has the admin role. → Add role check before processing the request. ───────────────────────────────────────────────────────── 1 critical · 1 high · 1 medium ### `vemora remember ""` Saves a persistent knowledge entry to `.vemora/knowledge/entries.json`. The entry is committed to git and included automatically in future `context` and `ask` results when relevant. When `--category` is omitted, the configured LLM classifies the entry automatically into one of the four categories. Falls back to `pattern` if no LLM is available. Options: --root project root (default: cwd) --category decision | pattern | gotcha | glossary (auto-classified if omitted) --files comma-separated related file paths --symbols comma-separated related symbol names --confidence high | medium | low (default: medium) --supersedes invalidate an existing entry and link this one as its replacement # Category auto-classified by the LLM vemora remember "EmailService.send queues if SMTP offline — see OutboxRepository" # Or specify explicitly vemora remember "EmailService.send queues if SMTP offline — see OutboxRepository" \ --category gotcha \ --files src/core/email/services/email.service.ts \ --symbols EmailService.send # Replace an existing entry (invalidates abc12345, creates a linked replacement) vemora remember "Updated behaviour: EmailService.send now retries 3×" --supersedes abc12345 ### `vemora brief` Prints a compact session primer — project overview and knowledge entries (medium + high confidence) — designed to be run at the start of each LLM session to re-establish context with minimal tokens. Options: --root project root (default: cwd) --all include all knowledge entries, including low-confidence ones --skill task-type preset: surfaces knowledge entries most relevant to the skill's category boost list and prepends a focused instruction block (debug | refactor | add-feature | security | explain | test) vemora brief --root . # overview + medium & high-confidence entries vemora brief --root . --skill debug # gotcha + pattern entries first, debug focus note vemora brief --root . --skill security # security gotchas surfaced first vemora brief --root . --all # all entries, no filter ### `vemora knowledge` Manages saved knowledge entries. vemora knowledge list --root . # list all entries grouped by category vemora knowledge update "" --root . # edit an existing entry in-place vemora knowledge forget --root . # remove an entry by ID (prefix match) `knowledge update` replaces the body of an existing entry without creating a new one — use it to correct a typo or refine wording. For a change in meaning, prefer `remember … --supersedes ` to preserve history. Options for knowledge update: --root project root (default: cwd) --title override title (auto-derived from text if omitted) --category <cat> decision | pattern | gotcha | glossary --confidence <level> high | medium | low --files <paths> comma-separated related file paths --symbols <names> comma-separated related symbol names ### `vemora init-agent` Generates AI agent instruction files from the existing index. Supports Claude Code, Gemini, GitHub Copilot, Cursor, and Windsurf. Options: --root <dir> project root (default: cwd) --agent <name> target a single agent: claude, gemini, copilot, cursor, windsurf (default: all) --force overwrite existing files that have no vemora markers --hooks write Claude Code hooks to .claude/settings.json (claude target only) --mcp write vemora MCP server config to .claude/settings.json (claude target only) Use `--hooks` to register a `PreCompact` hook that reminds Claude Code to persist key decisions before context is compressed. Use `--mcp` to add the vemora MCP server entry to `.claude/settings.json` so Claude Code discovers it automatically (requires `@modelcontextprotocol/sdk`): # Full Claude Code setup in one command: instruction file + hooks + MCP server config vemora init-agent --agent claude --hooks --mcp --root . | Agent | Output file | |---|---| | `claude` | `CLAUDE.md` | | `gemini` | `GEMINI.md` | | `copilot` | `.github/copilot-instructions.md` | | `cursor` | `.cursor/rules/vemora.mdc` (with `alwaysApply: true`) | | `windsurf` | `.windsurfrules` | Re-running `init-agent` only updates the auto-generated block between `` markers. Custom content outside the markers is preserved. ### `vemora summarize` Generates LLM-powered summaries for every indexed file and a high-level project overview. **Incremental** — only re-generates summaries for files whose content has changed. Summaries serve two purposes: - **Retrieval**: after re-running `vemora index`, each summary is embedded as a synthetic file-overview chunk. Abstract queries ("how does auth work?") can match against natural-language descriptions rather than raw code. - **Planner context**: used by `vemora plan` and `vemora audit` as cheap, dense context (instead of raw code chunks). Options: --root <dir> project root (default: cwd) --force re-generate all summaries --model <name> override LLM model (default: from config) --files-only only generate per-file summaries --project-only (re)generate project overview from existing file summaries --show print the existing project overview without regenerating vemora summarize --show --root . # print overview without regenerating ### `vemora status` Prints index stats, embedding cache info, knowledge store summary, and a count of TODO/FIXME/HACK/XXX annotations by type. ### `vemora deps <file>` Shows the full dependency context for a file: what it imports, what imports it. Options: --root <dir> project root (default: cwd) -d, --depth <n> transitive depth for outgoing imports (default: 1) -r, --reverse-depth <n> transitive depth for incoming importers (default: 1) # All files that depend on SyncOrchestrator, up to 3 hops vemora deps src/core/sync/SyncOrchestrator.ts --root . --reverse-depth 3 ### `vemora usages <SymbolName>` Finds all files that use a named symbol, following re-export chains. Options: --root <dir> project root (default: cwd) -d, --depth <n> max re-export chain depth to follow (default: 10) --callers-only show only files with call graph data ### `vemora chat` Interactive chat session with the codebase. Supports OpenAI, Anthropic, Gemini, and Ollama. vemora chat --provider anthropic --model claude-opus-4-6 vemora chat --provider ollama --model qwen2.5-coder:14b ### `vemora report` Shows a usage statistics report: commands breakdown, token savings, and most frequent query terms. Options: --root <dir> project root (default: cwd) --days <n> limit report to events from the last N days -v, --verbose show per-query breakdown (last 20 queries) --clear clear all recorded usage data ### `vemora triage` Zero-LLM static heuristic scan for bugs, security issues, and performance problems. Works entirely from the existing index — no API key or network access required. Options: --root <dir> project root (default: cwd) --type <types> comma-separated: bugs, security, performance (default: all) -k, --top-k <n> max findings to return, ranked by score (default: 30) --min-score <n> skip findings below this threshold (default: 1) --file <path> restrict scan to files matching this substring --output <fmt> terminal (default) | json | markdown Each finding includes a severity (high/medium/low), a reason, and the exact code location. # Full scan vemora triage --root . # Bugs only, top 10, export to Markdown vemora triage --type bugs -k 10 --output markdown --root . # Security scan limited to the API layer vemora triage --type security --file src/api --root . Heuristics cover: empty catch blocks, unguarded `JSON.parse`, sync I/O in loops, `any` casts, hardcoded secrets, dangerous `eval`/`exec`, prototype pollution, SQL/command injection patterns, and more. ### `vemora dead-code` Zero-LLM static analysis for unused code. Works entirely from the existing index — no API key or network access required. Options: --root <dir> project root (default: cwd) --type <types> comma-separated: uncalled-private, unused-export, unreachable-file (default: all) --output <fmt> terminal (default) | json Three detection categories: | Type | What it finds | |---|---| | `uncalled-private` | Private functions and methods with an entry in the call graph but no recorded callers | | `unused-export` | Exported symbols not imported by any file in the dep graph (namespace imports excluded) | | `unreachable-file` | Files that export symbols but are never imported by any other file in the project | # Full scan — all three categories vemora dead-code --root . # Only private methods with no callers vemora dead-code --type uncalled-private --root . # Machine-readable output vemora dead-code --output json --root . | jq '.[] | select(.type == "unused-export")' **Caveats:** call graph coverage is incomplete for dynamic dispatch, arrow functions assigned to variables, and `require()` calls. Namespace imports (`import * as X`) prevent false positives on `unused-export` by marking the whole file as used. Entry points (`cli.ts`, `index.ts`, `main.ts`, etc.) are excluded from `unreachable-file`. ### `vemora mcp` Starts a **Model Context Protocol server** that exposes vemora's search capabilities as native MCP tools. Unlike the CLI (which spawns a new process per command and re-reads the index from disk each time), the MCP server loads the index **once at startup** and serves all tool calls from memory — significantly faster on large codebases. Options: --root <dir> project root directory (default: cwd) Requires `@modelcontextprotocol/sdk`: npm install @modelcontextprotocol/sdk # or: pnpm add @modelcontextprotocol/sdk #### MCP tools | Tool | Description | |---|---| | `search` | Semantic / hybrid / keyword search; supports all `context` flags (`mmr`, `rerank`, `session`, `structured`, `skill`, `budget`) | | `focus` | Implementation, deps, callers, tests, and knowledge for a file or symbol | | `brief` | Session primer: project overview + knowledge entries | | `remember` | Persist a knowledge entry | | `deps` | Dependency graph for a file | #### Configuration **Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json`): { "mcpServers": { "vemora": { "command": "vemora", "args": ["mcp", "--root", "/absolute/path/to/your/project"] } } } **Claude Code** (`.claude/settings.json` in the project root): { "mcpServers": { "vemora": { "command": "vemora", "args": ["mcp", "--root", "."] } } } Once the server is running, the CLAUDE.md / GEMINI.md instruction files (generated by `vemora init-agent`) remain useful as fallback guidance for agents that call vemora via Bash, but MCP-compatible clients will call the tools natively without needing those instructions. ### `vemora focus <target>` Aggregates all structural context about a file or symbol in one call — replaces the need to run `context`, `deps`, `usages`, and `knowledge` separately. Options: --root <dir> project root (default: cwd) --format <fmt> markdown (default) | plain --budget <n> max tokens to include in output (truncates from the end) --lines <start-end> restrict implementation output to chunks overlapping a line range --depth <level> expand class members: method — shows implementation for each member `<target>` can be a file path (full or partial) or a symbol name: # File focus — exports, chunks, imports, importers, call graph, tests, knowledge vemora focus src/core/email/services/email.service.ts --root . vemora focus email.service --root . # partial path match # Symbol focus — implementation, callers, callees, members, tests vemora focus EmailService --root . # shows Methods section with member list vemora focus EmailService --root . --depth method # also expands each method's implementation vemora focus EmailService.send --root . # focus on a single method directly # Restrict to a specific line range (useful when --budget cuts off a method) vemora focus src/core/email/services/email.service.ts --root . --lines 200-280 # Pipe into a context block for any LLM vemora focus src/search/hybrid.ts --root . --format plain > context.md ## Configuration Edit `.vemora/config.json` after `init`: { "projectId": "b88eb8199f78331e", "projectName": "my-app", "version": "1.0.0", "include": ["**/*.ts", "**/*.tsx"], "exclude": ["**/node_modules/**", "**/dist/**"], "maxChunkLines": 80, "maxChunkChars": 3000, "embedding": { "provider": "ollama", "model": "nomic-embed-text", "dimensions": 768 }, "summarization": { "provider": "ollama", "model": "gemma4:e2b", "baseUrl": "http://localhost:11434" }, "reranker": { "provider": "ollama" }, "display": { "format": "terse" } } ### Planner-executor configuration Add `planner` and `executor` blocks to use different models for planning and execution: { "planner": { "provider": "anthropic", "model": "claude-opus-4-6" }, "executor": { "provider": "gemini", "model": "gemini-2.0-flash", "apiKey": "your-google-ai-studio-key" } } `planner` is used by `vemora plan` and `vemora audit`. `executor` handles step execution. Fallback chain: `executor` → `summarization` → same model for both roles. The `planner` config also accepts two extra fields when using `claude-code`: | Field | Type | Description | |---|---|---| | `allowedTools` | `string[]` | Tools the subprocess may call (default: `["Read","Grep","Glob"]`) | | `maxBudgetUsd` | `number` | Spend cap per plan call in USD (default: `0.50`) | ### `display.format` Sets the default output format for `query`, `context`, and `ask`. Set to `"terse"` for small/local models with limited context windows. ### Embedding providers | Provider | Config | Notes | |---|---|---| | `openai` | `OPENAI_API_KEY` env or `apiKey` in config | Best quality. Requires `npm install openai`. | | `ollama` | `baseUrl`, `maxChars` (see below) | Local, no cost. | | `none` | — | Keyword search only, no embeddings. | #### Ollama embedding options | Field | Default | Description | |---|---|---| | `model` | `"nomic-embed-text"` | Embedding model to pull and use | | `dimensions` | `768` | Must match the model output dimensions | | `baseUrl` | `"http://localhost:11434"` | Ollama server URL | | `maxChars` | `3800` | Max characters per chunk before truncation. Prevents exceeding the model's context window. Increase for models with larger context (e.g. `mxbai-embed-large`: ~8000). | "embedding": { "provider": "ollama", "model": "nomic-embed-text", "dimensions": 768, "maxChars": 3800 } ### LLM providers Used by `ask`, `chat`, `summarize`, `plan`, and `audit`. | Provider | Config | Notes | |---|---|---| | `openai` | `OPENAI_API_KEY` env or `apiKey` in config | Also works with any OpenAI-compatible endpoint via `baseUrl`. | | `anthropic` | `ANTHROPIC_API_KEY` env or `apiKey` in config | Requires `npm install @anthropic-ai/sdk`. | | `gemini` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` env or `apiKey` in config | Uses Google's OpenAI-compatible endpoint. Free tier available via Google AI Studio. | | `ollama` | `baseUrl` (default: `http://localhost:11434`) | Local, no cost. | | `claude-code` | `baseUrl` = path to `claude` binary (default: `"claude"`) | Planner-only. Spawns the Claude Code CLI subprocess; the subprocess can explore the codebase with `Read`/`Grep`/`Glob` before answering. Requires Claude Code installed and authenticated. | #### OpenAI-compatible endpoints The `openai` provider accepts a `baseUrl` field, enabling any compatible API: { "provider": "openai", "model": "llama-3.3-70b-versatile", "baseUrl": "https://api.groq.com/openai/v1", "apiKey": "..." } | Service | `baseUrl` | Free tier | |---|---|---| | Groq | `https://api.groq.com/openai/v1` | Yes (rate limited) | | OpenRouter | `https://openrouter.ai/api/v1` | Some free models | | Gemini (compat) | `https://generativelanguage.googleapis.com/v1beta/openai/` | Yes | ### Reranker Controls how search results are re-scored when `--rerank` is passed to `query`, `context`, or `ask`, and always in `chat`. | Provider | Config | Notes | |---|---|---| | `xenova` | _(no extra config)_ | Local cross-encoder (`ms-marco-MiniLM-L-6-v2`). Best quality. Requires `npm install @xenova/transformers`. | | `ollama` | `model` (optional), `baseUrl` (optional) | Uses the configured LLM to rank results in a single call. No extra dependency. | | `none` | — | Skip reranking entirely. | { "reranker": { "provider": "ollama" } } When `provider` is `ollama` and `model` is omitted, the model from `summarization` is used. ### Recommended configurations **Maximum quality (cloud)** { "planner": { "provider": "anthropic", "model": "claude-opus-4-6" }, "executor": { "provider": "openai", "model": "gpt-4o-mini" } } **Claude Code as planner + free executor** { "planner": { "provider": "claude-code", "model": "claude-sonnet-4-6", "allowedTools": ["Read","Grep","Glob"], "maxBudgetUsd": 0.50 }, "executor": { "provider": "ollama", "model": "qwen2.5-coder:14b", "baseUrl": "http://localhost:11434" } } **Pro planner + free executor** { "planner": { "provider": "anthropic", "model": "claude-opus-4-6" }, "executor": { "provider": "gemini", "model": "gemini-2.0-flash", "apiKey": "..." } } **Fully local (no API keys)** { "embedding": { "provider": "ollama", "model": "nomic-embed-text", "dimensions": 768 }, "summarization": { "provider": "ollama", "model": "gemma4:e2b" }, "executor": { "provider": "ollama", "model": "qwen2.5-coder:7b" }, "reranker": { "provider": "ollama" }, "display": { "format": "terse" } } Other local executor models that work well: `qwen2.5-coder:14b`, `llama3.2`, `mistral`. ## What goes in git ✓ .vemora/config.json ✓ .vemora/metadata.json ✓ .vemora/index/files.json ✓ .vemora/index/chunks.json ✓ .vemora/index/symbols.json ✓ .vemora/index/deps.json ✓ .vemora/index/callgraph.json ✓ .vemora/summaries/file-summaries.json ✓ .vemora/summaries/project-summary.json ✓ .vemora/knowledge/entries.json ← shared knowledge store ✗ .vemora-cache/ ← local embedding vectors (gitignored) ## Incremental indexing Chunk IDs are derived from `sha256(filePath + content)`. If a function's code doesn't change, its chunk ID is stable across branches — embeddings are reused without any API call. ## Tech stack - **TypeScript + Node.js** (CommonJS, ES2022 target) - **commander** — CLI framework - **fast-glob** — repository scanning - **tree-sitter** (optional) — AST-based symbol extraction for TS/JS - **openai** SDK _(optional)_ — embedding generation, OpenAI and Gemini LLM provider; `npm install openai` - **@anthropic-ai/sdk** _(optional)_ — Anthropic/Claude LLM provider; `npm install @anthropic-ai/sdk` - **@modelcontextprotocol/sdk** _(optional)_ — MCP server transport for `vemora mcp`; `npm install @modelcontextprotocol/sdk` - **@xenova/transformers** _(optional)_ — local cross-encoder model for `--rerank` with `reranker.provider = "xenova"`; `npm install @xenova/transformers`. Not needed if using `reranker.provider = "ollama"` or `"none"`. - **hnsw** — HNSW index for sub-millisecond vector search - **chokidar** — file watching for `--watch` mode - **chalk + ora** — terminal output</div><div><strong>标签：</strong>AI辅助开发, GNU通用公共许可证, LLM, Node.js, RAG, SOC Prime, Unmanaged PE, 云安全监控, 代码索引, 开发工具, 自动化攻击, 静态分析</div></article></div>  <script> (function () { var base = (document.querySelector('base') && document.querySelector('base').getAttribute('href')) || ''; var path = base.replace(/\/?$/, '') + '/cap-wasm/cap_wasm.min.js'; window.CAP_CUSTOM_WASM_URL = new URL(path, window.location.href).href; })(); </script> </body> </html>