Alex8791-cyber/cognithor

GitHub: Alex8791-cyber/cognithor

本地优先的自主智能体操作系统，支持 16 种 LLM 后端、17 个通信渠道和 5 层认知记忆，提供完整的知识管理和自动化能力。

Stars: 147 | Forks: 21

Cognithor · Agent OS

A local-first, autonomous agent operating system for AI experimentation and personal automation.

Cognition + Thor — Intelligence with Power

16 LLM Providers · 17 Channels · 5-Tier Memory · Knowledge Vault · Security · Apache 2.0

## 为什么选择 Cognithor？大多数 AI 助手会将您的数据发送到云端。Cognithor 完全在您的机器上运行 —— 搭配 Ollama 或 LM Studio，无需 API 密钥。云服务商是可选的，而非强制。它将一系列零散的工具替换为一个集成系统：17 个频道、48 个 MCP 工具、5 层记忆、知识库、语音、浏览器自动化等 —— 所有功能从一开始就连接完毕。9,596 个测试，89% 的覆盖率，确保其可靠性。请参阅 [状态与成熟度](#status--maturity) 了解这能保证什么以及不能保证什么。 ## 状态与成熟度 **Cognithor 是 Beta / 实验性软件。** 它正处于快速、活跃的开发中。 | 方面 | 状态 | |--------|--------| | **核心 Agent 循环 (PGE)** | 稳定 —— 经过充分测试且功能正常 | | **记忆系统** | 稳定 —— 5 层架构运行可靠 | | **CLI 频道** | 稳定 —— 主要开发接口 | | **Web UI / 控制中心** | Beta —— 功能可用但可能存在细节问题 | | **消息频道** (Telegram, Discord 等) | Beta —— 基本流程可用，边缘情况可能出错 | | **语音模式 / TTS** | Alpha —— 实验性，依赖硬件 | | **浏览器自动化** | Alpha —— 需要 Playwright 设置 | | **部署 (Docker, bare-metal)** | Beta —— 仅在有限配置上测试过 | | **企业功能** (GDPR, A2A, 治理) | Alpha —— 已实现但未经合规审计 | **测试套件覆盖范围：** 所有模块的单元测试、集成测试和模拟端到端测试。9,596 个测试在受控环境中验证代码的正确性。 **测试套件未覆盖范围：** 真实部署场景、网络边缘情况、长期运行稳定性、多用户负载、特定硬件的语音/GPU 问题，或实际的 LLM 响应质量。 **用户注意事项：** - 本项目由单人开发者借助 AI 协助开发。代码经过人工审查，但开发节奏很快。 - 次版本之间可能会发生破坏性变更。如果稳定性对您很重要，请固定您的版本。 - 系统提示词、错误消息和 UI 字符串的默认语言是 **德语**。请参阅 [语言与国际化](#language--internationalization)。 - 对于生产用途，强烈建议在您的特定环境中进行彻底测试。 - 欢迎提交 Bug 报告和贡献 —— 请参阅 [Issues](https://github.com/Alex8791-cyber/cognithor/issues)。

Cognithor Demo

## 目录 - [为什么选择 Cognithor?](#why-cognithor) - [状态与成熟度](#status--maturity) - [新特性](#whats-new) - [亮点](#highlights) - [架构](#architecture) - [LLM 提供商](#llm-providers) - [频道](#channels) - [快速开始](#quick-start) (5 分钟内) - [配置](#configuration) - [安全](#security) - [MCP 工具](#mcp-tools) - [测试](#tests) - [代码质量](#code-quality) - [项目结构](#project-structure) - [部署](#deployment) - [语言与国际化](#language--internationalization) - [路线图](#roadmap) - [录制演示](#recording-a-demo) - [许可证](#license) ## 新特性 ### v0.27.1 — 社区技能市场与自主性加固完整的社区技能市场，包含信任链（发布者验证、5 项检查验证、ToolEnforcer 运行时沙箱）。PGE 循环中的 13 项自主性修复。 - **社区技能市场** — 从 GitHub 托管的注册表安装、搜索、评级和报告社区技能，包含发布者验证和信任级别 - **ToolEnforcer** — 针对社区技能的运行时工具白名单：技能只能调用声明的工具 - **5 项检查验证流水线** — 语法、注入扫描、工具白名单、安全审计、SHA-256 哈希验证 - **自主性加固** — 多步计划不再提前退出，智能失败阈值，带回退的重规划重试，修复了 presearch 跳过模式 - **subprocess 区分** — 现在允许 `subprocess.run()`/`check_output()`；`Popen`/`call` 仍被阻止 - **配置调优** — 更大的上下文预算 (`memory_top_k: 8`, `max_context_chars: 8000`, `response_token_budget: 4000`) - **线程安全的社区缓存** — 所有社区模块缓存上的 `asyncio.Lock`，aiohttp 回退到 urllib **早期版本** - **v0.27.0** — 全面审计，安装程序大修：80 项审计，XSS 修复，CORS 加固，速率限制，自动安装 Python/Ollama - **v0.26.7** — 连线：基于 DAG 的并行执行器，带 SSRF 防护的 http_request 工具，子 Agent 深度保护，实时配置重载 - **v0.26.6** — 聊天与语音：集成聊天页面，带唤醒词的语音模式，Piper TTS，15 个 Agent 基础设施子系统 - **v0.26.5** — 人性化感觉：性格引擎，情绪检测，用户偏好，状态回调，友好的错误消息 - **v0.26.0–v0.26.4** — 安全加固，Docker 生产环境，LM Studio 后端，扩展，覆盖率与技能 ## 亮点 - **16 个 LLM 提供商** — Ollama (本地), LM Studio (本地), OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, Mistral, Together AI, OpenRouter, xAI (Grok), Cerebras, GitHub Models, AWS Bedrock, Hugging Face, Moonshot/Kimi - **17 个通信频道** — CLI, Web UI, REST API, Telegram, Discord, Slack, WhatsApp, Signal, iMessage, Microsoft Teams, Matrix, Google Chat, Mattermost, Feishu/Lark, IRC, Twitch, Voice (STT/TTS) - **5 层认知记忆** — 核心身份、情景日志、语义知识图谱、程序性技能、工作记忆 - **3 通道混合搜索** — BM25 全文 + 向量嵌入 + 知识图谱遍历，配合分数融合 - **PGE 架构** — Planner (LLM) -> Gatekeeper (确定性策略引擎) -> Executor (沙箱化) - **安全性** — 4 级沙箱，SHA-256 审计链，EU AI Act 合规模块，凭证库，红队测试，运行时 Token 加密 (Fernet AES-256)，TLS 支持，文件大小限制 (未经独立审计 —— 请参阅 [状态与成熟度](#status--maturity)) - **知识库** — 兼容 Obsidian 的 Markdown 库，支持 YAML frontmatter、标签、`[[反向链接]]`、全文搜索 - **文档分析** — 基于 LLM 的 PDF/DOCX/HTML 结构化分析（摘要、风险、行动项、决策） - **模型上下文协议 (MCP)** — 跨 10 个模块的 48 个工具（filesystem, shell, memory, web, browser, media, vault, synthesis, code, skills） - **分布式锁** — 基于 Redis 的（带基于文件的回退）锁，用于多实例部署 - **持久化消息队列** — 基于 SQLite 的持久队列，支持优先级、DLQ 和自动重试 - **Prometheus 指标** — /metrics 端点，配合 Grafana 仪表板用于生产可观测性 - **技能市场** — 持久化到 SQLite 的技能市场，支持评级、搜索和 REST API - **社区技能市场** — GitHub 托管的注册表，包含发布者验证（4 个信任级别）、5 项检查验证流水线、ToolEnforcer 运行时沙箱、异步安装/搜索/报告 - **Telegram Webhook** — 轮询 + Webhook 模式，延迟低于 100ms - **自动依赖加载** — 在启动时检测并安装缺失的可选包 - **Agent-to-Agent 协议 (A2A)** — Linux Foundation RC v1.0，用于 Agent 间通信 - **集成聊天** — 控制中心内的完整聊天页面，支持 WebSocket 流式传输、工具指示器、画布面板、批准横幅和语音模式 - **React 控制中心** — 完整的 Web 仪表板 (React 19 + Vite 7)，集成后端启动器、实时配置编辑、Agent 管理、提示词编辑、Cron 任务、MCP 服务器和 A2A 设置 - **人性化感觉** — 性格引擎（温暖、幽默、问候）、情绪检测（沮丧/紧急/困惑/积极）、用户偏好学习、实时状态回调、用户友好的德语错误消息 - **自动检测频道** — 当 `.env` 中存在 Token 时，频道会自动激活 —— 无需手动配置标志 - **知识综合** — 跨 Memory + Vault + Web 的元分析，通过 LLM 融合：`knowledge_synthesize`（带置信度评级的完整综合）、`knowledge_contradictions`（事实核查）、`knowledge_timeline`（因果链）、`knowledge_gaps`（完整性评分 + 研究建议） - **自适应上下文流水线** — 在每次 Planner 调用前自动丰富上下文：BM25 记忆搜索 + Vault 全文搜索 + 近期片段，在 <50ms 内注入 WorkingMemory - **安全加固** — 所有通道上的运行时 Token 加密 (Fernet AES-256)，Webhook 服务器的 TLS 支持，所有上传/处理路径的文件大小限制，SQLite 中的持久会话映射 - **一键启动** — 双击 `start_cognithor.bat` -> 浏览器打开 -> 点击 **Power On** -> 完成 - **增强的网络研究** — 4 提供商搜索回退 (SearXNG -> Brave -> Google CSE -> DuckDuckGo)，用于重 JS 站点的 Jina AI Reader，域名过滤，来源交叉检查 - **程序性学习** — Reflector 从成功会话中自动合成可复用技能 - **DAG 工作流引擎** — 有向无环图执行，支持并行分支、条件边、循环检测、自动重试。现已接入 Executor 用于并行工具执行 - **分布式 Workers** — 基于能力的作业路由、健康监控、故障转移、死信队列 - **多 Agent 协作** — Agent 团队的辩论、投票和流水线模式 - **工具沙箱加固** — 每个工具的资源限制、网络防护、逃逸检测（8 种攻击类别） - **GDPR 合规工具包** — 数据处理日志 (Art. 30)、保留强制执行、删除权 (Art. 17)、审计导出 - **Agent 基准套件** — 14 个标准化任务、综合评分、跨版本回归检测 - **确定性重放** — 记录和重放 Agent 执行，支持假设分析和差异比较 - **Agent SDK** — 基于装饰器的 Agent 注册 (`@agent`, `@tool`, `@hook`)、项目脚手架 - **插件远程注册表** — 带 SHA-256 校验和、依赖解析、安装/更新/回滚的远程清单 - **uv 安装程序支持** — 自动检测 uv 以实现 10 倍更快的安装，透明回退到 pip - **9,596 个测试** · **89% 覆盖率** · **0 个 lint 错误** ## 架构 ``` ┌───────────────────────────────────────────────────────────────────┐ │ Control Center UI (React 19 + Vite 7) │ │ Config · Agents · Chat · Voice · Prompts · Cron · MCP · A2A │ ├───────────────────────────────────────────────────────────────────┤ │ Prometheus /metrics · Grafana Dashboard │ ├───────────────────────────────────────────────────────────────────┤ │ REST API (FastAPI, 20+ endpoints, port 8741) │ ├───────────────────────────────────────────────────────────────────┤ │ Channels (17) │ │ CLI · Web · Telegram (poll+webhook) · Discord · Slack │ │ WhatsApp · Signal · iMessage · Teams · Matrix · Voice · ... │ ├───────────────────────────────────────────────────────────────────┤ │ Gateway Layer │ │ Session · Agent Loop · Distributed Lock · Status Callbacks │ │ Personality · Sentiment · User Preferences │ ├───────────────────────────────────────────────────────────────────┤ │ Durable Message Queue (SQLite, priorities, DLQ) │ ├───────────────────────────────────────────────────────────────────┤ │ Context Pipeline (Memory · Vault · Episodes) │ ├─────────────┬──────────────┬──────────────────────────────────────┤ │ Planner │ Gatekeeper │ Executor │ │ (LLM) │ (Policy) │ (Sandbox) │ ├─────────────┴──────────────┴──────────────────────────────────────┤ │ DAG Workflow Engine · Workflow Adapter · Benchmark Suite │ ├───────────────────────────────────────────────────────────────────┤ │ MCP Tool Layer (48 tools) │ │ Filesystem · Shell · Memory · Web · Browser · Media · Vault │ │ Synthesis · Skills Marketplace · Remote Registry │ ├───────────────────────────────────────────────────────────────────┤ │ Multi-LLM Backend Layer (16) │ │ Ollama · OpenAI · Anthropic · Gemini · Groq · DeepSeek │ │ Mistral · Together · OpenRouter · xAI · Cerebras · ... │ ├───────────────────────────────────────────────────────────────────┤ │ 5-Tier Cognitive Memory │ │ Core · Episodic · Semantic · Procedural · Working │ ├───────────────────────────────────────────────────────────────────┤ │ Infrastructure: Redis/File Distributed Lock │ │ SQLite Durable Queue · Prometheus Telemetry │ │ Worker Pool · GDPR Compliance · Deterministic Replay │ └───────────────────────────────────────────────────────────────────┘ ``` ### PGE 三元组 (Planner -> Gatekeeper -> Executor) 每个用户请求都经过三个阶段： 1. **Planner** — 基于 LLM 的理解和规划。分析请求，在记忆中搜索相关上下文，创建带有工具调用的结构化行动计划。支持失败时的重新规划。 2. **Gatekeeper** — 确定性策略引擎。根据安全规则验证每个计划的工具调用（风险级别 GREEN/YELLOW/ORANGE/RED、沙箱策略、参数验证）。无 LLM，无幻觉，无例外。 3. **Executor** — 通过基于 DAG 的并行调度执行批准的操作（独立操作分波并发运行）。Shell 命令隔离运行 (Process -> Namespace -> Container)，文件访问仅限于允许的路径。 ### 5 层认知记忆 | 层级 | 名称 | 持久化 | 用途 | |------|------|------------|---------| | 1 | **Core** | `CORE.md` | 身份、规则、性格 | | 2 | **Episodic** | 每日日志文件 | 今天/昨天发生了什么 | | 3 | **Semantic** | 知识图谱 + SQLite | 客户、产品、事实、关系 | 4 | **Procedural** | Markdown + frontmatter | 习得的技能和工作流 | | 5 | **Working** | RAM (易失性) | 活动会话上下文 | 记忆搜索使用 3 通道混合方法：**BM25**（FTS5 全文搜索，针对德语复合词优化）+ **向量搜索** (Ollama 嵌入，余弦相似度) + **图谱遍历** (实体关系)。分数融合具有可配置的权重和新近度衰减。 ### 知识库除了 5 层记忆外，Cognithor 还包含一个 **兼容 Obsidian 的知识库** (`~/.jarvis/vault/`)，用于持久的、人类可读的笔记： - **文件夹结构**：`recherchen/`, `meetings/`, `wissen/`, `projekte/`, `daily/` - **Obsidian 格式**：YAML frontmatter（标题、标签、来源、日期），`[[反向链接]]` - **6 个工具**：`vault_save`, `vault_search`, `vault_list`, `vault_read`, `vault_update`, `vault_link` - 直接在 [Obsidian](https://obsidian.md) 中打开 Vault 文件夹以进行图谱可视化 ### 反思与程序性学习会话完成后，Reflector 会评估结果，为语义记忆提取事实，并将可重复的模式识别为程序候选项。习得的程序会在未来类似的请求中自动建议。 ## LLM 提供商 Cognithor 从 API 密钥自动检测您的后端。设置一个密钥，模型即自动配置： | 提供商 | 后端类型 | 配置键 | 模型 (Planner / Executor) | |----------|-------------|------------|----------------------------| | **Ollama** (本地) | `ollama` | *(无需)* | qwen3:32b / qwen3:8b | | **LM Studio** (本地) | `lmstudio` | *(无需)* | *(您加载的模型)* | | **OpenAI** | `openai` | `openai_api_key` | gpt-5.2 / gpt-5-mini | | **Anthropic** | `anthropic` | `anthropic_api_key` | claude-opus-4-6 / claude-haiku-4-5 | | **Google Gemini** | `gemini` | `gemini_api_key` | gemini-2.5-pro / gemini-2.5-flash | | **Groq** | `groq` | `groq_api_key` | llama-4-maverick / llama-3.1-8b-instant | | **DeepSeek** | `deepseek` | `deepseek_api_key` | deepseek-chat (V3.2) | | **Mistral** | `mistral` | `mistral_api_key` | mistral-large-latest / mistral-small-latest | | **Together AI** | `together` | `together_api_key` | Llama-4-Maverick / Llama-4-Scout | | **OpenRouter** | `openrouter` | `openrouter_api_key` | claude-opus-4.6 / gemini-2.5-flash | | **xAI (Grok)** | `xai` | `xai_api_key` | grok-4-1-fast-reasoning / grok-4-1-fast | | **Cerebras** | `cerebras` | `cerebras_api_key` | gpt-oss-120b / llama3.1-8b | | **GitHub Models** | `github` | `github_api_key` | gpt-4.1 / gpt-4.1-mini | | **AWS Bedrock** | `bedrock` | `bedrock_api_key` | claude-opus-4-6 / claude-haiku-4-5 | | **Hugging Face** | `huggingface` | `huggingface_api_key` | Llama-3.3-70B / Llama-3.1-8B | | **Moonshot/Kimi** | `moonshot` | `moonshot_api_key` | kimi-k2.5 / kimi-k2-turbo | ``` # ~/.cognithor/config.yaml — 只需设置一个 key，其他一切自动配置 gemini_api_key: "AIza..." # 就是这样。Backend、models 和 operation mode 均为自动检测。 # 或者使用 LM Studio (本地，无需 API key)： llm_backend_type: "lmstudio" # lmstudio_base_url: "http://localhost:1234/v1" # 默认 ``` ## 频道 | 频道 | 协议 | 特性 | |---------|----------|----------| | **CLI** | Terminal REPL | Rich 格式化、流式传输、`/commands`、状态反馈 | | **Web UI** | WebSocket | 实时流式传输、语音录制、文件上传、深色主题、状态事件 | | **REST API** | FastAPI + SSE | 编程访问、服务器发送事件 | | **Telegram** | Bot API (poll + webhook) | 文本、语音消息、照片、文档、Webhook 模式 (<100ms)、输入指示器 | | **Discord** | Gateway + REST | Embeds、反应、话题支持、输入指示器 | | **Slack** | Socket Mode | Block Kit、交互式按钮、话题支持 | | **WhatsApp** | Meta Cloud API | 文本、媒体、位置、联系人 | | **Signal** | signal-cli bridge | 加密消息、附件 | | **iMessage** | PyObjC (macOS) | 原生 macOS 集成 | | **Microsoft Teams** | Bot Framework v4 | 自适应卡片、审批 | | **Matrix** | matrix-nio | 联邦式、加密房间 | | **Google Chat** | Chat API | Workspace 集成 | | **Mattermost** | REST API | 自托管团队聊天 | | **Feishu/Lark** | Bot API | 字节跳动企业消息 | | **IRC** | IRC protocol | 经典互联网中继聊天 | | **Twitch** | TwitchIO | 直播聊天集成 | | **Voice** | Whisper + Piper + ElevenLabs | STT, TTS, 唤醒词, 对话模式, Piper TTS (Thorsten Emotional) | ## 演示 ``` python demo.py # Full experience (~3 minutes) python demo.py --fast # Speed run (~15 seconds) ``` ## 快速开始 **时间：从克隆到运行 Agent 不到 5 分钟。** ### 前置条件 - Python >= 3.12 - **LLM 后端** (以下之一): - [Ollama](https://ollama.ai) — 本地、免费、符合 GDPR（推荐） - [LM Studio](https://lmstudio.ai) — 本地、兼容 OpenAI 的 API，端口 1234 - 上述 14 个云提供商中的任何一个（仅需 API 密钥） - 可选：`playwright` 用于浏览器自动化，`faster-whisper` 用于语音 ### 步骤 1：克隆并安装（约 2 分钟） ``` # 选项 A：从 PyPI 安装 (最简单) pip install cognithor # Core features pip install cognithor[all] # All features (recommended) pip install cognithor[full] # Everything including voice + PostgreSQL # 选项 B：Clone repository (用于开发 / 获取最新更改) git clone https://github.com/Alex8791-cyber/cognithor.git cd cognithor # 推荐：Interactive installation (venv, Ollama check, systemd, smoke test) chmod +x install.sh ./install.sh # 或者：Manual installation (无需 C compiler) pip install -e ".[all,dev]" # Individual feature groups (按需安装) pip install -e ".[telegram,voice,web,cron]" ``` 安装程序提供五种模式：`--minimal`（仅核心）、`--full`（所有功能）、`--use-uv`（使用 [uv](https://docs.astral.sh/uv/) 安装快 10 倍）、`--systemd`（+ 服务安装）、`--uninstall`（卸载）。不带标志时，它以交互模式启动。如果安装了 `uv`，它会被自动检测并优先于 pip 使用。 ### 步骤 2：拉取 LLM（约 2 分钟） ``` ollama pull qwen3:32b # Planner (20 GB VRAM) ollama pull qwen3:8b # Executor (6 GB VRAM) ollama pull nomic-embed-text # Embeddings (300 MB VRAM) # 可选： ollama pull qwen3-coder:32b # Code tasks (20 GB VRAM) ``` 没有 GPU？使用较小的模型（两者都用 `qwen3:8b`）或云提供商 —— 只需设置一个 API 密钥。 ### 步骤 3：启动（约 10 秒） **选项 A：一键启动** — 包含预构建的 Web UI，无需 Node.js ``` Double-click start_cognithor.bat -> Browser opens -> Click "Power On" -> Done. ``` **选项 B：CLI** ``` cognithor # Interactive CLI python -m jarvis # Same thing (always works, no PATH needed) python -m jarvis --lite # Lite mode: qwen3:8b only (6 GB VRAM) python -m jarvis --no-cli # Headless mode (API only) JARVIS_HOME=~/my-cognithor cognithor # Custom home directory ``` **选项 C：控制中心 UI（开发版）** ``` cd ui npm install npm run dev # -> http://localhost:5173 ``` 点击 **Power On** 直接从 UI 启动后端。Vite 开发服务器会自动在端口 8741 上生成并管理 Python 后端进程 —— 包括孤儿检测、干净关闭和进程生命周期管理。**聊天页面** 默认打开为起始页 —— 立即开始与 Jarvis 对话，或激活 **语音模式** 进行免提对话。所有配置 —— Agent、提示词、Cron 任务、MCP 服务器、A2A 设置 —— 都可以通过仪表板进行编辑和保存。更改将持久化到 `~/.jarvis/` 下的 YAML 文件中。 ### 频道自动检测当在 `~/.jarvis/.env` 中找到 Token 时，频道会自动启动： ``` # ~/.jarvis/.env — 只需添加您的 tokens，channels 会自动激活 JARVIS_TELEGRAM_TOKEN=your-bot-token JARVIS_TELEGRAM_ALLOWED_USERS=123456789 JARVIS_DISCORD_TOKEN=your-discord-token JARVIS_SLACK_TOKEN=xoxb-your-slack-token ``` 无需在配置中设置 `telegram_enabled: true` —— Token 的存在就足够了。 ### 目录结构（首次启动时自动创建） ``` ~/.cognithor/ ├── config.yaml # User configuration ├── CORE.md # Identity and rules ├── memory/ │ ├── episodes/ # Daily log files │ ├── knowledge/ # Knowledge graph files │ ├── procedures/ # Learned skills │ └── sessions/ # Session snapshots ├── vault/ # Knowledge Vault (Obsidian-compatible) │ ├── recherchen/ # Web research results │ ├── meetings/ # Meeting notes │ ├── wissen/ # Knowledge articles │ ├── projekte/ # Project notes │ ├── daily/ # Daily notes │ └── _index.json # Quick lookup index ├── index/ │ └── cognithor.db # SQLite index (FTS5 + vectors + entities) ├── mcp/ │ └── config.yaml # MCP server configuration ├── queue/ │ └── messages.db # Durable message queue (SQLite) └── logs/ └── cognithor.log # Structured logs (JSON) ``` ## 配置 Cognithor 通过 `~/.cognithor/config.yaml` 进行配置。所有值都可以使用 `JARVIS_` 前缀（旧版）或 `COGNITHOR_` 前缀的环境变量进行覆盖。 ``` # 示例：~/.cognithor/config.yaml owner_name: "Alex" language: "de" # "de" (German, default) or "en" (English) # LLM Backend — 设置一个 key，backend 自动检测 # openai_api_key: "sk-..." # anthropic_api_key: "sk-ant-..." # gemini_api_key: "AIza..." # groq_api_key: "gsk_..." # xai_api_key: "xai-..." # 或者：llm_backend_type: "lmstudio" # 本地，无需 key ollama: base_url: "http://localhost:11434" timeout_seconds: 120 web: # Search providers (all optional, fallback chain: SearXNG -> Brave -> Google CSE -> DDG) # searxng_url: "http://localhost:8888" # brave_api_key: "BSA..." # google_cse_api_key: "AIza..." # google_cse_cx: "a1b2c3..." # jina_api_key: "" # Optional, free tier works without key # domain_blocklist: [] # Blocked domains # domain_allowlist: [] # If set, ONLY these domains allowed vault: enabled: true path: "~/.jarvis/vault" # auto_save_research: false # Auto-save web research results channels: cli_enabled: true # Channels auto-detect from tokens in ~/.jarvis/.env # Set to false only to explicitly disable a channel: # telegram_enabled: false security: allowed_paths: - "~/.cognithor" - "~/Documents" # 个性 personality: warmth: 0.7 # 0.0 = sachlich, 1.0 = sehr warm humor: 0.3 # 0.0 = kein Humor, 1.0 = viel Humor greeting_enabled: true # Tageszeit-Grüße follow_up_questions: true # Rückfragen anbieten success_celebration: true # Erfolge feiern # 扩展 distributed_lock: backend: "file" # "redis" or "file" # redis_url: "redis://localhost:6379/0" message_queue: enabled: true # max_retries: 3 # dlq_enabled: true telemetry: prometheus_enabled: true # metrics_port: 9090 ``` ## 安全 Cognithor 实施了多层安全措施（未经独立审计）： | 功能 | 描述 | |---------|-------------| | **Gatekeeper** | 确定性策略引擎（无 LLM）。4 个风险级别：GREEN（自动）-> YELLOW（通知）-> ORANGE（批准）-> RED（阻止） | | **沙箱** | 4 个隔离级别：Process -> Linux Namespaces (nsjail) -> Docker -> Windows Job Objects | | **审计追踪** | 带有 SHA-256 链的仅追加 JSONL。防篡改。记录前屏蔽凭证。 | | **凭证库** | Fernet 加密、每 Agent 的秘密存储 | | **运行时 Token 加密** | 所有频道 Token（Telegram, Discord, Slack...）在内存中使用临时 Fernet 密钥 (AES-256) 加密。永远不会以明文形式存储在 RAM 中 | | **TLS 支持** | 所有 Webhook 服务器的可选 SSL/TLS。强制最低 TLS 1.2。非 localhost 未使用 TLS 时记录警告 | | **文件大小限制** | 所有路径的上传/处理限制：50 MB 文档，100 MB 音频，1 MB 代码执行，50 MB WebUI 上传 | | **会话持久化** | 频道到会话的映射存储在 SQLite（WAL 模式）中。重启后依然存在 —— 不会丢失 Telegram/Discord 会话 | | **输入净化** | Shell 命令和文件路径的注入防护 | | **子 Agent 深度保护** | 可配置的 `max_sub_agent_depth`（默认为 3）防止无限 handle_message() 递归 | | **SSRF 防护** | 针对 http_request 和 web_fetch 工具的私有 IP 阻止 | | **EU AI Act** | 合规模块、影响评估、透明度报告 | | **红队测试** | 自动化攻击性安全测试 (1,425 LOC) | ## MCP 工具 | 工具服务器 | 工具 | 描述 | |-------------|-------|-------------| | **Filesystem** | read, write, edit, list, delete | 路径沙箱化的文件操作 | | **Shell** | exec_command | 带超时的沙箱化命令执行 | | **Memory** | search, save, get_entity, add_entity, ... | 跨所有 5 层的 10 个记忆工具 | | **Web** | web_search, web_fetch, search_and_read, web_news_search, http_request | 4 提供商搜索 (SearXNG -> Brave -> Google CSE -> DDG)，Jina Reader 回退，域名过滤，交叉检查，完整 HTTP 方法支持 (POST/PUT/PATCH/DELETE) | | **Browser** | navigate, screenshot, click, fill_form, execute_js, get_page_content | 基于 Playwright 的浏览器自动化 | | **Media** | transcribe_audio, analyze_image, extract_text, analyze_document, convert_audio, resize_image, tts, document_export | 多模态流水线 + LLM 驱动的文档分析（全部本地） | | **Vault** | vault_save, vault_search, vault_list, vault_read, vault_update, vault_link | 兼容 Obsidian 的知识库，支持 frontmatter、标签、反向链接 | | **Synthesis** | knowledge_synthesize, knowledge_contradictions, knowledge_timeline, knowledge_gaps | 跨 Memory + Vault + Web 的元分析，通过 LLM 融合、置信度评分、事实核查 | ## 测试 ``` # 所有测试 make test # 附带 coverage report make test-cov # 特定区域 python -m pytest tests/test_core/ -v python -m pytest tests/test_memory/ -v python -m pytest tests/test_channels/ -v ``` 当前状态：**9,596 个测试** · **100% 通过率** · **89% 覆盖率** · **约 109,000 行源代码** · **约 92,000 行测试代码** | 领域 | 测试数 | 描述 | |------|-------|-------------| | Core | 1,893 | Planner, Gatekeeper, Executor, Config, Models, Reflector, Distributed Lock, Model Router, DAG Engine, Delegation, Collaboration, Agent SDK, Workers, Personality, Sentiment | | Integration | 1,314 | 端到端测试、阶段连线、入口点、A2A 协议 | | Channels | 1,360 | CLI, Telegram (含 Webhook), Discord, Slack, WhatsApp, API, WebUI, Voice, iMessage, Signal, Teams | | MCP | 825 | Client, filesystem, shell, memory server, web, media, synthesis, vault, browser, bridge, resources | | Memory 658 | 所有 5 层、索引器、混合搜索、分块器、监视器、Token 估算、完整性、清理 | | Skills | 534 | 技能注册表、生成器、市场、持久化、API、CLI 工具、脚手架、linter、BaseSkill、远程注册表 | | Security | 469 | Audit, credentials, token store, TLS, policies, sandbox, sanitizer, agent vault, resource limits, GDPR | | Gateway | 252 | 会话管理、Agent 循环、上下文流水线、阶段初始化、审批流程 | | A2A | 158 | Agent-to-Agent 协议、客户端、HTTP 处理器、流式传输 | | Telemetry | 175 | 成本跟踪、指标、追踪、Prometheus 导出、检测、记录器、重放 | | Other | 247 | HITL、治理、学习、主动、配置管理器 | | Tools | 103 | 重构 Agent、代码分析器、技能 CLI 开发工具 | | Utils | 126 | 日志记录、辅助函数、错误消息、安装程序 | | Benchmark | 48 | Agent 基准套件、评分、回归检测 | | Cron | 63 | 引擎、作业存储、调度 | | UI API | 55 | 控制中心端点 | ## 代码质量 ``` make lint # Ruff linting (0 errors) make format # Ruff formatting make typecheck # MyPy strict type checking make check # All combined (lint + typecheck + tests) make smoke # Installation validation (26 checks) make health # Runtime check (Ollama, disk, memory, audit) ``` ## 项目结构 ``` cognithor/ ├── src/jarvis/ # Python backend │ ├── config.py # Configuration system (YAML + env vars) │ ├── config_manager.py # Runtime config management (read/update/save) │ ├── models.py # Pydantic data models (58+ classes) │ ├── core/ │ │ ├── planner.py # LLM planner with re-planning │ │ ├── gatekeeper.py # Deterministic policy engine (no LLM) │ │ ├── executor.py # DAG-based parallel tool executor with audit trail │ │ ├── model_router.py # Model selection by task type │ │ ├── llm_backend.py # Multi-provider LLM abstraction (16 backends) │ │ ├── orchestrator.py # High-level agent orchestration │ │ ├── reflector.py # Reflection, fact extraction, skill synthesis │ │ ├── distributed_lock.py # Redis/file-based distributed locking │ │ ├── dag_engine.py # DAG Workflow Engine (parallel execution) │ │ ├── execution_graph.py # Execution Graph UI (Mermaid export) │ │ ├── delegation.py # Agent Delegation Engine (typed contracts) │ │ ├── collaboration.py # Multi-Agent Collaboration (debate/voting/pipeline) │ │ ├── agent_sdk.py # Agent SDK (decorators, registry, scaffolding) │ │ ├── worker.py # Distributed Worker Runtime (job routing, failover) │ │ ├── personality.py # Personality Engine (warmth, humor, greetings) │ │ ├── sentiment.py # Keyword/regex sentiment detection (German) │ │ └── user_preferences.py # SQLite user preference store (auto-learn) │ ├── memory/ │ │ ├── manager.py # Central memory API (all 5 tiers) │ │ ├── core_memory.py # Tier 1: CORE.md management │ │ ├── episodic.py # Tier 2: Daily logs (Markdown) │ │ ├── semantic.py # Tier 3: Knowledge graph (entities + relations) │ │ ├── procedural.py # Tier 4: Skills (YAML frontmatter + Markdown) │ │ ├── working.py # Tier 5: Session context (RAM) │ │ ├── indexer.py # SQLite index (FTS5 + entities + vectors) │ │ ├── search.py # 3-channel hybrid search (BM25 + vector + graph) │ │ ├── embeddings.py # Embedding client with LRU cache │ │ ├── chunker.py # Markdown-aware sliding window chunker │ │ └── watcher.py # Auto-reindexing (watchdog/polling) │ ├── mcp/ │ │ ├── client.py # Multi-server MCP client (stdio + builtin) │ │ ├── server.py # Jarvis as MCP server │ │ ├── filesystem.py # File tools (path sandbox) │ │ ├── shell.py # Shell execution (timeout, sandbox) │ │ ├── memory_server.py # Memory as 10 MCP tools │ │ ├── web.py # Enhanced web search (4 providers), URL fetch (Jina fallback), http_request │ │ ├── vault.py # Knowledge Vault (Obsidian-compatible, 6 tools) │ │ ├── synthesis.py # Knowledge Synthesis (4 tools: synthesize, contradictions, timeline, gaps) │ │ ├── browser.py # Browser automation (Playwright, 6 tools) │ │ └── media.py # Media pipeline (STT, TTS, image, PDF, document analysis, 8 tools) │ ├── gateway/ │ │ ├── gateway.py # Agent loop, session management, subsystem init │ │ └── message_queue.py # Durable SQLite-backed message queue (priorities, DLQ) │ ├── channels/ # 17 communication channels + Control Center API │ │ ├── base.py # Abstract channel interface │ │ ├── config_routes.py # REST API for Control Center (20+ endpoints) │ │ ├── cli.py, api.py # Core channels │ │ ├── telegram.py # Telegram (polling + webhook mode) │ │ ├── discord.py # Discord │ │ ├── whatsapp.py, signal.py # Encrypted messaging │ │ ├── voice.py # Voice I/O (STT/TTS) │ │ └── ... # Teams, Matrix, IRC, Twitch, Mattermost, etc. │ ├── security/ │ │ ├── audit.py # Append-only audit trail (SHA-256 chain) │ │ ├── credentials.py # Credential store (Fernet encrypted) │ │ ├── token_store.py # Runtime token encryption (ephemeral Fernet) + TLS helper │ │ ├── sandbox.py # Multi-level sandbox (L0-L2) │ │ ├── policies.py # Security policies (path, command, network) │ │ ├── policy_store.py # Versioned policy store (simulation, rollback) │ │ ├── resource_limits.py # Tool sandbox hardening (per-tool profiles, escape detection) │ │ ├── gdpr.py # GDPR Compliance Toolkit (Art. 15-17, 30) │ │ └── sanitizer.py # Input sanitization (injection protection) │ ├── cron/ # Cron engine with APScheduler │ ├── a2a/ # Agent-to-Agent protocol (Linux Foundation RC v1.0) │ ├── skills/ # Skill registry, generator, marketplace (SQLite persistence) │ ├── graph/ # Knowledge graph engine │ ├── telemetry/ # Cost tracking, metrics, tracing, Prometheus export │ │ ├── recorder.py # Execution recorder (13 event types, JSONL export) │ │ └── replay.py # Deterministic replay engine (what-if analysis) │ ├── benchmark/ # Agent Benchmark Suite │ │ └── suite.py # 14 tasks, scoring, runner, reports, regression detection │ └── utils/ │ ├── logging.py # Structured logging (structlog + Rich) │ ├── installer.py # uv/pip detection and command abstraction │ └── error_messages.py # User-friendly German error templates ├── ui/ # Control Center (React 19 + Vite 7) │ ├── vite.config.js # Dev server with backend launcher plugin │ ├── package.json # Dependencies (react, vite) │ ├── index.html # Entry point │ └── src/ │ ├── CognithorControlCenter.jsx # Main dashboard (1,700 LOC) │ ├── pages/ │ │ └── ChatPage.jsx # Integrated chat page (default start) │ ├── components/chat/ │ │ ├── MessageList.jsx # Message display with Markdown │ │ ├── ChatInput.jsx # Rich input bar │ │ ├── ChatCanvas.jsx # Canvas side panel │ │ ├── ToolIndicator.jsx # Tool execution indicators │ │ ├── ApprovalBanner.jsx # Inline approval/deny banner │ │ └── VoiceIndicator.jsx # Voice mode visual feedback │ ├── hooks/ │ │ ├── useJarvisChat.js # WebSocket chat hook │ │ └── useVoiceMode.js # Voice mode hook (wake word, STT, TTS) │ ├── App.jsx # App shell │ └── main.jsx # React entry ├── tests/ # 9,596 tests, ~92,000 LOC │ ├── test_core/ # Planner, Gatekeeper, Executor, Distributed Lock │ ├── test_memory/ # All 5 memory tiers, hybrid search │ ├── test_mcp/ # MCP tools and client │ ├── test_channels/ # All channel implementations (incl. Webhook) │ ├── test_security/ # Audit, sandbox, policies │ ├── test_integration/ # End-to-end tests │ ├── test_skills/ # Skills, marketplace, persistence │ ├── test_telemetry/ # Metrics, Prometheus export │ ├── test_config_manager.py # Config manager + API routes │ └── test_ui_api_integration.py # 55 Control Center API integration tests ├── skills/ # Built-in skill definitions ├── scripts/ # Backup, deployment, utilities ├── deploy/ # Docker, systemd, nginx, Caddy, bare-metal installer ├── apps/ # PWA app (legacy) ├── start_cognithor.bat # One-click launcher (Windows) ├── config.yaml.example # Example configuration ├── pyproject.toml # Python project metadata ├── Makefile # Build, test, lint commands ├── Dockerfile # Container image ├── docker-compose.yml # Development compose ├── docker-compose.prod.yml # Production compose (5 services + profiles) └── install.sh # Interactive installer ``` ## 部署 ### 一键启动双击 `start_cognithor.bat` -> 浏览器打开 -> 点击 **Power On** -> 完成。 ### Docker（开发版） ``` docker compose up -d # Core backend docker compose --profile webui up -d # + Web UI ``` ### Docker（生产版） ``` cp .env.example .env # Edit: set JARVIS_API_TOKEN, etc. docker compose -f docker-compose.prod.yml up -d # 附带 optional services docker compose -f docker-compose.prod.yml --profile postgres up -d # + PostgreSQL docker compose -f docker-compose.prod.yml --profile nginx up -d # + Nginx TLS docker compose -f docker-compose.prod.yml --profile monitoring up -d # + Prometheus + Grafana ``` 服务：Jarvis (headless) + WebUI + Ollama + 可选 PostgreSQL (pgvector) + 可选 Nginx 反向代理 + 可选 Prometheus/Grafana 监控。通过 nvidia-container-toolkit 支持 GPU（在 compose 文件中取消注释）。 ### Bare-Metal 服务器 ``` sudo bash deploy/install-server.sh --domain jarvis.example.com --email admin@example.com # 或者使用 self-signed cert： sudo bash deploy/install-server.sh --domain test.local --self-signed ``` 安装到 `/opt/cognithor/`，数据位于 `/var/lib/cognithor/`，systemd 服务 `cognithor` + `cognithor-webui`。 ### Systemd（用户级） ``` ./install.sh --systemd systemctl --user enable --now cognithor journalctl --user -u cognithor -f # Logs ``` ### 健康检查 ``` curl http://localhost:8741/api/v1/health # Control Center API curl http://localhost:8080/api/v1/health # WebUI (standalone) curl http://localhost:9090/metrics # Prometheus metrics ``` ### 备份 ``` ./scripts/backup.sh # Create backup ./scripts/backup.sh --list # List backups ./scripts/backup.sh --restore latest # Restore ``` 请参阅 [`deploy/README.md`](deploy/README.md) 获取完整的部署文档（Docker profiles、TLS、Nginx/Caddy、bare-metal 安装、监控、故障排除）。 ## 语言与国际化 Cognithor 最初是用德语开发的。以下区域包含德语字符串： | 区域 | 语言 | 说明 | |------|----------|-------| | **README & 文档** | 英语 | 已完全翻译 | | **代码与注释** | 混合 (EN/DE) | 变量名为英语，部分注释为德语 | | **系统提示词** (Planner) | 德语 | 默认情况下，LLM 使用德语进行指令 | | **错误消息** | 德语 | 面向用户的错误模板位于 `utils/error_messages.py` | | **Vault 文件夹** | 德语 | `recherchen/`, `meetings/`, `wissen/`, `projekte/`, `daily/` | | **性格 / 问候** | 德语 | "Guten Morgen!", "Guten Abend!", 等 | | **Gatekeeper 原因** | 英语 | 策略决策使用英语 | | **日志消息** | 英语 | structlog 键和消息 | **自定义语言：** 在 `~/.jarvis/CORE.md` 中覆盖系统提示词或修改 `core/planner.py:SYSTEM_PROMPT`。完整的 i18n 支持已计划但尚未实现。 **贡献翻译：** 如果您想帮助将 Cognithor 翻译成其他语言，请提出 issue 或 PR。主要需要翻译的区域是：系统提示词、错误消息和 Vault 文件夹名称。 ## 路线图 | 阶段 | 描述 | 状态 | |-------|-------------|--------| | **阶段 1** | 基础 (PGE 三元组, MCP, CLI) | 完成 | | **阶段 2** | 记忆 (5 层, 混合搜索, MCP 工具) | 完成 | | **阶段 3** | 反思与程序性学习 | 完成 | | **阶段 4** | 频道、Cron、Web 工具、模型路由器 | 完成 | | **阶段 5** | 多 Agent 与安全加固 | 完成 | | **阶段 6** | Web UI 与语音 | 完成 | | **阶段 7** | 控制中心 UI、API 集成、频道自动检测 | 完成 | | **阶段 8** | UI 集成到仓库、后端启动器、孤儿管理 | 完成 | | **阶段 9** | 安全加固：Token 加密、TLS、文件大小限制、会话持久化 | 完成 | | **阶段 10** | 服务器部署：Docker 生产版、Bare-metal 安装程序、Nginx/Caddy、健康端点 | 完成 | | **阶段 11** | 扩展：分布式锁、持久消息队列、Prometheus 指标、Telegram webhook、技能市场持久化、自动依赖加载 | 完成 | | **部署** | 安装程序、systemd、Docker、备份、冒烟测试、一键启动器 | 完成 | | **阶段 12** | 人性化感觉：性格引擎、情绪检测、用户偏好、状态回调、友好的错误消息 | 完成 | | **阶段 13** | 语音与聊天集成：集成聊天页面、语音对话模式、Piper TTS (Thorsten Emotional)、自然语言响应 | 完成 | | **阶段 14** | Agent 基础设施：DAG 工作流、执行图、委托、Policy-as-code、知识图谱、记忆整合 | 完成 | | **阶段 15** | 多 Agent 与 SDK：协作 (辩论/投票/流水线)、Agent SDK、插件远程注册表 | 完成 | | **阶段 16** | 安全与运维：工具沙箱加固、分布式 Workers、确定性重放、基准测试、uv 安装程序、GDPR 工具包 | 完成 | ### 下一步计划 - **阶段 17** — 移动端：通过 Capacitor 的原生 Android/iOS 应用、推送通知、带本地 LLM 的离线模式 - **阶段 18** — 水平扩展：带 Redis Streams 的多节点 Gateway、记忆层的自动分片 - **阶段 19** — 高级治理：联邦策略管理、跨组织合规 ## 录制演示要为您的 README 或文档创建终端录制： ``` # 安装 asciinema pip install asciinema # 录制会话 asciinema rec demo.cast # 转换为 GIF (需要 agg) # https://github.com/asciinema/agg agg demo.cast demo.gif ``` 或者，使用 [terminalizer](https://github.com/faressoft/terminalizer) 生成可定制的终端 GIF，或使用 [VHS](https://github.com/charmbracelet/vhs) 进行脚本化录制。 **指标：** 约 109,000 行源代码 · 约 92,000 行测试代码 · 9,596 个测试 · 89% 覆盖率 · 0 个 lint 错误 · **状态：Beta** ## 贡献者 | 贡献者 | 角色 | 专注领域 | |-------------|------|-------| | [@Alex8791-cyber](https://github.com/Alex8791-cyber) | 创建者与维护者 | 架构、核心开发 | | [@TomiWebPro](https://github.com/TomiWebPro) | 核心贡献者与 QA 负责人 | Ubuntu 部署与真实世界测试 | ### 特别感谢 [@TomiWebPro](https://github.com/TomiWebPro) — 第一位社区 QA 合作伙伴，也是 Cognithor 的 Ubuntu 部署能够实际运行的原因。他在真实 Ubuntu 系统上的细致测试发现了 9 个关键的安装 Bug，这些 Bug 现已修复并具有完整的测试覆盖。 ## 许可证 Apache 2.0 — 请参阅 [LICENSE](LICENSE) 版权所有 2026 Alexander Soellner

标签：Agent OS, AI实验, AI风险缓解, Apache 2.0, DLL 劫持, IP 地址批量处理, LLM聚合, LLM评估, LM Studio, MCP工具, Ollama, Python, RAG, React, Ruby, Syscalls, 个人自动化, 五层认知记忆, 企业安全, 向量数据库, 多渠道集成, 大语言模型, 工作流自动化, 开源, 搜索引擎查询, 文档分析, 无后门, 智能体操作系统, 本地优先, 本地部署, 浏览器自动化, 特征检测, 知识合成, 知识库, 离线运行, 网络安全, 网络资产管理, 自主智能体, 自定义请求头, 认知架构, 语音交互, 请求拦截, 逆向工具, 隐私保护