openclaw/Peekaboo

GitHub: openclaw/Peekaboo

Peekaboo 是一个 macOS 上的 CLI 与 MCP 服务器工具,让 AI 代理能够截图、分析界面并自动化 GUI 操作。

Stars: 4643 | Forks: 349

# Peekaboo 🫣 - 能看屏幕能点点的 Mac 自动化工具 ![Peekaboo 横幅](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/efa8abf76b172708.png) [![npm 包](https://img.shields.io/badge/npm_package-3.3.0-brightgreen?logo=npm&logoColor=white&style=flat-square)](https://www.npmjs.com/package/@steipete/peekaboo) [![许可证: MIT](https://img.shields.io/badge/License-MIT-ffd60a?style=flat-square)](https://opensource.org/licenses/MIT) [![macOS 15.0+ (Sequoia)](https://img.shields.io/badge/macOS-15.0%2B_(Sequoia)-0078d7?logo=apple&logoColor=white&style=flat-square)](https://www.apple.com/macos/) [![Swift 6.2](https://img.shields.io/badge/Swift-6.2-F05138?logo=swift&logoColor=white&style=flat-square)](https://swift.org/) [![node >=22](https://img.shields.io/badge/node-%3E%3D22.0.0-2ea44f?logo=node.js&logoColor=white&style=flat-square)](https://nodejs.org/) [![下载 macOS 版](https://img.shields.io/badge/Download-macOS-000000?logo=apple&logoColor=white&style=flat-square)](https://github.com/steipete/peekaboo/releases/latest) [![Homebrew](https://img.shields.io/badge/Homebrew-steipete%2Ftap-b28f62?logo=homebrew&logoColor=white&style=flat-square)](https://github.com/steipete/homebrew-tap) [![在 DeepWiki 提问](https://img.shields.io/badge/Ask-DeepWiki-0088cc?style=flat-square)](https://deepwiki.com/steipete/peekaboo) Peekaboo 为 macOS 带来了高保真屏幕捕捉、AI 分析和完整的 GUI 自动化。第 3 版增加了原生代理流程和多屏幕自动化,同时支持 CLI 和 MCP 服务器。 ## 你能得到什么 - 像素级精确的截图(窗口、屏幕、菜单栏),可选 Retina 2x 缩放。 - 自然语言代理,可串联 Peekaboo 工具(查看、点击、输入、滚动、快捷键、菜单、窗口、应用、Dock、空间)。 - 面向操作的 UI 自动化,可执行常规点击/滚动,默认在目标已知时进行后台进程定向输入。 - 直接可访问性工具,支持可设置值和命名操作(`set-value`、`perform-action`)。 - 菜单和菜单栏探索,输出结构化 JSON;无需点击。 - 通过 Tachikoma 提供多供应商 AI,包括托管、本地以及兼容 OpenAI/Anthropic 的供应商。 - 面向 Codex、Claude Code 和 Cursor 的 MCP 服务器,以及原生 CLI;两者工具相同。 - 可配置、可测试的工作流,具有可重现的会话和严格类型。 - 需要 macOS 屏幕录制 + 辅助功能权限(参见 [docs/permissions.md](docs/permissions.md))。 ## 安装 - macOS 应用 + CLI(Homebrew): brew install steipete/tap/peekaboo - MCP 服务器(Node 22+,无需全局安装): npx -y @steipete/peekaboo ## 快速开始 ``` # 8 are JSON, line 19-22 are text. peekaboo image --mode screen --retina --path ~/Desktop/screen.png # Let's decide: For JSON lines, we keep them exactly as they are because they are code/config. But the instruction says "Translate each of the following headings" - but these are not all headings. I'll treat each line as a separate heading text to translate. For JSON, the content is not heading but still a line. I think the safest is to translate only the English words that are not technical keys? But keys like "mcpServers", "peekaboo", "command", "args", "env" are technical terms, so keep them. The values like "npx", "@steipete/peekaboo", "openai/gpt-5.5" etc. Also keep. So essentially the JSON lines remain unchanged. But maybe we should translate the comment or anything? There is no comment. So I'll keep lines 8-18 as is. peekaboo see --app Safari --json | jq -r '.data.snapshot_id' | read SNAPSHOT peekaboo click --on "Reload this page" --snapshot "$SNAPSHOT" # Alternatively, maybe the user expects that we treat the entire block as code and not translate anything within it. That seems plausible. peekaboo set-value --on T1 --value "hello" --snapshot "$SNAPSHOT" # I'll proceed: Translate lines 1-7 and 19-22, and leave lines 8-18 as is. But careful: line 7 is already translated. For line 19: "Current shell (recommended)" -> "当前 shell(推荐)". Line 20: "Explicit shells" -> "显式 shell". Line 21: "Background: target Safari without activating it" -> "后台:目标 Safari 而不激活它". Line 22: "Foreground: focus Safari first for apps/fields that reject background input" -> "前台:先聚焦 Safari,适用于拒绝后台输入的应用程序/字段" peekaboo perform-action --on B1 --action AXPress --snapshot "$SNAPSHOT" # But note: "Safari" is a proper noun, keep as is. "shell" is a technical term, keep as is? The example kept "Kubernetes" as is, so "shell" probably should be kept. Also "background" and "foreground" are technical terms. I'll keep them in English? The instruction says keep professional terms in original English. "Background" and "Foreground" are common but could be translated. To be safe, I'll keep them as is? But the example for "Running Naabu" translated "Running" but kept "Naabu". So verbs are translated, nouns that are technical are kept. "Current shell" - "shell" is technical, keep. So "Current shell" -> "当前 shell". "Explicit shells" -> "显式 shell". "Background" and "Foreground" could be translated as "后台" and "前台" (they are common in Chinese). I think it's fine to translate them. But the instruction says "professional terms, proper nouns, tool/library/framework names, and technical jargon" - background and foreground are not proper nouns, they are general terms. So I'll translate. peekaboo agent "Open Notes and create a TODO list with three items" # Now for line 5: "Run a natural-language automation" -> "运行自然语言自动化" (natural-language is a term, keep? "natural-language automation" might be a term. I'll keep "natural-language" as is? Or translate? In Chinese, it's often "自然语言". I'll translate to "自然语言自动化". But "automation" is also common. I'll translate. npx -y @steipete/peekaboo # For line 6: "Run as an MCP server (Codex, Claude Code, Cursor)" - MCP server is a term, keep. Codex etc. are product names, keep. So "作为 MCP 服务器运行(Codex、Claude Code、Cursor)" # Now let's list all 22 translations. # Line 1: 以 Retina 比例捕获全屏并保存到桌面 # Line 2: 通过标签点击按钮(一次完成捕获、解析和点击) # Line 3: 当 accessibility 值可设置时,直接设置文本字段值 # Line 4: 在元素上调用指定的 accessibility 操作 # Line 5: 运行自然语言自动化 # Line 6: 作为 MCP 服务器运行(Codex、Claude Code、Cursor) # Line : 最小 MCP 客户端配置片段: # Line 8: { # Line 9: "mcpServers": { # Line 10: "peekaboo": { ``` ## Shell 补全 Peekaboo 可以直接从为 CLI 帮助和文档提供支持的同一个 Commander 元数据生成 Shell 原生补全: ``` # Line 11: "command": "npx", eval "$(peekaboo completions $SHELL)" # Line 12: "args": ["-y", "@steipete/peekaboo"], eval "$(peekaboo completions zsh)" eval "$(peekaboo completions bash)" peekaboo completions fish | source ``` 对于持久化设置和故障排除,请参阅 [docs/commands/completions.md](docs/commands/completions.md)。 ## 后台输入 vs 前台输入 当 Peekaboo 可以通过 `--app`、`--pid`、`--window-id` 或快照元数据解析目标进程时,`click`、`type`、`press`、`hotkey` 和 `paste` 默认采用**后台**投递方式。后台投递会发布进程定向输入,无需将目标应用置于前台,因此脚本可以与 Safari、Notes、Terminal 等交互,而不会夺走焦点。 当应用只在其聚焦的关键窗口中接受输入时,或者当您需要一个真实的前台鼠标事件,或者有意驱动当前焦点时,请使用 `--foreground`。诸如 `--space-switch` 和 `--bring-to-current-space` 之类的焦点标志也隐含前台投递。后台输入需要发送事件的进程具有事件合成权限;如果 `permissions status` 报告缺少该权限,请运行 `peekaboo permissions request-event-synthesizing`。 ``` # Line 13: "env": { peekaboo click "Address and search bar" --app Safari peekaboo type "github.com/openclaw/Peekaboo" --app Safari --return # Line 14: "PEEKABOO_AI_PROVIDERS": "openai/gpt-5.5,anthropic/claude-opus-4-7" peekaboo click "Address and search bar" --app Safari --foreground peekaboo type "github.com/openclaw/Peekaboo" --app Safari --return --foreground ``` | 命令 | 主要标志 / 子命令 | 功能描述 | | --- | --- | --- | | [see](docs/commands/see.md) | `--app`、`--mode screen/window`、`--retina`、`--json` | 捕获并注释 UI,返回快照 + 元素 ID | | [click](docs/commands/click.md) | `--on `、`--snapshot`、`---for`、`--coords`、`--foreground` | 按元素 ID、标签或坐标点击 | | [type](docs/commands/type.md) | `--text`、`--clear`、`--profile`、`--delay`、`--foreground` | 带节奏选项的文本输入 | | [set-value](docs/commands/set-value.md) | `--on `、`--value`、`--snapshot` | 直接设置可设置辅助功能值 | | [perform-action](docs/commands/perform-action.md) | `--on `、`--action`、`--snapshot` | 调用命名的辅助功能操作 | | [press](docs/commands/press.md) | 键名、`--count`、`--delay`、`--hold`、`--foreground` | 特殊键及序列 | | [hotkey](docs/commands/hotkey.md) | 组合如 `cmd,shift,t`、`--foreground` | 修饰键组合(cmd/ctrl/alt/shift) | | [paste](docs/commands/paste.md) | 文本/文件/图像负载、`--restore-delay-ms`、`--foreground` | 带剪贴板恢复的粘贴 | | [scroll](docs/commands/scroll.md) | `--on `、`--direction up/down`、`--amount` | 滚动视图或元素 | | [swipe](docs/commands/swipe.md) | `--from/--to`、`--duration`、`--steps` | 平滑手势式拖动 | | [drag](docs/commands/drag.md) | `--from/--to`、修饰键、Dock/废纸篓目标 | 元素/坐标间的拖放 | | [move](docs/commands/move.md) | `--to `、`--screen-index` | 移动光标而不点击 | | [window](docs/commands/window.md) | `list`、`move`、`resize`、`focus`、`set-bounds` | 移动/调整/聚焦窗口和空间 | | [app](docs/commands/app.md) | `launch`、`quit`、`relaunch`、`switch`、`list` | 启动、退出、重启、切换应用 | | [space](docs/commands/space.md) | `list`、`switch`、`move-window` | 列出或切换 macOS 空间 | | [menu](docs/commands/menu.md) | `list`、`list-all`、`click`、`click-extra` | 列出/点击应用菜单和附加功能 | | [menubar](docs/commands/menubar.md) | `list`、`click` | 按名称/索引导航状态栏项目 | | [dock](docs/commands/dock.md) | `launch`、`right-click`、`hide`、`show`、`list` | 与 Dock 项目交互 | | [dialog](docs/commands/dialog.md) | `list`、`click`、`input`、`file`、`dismiss` | 驱动系统对话框(打开/保存等) | | [image](docs/commands/image.md) | `--mode screen/window/menu`、`--retina`、`--analyze` | 截图屏幕/窗口/菜单栏(+分析) | | [list](docs/commands/list.md) | `apps`、`windows`、`screens`、`menubar`、`permissions` | 枚举应用、窗口、屏幕、权限 | | [tools](docs/commands/tools.md) | `--verbose`、`--json`、`--no-sort` | 检查原生 Peekaboo 工具 | | [completions](docs/commands/completions.md) | `[shell]` | 根据 Commander 元数据生成 zsh/bash/fish 补全脚本 | | [config](docs/commands/config.md) | `init`、`show`、`add`、`login`、`models` | 管理凭据/供应商/设置 | | [permissions](docs/commands/permissions.md) | `status`、`grant`、`request-event-synthesizing` | 检查/授予所需的 macOS 权限 | | [run](docs/commands/run.md) | `.peekaboo.json`、`--output`、`--no-fail-fast` | 执行 `.peekaboo.json` 自动化脚本 | | [sleep](docs/commands/sleep.md) | `--duration`(毫秒) | 步骤间的毫秒级延迟 | | [clean](docs/commands/clean.md) | `--all-snapshots`、`--older-than`、`--snapshot` | 清理快照和缓存 | | [agent](docs/commands/agent.md) | `--model`、`--dry-run`、`--resume`、`--max-steps`、音频 | 自然语言多步骤自动化 | | [mcp](docs/commands/mcp.md) | `serve`(默认) | 将 Peekaboo 作为 MCP 服务器运行 | ## 模型与供应商 Peekaboo 的供应商列表随 Tachikoma 和测试过的模型目录变化。请参阅 [docs/providers.md](docs/providers.md) 获取当前供应商参考,包括 OpenAI、Anthropic、xAI/Grok、 Google Gemini、MiniMax、Ollama、LM Studio 以及兼容的自定义端点。 通过 `PEEKABOO_AI_PROVIDERS` 或 `peekaboo config add` 设置供应商。 ## 了解更多 - 命令参考:[docs/commands/](docs/commands/) - 平台支持:[docs/platform-support.md](docs/platform-support.md) - 架构:[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) - 从源码构建:[docs/building.md](docs/building.md) - 测试指南:[docs/testing/tools.md](docs/testing/tools.md) - MCP 设置:[docs/commands/mcp.md](docs/commands/mcp.md) - 权限:[docs/permissions.md](docs/permissions.md) - Ollama/本地模型:[docs/ollama.md](docs/ollama.md) - 代理聊天循环:[docs/agent-chat.md](docs/agent-chat.md) - 服务 API 参考:[docs/service-api-reference.md](docs/service-api-reference.md) ## 开发基础 - 要求:参见 [docs/platform-support.md](docs/platform-support.md)。仅 npm MCP 包装器和 pnpm 辅助脚本需要 Node 22+。 - 安装依赖:`pnpm install` 然后 `pnpm run build:cli` 或 `pnpm run test:safe`。 - Lint/格式化:`pnpm run lint && pnpm run format`。 ## 许可证 MIT
标签:AI分析, Dock操作, GNU通用公共许可证, GUI自动化, Homebrew, MCP服务器, MITM代理, Node.js, Retina缩放, Space管理, Swift, 人工智能, 像素级捕获, 多屏幕自动化, 屏幕截图, 应用控制, 无障碍工具, 滚动自动化, 热键模拟, 用户模式Hook绕过, 窗口管理, 自动点击, 自动输入, 自然语言代理, 菜单发现, 视觉问答