GitHubSecurityLab/seclab-taskflow-agent

GitHub: GitHubSecurityLab/seclab-taskflow-agent

基于 MCP 协议和 OpenAI Agents SDK 的多 Agent 工作流编排框架，通过 YAML 配置实现安全研究任务的自动化协作。

Stars: 205 | Forks: 25

# GitHub Security Lab Taskflow Agent Security Lab Taskflow Agent 是一个启用 MCP 的多 Agent 框架。 Taskflow Agent 构建在 [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/) 之上。 ## 核心概念 Taskflow Agent 利用类似 GitHub Workflow 风格的基于 YAML 的语法，通过一组 Agent 执行一系列任务。其主要价值在于作为一个 CLI 工具，允许用户快速定义和脚本化 Agentic 工作流，而无需编写任何代码。 Agent 通过 [personalities](examples/personalities/)（人格）定义，在给定一组 [tools](src/seclab_taskflow_agent/toolboxes/)（工具）的情况下，接收一个 [task](examples/taskflows/)（任务）来完成。 Agent 可以通过所谓的 [taskflows](doc/GRAMMAR.md)（任务流）协作完成一系列任务。你可以在 [这里](doc/GRAMMAR.md) 找到 taskflow 语法的详细概述，并在 [这里](examples/taskflows/) 查看示例 taskflow。 ## 使用场景与示例 Seclab Taskflow Agent 框架主要旨在适应 Agentic 安全研究工作流和漏洞分类任务中涉及的迭代反馈循环驱动的工作。其设计理念核心在于相信，随着前沿模型能力的不断演进，通过 prompt 层面专注于捕获漏洞模式将极大地改善和扩展安全研究成果。在 GitHub Security Lab，我们主要将此框架用作代码审计工具，但它也可以作为探索 Agentic 工作流的更通用的“瑞士军刀”。例如，我们也使用此框架进行自动化代码扫描告警分类。该框架包含一个 [CodeQL](https://codeql.github.com/) MCP server，可用于 Agentic 代码审查；有关如何让 Agent 使用 CodeQL 数据库审查 C 代码的示例，请参阅 [CVE-2023-2283](examples/taskflows/CVE-2023-2283.yaml) taskflow（[演示视频](https://www.youtube.com/watch?v=eRSPSVW8RMo)）。 CodeQL MCP Server 不直接生成 CodeQL 查询，而是提供基于 CodeQL 查询的 MCP 工具，允许 Agent 导航和探索代码。它利用模板化的 CodeQL 查询为模型驱动的代码分析提供针对性的上下文。 ## 环境要求 Python >= 3.10 或 Docker ## 配置通过 `AI_API_TOKEN` 环境变量为有权使用 [GitHub Models](https://models.github.ai) 的账户提供 GitHub token。进一步的配置取决于具体使用场景，即取决于你想在 taskflow 中使用哪些 MCP server。在终端中，你可以像这样将 `AI_API_TOKEN` 添加到环境中： ``` export AI_API_TOKEN= ``` 或者，如果你使用的是 GitHub Codespaces，那么你可以 [添加一个 Codespace secret](https://github.com/settings/codespaces/secrets/new)，以便在 Codespace 中工作时 `AI_API_TOKEN` 自动可用。 [seclab-taskflow](https://github.com/GitHubSecurityLab/seclab-taskflows) 仓库中的许多 MCP server 还需要一个名为 `GH_TOKEN` 的环境变量来访问 GitHub API。如果你愿意，可以使用两个单独的 PAT，或者像这样将一个 PAT 用于两个目的： ``` export GH_TOKEN=$AI_API_TOKEN ``` 我们不建议将 secret 存储在磁盘上，但你可以通过在项目根目录中添加 `.env` 文件来持久化非敏感环境变量。示例： ``` # MCP 配置 CODEQL_DBS_BASE_PATH="/app/my_data/codeql_databases" AI_API_ENDPOINT="https://models.github.ai/inference" ``` ## 从源码部署我们使用 [hatch](https://hatch.pypa.io/) 来构建项目。下载并构建如下： ``` git clone https://github.com/GitHubSecurityLab/seclab-taskflow-agent.git cd seclab-taskflow-agent python -m venv .venv source .venv/bin/activate pip install hatch hatch build ``` 然后运行 `hatch run main`。示例：将 prompt 部署到 Agent Personality： ``` hatch run main -p seclab_taskflow_agent.personalities.assistant 'explain modems to me please' ``` 示例：部署 Taskflow： ``` hatch run main -t examples.taskflows.example ``` 示例：使用命令行全局变量部署 Taskflow： ``` hatch run main -t examples.taskflows.example_globals -g fruit=apples ``` 可以设置多个全局变量： ``` hatch run main -t examples.taskflows.example_globals -g fruit=apples -g color=red ``` ## 通过 Docker 部署你可以使用 `docker/run.sh` 通过其 Docker 镜像部署 Taskflow Agent。警告：Agent Docker 镜像 _不_ 旨在作为安全边界，而仅仅是作为一种部署便利。镜像入口点是 `__main__.py`，因此其操作方式与直接从源码调用 Agent 相同。你可以在 [这里](https://github.com/GitHubSecurityLab/seclab-taskflow-agent/pkgs/container/seclab-taskflow-agent) 找到 Seclab Taskflow Agent 的 Docker 镜像，并在 [这里](release_tools/) 查看其构建方式。请注意，此镜像基于 Taskflow Agent 的公开发布版本，你必须将任何自定义 taskflow、personality 或 prompt 挂载到镜像中，Agent 才能使用它们。用于提供自定义数据的可选镜像挂载点通过环境变量配置： - 通过 `MY_DATA` 自定义数据，挂载到 `/app/my_data` - 通过 `MY_PERSONALITIES` 自定义 personality，挂载到 `/app/personalities/my_personalities` - 通过 `MY_TASKFLOWS` 自定义 taskflow，挂载到 `/app/taskflows/my_taskflows` - 通过 `MY_PROMPTS` 自定义 prompt，挂载到 `/app/prompts/my_prompts` - 通过 `MY_TOOLBOXES` 自定义 toolbox，挂载到 `/app/toolboxes/my_toolboxes` 有关更多详细信息，请参阅 [docker/run.sh](docker/run.sh)。示例：部署 Taskflow (example.yaml)： ``` docker/run.sh -t example ``` 示例：使用全局变量部署 Taskflow： ``` docker/run.sh -t example_globals -g fruit=apples ``` 示例：部署自定义 taskflow (custom_taskflow.yaml)： ``` MY_TASKFLOWS=~/my_taskflows docker/run.sh -t custom_taskflow ``` 示例：部署自定义 taskflow (custom_taskflow.yaml) 并使本地 CodeQL 数据库对 CodeQL MCP server 可用： ``` MY_TASKFLOWS=~/my_taskflows MY_DATA=~/app/my_data CODEQL_DBS_BASE_PATH=/app/my_data/codeql_databases docker/run.sh -t custom_taskflow ``` 对于更高级的场景，例如使自定义 MCP server 代码可用，你可以修改运行脚本以将自定义代码挂载到镜像中，并相应地配置你的 toolbox 以使用该代码。 ``` export MY_MCP_SERVERS="$PWD"/mcp_servers export MY_TOOLBOXES="$PWD"/toolboxes export MY_PERSONALITIES="$PWD"/personalities export MY_TASKFLOWS="$PWD"/taskflows export MY_PROMPTS="$PWD"/prompts export MY_DATA="$PWD"/data if [ ! -f ".env" ]; then touch ".env" fi docker run \ --mount type=bind,src="$PWD"/.env,dst=/app/.env,ro \ ${MY_DATA:+--mount type=bind,src=$MY_DATA,dst=/app/my_data} \ ${MY_MCP_SERVERS:+--mount type=bind,src=$MY_MCP_SERVERS,dst=/app/my_mcp_servers,ro} \ ${MY_TASKFLOWS:+--mount type=bind,src=$MY_TASKFLOWS,dst=/app/taskflows/my_taskflows,ro} \ ${MY_TOOLBOXES:+--mount type=bind,src=$MY_TOOLBOXES,dst=/app/toolboxes/my_toolboxes,ro} \ ${MY_PROMPTS:+--mount type=bind,src=$MY_PROMPTS,dst=/app/prompts/my_prompts,ro} \ ${MY_PERSONALITIES:+--mount type=bind,src=$MY_PERSONALITIES,dst=/app/personalities/my_personalities,ro} \ "ghcr.io/githubsecuritylab/seclab-taskflow-agent" "$@" ``` ## 通用 YAML 文件头 Seclab Taskflow Agent 使用的每个 YAML 文件都必须包含如下头信息： ``` seclab-taskflow-agent: version: "1.0" filetype: taskflow ``` 头信息中的 `version` 号当前为 1。这意味着该文件使用 seclab-taskflow-agent 语法的第 1 版。如果我们将来需要对语法进行重大更改，我们将更新版本号。这将有望使我们能够进行更改而不破坏向后兼容性。Version 可以指定为整数、浮点数或字符串。 `filetype` 决定文件是定义 personality、toolbox 等。这意味着不同类型的文件可以存储在同一目录中。 `filetype` 可以是以下之一： - taskflow - personality - toolbox - prompt - model_config 接下来我们将解释不同类型文件的作用及其可用的功能。 ## Personalities 单个 Agent 的核心特征。通过 `filetype` 为 `personality` 的 YAML 文件配置。这些是系统 prompt 级别的指令。示例： ``` # personalities 定义了该 Agent 的系统级指令 seclab-taskflow-agent: version: 1 filetype: personality personality: | You are a simple echo bot. You use echo tools to echo things. task: | Echo user inputs using the echo tools. # personality toolboxes 映射到该 Agent 可用的 mcp servers toolboxes: - seclab_taskflow_agent.toolboxes.echo ``` 在上面，`personality` 和 `task` 字段指定了每当使用此 `personality` 时要使用的系统 prompt。 `toolboxes` 是此 `personality` 可用的工具。`toolboxes` 应该是 `filetype` 为 `toolbox` 的文件列表。（有关如何引用其他文件，请参阅 [Import paths](#import-paths) 部分。） Personalities 可以通过两种方式使用。首先，它可以通过来自命令行的 prompt 输入独立使用： ``` hatch run main -p examples.personalities.echo 'echo this message' ``` 在这种情况下，来自 [`examples/personalities/echo.yaml`](examples/personalities/echo.yaml) 的 `personality` 和 `task` 被用作系统 prompt，而用户参数 `echo this message` 被用作用户 prompt。在此使用场景中，此 personality 有权访问的唯一工具是文件中指定的 `toolboxes`。 Personalities 也可以在 `taskflow` 中用于执行任务。这是通过将 `personality` 添加到 `taskflow` 文件中的 `agents` 字段来完成的： ``` taskflow: - task: ... agents: - personalities.assistant user_prompt: | Fetch all the open pull requests from `github/codeql` github repository. You do not need to provide a summary of the results. toolboxes: - seclab_taskflow_agent.toolboxes.github_official ``` 在这种情况下，`agents` 中指定的 `personality` 提供系统 prompt，而用户 prompt 在任务的 `user_prompt` 字段中指定。这种情况的一个很大区别是，`task` 中指定的 `toolboxes` 将覆盖 `personality` 有权访问的 `toolboxes`。因此，在上面的示例中，`personalities.assistant` 将有权访问 `seclab_taskflow_agent.toolboxes.github_official` toolbox 而不是其自己的 toolbox。重要的是要注意，在这种情况下 `personalities` 的 toolboxes 会被 *覆盖*，因此每当在 `task` 中提供 `toolboxes` 字段时，它将使用提供的 toolboxes，而 `personality` 会失去对其自身 toolboxes 的访问权限。例如： ``` taskflow: - task: ... agents: - examples.personalities.echo user_prompt: | echo this toolboxes: - seclab_taskflow_agent.toolboxes.github_official ``` 在上面的 `task` 中，`personalities.examples.echo` 将只能访问 `toolboxes.github_official`，而无法再访问 `seclab_taskflow_agent.toolboxes.echo` `toolbox`（除非它也被添加在 `task` `toolboxes` 中）。 ## Toolboxes 提供工具的 MCP server。通过 `filetype` 为 `toolbox` 的 YAML 文件配置。这些文件提供用于启动 MCP server 的类型和参数。例如，要启动在 python 文件中实现的 stdio MCP server： ``` # stdio mcp server 配置 seclab-taskflow-agent: version: 1 filetype: toolbox server_params: kind: stdio command: python args: ["-m", "seclab_taskflow_agent.mcp_servers.echo.echo"] env: TEST: value ``` 在上面，`command` 和 `args` 只是为了启动 MCP server 而应该运行的命令和参数。可以使用 `env` 字段传递环境变量。通过将 `kind` 设置为 `streamable`，也支持 [streamable](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http)： ``` server_params: kind: streamable url: https://api.githubcopilot.com/mcp/ #See https://github.com/github/github-mcp-server/blob/main/docs/remote-server.md headers: Authorization: "Bearer {{ env('GH_TOKEN') }}" optional_headers: X-MCP-Toolsets: "{{ env('GITHUB_MCP_TOOLSETS') }}" X-MCP-Readonly: "{{ env('GITHUB_MCP_READONLY') }}" ``` 你可以强制 `toolbox` 中的某些工具在运行前需要用户确认。如果工具可能执行不可逆的操作并且在使用前应要求用户批准，这将非常有用。这是通过在 `confirm` 部分中包含 MCP server 中工具（函数）的名称来完成的： ``` server_params: kind: stdio ... # 执行前需要框架与用户确认的 tools 列表 # 使用此功能对 MCP servers 中潜在的危险函数进行防护 confirm: - memcache_clear_cache ``` ## Taskflows 由一组 Agent 执行的一系列相互依赖的任务。通过 `filetype` 为 `taskflow` 的 YAML 文件配置。 Taskflows 支持许多功能，其详细信息可以在 [这里](doc/GRAMMAR.md) 找到。示例： ``` seclab-taskflow-agent: version: 1 filetype: taskflow taskflow: - task: # taskflows can optionally choose any of the models supported by your API for a task model: gpt-4.1 # taskflows can optionally limit the max allowed number of Agent task loop # iterations to complete a task, this defaults to 50 when not provided max_steps: 20 must_complete: true # taskflows can set a primary (first entry) and handoff (additional entries) agent agents: - seclab_taskflow_agent.personalities.c_auditer - examples.personalities.fruit_expert user_prompt: | Store an example vulnerable C program that uses `strcpy` in the `vulnerable_c_example` memory key and explain why `strcpy` is insecure in the C programming language. Do this before handing off to any other agent. Finally, why are apples and oranges healthy to eat? # taskflows can set temporary environment variables, these support the general # "{{ env('FROM_EXISTING_ENVIRONMENT') }}" pattern we use elsewhere as well # these environment variables can then be made available to any stdio mcp server # through its respective yaml configuration, see memcache.yaml for an example # you can use these to override top-level environment variables on a per-task basis env: MEMCACHE_STATE_DIR: "example_taskflow/" MEMCACHE_BACKEND: "dictionary_file" # taskflows can optionally override personality toolboxes, in this example # this normally only has the memcache toolbox, but we extend it here with # the GHSA toolbox toolboxes: - seclab_taskflow_agent.toolboxes.memcache - seclab_taskflow_agent.toolboxes.codeql - task: must_complete: true model: gpt-4.1 agents: - seclab_taskflow_agent.personalities.c_auditer user_prompt: | Retrieve C code for security review from the `vulnerable_c_example` memory key and perform a review. Clear the memory cache when you're done. env: MEMCACHE_STATE_DIR: "example_taskflow/" MEMCACHE_BACKEND: "dictionary_file" toolboxes: - seclab_taskflow_agent.toolboxes.memcache # headless mode does not prompt for tool call confirms configured for a server # note: this will auto-allow, if you want control over potentially dangerous # tool calls, then you should NOT run a task in headless mode (default: false) headless: true - task: # tasks can also run shell scripts that return e.g. json output for repeat prompt iterable must_complete: true run: | echo '["apple", "banana", "orange"]' - task: repeat_prompt: true agents: - seclab_taskflow_agent.personalities.assistant user_prompt: | What kind of fruit is {{ RESULT }}? ``` Taskflows 支持 [Agent handoffs](https://openai.github.io/openai-agents-python/handoffs/)（移交）。Handoffs 对于实现分类模式非常有用，其中主 Agent 可以决定将任务移交给 `Agents` 列表中的任何后续 Agent。有关其他有用的 Taskflow 模式（如可重复和异步模板化 prompt），请参阅 [taskflow 示例](examples/taskflows)。你可以像这样从命令行运行 taskflow： ``` hatch run main -t examples.taskflows.CVE-2023-2283 ``` ## Prompts Prompts 通过 `filetype` 为 `prompt` 的 YAML 文件配置。它们定义了一个可在 `taskflow` 文件中引用的可重用 prompt。它们只包含一个字段，即 `prompt` 字段，用于替换 taskflow 中的任何 `{{ PROMPT_ }}` 模板参数。例如，以下 `prompt`： ``` seclab-taskflow-agent: version: 1 filetype: prompt prompt: | Tell me more about bananas as well. ``` 将替换 taskflow 中 `user_prompt` 部分中找到的任何 `{{ PROMPT_examples.prompts.example_prompt }}` 模板参数： ``` - task: agents: - examples.personalities.fruit_expert user_prompt: | Tell me more about apples. {{ PROMPTS_examples.prompts.example_prompt }} ``` 变为： ``` - task: agents: - examples.personalities.fruit_expert user_prompt: | Tell me more about apples. Tell me more about bananas as well. ``` ## Model configs Model configs 通过 `filetype` 为 `model_config` 的 YAML 文件配置。这些提供了一种配置模型版本的方法。 ``` seclab-taskflow-agent: version: 1 filetype: model_config models: gpt_latest: gpt-5 ``` `model_config` 文件可以在 `taskflow` 中使用，然后在全局范围内使用 `models` 中定义的值。 ``` model_config: examples.model_configs.model_config taskflow: - task: model: gpt_latest ``` 然后可以通过更改 `model_config` 文件中的 `gpt_latest` 来更新模型版本，并将其应用于所有使用该配置的 taskflow。此外，可以通过 `model_config` 提供特定于模型的参数。为此，在 `model_config` 文件中定义一个 `model_settings` 部分。该部分必须是一个以模型名称为键的字典： ``` model_settings: gpt_latest: temperature: 1 reasoning: effort: high ``` 你不需要为 `models` 部分中定义的所有模型设置参数。当没有为模型设置参数时，它们将回退到默认值。但是，此部分中的所有设置必须属于 `models` 部分中指定的模型之一，否则将引发错误： ``` model_settings: new_model: ... ``` 以上将导致错误，因为 `new_model` 未在 `models` 部分中定义。模型参数也可以按任务设置，任务中定义的任何设置都将覆盖配置中的设置。 ## 传递环境变量类型为 `taskflow` 和 `toolbox` 的文件允许使用 `env` 字段传递环境变量： ``` server_params: ... env: CODEQL_DBS_BASE_PATH: "{{ env('CODEQL_DBS_BASE_PATH') }}" # prevent git repo operations on gh codeql executions GH_NO_UPDATE_NOTIFIER: "disable" ``` 对于 `toolbox`，`env` 可以在 `server_params` 内部使用。可以使用 `{{ env('ENV_VARIABLE_NAME') }}` 形式的模板将当前进程的环境变量值传递给 MCP server。因此，在上面，MCP server 在运行时带有 `GH_NO_UPDATE_NOTIFIER=disable`，并将当前进程中 `CODEQL_DBS_BASE_PATH` 的值传递给 MCP server。模板参数 `{{ env('CODEQL_DBS_BASE_PATH') }}` 被当前进程中环境变量 `CODEQL_DBS_BASE_PATH` 的值替换。同样，可以将环境变量传递给 `taskflow` 中的 `task`： ``` taskflow: - task: must_complete: true agents: - seclab_taskflow_agent.personalities.assistant user_prompt: | Store the json array ["apples", "oranges", "bananas"] in the `fruits` memory key. env: MEMCACHE_STATE_DIR: "example_taskflow/" MEMCACHE_BACKEND: "dictionary_file" ``` 这仅覆盖任务的 `MEMCACHE_STATE_DIR` 和 `MEMCACHE_BACKEND` 环境变量。也可以使用模板 `{{ env('ENV_VARIABLE_NAME') }}`。请注意，在使用模板 `{{ env('ENV_VARIABLE_NAME') }}` 时，`ENV_VARIABLE_NAME` 必须是当前进程中环境变量的名称。 ## Import paths YAML 文件通常需要相互引用。例如，taskflow 可以像这样引用 personality： ``` taskflow: - task: ... agents: - seclab_taskflow_agent.personalities.assistant ``` 我们使用 Python 的导入系统，因此像 `seclab_taskflow_agent.personalities.assistant` 这样的名称将使用 Python 的导入规则解析为 YAML 文件。这样做的一个好处是，它可以轻松地将 taskflow 作为 Python 包打包并在 PyPI 上共享。其实现方式如下： 1. 像 `seclab_taskflow_agent.personalities.assistant` 这样的名称被拆分（在最后一个 ` 字符处）为包名称（`seclab_taskflow_agent.personalities`）和文件名（`assistant`）。 2. Python 的 [`importlib.resources.files`](https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files) 用于将包名称解析为目录名称。 3. 扩展名 `.yaml` 被添加到文件名：`assistant.yaml`。 4. yaml 文件是从 `importlib.resources.files` 返回的目录中加载的。实现此功能的具体代码可以在 [`available_tools.py`](src/seclab_taskflow_agent/available_tools.py) 中找到。 ## 背景 SecLab Taskflow Agent 是一个实验性的 Agentic 框架，由 [GitHub Security Lab](https://securitylab.github.com/) 维护。我们正在使用它来试验将 AI Agent 用于安全目的，例如审计代码中的漏洞或对问题进行分类。我们很乐意听取你的反馈。请 [创建一个 issue](https://github.com/GitHubSecurityLab/seclab-taskflow-agent/issues/new/choose) 向我们发送功能请求或错误报告。我们也欢迎 pull request（如果你希望做出贡献，请参阅我们的 [贡献指南](./CONTRIBUTING.md) 了解更多信息）。 ## 许可证本项目根据 [MIT](https://spdx.org/licenses/MIT.html) 许可证的条款授权。有关完整条款，请参阅 [LICENSE](./LICENSE) 文件。 ## 维护者 [CODEOWNERS](./CODEOWNERS) ## 支持 [SUPPORT](./SUPPORT.md) ## 致谢 Security Lab 团队成员 [Man Yue Mo](https://github.com/m-y-mo) 和 [Peter Stöckli](https://github.com/p-) 为该框架的测试和开发做出了巨大贡献，同时也感谢 Security Lab 团队其他成员的有益讨论和反馈。

标签：AI安全, Chat Copilot, CodeQL, DevSecOps, MCP协议, OpenAI Agents SDK, YAML配置, 上游代理, 云安全监控, 多智能体框架, 安全评估工具, 智能体工作流, 漏洞复现, 自动化代码审计, 请求拦截, 逆向工具, 静态分析