kariemoorman/ghostbit

GitHub: kariemoorman/ghostbit

一款采用现代加密标准的多格式隐写术工具包，支持音频和图像载体，并提供 LLM 集成能力。

Stars: 0 | Forks: 0

GH0STB1T：
一款多格式隐写术工具包

## 为什么？

架构现代化

本实现代表了一次完整的架构迁移，从依赖于平台的 Java 和 .NET 代码库迁移至统一的 Python 解决方案，带来以下优势： - **平台独立性：** 消除了平台特定的运行时依赖，确保在异构计算环境（Windows、macOS 和 Linux）中的可移植性。 - **内存效率：** 消除了 JVM 堆开销，降低了基准内存消耗，并在保持完整功能的同时，能够在资源受限的系统上高效运行。 - **可审计性：** 不再依赖平台特定的加密 API 或闭源运行时组件，而是采用开源的 Python 加密技术，从而支持独立的安全审计和对实现的透明验证。 - **类型安全：** 本实现目标是 Python 3.13+，以利用现代类型注解和静态分析功能，确保整个代码库的类型安全。这些改进降低了部署复杂性和计算开销，促进了在资源受限环境中针对人类操作员和自动化 LLM 驱动的工作流的可靠且高效的运行。（另见 [EFF 程序员权利项目逆向工程 FAQ](https://www.eff.org/issues/coders/reverse-engineering-faq#faq5)）

安全性升级

长期的隐写术工具（例如 [OpenStego](https://www.openstego.com/)、[DeepSound](https://github.com/Jpinsoft/DeepSound)、[SilentEye](https://github.com/achorein/silenteye/)）使用过时的加密原语，导致隐藏数据容易受到攻击。这些工具依赖于弱密钥派生函数（KDF），例如使用 MD5、SHA-1 或 SHA-256 直接对密码进行哈希处理，这无法提供抗暴力破解能力。它们还依赖于旧版加密模式，例如没有身份验证的 AES-CBC。这些模式仅提供机密性，使 Payload 容易受到未检测到的修改、比特翻转攻击以及在常见错误处理模式下的填充解密攻击。本实现将现有的隐写术协议与现代经过审计的加密标准相结合，以确保安全的隐藏信息： | 组件 | 算法 | 参数 | 安全属性 | |-----------|-----------|------------|---------------------| | **密钥派生** | Argon2id | 64MB 内存，3 次迭代，并行度=4 | - 内存困难函数
- 混合保护防止侧信道攻击 | | **加密** | AES-256-GCM | 96 位随机 Nonce，128 位认证标签 | - 带关联数据的认证加密 (AEAD)
- 单次操作实现机密性 + 完整性 + 真实性 | | **盐值** | 随机 | 128 位，每个文件唯一 | - 防止彩虹表攻击
- 即使密码相同，密钥也唯一 |

这意味着什么？

**符合 NIST/FIPS 标准的加密技术** - 过渡到国家安全标准批准的算法（Argon2id、AES-256-GCM）；这些是 TLS 1.3、Signal、Bitwarden 和企业安全系统中使用的相同加密原语。 **无法破解的密码** - 旧版 SHA-256 允许攻击者在现代 GPU 上每秒测试数十亿个密码。Argon2id 将攻击者的速度减慢到每秒仅数千次测试，使暴力攻击变得不切实际（设计上具有内存困难特性）。即使是弱密码（8 个字符）也能获得数年的暴力破解保护。 **篡改检测与完整性验证** - 旧版 AES-CBC 允许未被发现的篡改、比特翻转攻击和 Payload 操纵。AES-GCM 对每一个字节的隐藏数据进行加密认证。任何修改（例如单个比特翻转）都会导致立即解密失败。现在，在不被检测到的情况下更改数据在数学上是不可能的。 **消除填充 Oracle 漏洞** - 带有 PKCS#7 填充的旧版 AES-CBC 容易受到自适应选择密文攻击。攻击者可以通过观察错误信息来解密数据而无需密码。AES-GCM 使用认证加密：无填充，无 Oracle，恒定时间失败。

## 功能 多媒体隐写术： 跨音频、图像和视频的多格式支持 • 音频：WAV / MP3 / FLAC / M4A / AIFF • 图像：BMP / PNG / JPEG / WEBP / TIFF / SVG / GIF 强加密： 对嵌入文件使用带有 Argon2id 密钥派生的 AES-GCM（见 [安全性升级](#why)） CLI： 易于使用的命令行界面（见 [CLI](#-cli)） API： 通过 API 进行项目集成（见 [API](#-python-api)） Docker： 容器化部署支持（见 [Docker](#-docker)） LLM 集成： 用于 LLM 驱动工作流的内置技能系统（见 [LLM 集成](#llm-integration)） 跨平台兼容性： MacOS, Linux, Windows ## 安装要求 - Python 3.13+ - FFmpeg（用于音频格式转换）
GitHub 发布版 从 [Releases](https://github.com/kariemoorman/ghostbit/releases) 下载最新的 `.whl` 文件： ``` pip install git+https://github.com/kariemoorman/ghostbit.git@latest ``` 开发版构建 从源码安装以进行开发或访问最新功能： ``` git clone https://github.com/kariemoorman/ghostbit.git cd ghostbit pip install -e ".[dev]" ``` ## 用法 ### CLI GH0STB1T CLI 支持直接从终端进行快速编码/解码/分析操作。

编码（隐藏文件）

``` # 音频 ghostbit audio encode -i -s -q {low,normal,high} -o . -p # 图像 ghostbit image encode -i -s -p ```

计算载体容量

``` # 音频 ghostbit audio capacity -i -q {low,normal,high} # 图像 ghostbit image capacity -i ```

解码（提取文件）

``` # 音频 ghostbit audio decode -i -p # 图像 ghostbit image decode -i -p ```

分析文件

``` # 音频 ghostbit audio analyze -i # 图像 ghostbit image analyze -i ```

创建测试文件

``` # 用于测试的音频创建 ghostbit audio test -o test_audio # 用于测试的图像创建 ghostbit image test -o test_images ```

### Python API GH0STB1T 提供了 Python API，以便无缝集成到现有的应用程序和工作流中。

编码（隐藏文件）

``` from ghostbit.audiostego import AudioMultiFormatCoder, EncodeMode # 初始化 coder coder = AudioMultiFormatCoder() # 编码文件 coder.encode_files_multi_format( carrier_file="music.wav", secret_files=["document.pdf", "image.jpg"], output_file="output.wav", quality_mode=EncodeMode.NORMAL_QUALITY, password="optional_password" ) ``` ``` # 使用 Progress Callbacks 编码 from ghostbit.audiostego import AudioMultiFormatCoder coder = AudioMultiFormatCoder() # 编码进度 def on_encode_progress(): print(".", end="", flush=True) coder.on_encoded_element = on_encode_progress coder.encode_files_multi_format( carrier_file="carrier.wav", secret_files=["secret.pdf"], output_file="output.wav" ) ```

计算载体容量

``` from ghostbit.audiostego import AudioMultiFormatCoder, BaseFileInfoItem, EncodeMode coder = AudioMultiFormatCoder() wav_file = coder._convert_to_wav("carrier_file.flac") def get_capacity(wav_file, encode_mode): base_file = BaseFileInfoItem( full_path=wav_file, encode_mode=encode_mode, wav_head_length=44, ) return base_file.max_inner_files_size capacity_bytes = get_capacity(wav_file, EncodeMode.NORMAL_QUALITY) print(f"Maximum capacity: {capacity_bytes / (1024*1024):.2f} MB") ``` ``` from ghostbit.audiostego import AudioMultiFormatCoder, EncodeMode import os coder = AudioMultiFormatCoder() # 使用不同质量模式检查容量 carrier = "long_audio.wav" secret_file = "large_video.mp4" secret_size = os.path.getsize(secret_file) / (1024 * 1024) print(f"Secret file size: {secret_size:.2f} MB") for mode in [EncodeMode.LOW_QUALITY, EncodeMode.NORMAL_QUALITY, EncodeMode.HIGH_QUALITY]: capacity = get_capacity(carrier, mode) / (1024 * 1024) fits = "✅ FITS" if capacity >= secret_size else "❌ TOO LARGE" print(f"{mode.name}: {capacity:.2f} MB capacity - {fits}") ```

解码（提取文件）

``` from ghostbit.audiostego import AudioMultiFormatCoder # 初始化 coder coder = AudioMultiFormatCoder() # 解码文件 coder.decode_files_multi_format( encoded_file="output.wav", output_dir="extracted/", password="optional_password" ) ``` ``` # 使用 Progress Callbacks 解码 from ghostbit.audiostego import AudioMultiFormatCoder coder = AudioMultiFormatCoder() # 解码进度 def on_decode_progress(): print(".", end="", flush=True) coder.on_decoded_element = on_decode_progress coder.decode_files_multi_format( encoded_file="output.wav", output_dir="extracted/" ) ```

密码保护

``` from ghostbit.audiostego import AudioMultiFormatCoder, KeyRequiredEventArgs coder = AudioMultiFormatCoder() # 在解码过程中处理密码请求 def request_password(args: KeyRequiredEventArgs): password = input(f"Enter password (version {args.h22_version}): ") if password: args.key = password else: args.cancel = True # Cancel operation coder.on_key_required = request_password coder.decode_files_multi_format( encoded_file="encrypted_output.wav", output_dir="extracted/" ) ``` ``` # Password-Protected Multiple Files from ghostbit.audiostego import AudioMultiFormatCoder, EncodeMode coder = AudioMultiFormatCoder() # 使用密码编码 coder.encode_files_multi_format( carrier_file="music.mp3", secret_files=[ "report.pdf", "spreadsheet.xlsx", "presentation.pptx" ], output_file="encoded_music.mp3", quality_mode=EncodeMode.HIGH_QUALITY, password="SuperSecure123!" ) print("✅ Multiple files encrypted and hidden!") # 解码 coder.decode_files_multi_format( encoded_file="encoded_music.mp3", output_dir="extracted_files/", password="SuperSecure123!" ) print("✅ Files extracted successfully!") ```

### Docker 可以使用 Docker 部署 GH0STB1T，以实现隔离、可复现的环境。

初始设置

1. **克隆仓库：** ``` git clone https://github.com/kariemoorman/ghostbit.git cd ghostbit ``` 2. **创建本地 `input` 和 `output` 目录：** 这些本地目录映射到 Docker 容器目录，确保安全的文件访问。 ``` mkdir input output ``` 3. **将文件放入 `input/` 目录：** ``` cp /path/to/carrier.wav input/ cp /path/to/secret.pdf input/ ```

构建与运行

``` # 构建并启动容器 docker-compose up -d ghostbit # 编码文件 docker-compose exec ghostbit ghostbit audio encode -i input/carrier.wav -f /input/secret.pdf -o encoded.wav -p # 解码文件 docker-compose exec ghostbit ghostbit audio decode -i output/encoded.wav -p # 检查容量 docker-compose exec ghostbit ghostbit audio capacity input/carrier.wav -q high # 分析文件 docker-compose exec ghostbit ghostbit audio analyze -i output/encoded.wav -p ```

清理

``` # 停止容器 docker-compose stop # 移除容器 docker-compose down # 移除容器和镜像 docker-compose down --rmi all ```

## LLM 集成 ### MCP Server GH0STB1T 包含一个 MCP Server，用于与基于 LLM 的系统进行标准化且安全的集成，支持 10 个工具： [GH0STB1T MCP Server](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/mcp_server) |音频|图像| |--|--| |- audio_encode | - image_encode | |- audio_decode | - image_decode | |- audio_capacity | - image_capacity | |- audio_analyze | - image_analyze | |- generate_audio_carrier | - generate_image_carrier | 安全加固措施包括： - 增强的密码安全性，确保密码绝不流经 AI 模型的上下文 - 通过 FastMCP 自动生成 JSON Schema 的类型注解工具参数 - 输入清洗，包括拒绝空字节和控制字符、路径规范化、拒绝 Shell 元字符、模式验证 - 文件系统沙箱，将 I/O 限制在仅指定和可解析的路径 - 拒绝符号链接，阻止所有输入文件上的符号链接 - 提示注入防御，包括文件名清洗 - 错误清洗，确保错误被转换为安全的类别级消息 - 审计日志和密码擦除 - 使用 JSON-RPC 协议的 STDIO 传输 - 无状态工具设计（每次调用新的编码器实例） - 资源耗尽预防，包括文件大小限制和无状态工具调用 MCP Server 的设置说明如下：

密码管理

与 CLI 不同，MCP Server 需要额外的安全层，以防止 LLM 访问用于加密/解密数字媒体中编码的秘密文件的密码。因此，用户必须首先准备一个密码文件，可以使用 SOPS 加密或明文（具有只读权限）。 ### SOPS ``` brew install sops age ``` ``` # 创建目录 mkdir -p ~/.config/sops/age # 生成密钥 age-keygen -o ~/.config/sops/age/keys.txt chmod 600 ~/.config/sops/age/keys.txt chmod 700 ~/.config/sops/age chmod 700 ~/.config/sops # 现在获取公钥 AGE_PUB=$(grep "public key" ~/.config/sops/age/keys.txt | awk '{print $NF}') # 在主目录创建配置 cat > ~/.sops.yaml << EOF creation_rules: - age: "$AGE_PUB" EOF ``` ``` # 创建密码文件 echo -n "demo123" > ~/.ghostbit-pw.txt #Encrypt the file using SOPS + age sops -e ~/.ghostbit-pw.txt > ~/.ghostbit-pw.enc ``` ``` # 应该打印您的密码 sops -d ~/.ghostbit-pw.enc ``` ``` password_file='~/.ghostbit-pw.enc' ``` ### 明文 ``` echo -n "demo123" > ~/.ghostbit-password chmod 600 ~/.ghostbit-password ``` ``` password_file="~/.ghostbit-password" ```

接入 MCP Server

确保在虚拟环境中安装了 GH0STB1T MCP 包： ``` cd /path/to/ghostbit python -m venv .ghostbit-venv source .ghostbit-venv/bin/activate pip install -e '.[mcp]' ``` ### LM Studio 选项 1：使用通用命令 mcp.json ``` { "mcpServers": { "ghostbit": { "command": "/path/to/ghostbit/.ghostbit-venv/bin/ghostbit-mcp" } } } ``` 选项 2：使用带 GHOSTBIT_ALLOWED_DIRS 的通用命令 mcp.json ``` { "mcpServers": { "ghostbit": { "command": "/path/to/ghostbit/.ghostbit-venv/bin/ghostbit-mcp", "env": { "GHOSTBIT_ALLOWED_DIRS": "/path/to/ghostbit/output:/path/to/ghostbit/tests/testcases" } } } } ```

示例提示

### 音频 1. 生成载体音频文件 * “使用 generate_audio_carrier 在 /path/to/outputdir/carrier.wav 创建一个时长 5 秒、频率 440 Hz、采样率 44100、单声道的 WAV 文件” 2. 检查容量 * “使用 audio_capacity 检查 /path/to/output/carrier.wav 在 'normal' 质量下可以隐藏多少数据” 3. 编码秘密文件 * “使用 audio_encode 将 /path/to/directory/test_document.txt 隐藏在 /path/to/outputdir/carrier.wav 中，将输出保存到 /path/to/output/encoded_audio.wav，质量 'normal'，password_file='~/.ghostbit-password'” 4. 分析隐藏数据 * “对 /path/to/output/encoded_audio.wav 使用 audio_analyze，password_file='~/.ghostbit-password'” 5. 解码并提取 * “对 /path/to/output/encoded_audio.wav 使用 audio_decode，输出到 /path/to/output/decoded，password_file='~/.ghostbit-password'” 6. 完整端到端（单次提示） * “在 /path/to/output/stego_carrier.wav 生成一个 10 秒 440Hz 的 WAV 载体，然后检查其在三种质量模式（low、normal、high）下的容量，然后使用质量 'low' 和 password_file='~/.ghostbit-password' 将 /path/to/test_document.txt 隐藏其中，保存到 /path/to/output/stego_audio.wav，然后分析结果” ### 图像 1. 生成载体图像 * “使用 generate_image_carrier 在 /path/to/output/carrier.png 创建一个宽 800、高 600、渐变图案的 PNG 图像” 2. 检查容量 * “使用 image_capacity 检查 /path/to/output/carrier.png 可以隐藏多少数据” 3. 编码秘密文件 * “使用 image_encode 将 /path/to/testcases/test_document.txt 隐藏在 /path/to/output/carrier.png 中，将输出保存到 /path/to/output/encoded_image，password_file='~/.ghostbit-pw.enc'” 4. 分析隐藏数据 * “对 /path/to/output/encoded_image/carrier.png 使用 image_analyze” 5. 解码并提取 * “对 /path/to/output/carrier.png 使用 image_decode，输出到 /path/to/output/decoded_image，password_file='~/.ghostbit-pw.enc'” 6. 完整端到端（单次提示） * “在 /path/to/output/stego_cover.png 生成一个 1000x1000 波浪图案的 PNG，然后检查其容量，然后使用 password_file='~/.ghostbit-pw.enc' 将 /path/to/test_document.txt 隐藏其中，保存到 /path/to/output/encoded，然后分析结果”

### 技能 GH0STB1T 包含一个技能系统，专为与 LLM 和 AI 助手无缝集成而设计。 ### 可用技能 GH0STB1T 提供专门的技能文档： 1. [**音频隐写术**](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/audiostego/skills/steganography/SKILL.md) - 带示例的完整使用指南 2. [**音频容量**](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/audiostego/skills/capacity/SKILL.md) - 容量规划和优化策略 3. [**音频故障排除**](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/audiostego/skills/troubleshooting/SKILL.md) - 常见问题及解决方案 4. [**图像隐术**](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/imagestego/skills/steganography/SKILL.md) - 带示例的完整使用指南 5. [**图像容量**](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/imagestego/skills/capacity/SKILL.md) - 容量规划和优化策略 6. [**图像故障排除**](https://github.com/kariemoorman/ghostbit/blob/main/src/ghostbit/imagestego/skills/troubleshooting/SKILL.md) - 常见问题及解决方案 ### LLM 快速入门

获取文档

``` from ghostbit.audiostego import get_audio_llm_context # 获取为 LLMs 格式化的完整文档 context = get_audio_llm_context() # 在您的 LLM 提示词中使用 prompt = f""" You are an expert in audio steganography using AudioStego. {context} User: How do I hide a 5MB PDF in a 10-minute WAV file with maximum security? Please provide a complete Python example with security best practices. """ # 发送提示词到您的 LLM # response = your_llm_api(prompt) ```

加载特定技能

``` from ghostbit.audiostego import load_audio_skill # 加载特定技能 stego_skill = load_audio_skill("steganography") # 获取技能内容 print(stego_skill.content) # 从技能中获取示例 examples = stego_skill.get_examples() for example in examples: print(f"Language: {example['language']}") print(f"Description: {example['description']}") print(f"Code:\n{example['code']}\n") # 获取特定部分 best_practices = stego_skill.get_section("Best Practices") print(best_practices) ```

创建提示模板

``` from ghostbit.audiostego import get_audio_llm_context # 准备上下文 skills_context = get_audio_llm_context() # 创建详细提示词 prompt = f""" You are an expert Python developer specializing in audio steganography. CONTEXT: {skills_context} TASK: The user wants to create a secure file hiding system for sensitive documents. Requirements: - Hide multiple PDF files in a single audio carrier - Use strong encryption with user-provided passwords - Show progress during encoding/decoding - Handle errors gracefully - Verify file integrity after extraction USER QUESTION: {user_question} Please provide: 1. Complete working code 2. Security considerations 3. Error handling strategy 4. Usage example Format your response as: - Code blocks with explanations - Security notes - Example usage """ # 发送到 LLM API # response = llm_api.generate(prompt) ```

与 Anthropic Claude API 集成

``` import anthropic from ghostbit.audiostego import get_audio_llm_context client = anthropic.Anthropic(api_key="your-api-key") context = get_audio_llm_context() message = client.messages.create( model="claude-3-opus-20240229", max_tokens=2048, system=f"You are an AudioStego expert. {context}", messages=[ {"role": "user", "content": "Show me how to use AudioStego with error handling"} ] ) print(message.content[0].text) ```

与 OpenAI API 集成

``` from openai import OpenAI from ghostbit.audiostego import get_audio_llm_context client = OpenAI(api_key="your-api-key") context = get_audio_llm_context() response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": f"You are an AudioStego expert. {context}"}, {"role": "user", "content": "How do I encode files with maximum security?"} ] ) print(response.choices[0].message.content) ```

## 故障排除 - **GitHub Issues：** [报告 Bug 或请求功能](https://github.com/kariemoorman/ghostbit/issues) - **讨论：** [提问和分享想法](https://github.com/kariemoorman/ghostbit/discussions) - **文档：** [完整 API 参考](https://github.com/kariemoorman/ghostbit/wiki) ## 贡献欢迎贡献！如何开始：[CONTRIBUTING.md](https://github.com/kariemoorman/ghostbit/blob/main/CONTRIBUTING.md) ## 引用 GH0STB1T 是一个免费的开源教育和研究工具。如果您在研究或项目中使用 GH0STB1T，请引用为： ``` @software{audiostego2026, author = {Karie Moorman}, title = {GH0STB1T: A Multi-format Steganography Toolkit for Python}, year = {2026}, url = {https://github.com/kariemoorman/ghostbit}, version = {0.0.1} } ``` **APA 格式：** ``` Moorman, Karie. (2026). GH0STB1T: A Multi-format Steganography Toolkit for Python (Version 0.0.1) [Computer software]. https://github.com/kariemoorman/ghostbit ``` ## 许可证本项目采用 [Apache License 2.0 LICENSE](LICENSE) 授权。

标签：DLL 劫持, DNS 反向解析, LLM集成, Python, Steganography, 信息隐藏, 图像隐写, 多格式支持, 大语言模型, 工具包, 数字水印, 数据隐藏, 文件加密, 无后门, 本体建模, 机密通信, 编码解码, 网络安全, 网络安全, 视频隐写, 请求拦截, 隐写术, 隐私保护, 隐私保护, 音频隐写