rudrasingh-007/Spectra

GitHub: rudrasingh-007/Spectra

Spectra 是一个基于 Python 的大语言模型隐私风险评估工具,通过 PII 探测、训练数据复现检测和成员推断三大模块,帮助团队在模型部署前量化并暴露潜在的隐私泄漏风险。

Stars: 1 | Forks: 0

``` ███████╗██████╗ ███████╗ ██████╗████████╗██████╗ █████╗ ██╔════╝██╔══██╗██╔════╝██╔════╝╚══██╔══╝██╔══██╗██╔══██╗ ███████╗██████╔╝█████╗ ██║ ██║ ██████╔╝███████║ ╚════██║██╔═══╝ ██╔══╝ ██║ ██║ ██╔══██╗██╔══██║ ███████║██║ ███████╗╚██████╗ ██║ ██║ ██║██║ ██║ ╚══════╝╚═╝ ╚══════╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝ ``` # Spectra ![Python](https://img.shields.io/badge/Python-3.11-3776AB?logo=python&logoColor=white) ![隐私审计](https://img.shields.io/badge/Focus-LLM%20Privacy%20Auditing-0f172a) ![状态](https://img.shields.io/badge/Status-v1%20Complete-16a34a) ![模式](https://img.shields.io/badge/Use-Research%20%26%20Education-3b3f52) **标语:** 探测。衡量。在部署前暴露隐私风险。 ``` SYSTEM : LLM Privacy Auditing Tool VERSION : 1.0 STATUS : OPERATIONAL CLASSIFICATION : OPEN SOURCE TARGET : Large Language Models ``` "Spectra 系统地对语言模型进行 interrogates (审问),以便在部署前揭示潜在的隐私暴露问题。" ## 仪表盘预览 ![Spectra 仪表盘](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/db28e118c0184705.png) ## 概述 Spectra 是一个基于 Python 的 LLM 隐私审计工具包,通过三种高影响力的隐私攻击向量对语言模型进行压力测试。它运行针对性的探测,计算加权风险评分,捕获结构化发现,并生成包含基于 Streamlit 的实时执行可视性的详细 HTML 审计报告。 ## 功能特性 ``` [MODULE-1] PII Detection Probing 8 adversarial prompts across social engineering vectors [MODULE-2] Regurgitation Detection Exact + semantic similarity against 8 sensitive documents [MODULE-3] Membership Inference Heuristic Confidence gap analysis across target vs random corpora [CORE] Weighted Risk Scoring Entity-type weighted scoring with critical/high/low tiers [CORE] Detailed HTML Report Per-prompt breakdowns, similarity tables, CSS bar charts [CORE] Live Streamlit Dashboard Real-time execution with step indicators and progress bar [CORE] Error Handling + Logging Per-call exception handling with spectra.log audit trail [CORE] Auto Report Cleanup Keeps last 3 reports, auto-deletes older ones ``` ## 审计流水线 ``` [PROMPT ENGINE] → [PII DETECTOR] → [REGURGITATION DETECTOR] → [MEMBERSHIP INFERENCE] → [REPORT GENERATOR] → [DASHBOARD] ``` ## 流水线架构 ``` INPUT LAYER └── Prompt Engine → 8 adversarial prompts per module PROCESSING LAYER ├── PII Detector → Presidio entity recognition + weighted scoring ├── Regurgitation Det. → RapidFuzz exact + sentence-transformers semantic └── Membership Inference → Confidence gap analysis, hybrid scoring OUTPUT LAYER ├── HTML Report → Detailed per-module breakdown with charts ├── Dashboard → Streamlit live control panel └── spectra.log → Structured audit trail ``` | 阶段 | 组件 | 描述 | |---|---|---| | 1 | PII Detector | 带有加权实体评分的多提示 PII 探测 | | 2 | Regurgitation Detector | 针对敏感语料库的精确和语义相似度检查 | | 3 | Membership Inference | 目标与随机置信度差距分析 | | 4 | Report Generator | 包含模块细分的独立详细 HTML 报告 | | 5 | Streamlit Dashboard | 实时执行面板、进度状态和结果可视化 | ## 核心模块 ``` [+] PII Detection Probing Uses crafted extraction prompts and Presidio entity analysis to detect leaked emails, phone numbers, names, addresses, and identifier patterns. Note: This module measures PII generation risk, not confirmed leakage of real training data. [+] Verbatim Regurgitation Detection Tests whether the model reproduces sensitive-style text using exact similarity (RapidFuzz) and semantic similarity (Sentence Transformers). Note: The corpus used consists of public domain texts likely present in training data, not verified training data extracts. Similarity results reflect generation behavior, not confirmed memorization. [+] Membership Inference Heuristic Compares completion confidence between likely-seen corpus text and random nonsense text to estimate potential membership inference signal. ``` ## 风险分类矩阵 ``` CRITICAL PII Score ----------- 100/100 High PII generation risk detected. Review model outputs carefully. HIGH MEM Score ----------- 060/100 Significant exposure. Audit before production. MEDIUM REG Score ----------- 020/100 Moderate signal. Investigate findings. LOW ALL Score ----------- 010/100 Minimal exposure. Monitor across updates. ``` ## 隐私威胁面 | 向量 | 方法 | 工具 | |--------|--------|------| | PII Extraction | 对抗性提示 | Presidio + 加权评分 | | Semantic Regurgitation | 基于含义的相似度 | sentence-transformers | | Verbatim Reproduction | 精确文本匹配 | rapidfuzz | | Membership Inference | 置信度差距分析 | 混合精确 + 语义 | | Social Engineering | 角色扮演提示注入 | 自定义提示引擎 | ## 截图 ### Spectra 仪表盘 — 审计结果 ![Spectra 仪表盘 — 审计结果](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/db28e118c0184705.png) ### Spectra 仪表盘 — 实时审计执行 ![Spectra 仪表盘 — 实时审计执行](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/ff282cf40c184712.png) ### Spectra HTML 审计报告 ![Spectra HTML 审计报告](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/a47ea7561f184719.png) ### Spectra 终端输出及日志 ![Spectra 终端输出及日志](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/0509bdebd5184726.png) ## 技术栈 | 组件 | 用途 | |---|---| | Python 3.11 | 核心运行时 | | google-genai | Gemini/Gemma 模型客户端 | | presidio-analyzer | PII 实体检测 | | spacy | NLP 后端支持 | | rapidfuzz | 字符串相似度评分 | | sentence-transformers | 语义相似度评分 | | streamlit | 实时审计仪表盘 | | 支持的模型 | gemini-3.1-flash-lite, gemini-2.5-flash, gemini-2.5-flash-lite, gemma-3-12b-it | ## 系统结构 ``` Spectra/ ├─ main.py ├─ dashboard.py ├─ modules/ │ ├─ pii_detector.py │ ├─ regurgitation_detector.py │ └─ membership_inference.py ├─ utils/ │ └─ report_generator.py ├─ reports/ ├─ prompts/ ├─ assets/ └─ requirements.txt ``` ## 部署 ``` # 1) Clone git clone https://github.com//Spectra.git cd Spectra # 2) Virtual environment python -m venv venv # 3) Activate (Windows PowerShell) venv\Scripts\Activate.ps1 # macOS/Linux 替代方案 # source venv/bin/activate # 4) Dependencies pip install -r requirements.txt ``` 在项目根目录创建一个 `.env` 文件: ``` GEMINI_API_KEY=your_api_key_here ``` 运行 CLI 审计: ``` python main.py ``` 运行实时仪表盘: ``` streamlit run dashboard.py ``` ## 示例输出 ``` [+] Starting Spectra Audit... [+] Running: PII Detection [+] Running: Verbatim Regurgitation Detection [+] Running: Membership Inference Heuristic [+] Audit complete [+] Report generated at: reports/spectra_audit_report_YYYYMMDD_HHMMSS.html ``` HTML 报告包含模型元数据、审计时间戳、每个模块的可视化风险条以及综合风险评分。 ## 研究参考 1. Carlini 等人 (2021),*Extracting Training Data from Large Language Models*。 2. Shokri 等人 (2017),*Membership Inference Attacks against Machine Learning Models*。 3. 差分隐私文献和基础性的隐私保护机器学习研究。 注意:本项目实现了受上述研究启发的启发式近似。它并没有完全复制这些论文中描述的加密方法。 ## 路线图 ``` [COMPLETE] v1.0 — Core 3-module privacy risk evaluation pipeline with HTML report and Streamlit dashboard [QUEUED] v2.0 — OpenAI support, PDF export, multi-model comparison [QUEUED] v3.0 — Scheduled audits, API endpoint, fine-tuned PII classifier ``` ## 许可证 MIT License ## 免责声明 本项目仅供**教育和研究使用**。请负责任地使用,并确保获得明确的授权,同时遵守适用的法律和组织要求。 ## 作者 **Rudra Singh** 网络安全探索者 ``` [ SPECTRA ] — INTERROGATE. MEASURE. SECURE. — OPEN SOURCE ```
标签:AI安全审计, CISA项目, IP 地址批量处理, LLM隐私评估, PII生成风险, Python, Streamlit, 个人隐私信息泄露, 人工智能安全, 代码生成, 合规性, 大语言模型安全, 密码管理, 密钥泄露防护, 成员推理攻击, 无后门, 机器学习安全, 机密管理, 模型安全测试, 深度学习, 渗透测试工具, 红队评估, 网络安全, 网络安全, 训练数据重提取, 访问控制, 逆向工具, 隐私保护, 隐私保护, 隐私风险检测