rudrasingh-007/Spectra
GitHub: rudrasingh-007/Spectra
Spectra 是一个基于 Python 的大语言模型隐私风险评估工具,通过 PII 探测、训练数据复现检测和成员推断三大模块,帮助团队在模型部署前量化并暴露潜在的隐私泄漏风险。
Stars: 1 | Forks: 0
```
███████╗██████╗ ███████╗ ██████╗████████╗██████╗ █████╗
██╔════╝██╔══██╗██╔════╝██╔════╝╚══██╔══╝██╔══██╗██╔══██╗
███████╗██████╔╝█████╗ ██║ ██║ ██████╔╝███████║
╚════██║██╔═══╝ ██╔══╝ ██║ ██║ ██╔══██╗██╔══██║
███████║██║ ███████╗╚██████╗ ██║ ██║ ██║██║ ██║
╚══════╝╚═╝ ╚══════╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝
```
# Spectra




**标语:** 探测。衡量。在部署前暴露隐私风险。
```
SYSTEM : LLM Privacy Auditing Tool
VERSION : 1.0
STATUS : OPERATIONAL
CLASSIFICATION : OPEN SOURCE
TARGET : Large Language Models
```
"Spectra 系统地对语言模型进行 interrogates (审问),以便在部署前揭示潜在的隐私暴露问题。"
## 仪表盘预览

## 概述
Spectra 是一个基于 Python 的 LLM 隐私审计工具包,通过三种高影响力的隐私攻击向量对语言模型进行压力测试。它运行针对性的探测,计算加权风险评分,捕获结构化发现,并生成包含基于 Streamlit 的实时执行可视性的详细 HTML 审计报告。
## 功能特性
```
[MODULE-1] PII Detection Probing 8 adversarial prompts across social engineering vectors
[MODULE-2] Regurgitation Detection Exact + semantic similarity against 8 sensitive documents
[MODULE-3] Membership Inference Heuristic Confidence gap analysis across target vs random corpora
[CORE] Weighted Risk Scoring Entity-type weighted scoring with critical/high/low tiers
[CORE] Detailed HTML Report Per-prompt breakdowns, similarity tables, CSS bar charts
[CORE] Live Streamlit Dashboard Real-time execution with step indicators and progress bar
[CORE] Error Handling + Logging Per-call exception handling with spectra.log audit trail
[CORE] Auto Report Cleanup Keeps last 3 reports, auto-deletes older ones
```
## 审计流水线
```
[PROMPT ENGINE] → [PII DETECTOR] → [REGURGITATION DETECTOR] → [MEMBERSHIP INFERENCE] → [REPORT GENERATOR] → [DASHBOARD]
```
## 流水线架构
```
INPUT LAYER
└── Prompt Engine → 8 adversarial prompts per module
PROCESSING LAYER
├── PII Detector → Presidio entity recognition + weighted scoring
├── Regurgitation Det. → RapidFuzz exact + sentence-transformers semantic
└── Membership Inference → Confidence gap analysis, hybrid scoring
OUTPUT LAYER
├── HTML Report → Detailed per-module breakdown with charts
├── Dashboard → Streamlit live control panel
└── spectra.log → Structured audit trail
```
| 阶段 | 组件 | 描述 |
|---|---|---|
| 1 | PII Detector | 带有加权实体评分的多提示 PII 探测 |
| 2 | Regurgitation Detector | 针对敏感语料库的精确和语义相似度检查 |
| 3 | Membership Inference | 目标与随机置信度差距分析 |
| 4 | Report Generator | 包含模块细分的独立详细 HTML 报告 |
| 5 | Streamlit Dashboard | 实时执行面板、进度状态和结果可视化 |
## 核心模块
```
[+] PII Detection Probing
Uses crafted extraction prompts and Presidio entity analysis to detect leaked emails,
phone numbers, names, addresses, and identifier patterns.
Note: This module measures PII generation risk, not confirmed leakage of real training data.
[+] Verbatim Regurgitation Detection
Tests whether the model reproduces sensitive-style text using exact similarity
(RapidFuzz) and semantic similarity (Sentence Transformers).
Note: The corpus used consists of public domain texts likely present in training data, not verified training data extracts. Similarity results reflect generation behavior, not confirmed memorization.
[+] Membership Inference Heuristic
Compares completion confidence between likely-seen corpus text and random nonsense
text to estimate potential membership inference signal.
```
## 风险分类矩阵
```
CRITICAL PII Score ----------- 100/100 High PII generation risk detected. Review model outputs carefully.
HIGH MEM Score ----------- 060/100 Significant exposure. Audit before production.
MEDIUM REG Score ----------- 020/100 Moderate signal. Investigate findings.
LOW ALL Score ----------- 010/100 Minimal exposure. Monitor across updates.
```
## 隐私威胁面
| 向量 | 方法 | 工具 |
|--------|--------|------|
| PII Extraction | 对抗性提示 | Presidio + 加权评分 |
| Semantic Regurgitation | 基于含义的相似度 | sentence-transformers |
| Verbatim Reproduction | 精确文本匹配 | rapidfuzz |
| Membership Inference | 置信度差距分析 | 混合精确 + 语义 |
| Social Engineering | 角色扮演提示注入 | 自定义提示引擎 |
## 截图
### Spectra 仪表盘 — 审计结果

### Spectra 仪表盘 — 实时审计执行

### Spectra HTML 审计报告

### Spectra 终端输出及日志

## 技术栈
| 组件 | 用途 |
|---|---|
| Python 3.11 | 核心运行时 |
| google-genai | Gemini/Gemma 模型客户端 |
| presidio-analyzer | PII 实体检测 |
| spacy | NLP 后端支持 |
| rapidfuzz | 字符串相似度评分 |
| sentence-transformers | 语义相似度评分 |
| streamlit | 实时审计仪表盘 |
| 支持的模型 | gemini-3.1-flash-lite, gemini-2.5-flash, gemini-2.5-flash-lite, gemma-3-12b-it |
## 系统结构
```
Spectra/
├─ main.py
├─ dashboard.py
├─ modules/
│ ├─ pii_detector.py
│ ├─ regurgitation_detector.py
│ └─ membership_inference.py
├─ utils/
│ └─ report_generator.py
├─ reports/
├─ prompts/
├─ assets/
└─ requirements.txt
```
## 部署
```
# 1) Clone
git clone https://github.com//Spectra.git
cd Spectra
# 2) Virtual environment
python -m venv venv
# 3) Activate (Windows PowerShell)
venv\Scripts\Activate.ps1
# macOS/Linux 替代方案
# source venv/bin/activate
# 4) Dependencies
pip install -r requirements.txt
```
在项目根目录创建一个 `.env` 文件:
```
GEMINI_API_KEY=your_api_key_here
```
运行 CLI 审计:
```
python main.py
```
运行实时仪表盘:
```
streamlit run dashboard.py
```
## 示例输出
```
[+] Starting Spectra Audit...
[+] Running: PII Detection
[+] Running: Verbatim Regurgitation Detection
[+] Running: Membership Inference Heuristic
[+] Audit complete
[+] Report generated at: reports/spectra_audit_report_YYYYMMDD_HHMMSS.html
```
HTML 报告包含模型元数据、审计时间戳、每个模块的可视化风险条以及综合风险评分。
## 研究参考
1. Carlini 等人 (2021),*Extracting Training Data from Large Language Models*。
2. Shokri 等人 (2017),*Membership Inference Attacks against Machine Learning Models*。
3. 差分隐私文献和基础性的隐私保护机器学习研究。
注意:本项目实现了受上述研究启发的启发式近似。它并没有完全复制这些论文中描述的加密方法。
## 路线图
```
[COMPLETE] v1.0 — Core 3-module privacy risk evaluation pipeline with HTML report and Streamlit dashboard
[QUEUED] v2.0 — OpenAI support, PDF export, multi-model comparison
[QUEUED] v3.0 — Scheduled audits, API endpoint, fine-tuned PII classifier
```
## 许可证
MIT License
## 免责声明
本项目仅供**教育和研究使用**。请负责任地使用,并确保获得明确的授权,同时遵守适用的法律和组织要求。
## 作者
**Rudra Singh**
网络安全探索者
```
[ SPECTRA ] — INTERROGATE. MEASURE. SECURE. — OPEN SOURCE
```
标签:AI安全审计, CISA项目, IP 地址批量处理, LLM隐私评估, PII生成风险, Python, Streamlit, 个人隐私信息泄露, 人工智能安全, 代码生成, 合规性, 大语言模型安全, 密码管理, 密钥泄露防护, 成员推理攻击, 无后门, 机器学习安全, 机密管理, 模型安全测试, 深度学习, 渗透测试工具, 红队评估, 网络安全, 网络安全, 训练数据重提取, 访问控制, 逆向工具, 隐私保护, 隐私保护, 隐私风险检测