DurgaRamireddy/AI-Powered-Alert-Triage-with-Claude-API

GitHub: DurgaRamireddy/AI-Powered-Alert-Triage-with-Claude-API

一个 AI 驱动的 SOC 告警分类与对比平台，利用 Claude API 在真实攻击流量中评估 AI 辅助分级的准确性与幻觉问题。

Stars: 0 | Forks: 0

# AI 驱动的告警分类（Claude API） **工具：** Python 3 · Claude API（claude-opus-4-5）· Splunk Enterprise · Impacket · CrackMapExec · VMware Workstation · Ubuntu · Windows Server 2022 · Windows 10 · Kali Linux **MITRE ATT&CK：** T1558.003 · T1558.004 · T1021.002 · T1550.002 **类型：** 家庭实验室 · 蓝队 · SOC 分析 · AI 辅助分类 · SIEM · 威胁检测 --- ## 简述构建了一个 AI 驱动的 SOC 分类系统，用于分析 Splunk 告警、比较 AI 与人工分析师的决策，并识别真实场景中的幻觉风险。 --- ## 概述本项目构建了一个端到端的 AI 辅助 SOC 分类管道。来自家庭 Active Directory 实验室的真实攻击生成 SIEM 告警被标准化后，通过作为一级 SOC 分析师的 Claude API 进行处理，并在一个自定义分析仪表板中展示 AI 与人工分类结果以便对比。目标是回答一个真实的 SOC 问题：**AI 模型能否可靠地分类 SIEM 告警，并在何处失败？** **本项目完整展示了以下流程：** 1. 在活动的 AD 环境中运行 Kerberoasting、AS-REP Roasting 和横向移动攻击，生成真实 SIEM 告警 2. 将 38 条告警从 Splunk 导出并标准化为一致的 JSON 架构 3. 构建 Python 分类引擎，调用 Claude API 并使用结构化的 SOC 分析师系统提示 4. 为每条告警接收结构化的 JSON 分类输出（严重性、判决、MITRE 映射、升级决策、证据摘要） 5. 在自定义深色主题仪表板中并排展示 AI 分类与人工分析师分类 6. 使用故意降级的告警测试 AI 失败情况，并记录模型幻觉位置 --- ## 实验架构 ``` ┌─────────────────────────────────────────────────────────────────┐ │ VMware Host-Only Network │ │ 192.168.255.0/24 │ │ │ │ ┌──────────────────────┐ ┌──────────────────────────────┐ │ │ │ Windows Server 2022 │ │ Ubuntu 22.04 │ │ │ │ WIN-I4UHLQF702E │ │ Splunk Web │ │ │ │ DC: lab.local │ │ Forwarder │ │ │ │ 192.168.255.130 │ │ (internal services) │ │ │ │ AD DS / DNS / KDC │ │ │ │ │ └──────────────────────┘ └──────────────────────────────┘ │ │ │ ▲ │ │ │ Domain Auth │ Windows Logs │ │ ▼ │ │ │ ┌──────────────────────┐ │ │ │ │ Windows 10 │────────────────────┘ │ │ │ DESKTOP-4FNNSE5 │ Universal Forwarder │ │ │ 192.168.255.132 │ │ │ │ Corp-PC01 │ │ │ └──────────────────────┘ │ │ │ │ ┌──────────────────────┐ │ │ │ Kali Linux │ Runs attacks → generates alerts │ │ │ 192.168.255.135 │ Impacket · CrackMapExec · BloodHound │ │ └──────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ │ Claude API (latest model) ▼ ┌───────────────────────┐ │ Python Triage │ │ Engine │ │ triage_engine.py │ └───────────────────────┘ │ ▼ ┌───────────────────────┐ │ Analyst Dashboard │ │ dashboard.html │ │ AI vs Manual Triage │ └───────────────────────┘ ``` | 虚拟机 | 操作系统 | IP | 角色 | |---|---|---|---| | WIN-I4UHLQF702E | Windows Server 2022 | 192.168.255.130 | 域控制器（lab.local） | | DESKTOP-4FNNSE5 | Windows 10 22H2 | 192.168.255.132 | Corp-PC01（加入域） | | Ubuntu VM | Ubuntu 22.04 | 192.168.255.131 | Splunk Enterprise 服务器 | | Kali Linux | Kali Rolling | 192.168.255.135 | 攻击机 | **域：** `lab.local` **域用户：** `jsmith`（标准用户）、`mjones`（标准用户）、`Administrator`（域管理员） --- ## 第一阶段 - 攻击模拟与告警生成 ### 从 Kali Linux 执行的攻击 ``` # Kerberoasting - 请求用户 SPN 的 RC4 加密 TGS 票据 python3 GetUserSPNs.py LAB.local/jsmith:Password \ -dc-ip 192.168.255.130 -request # AS-REP Roasting - 针对无 Kerberos 预身份验证的账户 python3 GetNPUsers.py LAB.local/ \ -usersfile /etc/passwd -dc-ip 192.168.255.130 # Lateral Movement - 通过 CrackMapExec 进行 SMB 网络登录 crackmapexec smb 192.168.255.130 192.168.255.132 \ -u jsmith -p Password ``` ### Splunk 告警统计 | 攻击 | 事件代码 | 生成告警数 | |---|---|---| | Kerberoasting | 4769 | 18 | | AS-REP Roasting | 4768 | 7 | | 横向移动（SMB） | 4624 | 13 | | **总计** | | **38** | ### 使用的 SPL 查询 ``` -- Kerberoasting detection index=* EventCode=4769 | table _time, ComputerName, Account_Name, Service_Name, Client_Address, Ticket_Encryption_Type -- AS-REP Roasting detection index=* EventCode=4768 | table _time, ComputerName, Account_Name, Client_Address, Status, Pre_Authentication_Type -- Lateral Movement detection index=* EventCode=4624 Logon_Type=3 Account_Name=jsmith | table _time, ComputerName, Account_Name, Logon_Type, Workstation_Name, IpAddress ``` --- ## 第二阶段 - 告警标准化所有 38 条告警从 Splunk 导出为 NDJSON，并使用 `normalize.py` 标准化为一致的架构。 ### 标准化告警架构 ``` { "alert_id": "uuid", "attack_type": "Kerberoasting | AS-REP Roasting | Lateral Movement", "timestamp": "ISO 8601 from Splunk _time", "event_code": "4769 | 4768 | 4624 | unknown", "source_host": "ComputerName", "account": "Account_Name", "source_ip": "Client_Address or IpAddress", "service_name": "Service_Name or null", "logon_type": "Logon_Type or null", "ticket_encryption_type": "0x17 (RC4) | 0x12 (AES-256) | null", "raw_message": "_raw Splunk event", "log_source": "Splunk/WinEventLog:Security" } ``` **已解决的标准化问题：** - Splunk 字段名使用下划线（`Ticket_Encryption_Type` 而非 `TicketEncryptionType`） - 横向移动过滤器需要显式指定 `Account_Name=jsmith` —— 机器账户没有 `IpAddress` - Splunk 默认导出 NDJSON —— 标准化器能检测并处理 NDJSON 与 JSON 数组格式 --- ## 第三阶段 - AI 分类引擎 ### 系统提示（分析师角色） ``` SYSTEM_PROMPT = """ You are a Tier 1 SOC analyst assistant. You analyze SIEM alerts and produce structured triage reports. For every alert, respond ONLY with valid JSON: { "severity": "Critical | High | Medium | Low", "verdict": "True Positive | Likely True Positive | Requires Investigation | Likely False Positive", "mitre_tactic": "TA00XX - Tactic Name", "mitre_technique": "T1XXX - Technique Name", "mitre_confidence": "High | Medium | Low", "false_positive_probability": integer 0-100, "escalate": true or false, "escalation_reason": "string or null", "evidence_summary": "2-3 sentence analyst summary", "recommended_next_steps": ["step 1", "step 2", "step 3"], "analyst_notes": "caveats, confidence gaps, missing data" } Rules: - Base analysis strictly on alert fields provided - Lower confidence when fields are null or missing — never invent data - Never fabricate IOCs, IPs, or account names not in the alert - Kerberoasting = T1558.003. AS-REP Roasting = T1558.004. Lateral Movement = T1550.002 or T1021 """ ``` ### 分类引擎 ``` import anthropic import json import time from dotenv import load_dotenv load_dotenv() client = anthropic.Anthropic() def triage_alert(alert: dict) -> dict: message = client.messages.create( model="Claude API (latest model)", max_tokens=1024, system=SYSTEM_PROMPT, messages=[{ "role": "user", "content": f"Triage this SIEM alert:\n\n{json.dumps(alert, indent=2)}" }] ) raw = message.content[0].text clean = raw.replace("```json", "").replace("```", "").strip() try: return json.loads(clean) except json.JSONDecodeError: return {"error": "Failed to parse AI response", "raw": raw} ``` ### AI 分类结果汇总 | 攻击类型 | AI 判决 | 数量 | 是否正确 | |---|---|---|---| | Kerberoasting（jsmith，RC4 0x17） | 高 / 很可能为真阳性 | 2 | ✅ 是 | | Kerberoasting（机器账户，AES 0x12） | 低 / 可能为假阳性 | 16 | ✅ 是 | | AS-REP Roasting（jsmith，外部 IP） | 高 / 需要调查 | 2 | ✅ 是 | | AS-REP Roasting（机器账户） | 低 / 可能为假阳性 | 5 | ✅ 是 | | 横向移动（jsmith，Logon Type 3） | 高 / 需要调查 | 13 | ✅ 是 | **AI 正确区分了：** - RC4（0x17）= Kerberoasting 指标与 AES（0x12）= 正常流量 - 用户账户 SPN = 攻击目标与机器账户 SPN = 正常域行为 - jsmith 从 Kali IP 发起的网络登录 = 可疑与 DC 自身份认证 = 良性 --- ## 第四阶段 - 分析师仪表板通过 Python HTTP 服务器提供单文件 HTML 仪表板（`dashboard/dashboard.html`），展示全部 38 条分类结果。 ``` # Serve dashboard cd ~/soc-triage-ai python3 -m http.server 8080 # Open: http://localhost:8080/dashboard/dashboard.html ``` **仪表板功能：** - 告警队列（左侧面板）—— 38 条告警及其严重性徽章与升级指示 - AI 分类面板 —— 严重性、判决、MITRE 战术/技术、置信度、假阳性百分比、升级原因、证据摘要、建议后续步骤 - 人工分析师分类面板 —— 严重性、判决、升级与自由文本注释的下拉/输入字段 - AI 与人工分析师对比行 —— 每个字段的 ✅ 匹配 / ❌ 不匹配，并附上解释推理的注释输入框 --- ## 第五阶段 - AI 失败分析向分类引擎输入了三组故意降级的测试用例，以识别失败模式。 ### 测试用例 | 告警 ID | 描述 | 目的 | |---|---|---| | test-001 | 所有字段为空/未知 | 测试零数据下的行为 | | test-002 | 最小数据 —— 仅“User logged on.”原始消息 | 测试模糊日志处理 | | test-003 | 真实的 SQL SPN，启用 AES（0x12）加密 | 测试加密类型知识 | ### 发现的失败 —— test-003（加密类型幻觉） **告警：** `jsmith` 请求针对 `MSSQLSvc/sqlserver.lab.local:1433` 的 TGS，加密类型 `0x12` **AI 输出：** 很可能为真阳性 —— 高严重性 —— 高 MITRE 置信度 **AI 分析员注释（原文）：** **问题所在：** AI 颠倒了加密类型的映射。`0x12` 是 AES-256 —— 最强的 Kerberos 加密，代表完全正常的流量；`0x17` 才是 RC4 —— 实际的 Kerberoasting 指标。AI 错误地将其说反了 **且以高置信度陈述**，这将在真实 SOC 中导致错误升级。 **正确的分析师判决：** 假阳性 —— 使用 AES-256 的命名 SPN 是正常的域行为。 ### 其他测试结果 | 测试 | AI 判决 | 是否失败 | 备注 | |---|---|---|---| | test-001（空） | 低 / 需要调查 | ✅ 无 | 正确降低置信度 | | test-002（模糊） | 高 / 需要调查 | ✅ 无 | 合理但不可操作 —— 垃圾进，垃圾出 | | test-003（AES SPN） | 高 / 很可能为真阳性 | ❌ 幻觉 | 错误的加密映射且以高置信度陈述 | ### 关键发现 --- ## 截图 ### 告警队列与 AI 分类 ![SOC 分类仪表板](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/1f82c8b3ec222212.png) ![AI 分类详情](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/5c3d6b9d9e222214.png) ### 仪表板 —— AI 与人工分析师对比 ![高严重性对比](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/c4f34796d8222216.png) ![中严重性对比](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/8cef843363222217.png) ![低严重性对比](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/17c4a35167222219.png) ![仪表板概览](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/a8e2e3584b222220.png) ### 分类结果 —— JSON 输出 ![分类结果 1](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/e55f0410c3222222.png) ![分类结果 2](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/c60e5d4e88222224.png) ### AI 失败测试 ![测试失败](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/29b3cae521222226.png) --- ## AI 失败分析关于 AI 幻觉案例、测试方法论及生产环境建议的完整文档： 📄 [failure_analysis.md](failure_analysis.md) --- ## 检测逻辑 —— 分析师参考 ### Kerberoasting（T1558.003） | 指标 | 恶意 | 正常 | |---|---|---| | 加密类型 | 0x17（RC4） | 0x12 / 0x11（AES） | | 请求账户 | 用户账户 | 机器账户（$） | | 目标 SPN | 用户命名 SPN | 机器账户 SPN | | 源 IP | 外部工作站 | 本地主机（::1） | ### AS-REP Roasting（T1558.004） | 指标 | 恶意 | 正常 | |---|---|---| | 目标账户 | 启用 DONT_REQ_PREAUTH 的用户账户 | 机器账户 | | 源 IP | 外部 IP | 本地主机（::1） | | 数量 | 多个账户被针对 | 单次自身份认证 | ### 横向移动（T1021 / T1550.002） | 指标 | 可疑 | 正常 | |---|---|---| | 登录类型 | 来自意外源的 3（网络） | 来自已知管理员主机的 3 | | 账户 | 敏感主机上的标准用户 | 管理员主机上的管理员 | | 时间 | 非工作时间、突发模式 | 工作时间、单次事件 | | 来源 | Kali / 攻击者 IP | 已知内部工作站 | --- ## 展示技能 - 活动目录攻击模拟（Kerberoasting、AS-REP Roasting、通过 Impacket 与 CrackMapExec 的横向移动） - Splunk SPL 查询编写与告警导出 - 告警标准化与 Python 中的架构设计 - Claude API 集成与结构化 JSON 输出强制 - SOC 分析师系统提示工程 - MITRE ATT&CK 技术映射（T1558.003、T1558.004、T1021.002、T1550.002） - AI 分类审计 —— 识别幻觉与失败模式 - 自定义 SOC 仪表板开发（HTML/CSS/JavaScript） - 检测差距分析与分析师注释 --- ## 成本与基础设施 | 资源 | 成本 | |---|---| | Anthropic API（38 条告警 + 失败测试） | ~$0.58 | | Splunk Enterprise（试用版） | $0 | | VMware Workstation | 已有许可证 | | 总项目成本 | < $1 | --- ## 参考资料 - [MITRE ATT&CK 框架](https://attack.mitre.org) - [Anthropic Claude API 文档](https://docs.anthropic.com) - [Splunk SPL 参考](https://docs.splunk.com/Documentation/Splunk/latest/SearchReference) - [Windows 安全事件 ID](https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/) - [Impacket 套件](https://github.com/fortra/impacket) **作者：** Durga Sai Sri Ramireddy | MS 网络安全，休斯顿大学 [![LinkedIn](https://img.shields.io/badge/-LinkedIn-0072b1?style=flat&logo=linkedin&logoColor=white)](https://linkedin.com/in/durga-ramireddy) [![GitHub](https://img.shields.io/badge/-GitHub-181717?style=flat&logo=github&logoColor=white)](https://github.com/DurgaRamireddy)

标签：AI与人工对比, AI辅助, AMSI绕过, AS-REP Roasting, Claude API, Kerberoasting, Python, T1021.002, T1550.002, T1558.003, T1558.004, 分析仪表盘, 告警分诊, 威胁检测, 安全运营中心, 家酿实验室, 幻觉测试, 无后门, 横向移动, 编程规范, 网络映射, 自动化分级, 证据汇总, 逆向工具