thc1006/SpikeIDS-MCU

GitHub: thc1006/SpikeIDS-MCU

在STM32N6 Neural-ART NPU上部署的INT8量化MLP入侵检测系统，实现了亚毫秒级推理并首次验证了T=1 SNN-ANN等效性。

Stars: 0 | Forks: 0

# SNN-IDS：在通用 MCU NPU 上基于硬件验证的 SNN 等效入侵检测 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.18906060.svg)](https://doi.org/10.5281/zenodo.18906060) [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/) [![Target](https://img.shields.io/badge/Target-STM32N6570--DK-03234B.svg)](https://www.st.com/en/evaluation-tools/stm32n6570-dk.html) [![NPU](https://img.shields.io/badge/NPU-Neural--ART_600_GOPS-green.svg)](#) [![Inference](https://img.shields.io/badge/Inference-0.29--0.46ms_@_800MHz-brightgreen.svg)](#key-results) **首个在通用 MCU NPU 上，通过硬件验证的 INT8 量化 ANN（近似等效于 T=1 SNN）用于实时网络入侵检测的部署。** ## 关键结果 | 指标 | NSL-KDD (5-class) | UNSW-NB15 (10-class) | |--------|-------------------|----------------------| | **总体准确率** | 78.86 ± 1.32% | 64.75 ± 0.61% | | **Macro F1** | 59.20 ± 2.80% | 40.29 ± 0.90% | | **推理延迟** | 0.46 ms | **0.29 ms** | | **NPU 执行** | 5 HW + 1 Hyb + 2 SW | 4 HW (100% NPU) | | **Flash / RAM** | 137.7 KB / 1.25 KB | 120.6 KB / 0.50 KB | | **评估** | 10 seeds, mean ± std | 10 seeds, mean ± std | **目标板：** STM32N6570-DK (ARM Cortex-M55 @ 800 MHz + Neural-ART NPU 600 GOPS INT8) ## 最新动态据我们所知，这是： 1. **首个在 ARM Cortex-M NPU (Neural-ART) 上的 IDS 部署** —— 之前的 MCU 级 IDS 研究使用的是 MAX78000 (Ngo et al., 2022)，这是一款带有固定 CNN 加速器的 AI 专用 MCU 2. **首个在商用 NPU 芯片上对 T=1 SNN–ANN 等效性的实证验证** —— FP32 和 INT8 模型之间的最终预测一致性达到 99% 3. **首个为 Neural-ART 目标编译的 QCFS 激活函数** —— Floor 算子确认为 CPU 回退执行，增加了 17.6% 的延迟 ## 理论基础具有零初始膜电位的单时间步 (T=1) SNN 产生的前向传播近似等效于带有 ReLU 激活的 INT8 量化 ANN： ``` T=1 SNN inference ≈ INT8 Quantized ANN inference ``` 关键参考文献： - Bu et al., "Optimal ANN-SNN Conversion" (QCFS), **ICLR 2022** - Jiang et al., "Unified Optimization Framework", **ICML 2023** - Bu et al., "Inference-Scale Complexity", **CVPR 2025** ## 架构 ``` IDS_MLP: Linear(d→256) → BN → σ → Linear(256→256) → BN → σ → Linear(256→128) → BN → σ → Linear(128→C) ``` - `d` = 41 (NSL-KDD) 或 34 (UNSW-NB15)，`C` = 5 或 10 - `σ` = ReLU (路径 B) 或 QCFS L=4 (路径 A) - BatchNorm 在导出时融合进 Linear → ONNX 图：仅包含 `Gemm` + `Relu` - 111,365 (NSL-KDD) 或 110,218 (UNSW-NB15) 个参数 - 针对极端不平衡的逆频率类别加权 ## NPU 硬件基准测试所有模型均通过 ST Edge AI Developer Cloud v4.0.0 在 STM32N6570-DK 上进行基准测试： | Model | Dataset | Inference | HW | Hyb | SW | Flash | RAM | |-------|---------|-----------|---:|----:|---:|-------|-----| | ReLU FP32 (CPU) | NSL-KDD | 1.24 ms | 0 | 0 | 11 | 466.4 KB | 2.17 KB | | **ReLU INT8 (NPU)** | **NSL-KDD** | **0.46 ms (2.7×)** | 5 | 1 | 2 | 137.7 KB | 1.25 KB | | ReLU FP32 (CPU) | UNSW-NB15 | 1.23 ms | 0 | 0 | 11 | 461.9 KB | 2.14 KB | | **ReLU INT8 (NPU)** | **UNSW-NB15** | **0.29 ms (4.2×)** | 4 | 0 | 0 | 120.6 KB | 0.50 KB | | QCFS INT8 | NSL-KDD | 0.54 ms | 13 | 1 | 14 | 138.0 KB | 2.00 KB | 关键发现： - **NPU 相比同模型下的纯 CPU 执行实现了 2.7–4.2× 加速** - **Floor 算子不被 Neural-ART NPU 支持** —— 回退至 CPU 作为 `Floor(float)` 执行 - **ReLU INT8 是最优 NPU 路径** —— 所有 Gemm+Relu 均在 NPU 上执行，激活函数无 CPU 回退 - **基于树的模型 (RF, XGBoost) 无法在 STM32N6 上运行** —— `TreeEnsembleClassifier` 被 ST Edge AI Core 拒绝 ("NOT IMPLEMENTED") ## 复现 ``` # 设置 python3 -m venv snn-ids-env source snn-ids-env/bin/activate pip install -r requirements.txt # 下载数据集 mkdir -p data # NSL-KDD：将 KDDTrain+.txt 和 KDDTest+.txt 放置在 data/ 中 # UNSW-NB15：将 parquet 文件放置在 data/ 中 # 运行实验 make train # Train ReLU model (single seed, NSL-KDD) make multiseed # 10-seed experiment (NSL-KDD, ReLU vs QCFS) make unsw # 10-seed experiment (UNSW-NB15) make unsw-export # ONNX + INT8 for UNSW-NB15 make tree-baseline # RF + XGBoost baselines make layerwise # FP32 vs INT8 layer-wise analysis make quant-ablation # 24-config quantization ablation make paper # Compile LaTeX paper # NPU 验证（需要浏览器） # 上传 models/*.onnx 至 https://stedgeai-dc.st.com # 选择目标：STM32N6570-DK → Benchmark ``` ## 项目结构 ``` ├── src/ │ ├── train.py # ReLU model training (Path B) │ ├── train_qcfs.py # QCFS model training (Path A) │ ├── export_onnx.py # ONNX export with BN fusion (NSL-KDD) │ ├── export_qcfs_onnx.py # QCFS ONNX export │ ├── export_unsw_onnx.py # ONNX export + INT8 PTQ (UNSW-NB15) │ ├── quantize.py # INT8 PTQ (ReLU, NSL-KDD) │ ├── quantize_qcfs.py # INT8 PTQ (QCFS) │ ├── quantize_ablation.py # 24-config quantization ablation │ ├── experiment_multiseed.py # 10-seed experiment (NSL-KDD) │ ├── experiment_unsw.py # 10-seed experiment (UNSW-NB15) │ ├── tree_baseline.py # RF + XGBoost baselines │ └── layerwise_analysis.py # FP32 vs INT8 layer-wise comparison ├── results/ │ ├── multiseed_experiment.json # NSL-KDD 10-seed results │ ├── unsw_multiseed_experiment.json # UNSW-NB15 10-seed results │ ├── tree_baselines.json # RF + XGBoost results │ ├── layerwise_analysis.json # FP32 vs INT8 analysis │ ├── quantize_ablation.json # Quantization ablation │ └── related_work_table.json # Related work comparison ├── paper/ # LaTeX paper ├── configs/ │ └── default.yaml # Experiment configuration ├── docs/ │ ├── ADR-001-SNN-NPU-GoNoGo-Verification.md │ └── SNN_RTOS_Telecom_Analysis.md ├── CITATION.cff # Citation metadata ├── requirements.txt # Pinned dependencies ├── Makefile # One-command reproducibility └── LICENSE # Apache 2.0 ``` ## 引用 ``` @software{tsai2026snnids, title = {SNN-IDS: Deploying SNN-Equivalent Intrusion Detection on a Commodity MCU NPU}, author = {Tsai, Hsiu-Chi}, year = {2026}, url = {https://github.com/thc1006/SpikeIDS-MCU}, doi = {10.5281/zenodo.18906060}, version = {1.0.0} } ``` ## 参考文献 - **QCFS Activation**: Bu et al., "Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks," *ICLR 2022*. - **Unified ANN-SNN Framework**: Jiang et al., "A Unified Optimization Framework of ANN-SNN Conversion," *ICML 2023*. - **Inference-Scale Complexity**: Bu et al., "Inference-Scale Complexity in ANN-SNN Conversion," *CVPR 2025*. - **NSL-KDD Dataset**: Tavallaee et al., *IEEE CISDA*, 2009. - **UNSW-NB15 Dataset**: Moustafa & Slay, *MilCIS*, 2015. - **Neural-ART NPU**: STMicroelectronics, STM32N6 Application Note UM3225. - **HH-NIDS (MAX78000)**: Ngo et al., *Future Internet* 15(1):9, 2022. - **Akida IDS**: Zahm et al., *CSIAC*, 2024. ## 许可证 Apache License 2.0。详见 [LICENSE](LICENSE)。

标签：Apex, ARM Cortex-M55, CISA项目, CNCF毕业项目, INT8量化, MCU, MLP, Neural-ART, NPU, NSL-KDD, QCFS激活, SNN, SNN-ANN等价性, STM32N6, T=1, UNSW-NB15, 入侵检测系统, 凭据扫描, 安全数据湖, 实时推理, 密码管理, 嵌入式安全, 微控制器, 机器学习, 深度学习, 物联网安全, 硬件加速, 网络流量分析, 脉冲神经网络, 边缘计算, 逆向工具