DavidNguyen1812/acoustic-side-channel-on-keyboard
GitHub: DavidNguyen1812/acoustic-side-channel-on-keyboard
基于 CoAtNet 深度学习模型的键盘声学侧信道攻击复现项目,可从手机或 Zoom 录音中预测用户输入的字母和数字。
Stars: 0 | Forks: 0
# 键盘击键声学侧信道攻击
该项目基于 Joshua Harrison 及其合作伙伴的论文《键盘声学侧信道攻击》。该攻击涉及收集手机和 Zoom 录制的 2021 MacBook Pro 上打字的原始 .wav 音频,并使用 FFT 提取击键音频,将其转换为频谱图,进而利用预训练的 CoAtNet 模型来预测笔记本电脑的击键(共 36 种可能性,即 [a-z0-9])。
目前训练和评估了 2 个模型的各 2 个版本:
1. Zoom 音频模型(V1 和 V2),用于 Zoom 录制音频
2. 手机音频模型(V1 和 V2),用于手机录制音频
## 仓库结构:
```
├── AudioPreprocessing
│ ├── 0-9 # Contain result keystroke extractions from running the extraction program
│ ├── 0-9-clean.wav # An audio of all the extracted keystroke combined
│ ├── 0-9.wav # Original audio before extraction
│ ├── RecordingKeystrokeIsolator.ipynb # Keystroke isolator code
│ └── keystrokeExtractionLogic # Data flow of how the keystroke isolator works
├── CoATNet # Four Pre-Trained CoATNet models
├── InferencePhrase # Inference Stage of the models
│ ├── 0-9 # Isolated keystrokes from a sample attack recording
│ ├── Models # Contains 4 Pre-Trained CoATNet models and their training + eval result
│ ├── 0-9.wav # Original of the attack recording
│ └── InferencePhrase.ipynb # Jupyter Notebook of the python code to run the models in infernece phrase to predict attacker recordings
├── Keystroke-Datasets # Clean Keystoke dataset obtain from /Botacin-s-Lab/EchoCrypt/
├── Reading # The original Harrison et Al paper that lay the foundation of this project
└── README.md # This file
```
## 击键数据集:
原始数据集可通过此[链接](https://github.com/Botacin-s-Lab/EchoCrypt/tree/main/dataset)找到
## 流程:
**CoATNet 模型的训练与验证**
```
Clean Keystroke Data -> Spectrogram Transformation -> Building CoATnet architecture -> Training and Validation -> Performance Metrics
```
**实施实际攻击**
```
Raw audio recording -> FFT algorithm to extract keystrokes audio -> Spectrogram Transformation -> Pre-Trained CoATnet -> Prediction
```
## 主要特性:
**训练与验证阶段**
```
1. A clean keystrokes (a-z0-9) on a 2021 Macbook Pro Model obtained from https://github.com/Botacin-s-Lab/EchoCrypt/tree/main/dataset
2. Transforming the isolated keystrokes to Mel Spectrograms and Torch Tensors
3. Uses CoAtNet architecture which utilizes convolution and transformer layers to train on the tensors.
4. Tracks the Performance of the model using
Loss
Accuracy
Precision
Recall
Weighted F1 Score
Confusion Matrix
5. Uses early stopping to prevent overfitting and training plateau to avoid unnecessary training epochs.
6. Automatically saves the best model for later inference stage.
7. Generates the training, validation, Confusion matrix, Precision, Recall, and F1 score visualizations across the training and validation process.
```
**实施实际攻击**
```
1. Extraction of keystroke sounds from recording audio using Fast-Fourier-Transform algorithm to enhance SNR.
2. Transforming the isolated keystrokes to Mel Spectrograms and Torch Tensors.
3. Passing the tensors to pre-trained CoATnet models to obtain the keystroke prediction.
```
## 模型训练与评估结果:
**手机模型**\
训练准确率:0.5486\
验证准确率:0.5333\
精确率:0.5539\
召回率:0.5333\
F1:0.5360
**Zoom 模型**\
训练准确率:0.6153\
验证准确率:0.3667\
精确率:0.3579\
召回率:0.3667\
F1:0.3532
## 环境要求:
$ 需要安装 Jupyter Notebook\
$ Python 版本 3.11+\
$ pip install torch torchvision torchaudio\
$ pip install librosa numpy scikit-learn matplotlib tqdm\
$ 一部装有录音软件(语音备忘录)的 iPhone 16\
$ Zoom 应用程序,且启用了 Zoom 录制功能,背景噪声抑制设置需调至 **MEDIUM**\
$ 2021 Macbook Pro
## 使用说明:
请遵循每个目录中的 Jupyter Notebook 进行操作
## 已知问题:
1. 我们想要处理的所有按键字符串并不完整,没有空格键,没有句号 (.) 或下划线,这就是为什么此特定代码可能无法在您处于 Zoom 会议中试图监听某人输入登录凭据的场景中成功运行(因为现在大多数密码都需要用到 ,. 或下划线)。其他因素也可能影响结果,例如打字速度、击键声的响度/音量可能会影响模型。其他噪音(风扇/空调)。
2. 我们没有使用与原作者相同的设置(手机下放置有一块超细纤维布)。
3. 隔离脚本有时无法提取所有可能的击键(例如,如果我们的句子有 35 个有效字符,它可能只会捕获 32 个)。
4. 缺乏足够的计算能力来增强模型复杂性以提高准确性。
## 贡献者:
1. David Nguyen - DavidNguyen1812
2. Sarah Soliman - sarahsolimans
标签:Apex, CoAtNet, FFT, MacBook安全, NoSQL, Zoom录音分析, 侧信道攻击复现, 信号处理, 凭据扫描, 击键识别, 声学侧信道攻击, 声纹识别, 密码猜测, 手机录音分析, 数据隐私, 机器学习, 深度学习安全, 逆向工具, 键盘击键窃取, 音频取证, 频谱图