marysubaja/SideChannelAttack
GitHub: marysubaja/SideChannelAttack
针对 AES 解密逆 S-Box 的相关性功耗分析攻击工具,通过汉明权重模型与皮尔逊相关系数从功耗轨迹中逐字节恢复密钥。
Stars: 0 | Forks: 0
# 侧信道攻击
对高级加密标准 (AES) 的差分功耗分析 (DPA) 攻击,专门针对解密过程中的逆 S-Box
## 1. 简介
**差分功耗分析 (DPA)** 是侧信道攻击 (SCA) 的一种形式,它利用硬件设备(如智能卡或微处理器)的功耗与正在处理的中间数据之间的统计关系。
传统的密码分析将算法视为一个数学“黑盒”,而 DPA 则着眼于物理实现。提供的代码专门实现了**相关性功耗分析 (CPA)** 变体,使用**汉明权重**模型将预测的功耗与实际记录的轨迹进行关联,以逐字节恢复秘密的 AES 密钥。
## 2. 文献调查
功耗分析领域由 **Paul Kocher** 于 1999 年开创,他证明了即使是微小的功耗变化也可能泄露秘密。关键的里程碑包括:
* **简单功耗分析 (SPA):** 直接观察功耗轨迹以识别指令(例如,在 RSA 中看到“平方”与“乘法”的区别)。
* **差分功耗分析 (DPA):** 使用统计方法(均值差)从噪声中提取信号。
* **相关性功耗分析 (CPA):** DPA 的演进(由 Brier 等人于 2004 年提出),它使用**皮尔逊相关系数**将功耗轨迹与功耗模型(如汉明权重)进行匹配。这是您的代码中使用的特定方法。
## 3. 应用
* **安全审计:** 硬件制造商使用这些脚本在批量生产之前测试芯片的“泄露”情况。
* **取证:** 从锁定或受保护的硬件设备中提取密钥。
* **智能卡安全:** 测试信用卡和 SIM 卡抵御物理篡改的韧性。
* **密码学研究:** 开发和验证诸如“掩码”或“混洗”等对策。
## 4. 算法:相关性功耗分析 (CPA)
该算法遵循一个结构化的统计过程:
1. **数据收集:** 捕获功耗轨迹 ($iTraces$) 和相应的密文 ($iCiphertext$)。
2. **功耗建模:** 使用 S-Box 输出的汉明权重,预测每一种可能的密钥猜测 ($0-255$) 的功耗。
3. **统计相关性:** 使用皮尔逊相关性将预测结果与所有时间样本的实际功耗测量值进行比较。
4. **密钥提取:** 具有最高绝对相关性峰值的猜测被识别为最可能的秘密密钥。
## 5. 伪代码
```
FUNCTION perform_dpa(Ciphertexts, Traces):
Initialize RecoveredKeys[16]
Precompute HammingWeightTable for 0-255
FOR each ByteIndex from 0 to 15:
FOR each KeyGuess from 0 to 255:
FOR each TraceIndex:
Value = Ciphertexts[TraceIndex, ByteIndex] XOR KeyGuess
IntermediateState = InverseSBox(Value)
Hypothesis[TraceIndex, KeyGuess] = HammingWeight(IntermediateState)
# Center the data
Center(Hypothesis)
Center(Traces)
# Calculate Correlation
CorrelationMatrix = MatrixMultiply(Hypothesis.Transpose, Traces)
# Find the maximum correlation point
BestGuess = IndexOfMax(Absolute(CorrelationMatrix))
RecoveredKeys[ByteIndex] = BestGuess
RETURN RecoveredKeys
## 6. 提供的代码解释
### A. 常量和表
* `inv_s`: The Inverse S-Box is used because the attack targets the **decryption** phase or the last round of AES where the ciphertext is XORed with the key and passed through the inverse substitution layer.
* `hamming_weight_8bit_table`: A precomputed list of how many "1" bits are in any byte. This is a common proxy for power consumption: a chip uses more energy to set a bit to "1" than to "0".
### B. `perform_dpa` 函数
* **Centering:** The code subtracts the mean from the traces and hypotheses. This simplifies the Pearson Correlation formula to a basic matrix multiplication, significantly speeding up the calculation.
* **Hypothesis Matrix:** It generates $256$ predictions for every single trace.
* **Matrix Multiplication:** `np.dot(hyp_hw.T, mean_traces)` is the "engine." It calculates the correlation between all 256 guesses and all time samples simultaneously.
* **Visualization:** It uses `matplotlib` to plot the correlation. In a successful attack, the "correct" key guess will show a distinct, high spike compared to the noise of the incorrect guesses.
## 7. 结论
The implementation demonstrates how side-channel leakage can bypass the mathematical strength of AES. Even if the AES algorithm is "unbreakable" by brute force, the physical implementation leaks enough information through power consumption to recover a 128-bit key in seconds using only a few hundred traces. The efficiency of this code relies on **vectorized NumPy operations**, making it capable of handling large datasets.
## 8. 扩展与未来研究
* **Multi-Variate Attacks:** Attacking multiple points in time simultaneously to bypass certain protections.
* **Deep Learning SCA:** Using Convolutional Neural Networks (CNNs) to recover keys from traces that are "desynchronized" (where the spikes don't line up perfectly).
* **Countermeasures:**
* **Masking:** Adding random noise to the data so the Hamming Weight is no longer predictable.
* **Hiding:** Inserting "dummy" cycles or random delays to jitter the traces in time.
* **Targeting Other Layers:** Extending the attack to the `MixColumns` or `AddRoundKey` layers for different AES implementations.
## 9. 结果
This figure is the visual output of the **Correlation Power Analysis (CPA)** step within the DPA algorithm. It shows how well your mathematical "guesses" about the key match the physical reality of the power consumption.
The X and Y Axes
* **X-axis (Time Samples):** Represents the specific moments in time during the AES operation. Since you used `num_test_samples = 100`, the plot spans from 0 to 100.
* **Y-axis (Correlation):** Represents the statistical correlation between your Hamming Weight hypothesis and the power trace. A higher absolute value (positive or negative) indicates a stronger match.
The Gray "Noise" Lines
The gray background is a "spaghetti plot" representing the correlation traces for all **incorrect key guesses**.
* Because there are 256 possible values for a single byte, the code calculates 256 correlation lines.
* For the 255 wrong keys, the power consumption hypothesis is essentially random compared to the actual data, so their correlation stays low and oscillates randomly around zero.
The Red Line
The red line represents the **Correct Key Guess** (in this case, `0xeb`).
* **The Spike:** Notice the sharp downward spike around **Sample 24**. This is the most critical part of the graph.
* At this specific point in time, the hardware was likely processing the S-Box operation. Because your hypothesis for `0xeb` matches what the hardware actually did, the correlation "spikes" significantly above the background noise.
Why is the spike negative?
In power analysis, a negative spike is just as valid as a positive one. Correlation can range from $-1$ to $+1$. A strong negative correlation simply means that as your predicted Hamming Weight increases, the measured power decreases (or vice-versa), often due to how the physical measurement probe is oriented or how the specific CMOS logic transitions.
Summary of Results
In your specific run:
* **The Algorithm's Logic:** The code looked at all 256 lines, found that the line for `0xeb` reached the highest absolute magnitude (approx. **-130** in this scaled result), and concluded that `0xeb` is the secret key byte.
* **Success Metric:** If the red line were buried deep inside the gray mass, the attack would have failed (likely requiring more traces to reduce the noise). Since the red spike is clearly visible, the attack was successful.
**How to Prevent an Attack Like This**
To stop the algorithm you provided from working, engineers use several "shielding" techniques:
________________________________________
1. Masking (The "Math Cloak")
This is the most common defense. Instead of processing the real data ($X$), the chip generates a random number ($M$) and processes $(X \oplus M)$.
• Why it works: The power consumption now correlates to a random value rather than the secret key. The "spikes" in your graph would disappear into the gray noise.
2. Hiding (The "Noise Generator")
The chip can be designed to use a constant amount of power regardless of whether it is processing a 0 or a 1.
• Current Flattening: Using special hardware (like Dual-Rail Logic) so that every operation consumes exactly the same amount of electricity.
• Noise Insertion: Adding a random noise generator to the chip to "drown out" the real signal, making it much harder for the correlation math to find a spike.
3. Shuffling and Jitter (The "Timing Shell Game")
The algorithm you have relies on the spike happening at a specific time (like Sample 24 in your image).
• Shuffling: The chip performs the 16 bytes of AES in a different, random order every time.
• Dummy Operations: The chip performs "fake" math at random intervals.
• Result: The "spike" moves around constantly, so when the attacker tries to average the traces, the spikes cancel each other out.
________________________________________
4. Protocol Level Limits
Some systems prevent DPA by simply limiting the number of tries.
• Since DPA requires hundreds or thousands of power traces to "see" through the noise, a smart card might lock itself forever if it detects the same key being used too many times in a suspicious manner.
Summary Table
Feature This Code (The Attack) The Prevention (The Defense)
Goal Extract the secret key. Keep the key hidden.
Method Correlates power to data. Breaks the link between power and data.
Visibility Looks for "spikes" in a graph. Tries to make the graph look like flat noise.
```
标签:AES, CPA, DPA, 侧信道攻击, 功率模型, 密码分析, 密码学, 差分功耗分析, 手动系统调用, 数据加密, 智能卡安全, 汉明重量, 物理攻击, 皮尔逊相关系数, 相关性功耗分析, 硬件安全, 解密过程, 逆向S盒, 逆向工具, 高级加密标准