rylanrodrigues/ai-log-analysis-agent
GitHub: rylanrodrigues/ai-log-analysis-agent
基于机器学习的全栈日志分析系统,帮助分析师高效分类日志并自动化验证安全告警的准确性。
Stars: 0 | Forks: 0
以下是您可以在 `README.md` 中使用的更专业、更具可读性的版本:
```
# AI 日志分析 Agent
AI Log Analysis Agent is a full-stack web application designed to help analyze log data more efficiently. The project focuses on identifying different types of logs and supporting the verification of alert results.
The system can classify logs into categories such as system logs, application logs, and security logs. It can also help determine whether an alert is a true positive, true negative, false positive, or false negative. This can be useful in situations where large amounts of log data need to be reviewed and security or system alerts must be checked carefully.
## 项目概述
Modern applications, servers, and security tools generate a large amount of log data. Reviewing these logs manually can take a lot of time, especially when alerts need to be verified. Some alerts may point to real issues, while others may be incorrect or harmless.
This project was created to make the log review process easier. It provides a simple web interface where users can upload log data, analyze individual log messages, train a model, and view previous results.
The goal of the project is not to replace human review, but to assist analysts by giving them a faster way to organize and understand log data.
## 主要功能
- Upload log datasets
- Analyze individual log messages
- Classify logs as system, application, or security logs
- Predict whether an alert is a true positive, true negative, false positive, or false negative
- Train a machine learning model
- View model accuracy and performance
- Store analyzed results in a PostgreSQL database
- View previous analysis results through a simple dashboard
## 使用技术
### Python
Python is used for the backend and machine learning logic. It handles the main application code, data processing, model training, and prediction.
### FastAPI
FastAPI is used to build the backend API. It provides the routes used by the frontend to upload logs, analyze log messages, train the model, and fetch stored results.
### PostgreSQL
PostgreSQL is used as the database for the project. It stores uploaded logs, predictions, confidence scores, and review results.
### SQLAlchemy
SQLAlchemy is used to connect the Python backend with the PostgreSQL database. It helps define database tables and makes it easier to read and write data from Python.
### Scikit-learn
Scikit-learn is used for the machine learning model. The model is trained using sample log data and reviewed examples so it can predict log types and alert verification results.
### HTML、CSS 和 JavaScript
The frontend is built using plain HTML, CSS, and JavaScript. No frontend framework is used. This keeps the project simple, lightweight, and easier to understand.
### Uvicorn
Uvicorn is used to run the FastAPI application locally during development.
## 前端页面
The frontend is split into multiple pages to keep the dashboard simple and organized.
### Dashboard
The dashboard gives a basic overview of the project and provides navigation to the main sections of the application.
### 模型训练
The model training page allows the user to train the machine learning model and view performance details such as accuracy, precision, recall, and F1 score.
### 上传 Dataset
The upload page allows the user to upload a dataset of logs for analysis.
### 分析日志
The analyze page allows the user to enter a single log message and receive a prediction. The result includes the log type, alert verification result, confidence score, and explanation.
### 结果
The results page shows previously analyzed logs that have been saved in the database.
## 日志类别
The project currently supports the following log categories:
- System log
- Application log
- Security log
## 告警核实类别
The project can classify alert results into the following categories:
- True Positive
- True Negative
- False Positive
- False Negative
### True Positive
A true positive means the system correctly detected a real issue.
Example: A failed login attack is detected, and the attack actually happened.
### True Negative
A true negative means the system correctly identified that there was no issue.
Example: A normal login happens, and no alert is raised.
### False Positive
A false positive means the system reported an issue, but there was no real problem.
Example: Normal user activity is incorrectly marked as suspicious.
### False Negative
A false negative means the system missed a real issue.
Example: An actual attack happens, but the system does not detect it.
## 如何运行项目
### 1. 打开项目文件夹
```bash
cd "/Users/rylanrodrigues/Documents/Autonomous AI Log Analysis Agent"
```
### 2. 启动 PostgreSQL
```
brew services start postgresql@16
```
### 3. 创建数据库
```
createdb log_analysis
```
如果数据库已经存在,则无需重新创建。
### 4. 创建环境文件
```
cp .env.example .env
```
`.env` 文件用于存储数据库连接字符串。
示例:
```
DATABASE_URL=postgresql+psycopg://your_username@localhost:5432/log_analysis
APP_NAME=AI Log Analysis Agent
```
如有需要,请将 `your_username` 替换为您本地计算机的用户名。
### 5. 激活虚拟环境
```
source .venv/bin/activate
```
### 6. 安装必要的依赖包
```
python -m pip install -r requirements.txt
```
### 7. 运行应用程序
```
uvicorn app.main:app --reload
```
### 8. 打开应用程序
在浏览器中打开以下 URL:
```
http://127.0.0.1:8000
```
## 项目使用说明
启动应用程序后,在浏览器中打开控制台。您可以通过顶部的导航栏在不同页面之间切换。
使用“上传数据集”页面上传日志文件。
使用“日志分析”页面测试单条日志消息。
使用“模型训练”页面训练机器学习模型并检查其性能。
使用“分析结果”页面查看已经分析并保存的日志。
## 项目结构
```
Autonomous AI Log Analysis Agent/
│
├── app/
│ ├── static/
│ │ ├── index.html
│ │ ├── model.html
│ │ ├── upload.html
│ │ ├── analyze.html
│ │ ├── results.html
│ │ ├── styles.css
│ │ └── app.js
│ │
│ ├── analyzer.py
│ ├── config.py
│ ├── database.py
│ ├── main.py
│ ├── models.py
│ ├── parsers.py
│ └── schemas.py
│
├── tests/
│ └── test_analyzer.py
│
├── .env.example
├── .gitignore
├── README.md
└── requirements.txt
```
## 重要文件说明
### `app/main.py`
这是主要的后端文件。它创建了 FastAPI 应用程序,并定义了前端使用的 API 路由。
### `app/analyzer.py`
此文件包含机器学习逻辑。它负责训练模型,并为日志类型分析和告警验证生成预测结果。
### `app/database.py`
此文件用于管理数据库连接。
### `app/models.py`
此文件定义了项目使用的数据库表。
### `app/schemas.py`
此文件定义了 API 使用的数据格式。
### `app/parsers.py`
此文件包含了读取和处理上传日志文件的辅助逻辑。
### `app/static/`
此文件夹包含了前端文件,其中包括 HTML 页面、CSS 样式和 JavaScript 行为逻辑。
### `requirements.txt`
此文件列出了运行该项目所需的所有 Python 包。
### `.env.example`
此文件提供了一份环境配置的示例。
### `.gitignore`
此文件用于告知 Git 哪些文件不应被上传至 GitHub,例如 `.env`、`.venv` 以及缓存文件。
## 注意事项
`.env` 文件包含私密的配置信息,因此不应上传至 GitHub。仓库中应仅包含 `.env.example` 文件。
本项目使用了基础的机器学习模型和示例训练数据。若要用于生产环境,应使用更大且更贴合实际的数据集来训练模型。
## 未来改进方向
未来可能的改进方向包括:
- 用户身份验证
- 图表和可视化报告
- 支持更多日志格式
- 更大规模的训练数据集
- 更完善的模型评估机制
- 结果的搜索与筛选功能
- 导出 CSV 格式
- 针对分析师和管理员的基于角色的访问控制
## 总结
AI Log Analysis Agent 是一个全栈项目,结合了后端 API、机器学习、数据库存储以及一个简易的 Web 控制台。其设计旨在通过帮助用户对日志进行分类并更高效地审查告警结果,从而为日志分析提供支持。
标签:Apex, AV绕过, FastAPI, PostgreSQL, 告警分类, 多模态安全, 异常检测, 数据可视化, 机器学习, 测试用例, 逆向工具