Singh-Manit/SSH-LOG-ANALYZER
GitHub: Singh-Manit/SSH-LOG-ANALYZER
Stars: 0 | Forks: 0
# log_analyzer
A Python command-line tool for parsing authentication logs, detecting brute-force login attempts, and flagging suspicious IPs. Built as a learning project to understand how SOC analysts approach log triage.
## Background
I built this after reading about how a large chunk of SOC work involves going through auth logs manually or with basic grep commands to find failed login patterns. The goal was to automate the repetitive parts — counting failures per IP, checking if they happened close together in time, figuring out if the same IP hit multiple accounts — and surface only what actually needs attention.
The sliding window approach for brute-force detection came from thinking about how a simple total-failure counter is easy to game. An attacker who sends one attempt every 10 minutes over 8 hours will never trigger a basic threshold, but they're still doing something suspicious. Checking how many failures happened within any 5-minute span catches that.
## What it detects
**Brute force** — more than N failed logins from the same IP within a configurable time window. Default is 5 failures in 5 minutes. Severity scales with how far over the threshold the count is.
**Credential stuffing** — a single IP targeting 10 or more distinct usernames. This is the pattern you see when someone is running a list of leaked credentials rather than targeting one specific account.
**Post-failure success** — a successful login from an IP that previously had multiple failures. This is the most actionable alert because it might mean the attacker actually got in.
## Requirements
- Python 3.8 or newer
- No external libraries required — uses only the standard library
## Installation
git clone https://github.com/yourname/log_analyzer.git
cd log_analyzer
That's it. No pip install needed.
## Usage
### Run on a real log file
python log_analyzer.py /var/log/auth.log
### Run the built-in demo (no file needed)
python log_analyzer.py
This uses a hardcoded sample log embedded in the script, which is useful for testing or showing someone how the tool works without needing access to an actual server.
### JSON output
python log_analyzer.py /var/log/auth.log --json
Outputs structured JSON instead of the terminal report. Useful if you want to pipe results into another tool or store them somewhere.
### Change detection thresholds
# Flag after 3 failures instead of 5
python log_analyzer.py /var/log/auth.log --threshold 3
# Use a 60-second window instead of 5 minutes
python log_analyzer.py /var/log/auth.log --window 60
# Both at once
python log_analyzer.py /var/log/auth.log --threshold 3 --window 60
### Full help
python log_analyzer.py --help
## Log formats supported
The parser handles three formats:
- **sshd** — standard OpenSSH auth log lines (`Failed password for user from IP`)
- **Apache/Nginx access logs** — combined log format, flags 401 and 403 responses
- **Generic failed auth** — a catch-all regex that looks for keywords like FAILED, UNAUTHORIZED, 401, 403 near an IP address
If a line does not match any pattern it is skipped. The summary at the top of the report shows how many lines were parsed vs. total, so you can tell if the format is being picked up correctly.
## Output explained
### Terminal report
Lines processed : 2847
Lines parsed : 2841
Unique IPs seen : 312
Alerts raised : 4
-- ALERTS -------------------------------------------------------
[CRITICAL] [2025-01-10 08:01:24] BRUTE_FORCE -- 192.168.1.50: 6 failed logins in 300s window
[HIGH ] [2025-01-10 08:05:18] CREDENTIAL_STUFFING -- 203.0.113.9: 10 distinct accounts targeted
[MEDIUM ] [2025-01-10 08:01:24] SUCCESS_AFTER_FAILURES -- 192.168.1.50: login succeeded after 6 failures
Alerts are sorted by severity (CRITICAL first). Each alert shows the timestamp of the last relevant event for that IP, not the first one.
### JSON output
{
"meta": {
"generated_at": "2025-01-10T08:15:00",
"total_lines": 2847,
"parsed_lines": 2841,
"unique_ips": 312
},
"alerts": [...],
"top_ips": [...]
}
## Sample log file
`sample_auth.log` is included in the repo. It has examples of each attack pattern the tool detects plus some normal traffic. Good for testing changes to thresholds without needing access to a live server.
python log_analyzer.py sample_auth.log
## Project structure
log_analyzer/
log_analyzer.py main script
sample_auth.log example log with known attack patterns
README.md this file
## Limitations
**No GeoIP lookup.** The tool works entirely offline and has no way to map IPs to countries or ASNs. Adding that would require the MaxMind GeoLite2 database or a paid API, which I want to explore later.
**Regex-based parsing breaks on non-standard formats.** If your sshd or Apache is configured to produce a different log format than the defaults, lines will be silently skipped. The parsed/total ratio in the output is the indicator to watch.
**Timestamps without years.** Standard syslog format (used by sshd) does not include the year in timestamps. The script assumes the current year, which means logs from December analyzed in January will have incorrect timestamps for window calculations.
**IPv6 not supported.** The IP matching regex only handles IPv4 addresses. Modern systems with IPv6 enabled will have those addresses skipped.
**Not a replacement for a real SIEM.** This is a single-host, single-file tool. It has no correlation across multiple log sources, no persistent database, no alerting integration, and no dashboard. It is meant for quick manual analysis or learning, not production monitoring.
**False positives on shared IPs.** If your organization uses a proxy or NAT gateway, many users may share one external IP. The tool will flag that IP for brute force or credential stuffing even if the traffic is legitimate. Whitelisting is not currently implemented.
## Possible improvements
- Add a whitelist for known-good IPs
- GeoIP integration to flag high-risk countries
- SQLite backend for persistent tracking across multiple log files
- Email or Slack alert on CRITICAL findings
- Support for IPv6 addresses
- Configurable output file path for the JSON report