erichutchins/geoipsed
GitHub: erichutchins/geoipsed
基于 Rust 的高性能 IP 地理位置装饰工具,支持流式处理海量日志并实时补充城市、国家、ASN 等元数据。
Stars: 29 | Forks: 3
# geoipsed
_Fast, inline geolocation decoration of IPv4 and IPv6 addresses written in Rust_
IP geolocation enriches logs with City, Country, ASN, and timezone metadata. `geoipsed` finds and decorates IP addresses in-place, leaving existing context intact—perfect for incident response and network analysis.
## 快速入门
```
cargo install geoipsed
echo "Connection from 81.2.69.205 to 175.16.199.37" | geoipsed
```
Output:
```
Connection from <81.2.69.205|AS0_|GB|London> to <175.16.199.37|AS0_|CN|Changchun>
```
## 特性
- IPv4 和 IPv6 支持及严格验证
- City, Country, ASN, timezone 元数据
- Flexible templating via `-t/--template`
- Inline decoration or JSON output modes (`--tag`, `--tag-files`)
- Fine-grained filtering: `--all`, `--no-private`, `--no-loopback`, `--no-broadcast`
- Color support with `-C/--color`
- Streaming input (stdin or multiple files)
- ~100x faster than Python implementations
## 数据库
Supports MaxMind (default), IP2Location, and IPinfo MMDB formats. Specify location with `-I` or `GEOIP_MMDB_DIR` environment variable.
## 用法
```
geoipsed --help
Inline decoration of IPv4 and IPv6 address geolocations
Usage: geoipsed [OPTIONS] [FILE]...
Arguments:
[FILE]... Input file(s) to process. Leave empty or use "-" to read from stdin
Options:
-o, --only-matching Show only nonempty parts of lines that match
-C, --color Use markers to highlight the matching strings [default: auto] [possible values: always, never, auto]
-t, --template Specify the format of the IP address decoration. Use the --list-templates option to see which fields are available. Field names are enclosed in {}, for example "{field1} any fixed string {field2} & {field3}"
--tag Output matches as JSON with tag information for each line
--tag-files Output matches as JSON with tag information for entire files
--all Include all types of IP addresses in matches
--no-private Exclude private IP addresses from matches
--no-loopback Exclude loopback IP addresses from matches
--no-broadcast Exclude broadcast/link-local IP addresses from matches
--only-routable Only include internet-routable IP addresses (requires valid ASN entry)
--provider Specify the MMDB provider to use (default: maxmind) [default: maxmind]
-I Specify directory containing the MMDB database files [env: GEOIP_MMDB_DIR=]
--list-providers List available MMDB providers and their required files
-L, --list-templates Display a list of available template substitution parameters to use in --template format string
-h, --help Print help
-V, --version Print version
```
## 示例
```
# 装饰模式
geoipsed access.log
# 仅匹配 IP(带装饰)
geoipsed -o access.log
# 自定义模板
geoipsed -t "{ip} in {country_iso}" access.log
# 过滤:仅公网 IP
geoipsed --no-private --no-loopback --no-broadcast access.log
# 进阶:带前后装饰的匹配范围 JSON 输出
geoipsed --tag access.log
```
### 仅提取 IP
For scenarios where you only need a raw list of IP addresses (like `grep -o` but faster and with IP validation), use the standalone `justips` tool:
```
cargo install justips
justips access.log
```
`justips` is a specialized, zero-dependency version of the extraction engine that is ~45% faster than `ripgrep` for finding IPs.
## 性能
`geoipsed` is highly optimized for sequential IP extraction, even outperforming `ripgrep` itself for this specific task.
Benchmarked against a **1.7GB Suricata log** (15.4M lines, 30.7M IP matches):
| Tool | Mode | Time | Throughput | Speedup |
| :--- | :--- | :---: | :---: | :---: |
| **`justips`** | **Parallel mmap + DFA** | **857ms** | **~2 GiB/s** | **7.2x** |
| `ripgrep` | `rg -ao` (v4/v6 regex) | 6.17s | ~275 MiB/s | Baseline |
| Python (`re`) | `IPRE.sub()` (baseline) | 431s | ~4 MiB/s | 0.01x |
For raw IP extraction (no geolocation), use the standalone [`justips`](crates/justips/) tool — it uses parallel mmap processing and is purpose-built for maximum throughput.
**Why is the DFA so fast?** While `ripgrep` is a world-class general search tool, `geoipsed` and `justips` use a specialized, compile-time DFA generated via `regex-automata`. This allows parsing and validating every `IpAddr` during the scan faster than a general regex engine can match the raw text.
## Workspace Crates
| Crate | Description |
| :--- | :--- |
| [`ip-extract`](crates/ip-extract/) | Zero-copy IP extraction library — compile-time DFA, defang support, builder pattern |
| [`justips`](crates/justips/) | Standalone CLI for fast IP extraction — parallel mmap, built-in dedup (`-u`, `-U`) |
| [`ipextract`](crates/ipextract-py/) | Python bindings (PyO3 + maturin) — stable ABI, published to [PyPI](https://pypi.org/project/ipextract/) |
## 文档
Full documentation, architecture details, and benchmarks available at [GitHub Pages](https://erichutchins.github.io/geoipsed/).
## 贡献
See [CLAUDE.md](CLAUDE.md) for project conventions and coding patterns.
标签:ASN查询, GitHub, HTTP/HTTPS抓包, IPv4, IPv6, IP地理定位, MaxMind, PowerShell, Rust, 动态分析, 可视化界面, 威胁情报, 开发者工具, 态势感知, 数据装饰, 日志处理, 日志富化, 系统分析, 网络分析, 网络安全, 网络安全审计, 网络流量审计, 通知系统, 隐私保护