edygert/js_unshroud

GitHub: edygert/js_unshroud

基于 Playwright 的无头 JavaScript 监控与分析工具，用于捕获网页运行时事件并分析恶意脚本行为。

Stars: 0 | Forks: 0

# js_unshroud A headless JavaScript monitoring and analysis tool that instruments web pages to capture network requests, storage operations, console logs, and other runtime events. ## Prerequisites - [Bun](https://bun.com) runtime (v1.3.5 or later) - Node.js (for Playwright's browser dependencies) ## Installation ### Linux/macOS Setup Follow these steps to set up the development environment on Linux or macOS: 1. **Install Bun runtime:** curl -fsSL https://bun.com/install | bash 2. **Reload your shell configuration:** # For bash . ~/.bashrc # For zsh . ~/.zshrc 3. **Verify Bun installation:** bun --version 4. **Clone the repository and install dependencies:** git clone cd js_unshroud bun install 5. **Install Playwright test dependencies:** npm install @playwright/test 6. **Install Chromium browser for Playwright:** npx playwright install chromium 7. **Verify installation by running tests:** bun test 8. **Build the standalone executable:** bun run build After completing these steps, you'll have a fully functional development environment with the compiled binary at `dist/js_unshroud`. ### Windows Setup Follow these steps to set up the development environment on Windows: 1. **Install Bun runtime:** Open PowerShell and run: powershell -c "irm bun.sh/install.ps1 | iex" 2. **Restart PowerShell** to reload environment variables 3. **Verify Bun installation:** bun --version 4. **Clone the repository and install dependencies:** git clone cd js_unshroud bun install 5. **Install Playwright test dependencies:** npm install @playwright/test 6. **Install Chromium browser for Playwright:** npx playwright install chromium 7. **Verify installation by running tests:** bun test 8. **Build the standalone executable:** bun run build After completing these steps, you'll have a fully functional development environment with the compiled binary at `dist/js_unshroud-windows-x64.exe`. **Note for Windows users:** - The shell scripts (`*.sh` files) require WSL, Git Bash, or similar Unix-like environment - All `bun` commands work natively in PowerShell or Command Prompt - The compiled executable embeds the Bun runtime (no Bun/Node.js needed at runtime), but still requires its `node_modules/playwright-core` and `instrumentation/` siblings plus a Chromium browser — use `bun run release:windows` to produce a portable bundle (see [Packaging and Distribution](#packaging-and-distribution)) ## Building Build the compiled executable: bun run build This creates a binary at `dist/js_unshroud`. The binary embeds the Bun runtime (no Bun or Node.js needed at runtime), but it is **not** a single self-contained file: it requires `playwright-core` and the `instrumentation/` scripts to be present on disk next to it (see [Packaging and Distribution](#packaging-and-distribution)). Building alone is enough for local use from the project root; to produce a portable artifact for another machine, use the release scripts below. ### Portable release builds bun run release:linux # dist/js_unshroud-linux-x64.tar.gz bun run release:macos # dist/js_unshroud-macos-x64.tar.gz bun run release:macos-arm # dist/js_unshroud-macos-arm64.tar.gz bun run release:windows # dist/js_unshroud-windows-x64.tar.gz bun run release:all # all of the above Each `release:*` script builds the binary with `playwright-core` marked external (so no build-machine path is baked into the executable) and packages a tarball whose layout is: js_unshroud-/ ├── js_unshroud- # the binary ├── node_modules/playwright-core/ # vendored Playwright (resolved relative to the binary) └── instrumentation/ # browser-context hook scripts Extract the tarball anywhere and run the binary from inside it (or symlink it onto `PATH` — the binary resolves its dependencies via the real path of the executable, so symlinks work). Browser binaries are supplied separately via `PLAYWRIGHT_BROWSERS_PATH`; see [Packaging and Distribution](#packaging-and-distribution). ## Development For development, you can run the TypeScript source directly: bun run dev ## Running ### Using the built binary (recommended for production) **Linux/macOS:** ./dist/js_unshroud run --url https://example.com --out events.jsonl **Windows (PowerShell):** .\dist\js_unshroud-windows-x64.exe run --url https://example.com --out events.jsonl **Windows (Command Prompt):** dist\js_unshroud-windows-x64.exe run --url https://example.com --out events.jsonl ### Using TypeScript source (development) Works the same on all platforms: bun run dev --url https://example.com --out events.jsonl ### Capture command options - `--url `: Required. The URL to monitor - `--out `: Required. Output file path (will be in JSONL format) - `--config `: Optional. Path to instrumentation configuration JSON file ## Analyzing Captured Events After capturing events, use the `analyze` subcommand to generate human-readable reports from the JSONL output. ### Usage js_unshroud analyze --input [options] ### Analyze command options **Required:** - `--input `: Path to JSONL events file **Optional:** - `--format `: Output format (default: `text`) - `text`: Human-readable timeline with timestamps and event summaries - `json`: Structured JSON output for programmatic consumption - `stats`: Event statistics summary (counts, time span, breakdown) - `--output `: Write to file instead of stdout (default: stdout) ### Examples # Capture events js_unshroud --url https://example.com --out events.jsonl # Analyze: human-readable timeline to stdout js_unshroud analyze --input events.jsonl # Analyze: JSON format to file js_unshroud analyze --input events.jsonl --format json --output timeline.json # Analyze: statistics summary js_unshroud analyze --input events.jsonl --format stats # Pipeline-friendly: search for network events js_unshroud analyze --input events.jsonl | grep "GET" # Capture and analyze in sequence js_unshroud --url https://example.com --out events.jsonl && \ js_unshroud analyze --input events.jsonl --format stats ### Workflow **Typical malware analysis workflow:** 1. **Capture** - Monitor JavaScript execution and capture events: js_unshroud --url https://malicious-site.com --out malware.jsonl 2. **Analyze** - Generate timeline to understand attack flow: js_unshroud analyze --input malware.jsonl 3. **Triage** - Get statistics to identify suspicious patterns: js_unshroud analyze --input malware.jsonl --format stats The analyzer supports all 26 event types captured by the tool, including code execution, network requests, cryptographic operations, download detection, and advanced attack patterns. ### Anti-Analysis Detection Workflow When malware uses `debugger;` statements as anti-analysis techniques, follow this two-pass workflow: **Pass 1: Detect Anti-Analysis Techniques** # Run with debugger detection enabled (default) js_unshroud --url https://malicious-site.com --out initial.jsonl Check for debugger events: js_unshroud query --input initial.jsonl --type debugger If debugger events are detected, the malware is using anti-analysis checks. Note the location and proceed to Pass 2. **Pass 2: Capture Full Behavior Without Detection Risk** # Disable debugger detection to avoid timing-based detection echo '{"enableDebuggerDetection": false}' > no-debugger.json js_unshroud --url https://malicious-site.com --out full-behavior.jsonl --config no-debugger.json **Rationale:** - Pass 1 confirms the malware uses `debugger;` statements and captures the location (valuable intelligence) - Pass 2 disables debugger detection to eliminate timing-based detection risk (~1-5ms overhead) - After disabling, subsequent `debugger` statements execute at native speed (~0ms), avoiding detection - The information about anti-analysis techniques is already gathered from Pass 1 **Note:** The fire-and-forget optimization (implemented by default) already minimizes detection risk by only capturing the first debugger location and disabling the domain immediately. However, for maximum stealth against sophisticated malware with aggressive timing checks (<3ms threshold), disable debugger detection entirely in Pass 2. ## Querying Events The `query` subcommand enables targeted filtering of captured events, allowing malware analysts to quickly find specific patterns without loading entire event logs into memory. ### Usage js_unshroud query --input [FILTERS] [OPTIONS] ### Query Command Options **Required:** - `--input `: Path to JSONL events file **Filter Options:** - `--type `: Event types (comma-separated, e.g., `network,console,code_execution`) - `--method `: HTTP method for network events (GET, POST, etc.) - `--url `: Exact URL match for network events - `--url-regex `: Regex URL match for network events (e.g., `api\.evil\.com`) - `--status

`: HTTP status code for network events
- `--level `: Console level for console events (log, warn, error, info, debug)
- `--storage-type `: Storage type for storage events (localStorage, sessionStorage)
- `--operation `: Storage operation (set, get, remove, clear)
- `--correlation-id `: Match events with specific correlation ID

**Output Options:**

- `--format `: Output format (default: `jsonl`)
- `jsonl`: One JSON event per line (pipeline-friendly, streamable)
- `count`: Print only the count of matching events (fast reconnaissance)
- `--output `: Write to file instead of stdout (default: stdout)

### Examples

# Find all network requests

js_unshroud query --input events.jsonl --type network



# Find POST requests to suspicious domains

js_unshroud query --input events.jsonl --type network --method POST --url-regex "\\.ru$"



# Count code execution events (fast)

js_unshroud query --input events.jsonl --type code_execution --format count



# Find localStorage operations

js_unshroud query --input events.jsonl --type storage --storage-type localStorage --operation set



# Find console errors

js_unshroud query --input events.jsonl --type console --level error



# Query and save to file

js_unshroud query --input events.jsonl --type network --method POST --output suspicious.jsonl



# Pipeline: query → analyze

js_unshroud query --input events.jsonl --type code_execution | \

  js_unshroud analyze --input - --format stats



# Multi-type query

js_unshroud query --input events.jsonl --type "network,storage,code_execution"



# Combine multiple filters

js_unshroud query --input events.jsonl \

  --type network \

  --method GET \

  --url-regex "api\.example\.com" \

  --status 200

### Query vs Analyze

| Feature | query | analyze |

|---------|-------|---------|

| **Purpose** | Filter/search specific events | Format events as timeline/stats |

| **Filtering** | Full QueryFilter support | No filtering (loads all events) |

| **Output** | Raw events (JSONL) or count | Timeline text/JSON or stats summary |

| **Use Case** | "Show me X events" | "Summarize what happened" |

| **Memory** | Streams (low memory) | Buffers all events |

| **Pipeline** | Output can pipe to analyze | End of pipeline |

**Workflow:** `capture → query (filter) → analyze (format)`

### Malware Analysis Workflows

**Triage Workflow:**

# 1. Quick reconnaissance - count suspicious event types

js_unshroud query --input malware.jsonl --type code_execution --format count

js_unshroud query --input malware.jsonl --type cryptojs --format count



# 2. Extract suspicious events

js_unshroud query --input malware.jsonl --type code_execution --output suspicious.jsonl



# 3. Analyze the filtered subset

js_unshroud analyze --input suspicious.jsonl --format text

**Network Exfiltration Investigation:**

# Find all POST requests (potential data exfiltration)

js_unshroud query --input malware.jsonl --type network --method POST



# Find requests to foreign TLDs

js_unshroud query --input malware.jsonl --type network --url-regex "\\.ru$|\\.cn$"



# Count suspicious network activity

js_unshroud query --input malware.jsonl --type network --method POST --format count

**Obfuscation Analysis:**

# Find Base64 encoding operations

js_unshroud query --input malware.jsonl --type encoding



# Find CryptoJS decryption operations

js_unshroud query --input malware.jsonl --type cryptojs --operation decrypt



# Combine with analyze for timeline

js_unshroud query --input malware.jsonl --type "code_execution,encoding,cryptojs" | \

  js_unshroud analyze --input - --format text

### Correlate Command

Post-capture correlation analysis to find related event patterns using custom correlation rules.

#### Usage

js_unshroud correlate --input  [OPTIONS]

#### Required Options

- `--input `: Path to JSONL events file

#### Optional Options

- `--rules-file `: Path to correlation rules JSON file
- Default: Checks `./correlation_rules.json`, then `/correlation_rules.json`
- `--rules `: Apply only specified correlation rules (comma-delimited list, default: apply all rules)
- `--format `: Output format (default: `text`)
- `text`: Human-readable correlation chains with timestamps and event summaries
- `json`: Structured JSON output for programmatic consumption
- `--output `: Write to file instead of stdout (default: stdout)

#### Correlation Rules

Correlation rules define patterns to detect in event streams. The default rules file (`correlation_rules.json`) includes:

- **storage-to-network**: Local storage writes followed by network requests (data exfiltration pattern)
- **network-request-response**: Network request-response pairs (correlationId matching)
- **error-chains**: Network failures followed by error events
- **timer-to-network**: Timer executions followed by network activity (delayed beaconing)

#### Custom Rules File Format

Create a JSON file with this schema:

{

  "rules": [

    {

      "name": "crypto-to-network",

      "description": "CryptoJS decryption followed by network exfiltration",

      "patterns": {

        "type": "sequence",

        "events": ["cryptojs", "network"],

        "maxTimeGap": 3000,

        "correlationField": "sessionId"

      }

    }

  ]

}

**Rule Schema:**

- `name` (string, required): Unique rule identifier
- `description` (string, required): Human-readable explanation
- `patterns` (object, required):
- `type` (string, required): Either `"sequence"` (events must occur in order) or `"group"` (events can occur in any order)
- `events` (array, required): Event types to correlate (e.g., `["storage", "network"]`)
- `maxTimeGap` (number, optional): Maximum time gap in milliseconds between events in the correlation
- `correlationField` (string, optional): Field to correlate by - `sessionId`, `correlationId`, or `url` (default: `sessionId`)

#### Examples

# Find all correlations using default rules

js_unshroud correlate --input events.jsonl



# Use custom rules file

js_unshroud correlate --input events.jsonl --rules-file my_rules.json



# Find only storage-to-network correlations

js_unshroud correlate --input events.jsonl --rules storage-to-network



# Find multiple specific correlations

js_unshroud correlate --input events.jsonl --rules storage-to-network,timer-to-network



# Output as JSON and save to file

js_unshroud correlate --input events.jsonl --format json --output chains.json



# Combine custom rules with rule filter

js_unshroud correlate --input events.jsonl --rules-file my_rules.json --rules crypto-to-network,eval-chain

#### Relationship to Other Commands

The correlate command is a **parallel post-processing tool** alongside analyze and query:

capture (run) → events.jsonl

                    ├─→ analyze: Format events as timeline or statistics

                    ├─→ query: Filter events by criteria

                    └─→ correlate: Find event correlation patterns

These commands operate on the same JSONL file but produce different outputs:

- **analyze**: Timeline text/JSON, statistics
- **query**: Filtered JSONL events (pipeable to analyze)
- **correlate**: Correlation chains text/JSON (standalone analysis)

#### Use Cases

**Malware Analysis:**

- Detect data exfiltration patterns (storage → network)
- Find obfuscation chains (encoding → eval → network)
- Identify delayed execution (timer → code_execution)
- Track fingerprinting flows (fingerprinting → storage → network)

**Behavioral Pattern Detection:**

- Correlate error events with network failures
- Find repeated API call patterns
- Detect Service Worker lifecycle anomalies
- Track multi-stage code execution chains

### Configuration

You can optionally provide a configuration file to control what instrumentation is enabled:

{

  "enableConsole": true,

  "enableNetwork": true,

  "enableStorage": true,

  "enableWebSocket": true,

  "enableTimer": false,

  "enableError": true,

  "enableDOM": false,

  "enableCodeExecution": true,

  "enableEncoding": true,

  "enableCryptoJS": true,

  "enableDebuggerDetection": true,

  "enableDownloadDetection": true,

  "enableClipboard": false,

  "enableBlobTracking": false,

  "enableArtifactCollection": false,

  "artifactDirectory": "./artifacts",

  "artifactTypes": {

    "pageSnapshot": true,

    "downloads": true,

    "codeExecution": true,

    "encoding": true,

    "cryptojs": true,

    "clipboard": true,

    "workers": true,

    "iframes": true

  },

  "maxArtifactSize": 10485760,

  "monitoringTimeoutSeconds": 15,

  "outputMode": "file",

  "udpLogging": {

    "enabled": false,

    "host": "127.0.0.1",

    "port": 514

  },

  "debug": false

}

Configuration options:

**Event Capture:**

- `enableConsole`: Capture console.log, console.warn, console.error, etc. (default: `true`)
- `enableNetwork`: Capture XMLHttpRequest and fetch network requests (default: `true`)
- `enableStorage`: Capture localStorage and sessionStorage operations (default: `true`)
- `enableWebSocket`: Capture WebSocket connections and messages (default: `true`)
- `enableTimer`: Capture setTimeout, setInterval operations (default: `false`)
- `enableError`: Capture JavaScript errors and exceptions (default: `true`)
- `enableDOM`: Capture DOM mutation events (default: `false`)
- `enableFingerprinting`: Capture canvas fingerprinting, WebGL properties, and navigator probes (default: `false`)
- `enableObjectTracking`: Enable proxy-based tracking of specific JavaScript objects (default: `false`)
- `enableHeadlessMitigation`: Enable countermeasures against headless browser detection (default: `false`)
- `enableServiceWorker`: Capture Service Worker registration, lifecycle, and messaging (default: `false`)
- `enableCodeExecution`: Capture eval(), Function(), and dynamic code execution (default: `true`)
- `enableEncoding`: Capture atob/btoa, fromCharCode, URI encoding/decoding (default: `true`)
- `enableCryptoJS`: Capture CryptoJS library encryption/decryption (AES, DES, TripleDES, RC4, Rabbit) (default: `true`)
- `enableDebuggerDetection`: Detect and automatically resume from `debugger;` statements - common anti-analysis technique (default: `true`). Uses fire-and-forget optimization: captures only the first debugger location (~1-5ms latency), then disables the domain so subsequent debuggers run at native speed (~0ms). For maximum stealth against sophisticated malware, disable this setting after confirming debugger usage in an initial run (see Anti-Analysis Detection Workflow).
- `enableWorkers`: Capture Web Worker and SharedWorker creation, messaging, and errors (default: `false`)
- `enableModules`: Capture ES module script injection via `",

  "scriptCount": 1,

  "scripts": ["alert('XSS')"],

  "element": "iframe#malicious-frame",

  "stackTrace": "injectIframe@https://example.com/inject.js:20:3"

}

**Performance Monitoring Events:**

{

  "id": "perf_1234567890_001",

  "timestamp": 1640995200700,

  "sessionId": "session_1640995200_abc123",

  "type": "performance_stats",

  "method": "periodic_report",

  "operation": "performance_monitoring",

  "uptime": 30000,

  "totalEventsProcessed": 1250,

  "eventsAccepted": 1200,

  "eventsRejected": 50,

  "eventsDeduplicated": 50,

  "acceptanceRate": "96.00%"

}

**Headless Mitigation Events:**

{

  "id": "evt_1234567890_006",

  "timestamp": 1640995200600,

  "sessionId": "session_1640995200_abc123",

  "type": "headless_mitigation",

  "method": "navigator.hardwareConcurrency",

  "operation": "value_override",

  "originalValue": 2,

  "newValue": 8,

  "stackTrace": "checkCPU@https://example.com/detection.js:15:5"

}

Note: `navigator.webdriver` is NOT overridden and will not generate events. The Chrome flag `--disable-blink-features=AutomationControlled` prevents the property from being created, making it completely undetectable.

标签：CMS安全, JavaScript, MITM代理, Playwright, 无头浏览器, 特征检测, 网络请求分析, 自动化攻击, 运行时监控