Punkrose/Prism
GitHub: Punkrose/Prism
Stars: 0 | Forks: 0
# 🔷 PRISM




**AI-powered prompt reverse engineering, model fingerprinting, and LLM benchmarking platform.**
PRISM analyzes LLM outputs to infer the prompts that produced them, identify which model generated a response, and benchmark LLM performance across standardized tasks — all with zero external dependencies.
## ✨ Features
### 🔍 Prompt Reverse Engineering
- Detect prompt type (creative, analytical, code, conversational, instruction-following, factual)
- Extract formatting constraints (code blocks, lists, headers, JSON, etc.)
- Identify refusal patterns and confidence levels
- Analyze response structure (paragraphs, sentence length, code presence)
- Compute complexity scores
### 🪪 Model Fingerprinting
- Identify which LLM produced a given output
- Match against built-in model profiles (GPT-4o, Claude 3.5, Gemini Pro, Llama 3, DeepSeek, Mistral)
- Analyze stylistic markers: formality, hedging, code quality, repetition rate
- Rank candidate models by confidence score
### 📊 LLM Benchmarking
- Run standardized tasks (summary, code, reasoning, creative, factual)
- Heuristic scoring against multiple criteria per task
- Multi-model comparison with leaderboard generation
- Formatted benchmark reports
## 📦 Installation
git clone https://github.com/Punkrose/Prism.git
cd Prism
npm install # No dependencies to install — just sets up the project
## 🚀 Quick Start
### CLI Usage
# Reverse engineer a prompt from a response
node bin/prism.js reverse "Here is a function that reverses a string..."
# Fingerprint which model produced output
node bin/prism.js fingerprint "Certainly! I'd be happy to help. Here's a detailed explanation..."
# Run benchmarks with simulated provider
node bin/prism.js bench --tasks 5
# Show project info
node bin/prism.js info
# Run from file
node bin/prism.js reverse ./response.txt
### API Usage
const { PromptReverseEngineer, ModelFingerprinter, LLMBenchmark } = require('./src');
// ── Prompt Reverse Engineering ──
const engineer = new PromptReverseEngineer({ detail: 'full' });
const analysis = engineer.analyze('Your LLM response text here...');
console.log(analysis.promptType); // 'code'
console.log(analysis.complexity); // { score: 65, level: 'moderate' }
console.log(analysis.constraints); // ['code-blocks', 'bullet-lists']
console.log(analysis.inference); // Human-readable summary
// ── Model Fingerprinting ──
const fingerprinter = new ModelFingerprinter();
const result = fingerprinter.fingerprint('Certainly! Here\'s a helpful response...');
console.log(result.detected); // 'claude-3.5'
console.log(result.confidence); // 0.72
console.log(result.ranked); // [{ id, score, markers, styleMatch }, ...]
// ── Benchmarking ──
const benchmark = new LLMBenchmark();
const results = await benchmark.run(async (prompt) => {
// Call your LLM API here
return await myLLM.generate(prompt);
});
console.log(benchmark.report(results));
## 🔧 How It Works
### Prompt Reverse Engineering
The `PromptReverseEngineer` analyzes text responses to infer the likely prompt structure:
1. **Tokenization** — Breaks response into typed tokens (words, numbers, punctuation, code)
2. **Pattern Detection** — Identifies structural markers (code blocks, lists, headers, JSON)
3. **Classification** — Uses weighted signal scoring to classify prompt type across 6 categories
4. **Refusal Detection** — Matches against 20+ known refusal patterns with confidence scoring
5. **Complexity Scoring** — Combines length, vocabulary diversity, structural complexity, and sentence length
### Model Fingerprinting
The `ModelFingerprinter` identifies the likely source model:
1. **Marker Extraction** — Searches for known model-specific phrases and patterns
2. **Style Analysis** — Computes metrics: avg sentence length, formality (0-1), hedging (0-1), code quality (0-1), repetition rate (0-1)
3. **Profile Matching** — Compares extracted markers and style metrics against built-in model profiles
4. **Ranking** — Scores each model on a 50/50 weighted combination of marker matches and style similarity
### LLM Benchmarking
The `LLMBenchmark` evaluates LLM performance:
1. **Task Execution** — Runs standardized prompts through a provider function
2. **Heuristic Scoring** — Scores each response against task-specific criteria using pattern matching and structural analysis
3. **Comparison** — Produces ranked leaderboards across multiple models
4. **Reporting** — Generates human-readable benchmark reports with scores, latencies, and rankings
## 🪪 Default Model Profiles
| Model | ID | Key Markers | Formality | Hedging |
|-------|-----|-------------|-----------|---------|
| GPT-4o | `gpt-4o` | "As an AI", "cutoff" | 0.70 | 0.30 |
| Claude 3.5 | `claude-3.5` | "Certainly", "I'd be happy" | 0.80 | 0.40 |
| Gemini Pro | `gemini-pro` | "Great question", "Here are" | 0.60 | 0.50 |
| Llama 3 | `llama-3` | "As a language model" | 0.50 | 0.20 |
| DeepSeek | `deepseek` | "Let me think", "First, Second, Third," | 0.60 | 0.35 |
| Mistral | `mistral` | "In summary", "The answer is" | 0.55 | 0.25 |
## 📋 Default Benchmark Tasks
| Task | Prompt | Criteria |
|------|--------|----------|
| Summary | Summarize a pangram passage | conciseness, accuracy |
| Code | Reverse a string without `.reverse()` | correctness, efficiency, readability |
| Reasoning | Logic puzzle about roses and flowers | logic, clarity |
| Creative | Write a haiku about AI | creativity, structure |
| Factual | Three laws of thermodynamics | accuracy, completeness |
## 🧪 Testing
# Run all tests
npm test
# Run demo
npm run demo
## 📁 Project Structure
prism/
├── bin/
│ └── prism.js # CLI entry point
├── src/
│ ├── index.js # Main exports
│ ├── utils.js # Shared utilities
│ ├── reverse.js # Prompt reverse engineering
│ ├── fingerprint.js # Model fingerprinting
│ └── benchmark.js # LLM benchmarking
├── test/
│ ├── reverse.test.js # Reverse engineering tests
│ ├── fingerprint.test.js # Fingerprinting tests
│ └── benchmark.test.js # Benchmarking tests
├── examples/
│ └── demo.js # Full demo script
├── package.json
├── LICENSE
└── README.md
## 📜 License
MIT License — Copyright (c) 2026 Punkrose
See [LICENSE](./LICENSE) for details.
标签:自定义脚本