Nevetan/Shield-Scan

GitHub: Nevetan/Shield-Scan

Stars: 0 | Forks: 0

# ShieldScan — AI-Powered Website Threat Detector A Chrome extension + Python backend that analyses any website for phishing, scams, and malware indicators using multi-signal detection and Llama 3.3 70B via Groq. ## Architecture Chrome Extension (JS) └── popup.html / popup.js │ extracts URL + page content ▼ FastAPI Backend (Python) ←── localhost:8000 ├── URL feature extraction (entropy, subdomains, TLD, impersonation) ├── Content analysis (keywords, urgency language, form fields) ├── WHOIS domain age lookup └── Llama 3.3 70B via Groq (free AI verdict + explanation) ## Setup ### 1. Get a free Groq API key Go to **console.groq.com** → sign in with Google → **API Keys → Create API key**. ### 2. Backend cd backend pip install -r requirements.txt Open `main.py` and paste your Groq key on line 22: client = Groq(api_key="YOUR_GROQ_API_KEY") Start the server: uvicorn main:app --reload --port 8000 Verify it's running: curl http://localhost:8000/health ### 3. Chrome Extension 1. Open Chrome → `chrome://extensions` 2. Enable **Developer mode** (top right toggle) 3. Click **Load unpacked** 4. Select the `extension/` folder The ShieldScan icon will appear in your toolbar. Keep the terminal running in the background — the extension needs the server to work. ## Features Analysed ### URL-level signals | Feature | Why it matters | |---|---| | Shannon entropy of domain | High entropy = randomly generated = suspicious | | Subdomain count | Deep subdomains often used in phishing | | Brand impersonation | e.g. `paypal.evil.com` | | IP address instead of domain | Legitimate sites use domain names | | Suspicious TLD | `.tk`, `.xyz`, `.icu` etc. | | Hyphens & digit ratio | `my-paypa1-secure.com` patterns | | Hex encoding | URL obfuscation technique | ### Content signals | Feature | Why it matters | |---|---| | Suspicious keyword matching | 30+ phishing/scam phrases | | Brand name presence | Combined with non-brand domain = suspicious | | Urgency language | Pressure tactics are a scam hallmark | | Password/card input fields | Credential harvesting detection | | External link analysis | Phishing kits link to multiple external domains | ### Domain signals | Feature | Why it matters | |---|---| | WHOIS domain age | Newly registered domains = high risk | | SSL certificate | No HTTPS = baseline red flag | ## API ### `POST /analyse` **Request:** { "url": "https://example.com", "page_text": "...", "page_title": "...", "links": ["https://..."] } **Response:** { "threat_level": "warning", "risk_score": 54, "summary": "This site was registered 12 days ago and contains urgency language asking users to verify their PayPal account, despite being hosted on a non-PayPal domain.", "signals": [ {"message": "Domain registered only 12 days ago", "severity": "high"}, {"message": "Contains 'verify your account' phishing language", "severity": "high"}, {"message": "References PayPal but domain is unrelated", "severity": "high"}, {"message": "HTTPS present", "severity": "low"} ], "has_ssl": true, "domain_age_days": 12 } # Author Nevetan Uthayachandran
标签:后端开发