harivelan-ux/Web-Vulnerability-Scanner

GitHub: harivelan-ux/Web-Vulnerability-Scanner

Stars: 0 | Forks: 1

# 🔍 Web Vulnerability Scanner — Learning Project ## What This Tool Does A Python-based web vulnerability scanner that: - 🕷️ **Crawls** a website to discover all internal links - 📋 **Extracts** HTML forms and their input fields - 💉 **Tests** for SQL Injection (error-based detection) - 🎭 **Tests** for Reflected XSS (payload reflection detection) - 📊 **Reports** findings in the terminal and optionally saves to a file ## Project Structure web_scanner/ │ ├── main.py ← Entry point. Parses CLI args, orchestrates the scan. │ ├── crawler.py ← Link discovery using BFS graph traversal. │ Collects all internal URLs on the target site. │ ├── forms.py ← HTML form extraction and submission. │ Finds
tags, reads inputs, submits with payloads. │ ├── vulnerabilities.py ← Core testing logic. │ SQL Injection: looks for DB error messages. │ XSS: checks if payload is reflected unescaped. │ ├── reporter.py ← Output handling. │ Prints colored terminal output, saves to text file. │ └── requirements.txt ← Python package dependencies. ## Installation ### 1. Clone / download the project cd web_scanner/ ### 2. (Recommended) Create a virtual environment python -m venv venv # On macOS/Linux: source venv/bin/activate # On Windows: venv\Scripts\activate ### 3. Install dependencies pip install -r requirements.txt ## Usage ### Basic scan (crawls up to 30 pages): ```python main.py -u http://localhost/dvwa malicious:http://altoro.testfire.net/ python main.py -u http://altoro.testfire.net/ ### Save results to a file: ```bash python main.py -u http://localhost/dvwa -o results.txt ### Verbose mode (see every payload tested): python main.py -u http://localhost/dvwa --verbose ### Scan a single page without crawling: python main.py -u http://localhost/dvwa/login.php --no-crawl ### Limit crawl to 10 pages: python main.py -u http://localhost/dvwa --max-pages 10 ### Only test for XSS (skip SQL injection): python main.py -u http://localhost/dvwa --skip-sqli ### Full help: python main.py --help ## All CLI Options | Flag | Short | Description | Default | |-------------------|-------|----------------------------------------------|---------| | `--url` | `-u` | Target URL to scan (**required**) | — | | `--output` | `-o` | Save report to this file | None | | `--max-pages` | | Max pages to crawl | 30 | | `--no-crawl` | | Skip crawling, only test the given URL | False | | `--verbose` | `-v` | Show each payload as it's tested | False | | `--skip-sqli` | | Disable SQL Injection testing | False | | `--skip-xss` | | Disable XSS testing | False | ## Safe Testing Environments **Never test on real websites without permission.** Use one of these: | Environment | URL | Notes | |-------------|-----|-------| | **DVWA** (Damn Vulnerable Web Application) | http://dvwa.co.uk | Best for beginners. Set Security to "Low" | | **WebGoat** | https://owasp.org/www-project-webgoat/ | OWASP project, very educational | | **bWAPP** | http://www.itsecgames.com/ | 100+ bugs, great variety | | **Your own Flask app** | localhost | Build a deliberately vulnerable form to test | ### Quick DVWA Docker setup: docker run --rm -it -p 80:80 vulnerables/web-dvwa # Visit: http://localhost/ # Default login: admin / password # Set Security Level to: Low (in DVWA Security tab) ## How the Vulnerabilities Work ### SQL Injection When a web app takes user input and passes it directly into a SQL query without sanitization: -- Intended query: SELECT * FROM users WHERE username='alice' AND password='secret' -- After injection with "' OR '1'='1' --": SELECT * FROM users WHERE username='' OR '1'='1' -- ' AND password='' -- ^^^^^^^^^^^^ always true! -- ^^ comments out the rest This scanner looks for **SQL error messages** in the response (e.g., "mysql_fetch_array()", "syntax error") that indicate the query broke — confirming the injection point. ### XSS (Cross-Site Scripting) When a web app reflects user input back to the page without HTML-encoding it: You searched for: You searched for: <script>alert(1)</script> This scanner checks if the **raw payload appears in the response** — a sign it wasn't escaped. ## Limitations (Good to Know as a Learner) | Limitation | Why It Exists | How to Fix | |------------|---------------|------------| | Only detects **reflected** XSS | Stored XSS requires checking other pages after injection | Add a second-pass check after form submission | | Only detects **error-based** SQL injection | Blind SQLi requires timing or boolean comparisons | Implement time-based SQLi checks (`SLEEP(5)`) | | No JavaScript execution | DOM XSS only appears in browser | Integrate Selenium or Playwright | | No authentication flow | Can't log in to test authenticated pages | Add a `--login-url`, `--username`, `--password` option | | Single-threaded | Slow on large sites | Use `concurrent.futures.ThreadPoolExecutor` | ## Key Python Concepts You'll Learn - **`requests.Session`** — persistent HTTP sessions with cookies - **`BeautifulSoup`** — HTML parsing and element extraction - **`urllib.parse`** — URL manipulation (joining, parsing, encoding) - **`argparse`** — command-line interface building - **BFS graph traversal** — used in the crawler - **Modular code design** — each file has one responsibility - **Type hints** — `def func(url: str) -> list[dict]:` - **ANSI color codes** — colored terminal output ## Extending the Scanner (Next Steps) Once comfortable with this code, try adding: 1. **Multi-threading** — scan multiple URLs simultaneously 2. **Authentication support** — log in before scanning 3. **CSRF token detection** — identify forms with/without CSRF protection 4. **Header injection** — test `User-Agent`, `Referer` headers 5. **Directory brute-force** — discover hidden paths 6. **JSON/HTML report output** — structured reporting 7. **Rate limiting** — be polite to the server 8. **Proxy support** — route through Burp Suite for packet inspection #Author **HARI VELAN** ## Disclaimer This tool is built for **educational purposes only**. The authors are not responsible for any misuse. Always obtain explicit written authorization before testing any system you do not own. Treat security testing like you would entering someone's house — you need a key (permission), not a lockpick.