Joyaljose0/LLM-prompt-injection-detector

GitHub: Joyaljose0/LLM-prompt-injection-detector

Stars: 0 | Forks: 0

# LLM Prompt Injection Detector This project builds an autonomous classifier designed to detect and block malicious prompt injections in AI agent pipelines. As companies deploy LLM agents that process external untrusted data (like web pages, documents, and tool outputs), they become vulnerable to adversarial instructions hidden in that content. This system intercepts inputs, scores them for malicious intent, and blocks them before they ever reach the executing agent. ## Project Architecture The architecture consists of three main stages: 1. **External Content Ingestion**: Simulates inputs from web pages, emails, or user prompts. 2. **Injection Detector**: A system that utilizes a sequence classification model (e.g., a fine-tuned BERT model) to assign an injection probability score. 3. **Decision & Routing**: - If the score exceeds the threshold (`> 0.5`), the prompt is **Blocked and Logged** for analyst review. - If safe, the prompt is passed to the **LLM Agent** for normal execution. ## Features - **FastAPI Backend**: High-performance REST API for processing and scoring text. - **Detector Model Wrapper**: A fallback heuristic detector and a HuggingFace BERT fine-tuning script (`train.py`) to train the model on synthetic injection pairs. - **LangChain Simulated Agent**: A mock agent demonstrating safe execution pathways. - **Premium Frontend Demo**: A visually stunning, dark-mode dashboard built in Vanilla JS/HTML/CSS with dynamic UI animations mapping out the system architecture. ## Getting Started ### Prerequisites - Python 3.8+ - Node.js (Optional, only if you wish to serve the static frontend via a dev server, otherwise just open `index.html` in a browser) ### Installation 1. Clone the repository: git clone https://github.com/Joyaljose0/LLM-prompt-injection-detector.git cd LLM-prompt-injection-detector 2. Install the backend dependencies: pip install -r backend/requirements.txt ### Running the Application 1. **Start the FastAPI Backend**: uvicorn backend.api:app --reload The API will be available at `http://127.0.0.1:8000`. 2. **Open the Frontend Demo**: Simply open `frontend/index.html` in your web browser. ### Training the Model (Optional) To train a `bert-tiny` model on the provided synthetic dataset: python backend/train.py This will train the model and save it to `backend/injection_model/`. The detector will automatically load it upon backend restart. ## Tech Stack - **AI/ML**: PyTorch, HuggingFace Transformers, scikit-learn - **Backend**: FastAPI, Uvicorn, Pydantic, LangChain - **Frontend**: HTML5, Vanilla CSS (Glassmorphism design), Vanilla JavaScript ## License MIT