mrzasad/complianceShield-pro

GitHub: mrzasad/complianceShield-pro

Stars: 0 | Forks: 0

# 🛡️ ComplianceShield — PECA/GDPR Data Pipeline A production-grade **Streamlit application** that intercepts raw data, audits it against **PECA 2016** (Pakistan Electronic Crimes Act) and **GDPR** compliance frameworks, encrypts sensitive PII fields, and produces an immutable structured audit log. ## Architecture Raw Data Source │ ▼ ┌──────────────────────────────────────────────────────────┐ │ Stage 1 · INTERCEPT │ │ DataInterceptor — SHA-256 checksum, batch ID, metadata │ └──────────────────────────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────┐ │ Stage 2 · AUDIT (GDPR + PECA) │ │ ComplianceAuditor — PII detection, rule matching, │ │ violation scoring, CRITICAL/HIGH/MEDIUM/LOW risk rating │ └──────────────────────────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────┐ │ Stage 3 · ENCRYPT │ │ DataEncryptor — AES-256-GCM (or Fernet / RSA-OAEP+AES) │ │ All PII fields replaced with ENC: │ └──────────────────────────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────┐ │ Stage 4 · LOG │ │ ComplianceLogger — append-only JSON structured log, │ │ exportable as JSON Lines (SIEM) or CSV │ └──────────────────────────────────────────────────────────┘ ## Compliance Frameworks ### GDPR | Rule ID | Article | Field | Risk | |---------|---------|-------|------| | GDPR-ART25-001 | Art. 25 – Data Minimisation | national_id | HIGH | | GDPR-ART32-001 | Art. 32 – Security of Processing | credit_card | CRITICAL | | GDPR-ART35-001 | Art. 35 – DPIA Required | dob | MEDIUM | | GDPR-ART5-001 | Art. 5 – Purpose Limitation | ip_address | MEDIUM | | GDPR-ART5-002 | Art. 5 – Lawfulness | email | MEDIUM | ### PECA 2016 | Rule ID | Section | Field | Risk | |---------|---------|-------|------| | PECA-SEC14-001 | Sec. 14 – Identity Information | national_id | HIGH | | PECA-SEC18-001 | Sec. 18 – Data Protection | phone | MEDIUM | | PECA-SEC34-001 | Sec. 34 – Dignity/Privacy | dob | LOW | | PECA-SEC14-002 | Sec. 14 – Identity Information | credit_card | CRITICAL | ## Encryption | Algorithm | Key Size | Mode | Notes | |-----------|----------|------|-------| | AES-256-GCM | 256-bit | Authenticated | Default — recommended | | Fernet (AES-128-CBC) | 128-bit | HMAC-SHA256 | Simple symmetric | | RSA-OAEP + AES | 2048-bit RSA + 256-bit AES | Hybrid | Key wrapping | Encrypted values format: `ENC:` ## Quick Start ### Local (Python) pip install -r requirements.txt streamlit run app.py ### Docker docker compose up --build # Open http://localhost:8501 ### Docker (manual) docker build -t complianceshield . docker run -p 8501:8501 complianceshield ## Production Extensions ### Apache Spark Replace `DataInterceptor.intercept()` with a PySpark job: spark = SparkSession.builder.appName("ComplianceShield").getOrCreate() df = spark.read.json("s3a://raw-data/landing/") df = df.rdd.mapPartitions(compliance_audit_udf).toDF() df.write.format("delta").mode("append").save("s3a://processed/compliant/") ### Apache Airflow DAG from airflow import DAG from airflow.operators.python import PythonOperator with DAG("compliance_pipeline", schedule="@hourly") as dag: ingest = PythonOperator(task_id="ingest", python_callable=intercept) audit = PythonOperator(task_id="audit", python_callable=audit_records) encrypt = PythonOperator(task_id="encrypt", python_callable=encrypt_fields) log_task = PythonOperator(task_id="log", python_callable=write_audit_log) ingest >> audit >> encrypt >> log_task ### Key Management (Production) - Store AES keys in **Azure Key Vault** or **AWS KMS** - Implement 90-day automatic key rotation - Use **Hardware Security Modules (HSM)** for RSA private keys - Log all key access events to the compliance audit trail ## File Structure compliance_pipeline/ ├── app.py # Streamlit UI ├── Dockerfile ├── docker-compose.yml ├── requirements.txt ├── README.md └── pipeline/ ├── __init__.py ` ├── interceptor.py # Stage 1: Data interception + checksums ├── auditor.py # Stage 2: GDPR/PECA rule engine ├── encryptor.py # Stage 3: AES-256-GCM encryption ├── logger.py # Stage 4: Structured audit logging └── spark_engine.py # Spark/Airflow execution simulation Screenshot 2026-05-19 083230 Screenshot 2026-05-19 083215 Screenshot 2026-05-19 083144 Screenshot 2026-05-19 083048 Screenshot 2026-05-19 083424 Screenshot 2026-05-19 083358 Screenshot 2026-05-19 083335 Screenshot 2026-05-19 083309