wisdom99/ai-incident-response-agent

GitHub: wisdom99/ai-incident-response-agent

Stars: 0 | Forks: 0

# AI Incident Response Agent Foundation for a Spring Boot-based AI Incident Response Agent that accepts incident events, runs them through a staged response workflow, and records operational decisions with strong observability defaults. ## Technology Stack - Java 25 - Spring Boot - PostgreSQL - Flyway - Docker Compose - Spring Boot Actuator - Jakarta Bean Validation - Structured JSON logging ## High-Level Architecture The service is organized as a workflow pipeline with explicit boundaries between each operational concern: 1. `incident ingestion` Receives validated incident payloads from upstream systems and turns them into internal commands. 2. `diagnosis` Produces an initial classification and probable-cause assessment for the incident. 3. `policy` Applies guardrails and determines whether automated execution is permitted. 4. `execution` Prepares or triggers the remediation plan allowed by policy. 5. `verification` Confirms whether the execution outcome should be monitored, retried, or escalated. 6. `audit` Captures the workflow trail so decisions remain traceable and reviewable. The orchestration entry point is `IncidentResponseWorkflowService`, which wires the stages together while keeping each stage separately testable and replaceable. ## Package Layout src/main/java/com/wisdom99/aiincidentresponse ├── AiIncidentResponseAgentApplication.java ├── orchestration │ ├── IncidentResponseSummary.java │ └── IncidentResponseWorkflowService.java ├── incident/ingestion │ ├── api │ └── application ├── diagnosis/application ├── policy/application ├── execution/application ├── verification/application └── audit/application ## Current Foundation This scaffold includes: - A Spring Boot application entry point - A starter ingestion API at `POST /api/v1/incidents` - Validation on the inbound payload - Workflow orchestration across diagnosis, policy, execution, verification, and audit services - PostgreSQL datasource configuration - Flyway database migrations - Docker Compose for local Postgres - Actuator health and metrics endpoints - Structured JSON console logging ## Database Flyway creates two starter tables: - `incident_record` Stores inbound incident metadata and the original payload - `audit_event` Stores workflow stage outcomes for traceability ## Local Setup ### Prerequisites - Java 25 installed locally - Maven installed locally - Docker Desktop or a compatible Docker runtime ### 1. Start PostgreSQL docker compose up -d ### 2. Configure environment Copy values from `.env.example` into your shell or your preferred environment manager: export DB_HOST=localhost export DB_PORT=5432 export DB_NAME=incident_response export DB_USERNAME=incident export DB_PASSWORD=incident export SERVER_PORT=8080 ### 3. Run the application mvn spring-boot:run ### 4. Verify the service Health endpoint: curl http://localhost:8080/actuator/health Sample incident ingestion request: curl --request POST http://localhost:8080/api/v1/incidents \ --header "Content-Type: application/json" \ --data '{ "externalIncidentId": "inc-1001", "sourceSystem": "pagerduty", "severity": "HIGH", "summary": "API error rate increased beyond threshold", "payload": { "service": "payments-api", "region": "eu-west-1", "errorRate": 17.3 } }' ## Observability - Actuator endpoints are exposed for `health`, `info`, `metrics`, `prometheus`, and `loggers` - Console logging is configured for structured JSON output - MDC fields such as `incidentId` and `sourceSystem` are attached during ingestion ## Suggested Next Steps - Persist incident and audit records through repositories - Add asynchronous execution and verification workers - Introduce AI-backed diagnosis and policy engines behind clear ports - Add integration tests for ingestion, Flyway, and PostgreSQL startup - Add authentication and authorization around execution-capable endpoints
标签:域名枚举