Documentation Index
Fetch the complete documentation index at: https://threatbasis.io/llms.txt
Use this file to discover all available pages before exploring further.
Security Orchestration, Automation, and Response (SOAR) platforms automate incident response workflows through playbooks. LLM integration transforms static playbooks into adaptive, intelligent workflows that can reason about incidents, make context-aware decisions, and handle novel scenarios that rigid automation cannot address.
AI-powered playbooks combine the reliability of structured automation with the flexibility of LLM reasoning. This guide covers integration patterns, playbook design, and implementation strategies for building intelligent SOAR workflows. For foundational concepts on using LLMs in security contexts, see AI Orchestration for Security.
AI-Enhanced Playbook Architecture
Traditional SOAR playbooks rely on deterministic logic that struggles with ambiguous or novel situations. AI-enhanced playbooks introduce reasoning capabilities at critical decision points, enabling workflows to adapt based on incident context rather than following rigid paths.
Integration Patterns
The following patterns represent different approaches to incorporating AI into SOAR workflows, ranging from minimal integration to fully adaptive systems. Each pattern balances automation benefits against the need for human oversight.
| Pattern | Description | Automation Level | Human Involvement |
|---|
| AI-assisted triage | LLM classifies, prioritizes | High | Review escalations |
| AI decision points | LLM chooses workflow branch | Medium | Approve decisions |
| AI enrichment | LLM summarizes, correlates | High | Consume outputs |
| AI response generation | LLM drafts actions | Low | Approve all actions |
| Hybrid automation | Mix of deterministic + AI | Variable | Risk-based |
Playbook Components
When enhancing existing playbooks with AI capabilities, each component can be augmented independently. This modular approach allows teams to introduce AI gradually, validating performance at each stage before expanding automation scope.
| Component | Traditional | AI-Enhanced |
|---|
| Trigger | Static rules | + Anomaly detection |
| Enrichment | API lookups | + LLM correlation |
| Decision | If/then logic | + LLM reasoning |
| Action | Predefined steps | + Adaptive response |
| Documentation | Templates | + LLM-generated reports |
Integrating LLMs with SOAR platforms requires careful consideration of the platform’s extensibility model, existing ecosystem integrations, and security requirements. Most platforms support custom actions through Python or webhook-based integrations, making LLM API calls straightforward to implement.
Each SOAR platform offers different mechanisms for custom integrations. The choice often depends on existing infrastructure investments and the level of flexibility required for AI workflows.
| Platform | AI Integration | Custom Actions | Considerations |
|---|
| Splunk SOAR | Python actions | Full flexibility | Splunk ecosystem |
| Palo Alto XSOAR | Python/JS actions | Marketplace | Cortex integration |
| Microsoft Sentinel | Logic Apps | Azure ecosystem | Native Copilot |
| Tines | No-code actions | Webhook-based | Simplicity |
| Swimlane | Python actions | Low-code | Flexibility |
Integration Architecture
A well-designed integration architecture separates concerns across distinct layers. This separation ensures that AI components can be updated, tested, or replaced without disrupting core automation workflows. For guidance on connecting AI components with security tools, refer to AI Security Tooling Integration.
| Layer | Function | Implementation |
|---|
| Trigger layer | Initiate playbook | Alert ingestion, schedule |
| Orchestration layer | Workflow control | SOAR engine |
| AI layer | LLM processing | API integration |
| Action layer | Execute responses | Tool integrations |
| Audit layer | Logging, compliance | SOAR + external logging |
AI Decision Points
Effective AI-powered playbooks require well-defined decision points where LLMs evaluate context and determine workflow direction. Each decision type carries different risk levels and requires appropriate confidence thresholds and fallback mechanisms.
Decision Types
Different decision types require varying levels of confidence before automated action. Classification decisions can tolerate slightly lower confidence since misclassification is usually recoverable, while response selection requires high confidence given the potential impact of automated actions.
| Decision Type | LLM Role | Confidence Threshold | Fallback |
|---|
| Classification | Categorize incident | > 90% | Human review |
| Severity assessment | Determine priority | > 85% | Conservative default |
| Response selection | Choose action path | > 95% | Human approval |
| Escalation | Determine escalation need | > 80% | Always escalate |
Confidence-Based Routing
LLM responses should include confidence scores that drive workflow routing decisions. Implementing confidence-based routing ensures that uncertain decisions receive appropriate human oversight while allowing high-confidence decisions to proceed automatically.
| Confidence Level | Action | Rationale |
|---|
| High (> 95%) | Auto-execute | Reliable decision |
| Medium (70-95%) | Execute with logging | Monitor for issues |
| Low (< 70%) | Human review | Uncertain decision |
| Very Low (< 50%) | Escalate | Requires expertise |
Playbook Design Patterns
Well-designed AI playbooks follow established patterns that balance automation efficiency with safety requirements. The following patterns demonstrate how to structure adaptive workflows for common security operations scenarios. For techniques on crafting effective prompts within these workflows, see Prompt Engineering for Security.
Adaptive Triage Playbook
The adaptive triage playbook enhances traditional alert processing with LLM capabilities at each stage. This pattern is particularly effective for handling the volume and variety of alerts that overwhelm purely rule-based approaches.
| Stage | Traditional | AI-Enhanced |
|---|
| 1. Alert ingestion | Parse alert fields | + LLM context extraction |
| 2. Enrichment | IOC lookups | + LLM correlation analysis |
| 3. Classification | Rule-based | + LLM reasoning |
| 4. Prioritization | Severity field | + LLM impact assessment |
| 5. Routing | Static rules | + LLM team matching |
Incident Investigation Playbook
Investigation playbooks benefit significantly from LLM integration due to the reasoning required to connect disparate evidence and construct coherent incident narratives. The AI assists analysts rather than replacing them, providing structured analysis that accelerates human decision-making.
| Stage | AI Capability | Output |
|---|
| Scope determination | Analyze initial indicators | Affected systems list |
| Evidence collection | Identify relevant data sources | Collection plan |
| Timeline construction | Correlate events | Incident timeline |
| Root cause analysis | Reason about causation | Hypothesis ranking |
| Impact assessment | Evaluate business impact | Impact summary |
Human-in-the-Loop Controls
Human oversight remains essential for AI-powered security automation. The level of oversight should correspond to the reversibility and impact of automated actions. Designing appropriate approval gates and escalation paths ensures that AI augments human capabilities without introducing unacceptable risk. For more on implementing safety controls, see AI Guardrails and Safety.
Approval Gates
Approval requirements should be proportional to action risk. Read-only operations can proceed automatically, while irreversible or high-impact actions require explicit human approval. This graduated approach maximizes automation benefits while maintaining necessary controls.
| Action Category | Approval Requirement | Timeout Behavior |
|---|
| Read-only enrichment | None | Auto-proceed |
| Reversible containment | Optional | Auto-proceed with logging |
| Irreversible containment | Required | Escalate |
| Remediation | Required | Hold |
| Communication | Required | Hold |
Escalation Triggers
Automated escalation ensures that situations beyond AI capability or authority receive appropriate human attention. Clear escalation paths prevent automation failures from causing delayed response to critical incidents.
| Trigger | Condition | Escalation Path |
|---|
| Low confidence | AI confidence < threshold | Senior analyst |
| High severity | Critical/High incidents | Incident commander |
| Novel pattern | No similar historical incidents | Threat intel team |
| Timeout | No response within SLA | Management |
Quality and Monitoring
Continuous monitoring of AI playbook performance enables iterative improvement and early detection of degradation. Track both operational metrics and AI-specific indicators to maintain visibility into automation effectiveness. For comprehensive approaches to AI system monitoring, see AI Observability and Monitoring.
| Metric | Description | Target |
|---|
| Automation rate | % incidents fully automated | > 60% |
| Decision accuracy | Correct AI decisions | > 95% |
| MTTR | Mean time to respond | Reduce by 50% |
| False positive rate | Incorrect automated actions | < 1% |
| Human override rate | Analyst corrections | < 10% |
Anti-Patterns to Avoid
When implementing AI-powered SOAR playbooks, several common mistakes can undermine effectiveness or introduce security risks. Recognizing and avoiding these patterns is essential for successful deployment.
-
Full automation without oversight — AI can make mistakes. Always include human checkpoints for critical actions.
-
Ignoring confidence scores — Low-confidence decisions need human review. Implement confidence-based routing.
-
Static prompts — Playbook context changes. Use dynamic prompts with current incident data.
-
Missing audit trails — AI decisions must be logged. Maintain complete decision history.
-
Over-reliance on AI — Some decisions require human judgment. Know when to escalate.
References