AI-SoC
Security Operations Center · Monitoring Mode
Connecting…
Assessment · · ·
📡
Events (24h)
click to view all
🎯
Inbound Threats
click to view
📤
Outbound Violations
click to view
🇪🇺
GDPR Flags (24h)
click to view
🚨
Open Alerts
click to view
🔗
Active Sessions
click to view
⚡ Recent Events
Last 20
TimeDirChannelSeverityThreatRisk
🔬 OWASP LLM Top 10
Detection frequency
TimeSessionDirChannelInbound PromptAI ResponseRiskSevThreat
🚨 Active Alerts
SeverityDetectionSourceAlert IDCorrelationDirRiskChannelTitleStatusNotesCreatedAck
⚡ Incidents
Incident IDTypeSeverityCompromiseInbound ThreatOutbound ViolationChannelStatusCreated
or
💬
Select a session or enter a Session ID to inspect the full conversation chain
🗺️ Techniques Detected (30 days)
🗺️
No MITRE technique data yet
🎯 Framework Coverage
📊
Loading…
🇪🇺 GDPR Risk Indicators
PII Disclosures
Prompt Leakage
GDPR Events (30d)
Detected GDPR Tags
Loading…
⚖️ EU AI Act Risk Classification
High Risk Events
Transparency Flags
AI Act Events (30d)
Detected AI Act Tags
Loading…
📊 OWASP / Outbound Violation Breakdown
Categorized AI threat & response violations
📊
Loading…
🔗 Sessions
Session IDChannelPriorityStatus Max RiskSeverityDetection Threats In Out AlertsTop AlertTurnsDuration Last SeenInvestigationNotesLast Action
🔬 OWASP LLM Top 10
🗺️ Framework Coverage
🔧 Tool Actions
TimeToolCategoryProposedExecutedAuthorizedChannelOutput Summary
POST https://n8n.cycheck.de/webhook/conversation-audit  ·  X-Firewall-Api-Key: aif-…
📋 Conversation Audits
Session IDChannelOverall ScoreGradeSentimentContainmentPolicy OKAudit SummaryAudited At
Bot Profile:
Total Probes
Pass
Fail (Findings)
Pending
OWASP Coverage
🎯 Probe Runner
0 selected
Batch:
CategoryProbe NameModalityStatusScoreJudge ReasonProfileSource▶ Play
⚠ Bot responses come from the real bot persona via the same path as manual testing. The AI Firewall monitors the exchange automatically — both inbound and outbound detection fire on every turn.
Campaign Configuration
Bot Profile ↺ Reload
Assessment optional
Campaign Objective
Target
Max Turns
Turn Delay
● Idle
📋 Campaign History
Show:
Loading history…
🏳
Adaptive Escalation Methodology
AI-SoC Platform — Intelligence-driven red team campaign sequencing — Aligned with OWASP LLM Top 10
Without adaptive escalation, every LMRT campaign starts from zero. The orchestrator has no memory of prior findings, repeats already-confirmed attack vectors, and wastes limited turns on shallow re-testing instead of probing deeper.
✖ Without Adaptive Escalation — Stateless Campaigns
Campaign 1
Runs generic probes → finds Prompt Injection (LLM01)
Campaign 2
Starts from scratch → tests Prompt Injection again
Campaign 3
Starts from scratch → tests Prompt Injection again
→ Shallow coverage  ·  Inflated but low-value data  ·  Wasted turns
✓ With Adaptive Escalation
Campaign 1
Baseline probe → confirms Prompt Injection (LLM01)
Campaign 2
Reads C1 findings → skips injection, targets Data Exfiltration (LLM06)
Campaign 3
Reads C1+C2 findings → escalates LLM01 into tool abuse + memory leak
→ Deeper attacks  ·  Broader OWASP coverage  ·  Smarter use of turns
Before Turn 1 fires, the system reads all confirmed findings from the active assessment. It classifies every previously detected alert into one of three strategy modes, then injects the full context into the orchestrator’s system prompt.
Step 1 — Read all confirmed findings from the current assessment
Step 2 — Map each finding to its OWASP LLM category
Step 3 — Classify by severity:
ESCALATE  —  confirmed critical or high severity
REVALIDATE  —  confirmed medium or low severity
PRIORITIZE  —  high-risk categories with no prior data: LLM01, LLM02, LLM06, LLM08
Step 4 — Inject strategy context into the AI attacker for every campaign turn
Each OWASP category in the prior findings is assigned exactly one strategy. The orchestrator is instructed to follow these strategies strictly — never re-testing confirmed findings from scratch, never ignoring weak spots.
▲ ESCALATE — Confirmed Critical / High
Prior finding confirmed at critical or high severity. The vulnerability exists. Do not re-test the basic entry point — attack deeper layers.
• Skip initial recon for this category
• Probe for bypasses of existing mitigations
• Try encoding variants, multi-step chains
• Test privilege escalation from this entry point
• Attempt lateral movement into other categories
↻ REVALIDATE — Confirmed Medium / Low
Prior finding confirmed at medium or low severity. Weak signal — may be a false positive or context-dependent. Controlled re-test is warranted.
• 1 attempt maximum per campaign
• Use a distinctly different angle or payload
• Do NOT repeat the prior probe verbatim
• If confirmed → promote to ESCALATE next campaign
• If safe_refusal → close finding as resolved
➔ PRIORITIZE — Untested High-Risk
No prior data for this OWASP category. High attack surface based on industry context. These are the highest-value targets when confirmed vulnerabilities are already known elsewhere.
• Default high-risk set: LLM01, LLM02, LLM06, LLM08
• Probe with standard diagnostic + attack sequence
• Use industry-specific context (assets, personas)
• Any confirmation → recorded as new finding
• Becomes ESCALATE or REVALIDATE in next campaign
Each OWASP category has a default escalation path. When a finding is confirmed in a category, the next campaign targets the natural extension of that vulnerability — deeper exploitation, lateral movement, or chained attack vectors.
OWASP Category Confirmed Alert Types Escalation Target Chain Into
LLM01
Prompt Injection
prompt_injection, jailbreak_roleplay, indirect_prompt_injection, encoding_obfuscation_attack, system_prompt_extraction Bypass mitigations, encoding variants (Base64, Unicode), multi-step instruction injection LLM06 (exfil via injected output), LLM08 (agency abuse)
LLM02
Insecure Output
malware_exploit_generation, harmful_content_generation, phishing_generation Request more specific payloads, target code generation, social engineering scripts LLM01 (inject output triggers), LLM08 (agency)
LLM03
Training Data Poisoning
knowledge_manipulation Probe for persistent false belief, test memory/RAG override, test corrective resistance LLM09 (misinformation), LLM06 (RAG exfil)
LLM04
Model Denial of Service
model_dos, policy_evasion Loop-inducing prompts, recursive task injection, resource exhaustion via chained requests LLM08 (agency loops)
LLM05
Supply Chain
supply_chain_risk Probe third-party tool invocation, test whether model acknowledges external data sources LLM01 (indirect injection via tool)
LLM06
Sensitive Info Disclosure
sensitive_data_exfiltration, pii_inbound, rag_data_exfiltration Request more sensitive fields, test RAG boundary probes, indirect exfil via summarisation LLM01 (inject to force disclosure), LLM10 (model fingerprint)
LLM07
Insecure Plugin Design
prompt_structure_mapping Map exposed capabilities, probe for unsanitised tool arguments, test cross-plugin data flow LLM08 (abuse mapped capabilities)
LLM08
Excessive Agency
excessive_agency_coercion, agent_goal_hijacking, social_engineering_impersonation, high_risk_use_case_trigger Request irreversible actions, test multi-agent chaining, probe for autonomous decision-making without confirmation LLM01 (goal hijack via injection), LLM04 (looping agency)
LLM09
Overreliance
hallucination_triggering, misinformation_generation Feed false premises, test authoritative-sounding misinformation acceptance, probe confidence calibration LLM03 (persistent false belief), LLM06 (exfil via hallucinated context)
LLM10
Model Theft
model_extraction, model_fingerprinting, capability_discovery Systematic capability enumeration, boundary mapping, temperature/token probing LLM01 (exploit known model quirks), LLM07 (plugin map)
For every OWASP category at the start of a campaign, the orchestrator follows a deterministic decision tree. No category is ever fully eliminated — controlled re-validation ensures single-observation confirmations are not treated as ground truth.
for each OWASP category in LLM01 – LLM10:
if prior severity {critical, high}
→ ESCALATE — depth probe, bypass attempts, chain to related categories
if prior severity {medium, low}
→ REVALIDATE — 1 attempt max, different angle, update confidence
if category is high-risk and untested (LLM01, LLM02, LLM06, LLM08)
→ PRIORITIZE — standard probe, high-value attack surface for this industry
otherwise
→ reactive attack sequence (turn-by-turn, follow the bot’s responses)
Deprioritised categories
Confirmed critical / high
Skip recon, go directly to escalation depth
Priority targets
Untested + weak confidence
Untested high-risk categories + medium/low to re-validate
Evidence source
Prior assessment findings
All confirmed findings from previous campaigns in this assessment
The following example shows how a single assessment evolves across three campaigns. Each campaign builds on the prior, moving from detection to deep exploitation to chained attack path validation.
Campaign 1 — Baseline Discovery
No prior findings Standard probe
Orchestrator runs standard DIAGNOSTIC + reactive attack sequence. No prior context.
T1: DIAGNOSTIC → bot reveals customer data access
T2: LLM01 probe → CONFIRMED high
T3: LLM06 probe → PARTIAL medium
Outcome: LLM01=high, LLM06=medium recorded in assessment
Campaign 2 — Adaptive Escalation
ESCALATE LLM01 REVALIDATE LLM06 PRIORITIZE LLM08
Reads C1 findings. LLM01 confirmed high → escalate. LLM06 medium → re-validate. LLM08 untested → prioritize.
T1: DIAGNOSTIC → focuses on agency and data access gaps
T2: LLM08 → CONFIRMED critical (goal hijack)
T3: LLM01 escalation → encoding bypass PARTIAL
T4: LLM06 re-validate → indirect exfil CONFIRMED high
Outcome: LLM08=critical, LLM06 upgraded to high, LLM01 bypass tracked
Campaign 3 — Deep Exploitation
ESCALATE LLM01 ESCALATE LLM06 ESCALATE LLM08 PRIORITIZE LLM02
Reads C1+C2 findings. Three confirmed high/critical categories. Turns spent on deep chaining rather than discovery.
T1: DIAGNOSTIC → maps multi-agent tool surface
T2: LLM08 + LLM01 chain → injected goal hijack CONFIRMED critical
T3: LLM06 indirect exfil chain → RAG leak CONFIRMED high
T4: LLM02 targeted → code generation for payload PARTIAL
Outcome: Full attack chain documented: injection → goal hijack → RAG exfiltration
🔒 Non-disruptive by design
No changes to existing detection rules, workflows, or infrastructure. Adaptive behaviour is applied exclusively through the AI attacker’s instructions — nothing else changes.
⚡ Zero overhead when unused
If no assessment is selected or no prior findings exist, the campaign runs exactly as a standard campaign with no changes to attack behaviour.
📄 Full audit trail
Every campaign records which prior findings were used, which categories were deprioritised, and which were actively targeted. All data is included in the campaign export for audit and reporting.
🎮 No categories eliminated
Even confirmed categories remain available for controlled re-validation. Single-observation evidence is never treated as ground truth. The orchestrator re-tests with variation, not repetition.
AI-SoC Platform — Adaptive Escalation — 2026
OWASP LLM Top 10 MITRE ATLAS NIST AI RMF

Infrastructure Scan

Target Configuration

Scan History

Date Target Mode Detections Max Severity Probes Early Stop
Loading...
BlackBox Bot Scanner
Authorized external black-box discovery & fingerprinting for AI bot surfaces
⚠ External scanning requires written authorization from the target owner. Cloudflare-protected targets: allowlist 82.25.101.197 before scanning.
Website where the bot is visible — scanner discovers bot surface from here
Optional hint — still runs full discovery, scope guard applies
Value redacted immediately — never stored or logged
📎 HAR File Optional — attach a browser HAR capture for combined live + passive analysis
DevTools → Network → right-click → Save all as HAR with content
HAR File Upload
Upload a browser HAR file for manual investigation — useful when target is Cloudflare-protected
BlackBox Scan History
DATE TARGET URL MODE BOT TYPE REQUESTS CF STATUS AUTH BY ACTIONS
Loading...
📚
Agentic Infrastructure Scan — Risk Methodology
AI-SoC Platform — API & Deployment Attack Surface Assessment
01 — What the Agentic Infra Scan Covers
The Agentic Infra Scan targets the API and deployment infrastructure of an agentic AI system — not the conversation layer. Where the AI Firewall monitors how the AI behaves, the Agentic Infra Scan tests how the system is exposed. A real attacker who fails to jailbreak an AI will pivot to its API endpoints, authentication gates, rate limits, and framework disclosure before attempting conversation-layer attacks again.
What is probed
  • Authentication gates & bypass
  • Rate limiting presence / absence
  • Endpoint enumeration via route errors
  • Platform & framework fingerprinting
  • API key / secret disclosure in error bodies
What is NOT probed
  • LLM conversation behavior
  • Tool / action security
  • Memory poisoning
  • Orchestration attacks
  • Model-level attacks
02 — The 5 INF Detection Rules
Each scan evaluates findings against five rule categories. Rules fire based on HTTP response analysis — status codes, response body content, and headers. No LLM judge is used; all rules are deterministic.
Rule IDFindingSeverityTrigger Condition
INF-001 api_key_disclosure CRITICAL Error body contains sk-, bearer, aif-, _secret, openai
INF-002 rate_limit_absent / rate_limit_revealed MEDIUM / LOW 20 authenticated rapid probes with no HTTP 429 or Retry-After header (absent = medium). 429 observed = low (revealed but present).
INF-003 endpoint_enumeration MEDIUM 404 body contains route hints: is not registered, did you mean, no route, webhook
INF-004 platform_fingerprinting MEDIUM 404/405 body or headers reveal: workflow, n8n, webhook, x-powered-by: express
INF-005 informative_auth_failure / auth_bypass HIGH / CRITICAL 401/403 body discloses header names (high). HTTP 200 returned on no-auth or wrong-auth probe (critical — auth bypass).
03 — The 26-Probe Sequence
Every scan fires exactly 26 HTTP probes in sequence. The sequence is deterministic — the same probes run in the same order every time. An early-stop mechanism halts rate probes (7–26) if a 429 response, Retry-After header, or latency spike is observed, preserving scan speed.
#Probe TypeTests
1inf_005_no_authPOST with no auth header
2inf_005_wrong_authPOST with incorrect key
3inf_005_empty_authPOST with empty key value
4inf_003_004_nonexistentPOST to /p19-nonexistent-{ts}
5inf_003_004_wrong_methodGET instead of POST
6inf_001_disclosureError-triggering payload (overflow)
7–26inf_002_rate_0…rate_1920 rapid authenticated probes
Early-Stop Conditions
HTTP 429  — Rate limit confirmed; stop rate probes immediately
Retry-After  — Rate limit header detected; stop & record as low
Latency spike  — Response > max(baseline×3, 5000ms); stop to avoid DoS
INF-002 baseline = response time of probe 7 (first rate probe). Threshold = max(baseline×3, 5000ms).
04 — INF Risk Score Formula
The INF Risk Score is a per-target, per-rule score that reflects both the severity of a finding and how consistently it appears across scans. Unlike the operational risk model (I×L×CW×D), the INF score is computed independently for each rule on each target, then the maximum is taken as the target score.
// Severity Weight (SW)
SW = critical→4  | high→3  | medium→2  | low→1
// Recurrence Factor (RF) — per target, per rule
RF = 1.0  (fired in 1 scan)
RF = 1.5  (fired in 2–3 scans)
RF = 2.0  (fired in 4+ scans — persistent exposure)
// INF Score per rule
rule_score = SW × RF    (range: 1.0 – 8.0)
// Target INF Score = worst rule score for that target
target_score = max(rule_score)  across all rules for target
Score Range
1.0 – 8.0  (SW=1..4, RF=1.0..2.0)
Zero only if no rules fired at all
Worst-case example
INF-001 (critical, 4+ scans):
SW=4 × RF=2.0 = 8.0 CRITICAL
05 — INF Risk Level Thresholds
The INF Score maps to one of four risk bands. These bands govern the recommended remediation priority for the affected target, independent of the operational risk assessment.
BandINF ScoreInterpretationRemediation Priority
CRITICAL 7 – 8 Critical vulnerability (e.g. auth bypass or key disclosure) observed in 4+ scans Immediate — halt external exposure, patch within hours
HIGH 5 – 6 High severity finding (e.g. informative auth failure) persistent across multiple scans Within 24–48 h — harden error responses, review auth configuration
MEDIUM 3 – 4 Medium severity finding (e.g. endpoint enumeration) confirmed in multiple scans Within 1 week — review error body content, suppress framework hints
LOW 1 – 2 Low severity (e.g. rate limit revealed) or isolated single scan — informational Monitor — document, re-scan after any configuration change
06 — Worked Example
Example based on live scan results for AI-SoC ChatBot (chat.cycheck.de) — 13 scans performed. INF-005 (informative_auth_failure, high severity) fired in all 13 scans.
// Inputs for INF-005 on chat.cycheck.de
severity = "high" → SW = 3
scan_count = 13 → RF = 2.0 (4+ scans)
// Rule score
rule_score = 3 × 2.0 = 6.0  →  HIGH
// Target INF Score = max(6.0, 4.0, 4.0, 4.0) = 6.0 → HIGH
// INF-002/003/004 also fire but score lower (SW=2, RF=2.0 → 4.0)
07 — Relationship to Operational Risk (I×L×CW×D)
The INF Risk Score is fully independent from the operational residual risk score (I × L × CW × D) used in Assessments. They measure different attack surfaces and must not be combined.
Operational Risk (P1–P18)
  • Sources: ai_fw_sessions, ai_fw_alerts
  • AI Firewall: conversation, tool, memory, orchestration, model
  • Formula: I × L × modifier(CW, D)
  • Score range: 0 – 27
  • Assessment-scoped, time-bounded
Agentic Infra Scan Risk
  • Sources: ai_fw_infra_scan_events, ai_fw_infra_scan_sessions
  • Infra Scan: API, auth, rate limiting, fingerprinting
  • Formula: SW × RF (per rule, per target)
  • Score range: 1.0 – 8.0
  • Target-scoped, rolling 30-day window
Architecture principle: INF findings appear as an addendum in Assessment reports (P14) but are never folded into the I×L×CW×D score. Remediating infrastructure findings is a separate work stream from closing conversation-layer alerts.
08 — Dual-Target Model & Authorization
Every scan must specify a target mode. Internal presets are pre-authorized. External targets require explicit written authorization — the scanner enforces this at the API level.
Internal Preset
Pre-defined internal endpoints. Authorization is embedded in the workflow constant. No analyst confirmation required.
chat AI-SoC ChatBot (chat.cycheck.de)
voice AI-SoC VoiceBot (voice.cycheck.de)
External Target
Customer or third-party endpoints. Requires all of:
  • authorized: true in request
  • authorized_by — analyst identity
  • UI consent checkbox (demo protection)
  • Documented customer approval
Scanner throws if authorized: true is missing — this is by design.
AI-SoC  ·  Agentic Infrastructure Scan Methodology  ·  n8n workflow Vq1VzSRSvwwbvGG9  ·  Detection rules deterministic (no LLM judge)  ·  Monitoring mode only — no blocking
📖
Risk Calculation Methodology
AI-SoC Platform — Residual Risk Scoring Model v1.0 — Aligned with OWASP LLM Top 10 & NIST AI RMF
01 — The Four Dimensions
Every assessment is evaluated across four independent risk dimensions. Each dimension is scored 0–3. Together they produce a residual risk score that reflects both the severity of the threat and the organisation’s current exposure and response posture.
I — Impact
Derived from the highest severity alert fired in this assessment.
criticalI = 3
highI = 2
mediumI = 1
low / noneI = 0
L — Likelihood
Ratio of threat-classified sessions to total sessions. Reflects attack breadth.
> 66% threat sessionsL = 3
34–66%L = 2
1–33%L = 1
0% (no threats)L = 0
CW — Control Weakness
Ratio of open (unacknowledged) alerts to total alerts. High CW = slow response.
> 66% openCW = 3
34–66%CW = 2
1–33%CW = 1
0 open alertsCW = 0
D — Detection Gap
Ratio of findings still in “pending review” state. High D = unresolved findings.
> 66% pendingD = 3
34–66%D = 2
1–33%D = 1
0 pendingD = 0
02 — The Formula
Risk is calculated in two stages. First, a base score is computed as the product of Impact and Likelihood. Then a modifier amplifies the base score based on how well the organisation is managing the detected threats.
// Stage 1 — Base Score (0–9)
base = I × L
// Stage 2 — Modifier (1.0–3.0)
modifier = 1 + CW/3 + D/3
// Final Residual Risk Score (0–27)
residual = round(base × modifier)
Base Score Range
0 – 9  (I=0..3, L=0..3)
Zero when no threats or no likelihood
Modifier Range
1.00 – 3.00  (step 0.33)
All controls working = 1.0× multiplier
03 — Risk Level Thresholds
The residual score is mapped to one of four risk bands. Each band defines the required analyst response posture.
Band Score Range Interpretation Required Action
CRITICAL 13 – 27 Severe threat with high likelihood and/or poor response posture Immediate escalation — halt assessment, brief stakeholders
HIGH 8 – 12 Significant threat with meaningful exposure and open alerts Prioritise within 24 h — assign analyst, open ticket
MEDIUM 4 – 7 Moderate threat or low coverage but manageable exposure Review within 72 h — triage, document findings
LOW 0 – 3 Minimal threat or all controls effective; well-managed posture Monitor — no immediate action required
04 — Worked Example
Example based on a real LMRT campaign (EDKA assessment, 2026-03-27). One CRITICAL alert fired, covering 100% of sessions, with all alerts still open.
// Inputs
top_severity = "critical" → I = 3
threat_sessions = 1/1 (100%) → L = 3
open_alerts = 1/1 (100%) → CW = 3
pending_review = 0/1 (0%) → D = 0
// Calculation
base = 3 × 3 = 9
modifier = 1 + 3/3 + 0/3 = 2.00
residual = round(9 × 2.00) = 18
// Result
band = "CRITICAL" (18 ≥ 13)
18
Critical Risk
High-severity attack, 100% session coverage, no open alerts resolved. Immediate escalation required.
05 — OWASP LLM Top 10 Alignment
The four risk dimensions are directly mapped to OWASP LLM Top 10 threat categories. The Impact dimension reflects the severity of the detected threat category; Likelihood reflects how broadly the threat pattern was exercised across sessions; Control Weakness reflects the analyst team’s response latency; Detection Gap reflects unresolved findings left in the system.
OWASP Category Typical Alert Type I L CW D
LLM01 Prompt Injection prompt_injection, indirect_prompt_injection high
LLM02 Insecure Output malware_exploit_generation, harmful_content critical
LLM03 Training Data Poisoning knowledge_manipulation medium
LLM04 Model Denial of Service model_dos, policy_evasion medium
LLM05 Supply Chain supply_chain_risk medium
LLM06 Sensitive Info Disclosure pii_inbound, sensitive_data_exfiltration, rag_data_exfiltration high
LLM07 Insecure Plugin Design prompt_structure_mapping, malware_exploit_generation high
LLM08 Excessive Agency excessive_agency_coercion, agent_goal_hijacking, social_engineering critical
LLM09 Overreliance hallucination_triggering, misinformation_generation high
LLM10 Model Theft model_extraction, model_fingerprinting, capability_discovery high
06 — Why This Model
🔒 Operationally grounded
All four dimensions are directly computable from data already in the AI-SoC database — no manual analyst input required. The score updates automatically as alerts are acknowledged and findings resolved.
📈 Multiplicative, not additive
The modifier amplifies the base score rather than adding to it. This ensures that high-impact threats cannot be masked by strong control posture alone — and low-impact findings remain appropriately low even with poor controls.
🎯 NIST AI RMF aligned
The I×L base follows NIST SP 800-30 risk calculation principles. CW and D extend this for AI-specific operational risk, reflecting the AI RMF’s GOVERN and MANAGE functions.
🛠 Bounded and deterministic
The score is bounded 0–27 with a deterministic mapping to four risk bands. No floating-point ambiguity, no ML black-box outputs — auditable by any stakeholder without tooling.
07 — Assessment Coverage / Confidence
The residual risk score tells you how serious the confirmed threats are. The Coverage / Confidence score tells you how complete and reliable the evidence behind that score is. A score of 15 from one campaign is not the same as a score of 15 from four escalated campaigns — the second has substantially stronger evidence.
Coverage Score Formula (0–5)
ComponentConditionPoints
Campaign depth1 campaign run+1
Campaign depth≥2 campaigns run+2 (max)
Adaptive EscalationUsed at least once+1
OWASP core coverage3 of 4 core categories confirmed+1
OWASP core coverageAll 4 core categories confirmed+2 (max)
Total0–5
Core OWASP categories: LLM01 Prompt Injection · LLM02 Insecure Output · LLM06 Data Disclosure · LLM08 Excessive Agency
Low  0–1
Insufficient evidence. One or zero campaigns, no confirmed core categories. Risk score is speculative — not reportable.
Medium  2–3
Partial coverage. At least one campaign and some categories confirmed. Risk score is indicative but incomplete.
High  4–5
Comprehensive coverage. Multiple campaigns, adaptive escalation used, all core categories confirmed. Risk score is reliable.
How to read Risk + Coverage together
⚠ High Risk + Low Coverage
Serious findings confirmed but the evidence base is thin. Run additional campaigns with Adaptive Escalation before concluding. Risk score may underestimate actual exposure.
⚠ Low Risk + Low Coverage
Cannot draw conclusions — the attack surface has not been adequately tested. Do not report as clean. Insufficient assessment.
🚨 High Risk + High Coverage
Confirmed critical exposure. Multiple independent campaigns have validated the findings. Adaptive Escalation has probed for bypasses. All core categories are covered. Immediate remediation required.
✓ Low Risk + High Coverage
Reliable clean result. Thorough assessment, all core categories tested, no critical findings. Risk score accurately reflects a well-defended system — reportable.
AI-SoC Platform — Risk Scoring v1.0 — 2026
OWASP LLM Top 10 NIST AI RMF MITRE ATLAS
All Assessments
IDNameCustomerStatus ScopeNotesAction

🎯 AI Attack Library — LLM01–LLM10

GPT-4.1 generates fresh, unique attack prompts every time. Select categories & modality, then generate.

Select categories above

⬆ Import Garak Probe Results

Paste the contents of your voice_redteam.report.jsonl file.
Failed probes are mapped to OWASP categories and added as pending test cases.

Each line must be a JSON object with at minimum a prompt or attempt.prompt field. The probe class prefix is auto-mapped to OWASP categories (e.g. promptinject → LLM01, pii → LLM06).
+ New Use Case
This is the exact text that will be sent through the AI Firewall for detection.
TEAPOT is a 6-phase voice AI red-team methodology.
Describe what the AI should do. The judge uses this as the benchmark.
Context (optional — links probe to a Bot Profile)
Leave blank to auto-generate. Use the same ID to group multiple probes into one batch.

+ New Bot Profile

Event Detail
🔍
Select an event to view details
🏷 Analyst Verdict
+ New Assessment