Smartflow Compliance Retrospector Guide

What is the Compliance Retrospector?

A separate Python service that runs alongside Smartflow and performs deep compliance analysis on your stored VAS logs after the fact — independently of the real-time proxy.

Runs as its own service

The retrospector is a Python FastAPI service (smartflow-retrospector) deployed alongside the other Smartflow containers. It has its own scheduled job runner (APScheduler) and exposes results through the API server — it does not share threads or memory with the proxy.

Reads directly from your VAS log archive

It reads from the same MongoDB database that stores your VAS logs. It uses a watermark cursor so it only processes new logs since the last run — efficient and incremental with no risk of double-counting.

Loads real ML models

On startup the retrospector downloads and caches Presidio (PII/PHI detection), sentence-transformers (semantic embeddings for attack similarity), and toxic-bert (harmful content classifier). Models are stored in a persistent volume and only downloaded once.

Writes violations back to MongoDB

All retroactive violations are written to a dedicated retroactive_violations collection in MongoDB, tagged with which pass found them, severity, and confidence score. They appear in the Post-Scan tab of the Compliance dashboard.

Why post-scan catches what real-time misses

Real-time compliance scanning must make a decision in under 50ms per request. That forces trade-offs. The retrospector has no such constraint.

Real-time scanner

Operates in ≤50ms per request

Fast regex, lightweight ML, keyword lists. Catches obvious PII, known blocked phrases, and policy keyword matches. Must decide before the response reaches the user.

Speed-limited Single request No context

Retrospector

Runs asynchronously, no time limit

Full Presidio NLP, neural embeddings, anomaly detection across days of conversation history, and statistical aggregates that only make sense when you look at patterns over time.

Deep NLP Multi-turn context Org-level patterns

Example: A jailbreak attempt spread across 6 messages over 20 minutes — each message looks harmless in isolation. The real-time scanner passes each one. The retrospector's conversation pass detects the escalating pattern and flags the entire conversation as a multi-turn attack.

1

Pass 1 — Response PII & Harmful Content Scan

Runs every 15 minutes. Scans the last 15 minutes of VAS logs using Presidio and toxic-bert to detect PII in LLM responses and harmful/toxic output that the real-time scanner may have scored below the blocking threshold.

A

Presidio PII / PHI detection

Microsoft Presidio analyses every LLM response text for names, email addresses, phone numbers, SSNs, credit card numbers, API secrets, HIPAA-defined PHI, and more. It uses Named Entity Recognition plus custom regex recognizers tuned for AI output patterns. Any entity above a confidence threshold is flagged.

B

Toxic-bert harmful content classifier

Every response is scored by unitary/toxic-bert across six dimensions: toxicity, severe toxicity, obscenity, threat, insult, and identity attack. Responses scoring above the configured threshold (0.65 by default) are flagged as potentially harmful output.

C

Severity scoring

Each finding is tagged CRITICAL, HIGH, MEDIUM, or LOW based on entity type (SSN/PHI = CRITICAL) and classifier score. The result is written to retroactive_violations with the source pass, user ID, and a snippet of the offending text.

Why this matters: The real-time scanner's PII check uses lightweight regex and may miss obfuscated or context-dependent PII (e.g. an SSN written in words). Presidio's full NLP pipeline, running without a time limit, catches a significantly broader range of entity types.

2

Pass 2 — Intelligent Full Rescan

Runs every 15 minutes. Identifies VAS log entries that the real-time scanner flagged as borderline (scored just below your blocking threshold) and triggers a full compliance analysis using the Rust compliance API, which has more time to run deeper checks.

A

High-risk log selection

The retrospector queries VAS logs for entries where the inline compliance score is above 0.3 (configurable) but below the blocking threshold. These are the ones where something was suspicious but not definitive enough to block live.

B

Full compliance API call

For each selected log, the retrospector calls your Smartflow compliance API (/api/compliance/scan) with the full prompt and response text. The compliance engine runs its complete policy evaluation with no latency budget — returning a definitive verdict and detailed findings.

C

Violations promoted

Any borderline log that the full scan confirms as a violation is written to retroactive_violations with pass: "full_rescan". Your team can review and dismiss or escalate from the Post-Scan tab.

3

Pass 3 — Multi-Turn Conversation Analysis

Runs every 30 minutes. Groups VAS logs by session/conversation, then uses sentence-transformer embeddings to detect escalating attack patterns across a full conversation that look benign message-by-message.

A

Conversation grouping

Logs are grouped by session_id (or user + rolling 2-hour window if no session ID is present). Each group forms a conversation timeline that is analysed as a unit.

B

Semantic similarity to attack categories

Each message is embedded using all-MiniLM-L6-v2. The embeddings are compared against pre-defined centroids for known attack categories: jailbreak escalation, prompt injection, social engineering, context manipulation, and role-play exploitation. A rising similarity trajectory across the conversation triggers a flag.

C

Escalation trajectory detection

If the last 3 messages have consistently higher attack similarity than the first 3, and the peak similarity exceeds 0.55, the conversation is flagged as a potential multi-turn attack. The full conversation ID and a trajectory score are recorded in the violation.

Typical catches: Gradual jailbreak (starting with a harmless request and slowly shifting context over many turns), incremental PII harvesting, and multi-step prompt injection where the payload is split across several messages.

4

Pass 4 — User Behavioural Anomaly Detection

Runs every hour. Looks at per-user request patterns over a sliding 24-hour window and flags statistically unusual behaviour — volume spikes, off-hours usage, and repeated policy probing.

A

Request volume spikes

The pass calculates each user's hourly request rate over the last 7 days to establish a baseline. If today's peak hour is more than 3× the baseline, the user is flagged for abnormal volume — a common indicator of automated scripting or credential compromise.

B

Off-hours usage

Requests made between 11pm and 5am local time (based on org timezone, defaulting to UTC) are counted. If a user has more than 10 off-hours requests in a 24h window who has no history of off-hours activity, it's flagged as anomalous.

C

Policy probing patterns

If a user's requests contain a high proportion of compliance flags (> 25% of their recent requests flagged by any scanner), they are flagged as potentially probing policy limits — testing what the guardrails will and won't allow.

5

Pass 5 — Nightly Organisational Compliance Report

Runs once per night at 2am. Aggregates all compliance data across the last 24 hours and the rolling 7 and 30 day windows to produce an org-level compliance health report stored in MongoDB and surfaced in the Post-Scan tab.

A

Violation rate trending

The report calculates the overall violation rate for the last 24h, 7d, and 30d and tracks the trend (improving, stable, worsening). A worsening trend over 7 days is highlighted in amber on the Post-Scan summary card.

B

Top violation categories

The report ranks violation types by frequency — PII exposure, toxic content, policy breach, jailbreak attempt — so administrators know where to focus attention. Each category includes a count and a rate per 1,000 requests.

C

Most active users & departments

The report surfaces the users and (where AD group data is available) departments with the highest violation rates. This is the data your compliance officer needs for quarterly attestation or regulatory examination packages.

Reading results in the Post-Scan tab

All retroactive violations appear in the Post-Scan tab inside the Compliance screen (/intelligent_compliance.html). The tab is marked with a microscope icon.

https://ai.acmecorp.com/dashboard/intelligent_compliance.html

Overview

VAS Logs

Violations

Frameworks

Post-Scan

14

Critical / High

31

Medium

8

Reviewed

37

Pending

Timestamp	User	Pass	Severity	Summary	Status
Apr 7 09:14	[email protected]	response_scan	CRITICAL	SSN detected in LLM response — 9-digit pattern	Pending
Apr 7 08:52	[email protected]	conversation	HIGH	Multi-turn jailbreak escalation (6 turns, score 0.72)	Pending
Apr 6 23:41	[email protected]	user_patterns	MEDIUM	Off-hours spike: 47 requests 2am–3am UTC	Dismissed

Click Detail to see the full violation record

The detail modal shows the exact prompt and response text, the specific entity or pattern detected, the confidence score, the VAS log ID, and all metadata. You can copy the VAS log ID to search for the original request in the Traces tab.

Click Review to mark as reviewed or dismiss

The review modal lets you set the status to Reviewed (confirmed violation), Dismissed (false positive), or Escalated. You can add a note that is stored with the record for audit trail purposes.

Latest Nightly Report card

At the bottom of the Post-Scan tab is a card showing the latest nightly report — overall violation rate, top violation type, 7-day trend, and most-flagged user. Click Full Report to expand all categories and departments.

Triggering passes manually

Every pass can be triggered on demand via the API. This is useful after a security incident, after configuring a new policy, or just to get results immediately without waiting for the next scheduled run.

1

Trigger a specific pass

POST to /api/compliance/retroactive/trigger/{pass_name} with your admin key. Valid pass names are: response_scan, full_rescan, conversation, user_patterns, statistical.

          # Trigger the PII response scan immediately

          curl -X POST https://ai.acmecorp.com/api/compliance/retroactive/trigger/response_scan \

            -H "X-Admin-Key: your-admin-key"

2

Check service status

GET /api/compliance/retroactive/status returns the last run time, status, and violation count for each pass. Useful for monitoring.

          curl https://ai.acmecorp.com/api/compliance/retroactive/status \

            -H "X-Admin-Key: your-admin-key"

3

Fetch violations via API

GET /api/compliance/retroactive?severity=CRITICAL&status=pending&limit=50 returns paginated violations filtered by severity, pass, and review status. The dashboard Post-Scan tab uses this endpoint directly.

Configuration: Pass schedules, thresholds (PII confidence, toxic score cutoff, anomaly multiplier, off-hours window), and the MongoDB/Compliance API URLs are all controlled via environment variables on the smartflow-retrospector container. See retrospector/config.py for the full list. No restart is required for schedule changes — they take effect on the next scheduler tick.