What is Smartflow Enterprise?

Smartflow Enterprise is a self-hosted AI governance gateway by APERION that provides OpenAI and Anthropic drop-in compatible proxying, a 4-phase BERT semantic cache, real-time policy engine, enterprise SSO identity, compliance tooling, MCP gateway, and A2A agent orchestration.

How does Smartflow's semantic cache work?

Smartflow uses a 4-phase MetaCache: exact hash match, BERT KNN semantic similarity (VectorLite), model-compressed match, and predictive pre-caching. This achieves 55–75% cache hit rates vs 5–15% for exact-match-only solutions, dramatically reducing LLM API costs.

Does Smartflow support enterprise SSO?

Yes. Smartflow supports Microsoft Entra ID (Azure AD) via OIDC, SAML 2.0, LDAP/Active Directory, and trusted proxy header SSO. Every AI request is tied to the authenticated corporate user identity for per-user policy enforcement and full audit trail.

Is Smartflow HIPAA and GDPR compliant?

Smartflow is purpose-built for regulated environments. It is self-hosted so your data never leaves your network. The policy engine includes pre-built PHI detection for HIPAA, student data protection for FERPA, and financial data guards for SOX. All requests are logged with user identity for compliance audits.

How does Smartflow compare to LiteLLM?

LiteLLM is a developer routing tool. Smartflow Enterprise is a governance platform. Smartflow adds enterprise SSO, a 4-phase semantic cache (vs LiteLLM's exact-match only), a no-code policy engine, per-user audit trails, compliance tooling, and MCP/A2A support that LiteLLM does not offer.

Can Smartflow run on Kubernetes?

Yes. Smartflow ships a production Helm chart with HPA, PodDisruptionBudget, NetworkPolicy, and cert-manager TLS integration. It can also run via Docker Compose for non-Kubernetes deployments.

APERION SmartFlow Documentation — Enterprise AI Governance Gateway

Name: Smartflow
Author: APERION

Getting Started

Up and running in minutes

Point any existing SDK at Smartflow — zero code changes required.

OpenAI SDK

Anthropic SDK

cURL

Python (async)

Helm / K8s

Integration Tests

# Any OpenAI SDK client — zero code changes required
from openai import OpenAI

client = OpenAI(
    base_url="https://YOUR_SMARTFLOW_HOST/v1",
    api_key="sk-sf-your-virtual-key"   # issued by your Smartflow admin
)

response = client.chat.completions.create(
    model="gpt-4.1",    # or "claude-sonnet-4-6", "gemini-2.0-flash", any routed model
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

# Native Anthropic SDK — point base_url at Smartflow, nothing else changes
import os
from anthropic import Anthropic

# Or set env: ANTHROPIC_BASE_URL=https://YOUR_SMARTFLOW_HOST/anthropic
client = Anthropic(
    base_url="https://YOUR_SMARTFLOW_HOST/anthropic",
    api_key="sk-sf-your-virtual-key"
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

# OpenAI-compatible endpoint (all models)
curl https://YOUR_SMARTFLOW_HOST/v1/chat/completions \
  -H "Authorization: Bearer sk-sf-your-virtual-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Native Anthropic Messages API
curl https://YOUR_SMARTFLOW_HOST/anthropic/v1/messages \
  -H "x-api-key: sk-sf-your-virtual-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Async streaming with the standard OpenAI AsyncClient
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="https://YOUR_SMARTFLOW_HOST/v1",
    api_key="sk-sf-your-virtual-key"
)

async def main():
    # Streaming chat completion
    stream = await client.chat.completions.create(
        model="claude-sonnet-4-6",
        messages=[{"role": "user", "content": "Explain quantum entanglement"}],
        stream=True
    )
    async for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

asyncio.run(main())

# Contact your Smartflow account team for the Helm chart and values file.
# Example values — replace placeholders with your actual configuration:

# values.yaml
secrets:
  openaiApiKey:      "sk-..."        # your OpenAI key
  anthropicApiKey:   "sk-ant-..."    # your Anthropic key
  googleApiKey:      "AIza..."       # optional: Google Gemini

ingress:
  host:  "smartflow.your-domain.com"
  tls:   true

# Deploy
helm upgrade --install smartflow ./smartflow-chart \
  --namespace smartflow --create-namespace \
  -f values.yaml

# Verify all pods are running
kubectl get pods -n smartflow
kubectl rollout status deployment/smartflow-proxy -n smartflow

#!/usr/bin/env bash
# Smartflow Integration Test Script
# Fill in your deployment details, then run:  bash smartflow_test.sh

export SMARTFLOW_HOST="https://your-smartflow.example.com"  # no trailing slash
export VIRTUAL_KEY="sk-sf-your-virtual-key"

echo "=== Test 1: Health Check ==="
curl -sf "$SMARTFLOW_HOST/health" | python3 -m json.tool

echo "=== Test 2: OpenAI-compatible chat ==="
curl -s "$SMARTFLOW_HOST/v1/chat/completions" \
  -H "Authorization: Bearer $VIRTUAL_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4.1","messages":[{"role":"user","content":"Reply with one word: PASS"}]}' \
  | python3 -m json.tool

echo "=== Test 3: Anthropic native messages ==="
curl -s "$SMARTFLOW_HOST/anthropic/v1/messages" \
  -H "x-api-key: $VIRTUAL_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":32,"messages":[{"role":"user","content":"Reply: PASS"}]}' \
  | python3 -m json.tool

echo "=== Test 4: Available models ==="
curl -s -H "Authorization: Bearer $VIRTUAL_KEY" \
  "$SMARTFLOW_HOST/v1/models" | python3 -m json.tool

echo "=== Test 5: Cache stats ==="
curl -s -H "Authorization: Bearer $VIRTUAL_KEY" \
  "$SMARTFLOW_HOST/api/metacache/stats" | python3 -m json.tool

What is a Virtual Key?

A virtual key (sk-sf-...) is the only credential your application needs. Your Smartflow admin issues it — no provider keys ever leave the server.

Provider keys stay server-side

OpenAI, Anthropic, and Gemini keys are stored encrypted in Smartflow. Your app never sees them.

Policy & guardrails enforced

Each virtual key is tied to model allow-lists, rate limits, budget caps, and compliance guardrails set by your admin.

Scoped & auditable

Lock a key to a specific team, user, or app. Every request is logged with the virtual key as identity for full audit trails.

Instant revoke & rotate

Rotate or revoke a virtual key without touching a single application. Provider keys remain unchanged.

To get a virtual key: contact your Smartflow admin, or use POST /api/auth/virtual-keys if you have admin access.

Documentation

Platform guides & references

Everything from platform overview to deep-dive feature guides.

Platform Overview

Complete technical overview of all Smartflow capabilities — proxy, caching, MCP gateway, A2A orchestration, policy engine, and enterprise identity.

Guide

Platform Capabilities

Detailed breakdown of all 37 features — OpenAI & Anthropic drop-ins, 4-phase BERT cache, local models, policy engine, K8s deployment, and more.

Guide

Cache Ecosystem

The three-layer cost reduction strategy: 4-phase MetaCache with VectorLite BERT KNN, in-flight prompt compression, and transparent LLM-side prompt cache injection.

Guide

SSO & Identity Guide NEW

Complete SSO passthrough guide — Entra ID OIDC, SAML 2.0, Kerberos SPNEGO, trusted proxy headers. UI & CLI config with real-world examples for every deployment pattern.

Enterprise

SSO & Unified Identity

Microsoft Entra ID SSO integration, group sync, App Role mapping to Smartflow roles, and zero-touch team provisioning from directory membership.

Enterprise

Dashboard User Guide & Walkthroughs

Step-by-step visual walkthroughs for every key task — SSO setup, adding users, building policies, compliance testing, cache monitoring, and trace log troubleshooting.

Walkthroughs

SafeChat Enterprise Guide

End-to-end configuration guide for SafeChat Enterprise — deployment, policy setup, virtual keys, compliance dashboards, and integration patterns.

Enterprise

SafeChat v3 Usage Guide

Practical usage guide for Smartflow v3 — authentication, routing, model selection, compliance controls, and SDK patterns for developers.

Guide

Docker & Kubernetes Best Practices

Production deployment guide — resource tuning, NGINX ingress buffering, cert-manager TLS, CPU throttle prevention, and a full operational runbook for both Docker Compose and Kubernetes.

Infrastructure

Deployment Wizard — Step-by-Step Guide NEW

Guided 8-step wizard walkthrough — connect your server via SSH, configure AI keys, choose Docker or Kubernetes, watch live deploy logs, and get a working Smartflow instance in under 10 minutes.

Wizard Guide

Compliance Retrospector — Post-Scan Analysis NEW

How the post-scan ML service catches PII leaks, multi-turn jailbreaks, and behavioural anomalies that real-time in-flight scanners miss — 5 passes explained with dashboard walkthrough.

Compliance

Shield — AI Coding Agent Guardrails NEW · ORG MODE

Tiered destructive-operation guardrails for IDE-resident AI coding agents (Cursor, VS Code, Codex, Claude Code). Two intercept seams catch DROP DATABASE prod, git push --force main, rm -rf / and other destructive ops before they execute — with a 4-tier severity model, Redis-backed approval queue, and the new Shield Org Mode fleet manager that pushes one policy across every developer laptop and CI runner.

Security

Aperion Shield — Free Local MCP Guardrails v1.0 · OSS

The free, open-source local edition. A tiny single-binary MCP guardrail that wraps any upstream tool (postgres, github, filesystem, custom) — local stdio and remote Streamable HTTP servers — and blocks destructive AI-agent operations before they execute. Ships 51 starter rules across 13 categories plus an optional 40-rule ATR community pack, an adaptive scoring layer, identity gating (ID.me, Okta, mock), and MCP supply-chain protection: tool-catalog pinning against rug pulls, description & result scanning for tool poisoning and prompt injection. v1.0 completes the lifecycle: --scan audits an MCP server before you install it (static signatures, registry + OSV.dev metadata, sandboxed live catalog audit), and --sandbox confines the server process at the OS level (macOS Seatbelt — credential-dir denial, write & network confinement). Drop-in for Cursor, Claude Code, and any MCP host. Apache 2.0, runs entirely on your machine, no account.

Open Source

Verifying Aperion Images — Cosign & Admission Policies NEW

How to verify Aperion runtime images with cosign before deployment. Covers the private registry pull workflow, key pinning, drop-in Kubernetes admission policies for Sigstore Policy Controller, Kyverno, and OPA Gatekeeper, plus CI snippets for GitHub Actions, GitLab CI, Jenkins, and Argo CD pre-sync.

Supply Chain

Sovereign — EU AI Act Compliance Fabric 1.8 · NEW

The flagship 1.8 release — built for EU AI Act high-risk obligations (effective 2 Aug 2026). Article-by-article Conformity Console, WORM-immutable audit archive (S3 Object Lock · Azure · GCS · NetApp SnapLock), eIDAS-grade RFC 3161 trusted timestamps, signed AI Bill of Materials per Annex IV, MCP Trust Registry with Sigstore/SLSA provenance, open-source independent audit verifier, supervisor-grade National AI Inventory + Article 79 incident pipeline, and a Pre-Bind Insurance Engine (NAIC 275 · IDD POG · Reg BI · SFDR). FedRAMP / IL5 / GAIA-X deployment SKUs included.

EU AI Act · Sovereign

API Reference

Complete endpoint documentation

All proxy, management, compliance, MCP, A2A, and routing endpoints.

Proxy Endpoints /v1 Anthropic Native API /anthropic MCP Gateway /api/mcp Routing & Fallback Chains /api/routing Compliance API /api/compliance Policy Engine /api/policies Virtual Keys /api/enterprise/vkeys Cache Management /api/metacache A2A Agent Gateway /a2a SSO / OIDC Auth /api/auth/sso Prometheus Metrics /metrics VAS Audit Logs /api/vas/logs

View Full API Reference

SDK

Smartflow Python SDK

Async and sync clients, type-safe responses, and built-in helpers for compliance, cache stats, and VAS audit logs.

SDK Reference

Full method reference for SmartflowClient and SyncSmartflowClient — chat, completions, Anthropic native, compliance, cache, VAS logs, MCP, A2A, and more.

Reference

Cursor & IDE Setup

Route all AI coding traffic (Cursor, VS Code Continue, Aider, Windsurf, JetBrains) through Smartflow for policy enforcement, cost tracking, and virtual key management.

Guide

Edge Chrome Extension

Automatically intercept browser-based LLM API calls via the Smartflow Edge extension. Enroll devices, enforce MAESTRO policies, and track per-user usage without any code changes.

Guide

Install

                pip install smartflow-sdk
            

Python 3.9+ · Async + Sync · v0.3.0

Architecture

Deployment diagrams

Five reference architectures from single-node on-prem to full enterprise cloud orchestration.

On-Premises Arch 1 Hybrid Cloud Arch 2 Full Cloud Arch 3 Hybrid LLM Routing Arch 4 Enterprise Orchestration Arch 5 All Diagrams Index Index

Release Notes

What's new

Latest · Sovereign

Smartflow 1.8 — Sovereign

EU AI Act Conformity Console · WORM audit archive (S3 Object Lock · Azure · GCS · SnapLock) · RFC 3161 trusted timestamps · signed AI Bill of Materials · MCP Trust Registry with Sigstore/SLSA provenance · independent open-source audit verifier · Single Pane of Glass live SSE feed · Retrospector → GRC closed loop · Aperion Sovereign supervisor platform with National AI Inventory and Article 79 incident pipeline · FedRAMP / IL5 / GAIA-X deployment SKUs · Pre-Bind Insurance Engine (NAIC 275 · IDD POG · Reg BI · SFDR).

New Product · OSS

Aperion Shield 0.1.0 — First Public Release

A new, free, open-source product from Aperion. Local MCP guardrails for AI coding agents — wraps Cursor / Claude Code / any MCP host and blocks destructive operations before they execute. 12 starter rules, three operating modes, local approval inbox, single ~6 MB binary. Apache 2.0.

Smartflow 1.7

AI Packaging Platform · Provider Onboarding Wizard · Hybrid Analytics Bridge · Real-time cache metrics · Model-by-model routing performance · 30-day request trend.

Smartflow 1.6

Any-to-any provider routing · March 2026 model table · Per-request Trace UI · Prometheus /metrics · OIDC SSO dashboard config · Full auth-code flow · 2 bug fixes.

Smartflow 1.5

Phase 4 VectorLite BERT semantic KNN cache · Kubernetes/Helm production validation · Key store deadlock fix · Async compliance refactor · MAESTRO UUID guard. Default model: claude-sonnet-4-6.

Smartflow 1.3 / 1.4

MCP SSE + STDIO transports · Per-request cache controls · OAuth PKCE · A2A agent gateway · Entra ID SSO · Guardrail policy groups with inheritance · 15 new features.

Competitive Analysis

How Smartflow compares

Independent technical whitepapers examining how Smartflow Enterprise stacks up against leading alternatives — written for enterprise procurement teams, architects, and engineering leaders.

Smartflow vs LiteLLM

Enterprise AI gateway vs. developer routing library. Covers semantic caching ROI, enterprise SSO, policy engine depth, compliance tooling, and why LiteLLM is a starting point rather than a destination for regulated deployments.

Whitepaper Jan 14, 2026

Smartflow vs TrueFoundry

Purpose-built AI gateway vs. broad MLOps platform. Examines AI gateway depth, semantic caching, compliance readiness, TCO analysis, and why a specialised gateway outperforms a general-purpose MLOps platform for LLM governance.

Whitepaper Feb 7, 2026

Smartflow vs OpenRouter

Data sovereignty vs. cloud aggregation. Covers HIPAA, FERPA, GDPR compliance risk, data flow architecture, per-user identity, semantic caching cost savings, and why regulated organisations cannot use cloud AI aggregators.

Whitepaper Mar 11, 2026

AI Packaging Platform

Ship your own branded AI API in a day. Package OpenAI, Anthropic, Google, or any provider under your own domain, model names, and virtual keys — with compliance, caching, and an immutable audit trail built in. New in 1.7.

Smartflow vs Azure AI Gateway

On-premises enforcement, L1–L4 semantic caching, and Splunk-native observability vs. Azure API Management’s cloud-only AI Gateway. Includes a complementary adoption path and phase-out roadmap for APIM customers.

Whitepaper Apr 8, 2026

Performance & Benchmarks

Load testing results & scaling analysis

Independent, reproducible load tests comparing Smartflow Enterprise to direct OpenAI API calls across a range of request rates — with methodology, raw numbers, and horizontal scaling analysis.

Horizontal Scaling & Latency Analysis

Systematic RPS comparison (5–80 RPS) of Smartflow Enterprise vs direct OpenAI on a 2-node Kubernetes cluster. With cache: 3.2× faster than OpenAI at 40 RPS. Without cache: flat ~190ms proxy overhead through 60 RPS. Infinite horizontal scale with constant latency. Includes Python vs Rust gateway analysis.

Whitepaper K8s Load Testing Apr 3, 2026

Industry Solutions

Governed Agentic AI by Industry

An architecture series on deploying AI agents in high-consequence industries with governance, compliance, and a clear operating model built in from day one.

Part 1: Manufacturing NEW

One governed agentic layer for the plant floor. Why OT/IT convergence, legacy protocols, and physical consequence break naive AI, and how a ubiquitous trust-secured middleware layer (deterministic deny gate, human identity in the loop, published capability catalog, local specialized models) unlocks agents safely.

Architecture ISA/IEC 62443 NIST AI RMF EU AI Act Jul 6, 2026

Part 2: Insurance NEW

One governed trust layer for the business of promises. How insurers satisfy converging global AI regulation (EU AI Act, NAIC, Colorado, NYDFS, FCA, MAS), keep sensitive claims and underwriting data home with local specialized models, and bend a rising token-cost curve with governed routing, budgets, and caching.

Architecture EU AI Act NAIC · NYDFS Cost Governance Jul 6, 2026

Regulated Industries

Built for financial services & compliance-driven AI

Deep-dive technical papers on Smartflow's governance capabilities for banks, broker-dealers, and regulated fintech — covering agent identity, information barrier enforcement, and regulatory examination readiness.

Agentic AI Governance Conformance NEW

How Smartflow maps, control-by-control, to Singapore IMDA's Model AI Governance Framework for Agentic AI — verifiable agent identity, deterministic enforcement, tamper-evident audit, a gateway kill switch, human-oversight analytics, and cross-boundary taint tracking, across all eight governance dimensions.

Conformance IMDA Framework EU AI Act 8 / 8 dimensions

AI Governance for Regulated Industries

How Smartflow solves the three hardest AI problems in financial services: cryptographic agent identity & delegated authority (AIDA), information barrier enforcement, and automated regulatory examination evidence packages for OCC, FINRA, FFIEC, and EU AI Act.

Whitepaper SR 11-7 FINRA 3110 EU AI Act Mar 26, 2026

Tamper-Evident Audit Logs

A technical brief on how Smartflow protects per-request LLM logs with HMAC-chained tamper-evidence, a customer-runnable verifier, and a full comparison with LiteLLM, Azure APIM AI Gateway, and OpenRouter.

Spotlight SOC 2 EU AI Act Art. 12 FINRA 4511 Apr 22, 2026

Sample AI Examination Reports

See what Smartflow actually produces — realistic sample output from the Regulatory Examination Suite. SR 11-7, FINRA 3110, EU AI Act, and Comprehensive packages with full AI inventory, findings, and audit evidence.

Sample Reports SR 11-7 FINRA 3110 EU AI Act Mar 26, 2026