Smartflow 1.6 released — Any-to-any provider routing, Trace UI, Prometheus /metrics, OIDC SSO dashboard config, 2 bug fixes. Read release notes →
Documentation & Support Portal

Everything you need to
build with Smartflow

Complete guides, API reference, SDK documentation, architecture diagrams, and direct support — all in one place.

Quick Start Platform Overview API Reference
37
Platform Features
4-Phase
Semantic Cache
3
Protocol Support
(LLM · MCP · A2A)
0.90
BERT Similarity
Threshold
K8s
Helm + cert-manager
Validated
3.2×
Faster than OpenAI
at 40 RPS w/ cache →
Up and running in minutes
Point any existing SDK at Smartflow — zero code changes required.
OpenAI SDK
Anthropic SDK
cURL
Python (async)
Helm / K8s
Integration Tests
# Any OpenAI SDK client — zero code changes required
from openai import OpenAI

client = OpenAI(
    base_url="https://YOUR_SMARTFLOW_HOST/v1",
    api_key="sk-sf-your-virtual-key"   # issued by your Smartflow admin
)

response = client.chat.completions.create(
    model="gpt-4.1",    # or "claude-sonnet-4-6", "gemini-2.0-flash", any routed model
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
# Native Anthropic SDK — point base_url at Smartflow, nothing else changes
import os
from anthropic import Anthropic

# Or set env: ANTHROPIC_BASE_URL=https://YOUR_SMARTFLOW_HOST/anthropic
client = Anthropic(
    base_url="https://YOUR_SMARTFLOW_HOST/anthropic",
    api_key="sk-sf-your-virtual-key"
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)
# OpenAI-compatible endpoint (all models)
curl https://YOUR_SMARTFLOW_HOST/v1/chat/completions \
  -H "Authorization: Bearer sk-sf-your-virtual-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Native Anthropic Messages API
curl https://YOUR_SMARTFLOW_HOST/anthropic/v1/messages \
  -H "x-api-key: sk-sf-your-virtual-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
# Async streaming with the standard OpenAI AsyncClient
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="https://YOUR_SMARTFLOW_HOST/v1",
    api_key="sk-sf-your-virtual-key"
)

async def main():
    # Streaming chat completion
    stream = await client.chat.completions.create(
        model="claude-sonnet-4-6",
        messages=[{"role": "user", "content": "Explain quantum entanglement"}],
        stream=True
    )
    async for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

asyncio.run(main())
# Contact your Smartflow account team for the Helm chart and values file.
# Example values — replace placeholders with your actual configuration:

# values.yaml
secrets:
  openaiApiKey:      "sk-..."        # your OpenAI key
  anthropicApiKey:   "sk-ant-..."    # your Anthropic key
  googleApiKey:      "AIza..."       # optional: Google Gemini

ingress:
  host:  "smartflow.your-domain.com"
  tls:   true

# Deploy
helm upgrade --install smartflow ./smartflow-chart \
  --namespace smartflow --create-namespace \
  -f values.yaml

# Verify all pods are running
kubectl get pods -n smartflow
kubectl rollout status deployment/smartflow-proxy -n smartflow
#!/usr/bin/env bash
# Smartflow Integration Test Script
# Fill in your deployment details, then run:  bash smartflow_test.sh

export SMARTFLOW_HOST="https://your-smartflow.example.com"  # no trailing slash
export VIRTUAL_KEY="sk-sf-your-virtual-key"

echo "=== Test 1: Health Check ==="
curl -sf "$SMARTFLOW_HOST/health" | python3 -m json.tool

echo "=== Test 2: OpenAI-compatible chat ==="
curl -s "$SMARTFLOW_HOST/v1/chat/completions" \
  -H "Authorization: Bearer $VIRTUAL_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4.1","messages":[{"role":"user","content":"Reply with one word: PASS"}]}' \
  | python3 -m json.tool

echo "=== Test 3: Anthropic native messages ==="
curl -s "$SMARTFLOW_HOST/anthropic/v1/messages" \
  -H "x-api-key: $VIRTUAL_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":32,"messages":[{"role":"user","content":"Reply: PASS"}]}' \
  | python3 -m json.tool

echo "=== Test 4: Available models ==="
curl -s -H "Authorization: Bearer $VIRTUAL_KEY" \
  "$SMARTFLOW_HOST/v1/models" | python3 -m json.tool

echo "=== Test 5: Cache stats ==="
curl -s -H "Authorization: Bearer $VIRTUAL_KEY" \
  "$SMARTFLOW_HOST/api/metacache/stats" | python3 -m json.tool
What is a Virtual Key?

A virtual key (sk-sf-...) is the only credential your application needs. Your Smartflow admin issues it — no provider keys ever leave the server.

Provider keys stay server-side
OpenAI, Anthropic, and Gemini keys are stored encrypted in Smartflow. Your app never sees them.
Policy & guardrails enforced
Each virtual key is tied to model allow-lists, rate limits, budget caps, and compliance guardrails set by your admin.
Scoped & auditable
Lock a key to a specific team, user, or app. Every request is logged with the virtual key as identity for full audit trails.
Instant revoke & rotate
Rotate or revoke a virtual key without touching a single application. Provider keys remain unchanged.
To get a virtual key: contact your Smartflow admin, or use POST /api/auth/virtual-keys if you have admin access.
Platform guides & references
Everything from platform overview to deep-dive feature guides.
Platform Overview
Complete technical overview of all Smartflow capabilities — proxy, caching, MCP gateway, A2A orchestration, policy engine, and enterprise identity.
Guide
Platform Capabilities
Detailed breakdown of all 37 features — OpenAI & Anthropic drop-ins, 4-phase BERT cache, local models, policy engine, K8s deployment, and more.
Guide
Cache Ecosystem
The three-layer cost reduction strategy: 4-phase MetaCache with VectorLite BERT KNN, in-flight prompt compression, and transparent LLM-side prompt cache injection.
Guide
SSO & Identity Guide NEW
Complete SSO passthrough guide — Entra ID OIDC, SAML 2.0, Kerberos SPNEGO, trusted proxy headers. UI & CLI config with real-world examples for every deployment pattern.
Enterprise
SSO & Unified Identity
Microsoft Entra ID SSO integration, group sync, App Role mapping to Smartflow roles, and zero-touch team provisioning from directory membership.
Enterprise
Dashboard User Guide & Walkthroughs
Step-by-step visual walkthroughs for every key task — SSO setup, adding users, building policies, compliance testing, cache monitoring, and trace log troubleshooting.
Walkthroughs
SafeChat Enterprise Guide
End-to-end configuration guide for SafeChat Enterprise — deployment, policy setup, virtual keys, compliance dashboards, and integration patterns.
Enterprise
SafeChat v3 Usage Guide
Practical usage guide for Smartflow v3 — authentication, routing, model selection, compliance controls, and SDK patterns for developers.
Guide
Docker & Kubernetes Best Practices
Production deployment guide — resource tuning, NGINX ingress buffering, cert-manager TLS, CPU throttle prevention, and a full operational runbook for both Docker Compose and Kubernetes.
Infrastructure
Complete endpoint documentation
All proxy, management, compliance, MCP, A2A, and routing endpoints.
View Full API Reference
Smartflow Python SDK
Async and sync clients, type-safe responses, and built-in helpers for compliance, cache stats, and VAS audit logs.
SDK Reference
Full method reference for SmartflowClient and SyncSmartflowClient — chat, completions, Anthropic native, compliance, cache, VAS logs, MCP, A2A, and more.
Reference
Cursor & IDE Setup
Route all AI coding traffic (Cursor, VS Code Continue, Aider, Windsurf, JetBrains) through Smartflow for policy enforcement, cost tracking, and virtual key management.
Guide
Edge Chrome Extension
Automatically intercept browser-based LLM API calls via the Smartflow Edge extension. Enroll devices, enforce MAESTRO policies, and track per-user usage without any code changes.
Guide
Install
pip install smartflow-sdk
Python 3.9+ · Async + Sync · v0.3.0
Deployment diagrams
Five reference architectures from single-node on-prem to full enterprise cloud orchestration.
On-Premises Arch 1 Hybrid Cloud Arch 2 Full Cloud Arch 3 Hybrid LLM Routing Arch 4 Enterprise Orchestration Arch 5 All Diagrams Index Index
What's new
Latest
Smartflow 1.6
Any-to-any provider routing · March 2026 model table · Per-request Trace UI · Prometheus /metrics · OIDC SSO dashboard config · Full auth-code flow · 2 bug fixes.
Smartflow 1.5
Phase 4 VectorLite BERT semantic KNN cache · Kubernetes/Helm production validation · Key store deadlock fix · Async compliance refactor · MAESTRO UUID guard. Default model: claude-sonnet-4-6.
Smartflow 1.3 / 1.4
MCP SSE + STDIO transports · Per-request cache controls · OAuth PKCE · A2A agent gateway · Entra ID SSO · Guardrail policy groups with inheritance · 15 new features.
How Smartflow compares
Independent technical whitepapers examining how Smartflow Enterprise stacks up against leading alternatives — written for enterprise procurement teams, architects, and engineering leaders.
Smartflow vs LiteLLM
Enterprise AI gateway vs. developer routing library. Covers semantic caching ROI, enterprise SSO, policy engine depth, compliance tooling, and why LiteLLM is a starting point rather than a destination for regulated deployments.
Whitepaper Jan 14, 2026
Smartflow vs TrueFoundry
Purpose-built AI gateway vs. broad MLOps platform. Examines AI gateway depth, semantic caching, compliance readiness, TCO analysis, and why a specialised gateway outperforms a general-purpose MLOps platform for LLM governance.
Whitepaper Feb 7, 2026
Smartflow vs OpenRouter
Data sovereignty vs. cloud aggregation. Covers HIPAA, FERPA, GDPR compliance risk, data flow architecture, per-user identity, semantic caching cost savings, and why regulated organisations cannot use cloud AI aggregators.
Whitepaper Mar 11, 2026
Load testing results & scaling analysis
Independent, reproducible load tests comparing Smartflow Enterprise to direct OpenAI API calls across a range of request rates — with methodology, raw numbers, and horizontal scaling analysis.
Horizontal Scaling & Latency Analysis
Systematic RPS comparison (5–80 RPS) of Smartflow Enterprise vs direct OpenAI on a 2-node Kubernetes cluster. With cache: 3.2× faster than OpenAI at 40 RPS. Without cache: flat ~190ms proxy overhead through 60 RPS. Infinite horizontal scale with constant latency. Includes Python vs Rust gateway analysis.
Whitepaper K8s Load Testing Apr 3, 2026
Built for financial services & compliance-driven AI
Deep-dive technical papers on Smartflow's governance capabilities for banks, broker-dealers, and regulated fintech — covering agent identity, information barrier enforcement, and regulatory examination readiness.
AI Governance for Regulated Industries
How Smartflow solves the three hardest AI problems in financial services: cryptographic agent identity & delegated authority (AIDA), information barrier enforcement, and automated regulatory examination evidence packages for OCC, FINRA, FFIEC, and EU AI Act.
Whitepaper SR 11-7 FINRA 3110 EU AI Act Mar 26, 2026
Sample AI Examination Reports
See what Smartflow actually produces — realistic sample output from the Regulatory Examination Suite. SR 11-7, FINRA 3110, EU AI Act, and Comprehensive packages with full AI inventory, findings, and audit evidence.
Sample Reports SR 11-7 FINRA 3110 EU AI Act Mar 26, 2026
Get help
Multiple channels — from quick self-serve to direct enterprise support.
Email Support
Technical issues, billing, and account questions handled by the engineering team directly.
support@langsmart.ai
Enterprise Support Portal
Bug reports, feature requests, and escalations handled directly by the LangSmart engineering team.
Contact engineering
Integration Tests
Download ready-to-run test scripts for your deployment. Fill in your host and virtual key — done.
Bash script Python script View inline
Full Documentation
Platform overview, API reference, SDK docs, and architecture guides all available above.
Browse docs