Choose Your Plan

All plans include all 25 safety domains, real-time violation detection, and deterministic safety guarantees. Pay per token -- scale as you grow.

Shared Inference

$2.50 /1M tokens

No base fee

CPU-powered inference on shared infrastructure. Ideal for development, testing, and light production workloads.

  • All 25 safety domains
  • Pay-per-use ($2.50/1M tokens)
  • Up to 5 seats
  • CPU inference (~30-50ms latency)
  • Dashboard & usage analytics
  • Standard rate limits (60 rpm)
  • Email support
  • Best-effort availability
Most Popular

Dedicated

$5.00 /1M tokens

+ $1,500/mo base fee

Dedicated GPU with isolated database. Continual learning on your domain data. 5-10x faster inference.

  • All 25 safety domains
  • Tokens billed at usage ($5.00/1M)
  • Up to 50 seats
  • Dedicated GPU inference (~5-10ms latency)
  • Isolated database + node pool
  • Continual learning on your data
  • Priority rate limits (300 rpm)
  • Custom domain gates
  • Slack + email support
  • 99.5% uptime SLA

Enterprise Security

$8.50 /1M tokens

+ $3,500/mo base fee

High-performance GPU with isolated VPC, HA database, and expanded audit and governance controls. BAA may be available for qualifying healthcare workloads.

  • All 25 safety domains
  • Tokens billed at usage ($8.50/1M)
  • Unlimited seats
  • High-performance GPU inference (~3-5ms latency)
  • Isolated HA infrastructure + VPC
  • Mapped to internal security control baseline
  • BAA may be available for qualifying healthcare use cases
  • Audit trail & explainability controls
  • Dedicated support engineer
  • 99.9% uptime SLA

Infrastructure at Every Tier

Every plan runs SolaceSentry's custom 350M parameter transformer with 4 judge transformers and dual-model courtroom. Higher tiers unlock dedicated GPU compute and isolated infrastructure.

Feature Shared Dedicated Enterprise
Compute CPU-Optimized Dedicated GPU High-Performance GPU
Inference Latency ~30-50ms ~5-10ms ~3-5ms
Included Tokens 1M/mo 10M/mo 100M/mo
Infrastructure Shared pool Isolated node + DB HA cluster + VPC
Continual Learning --
Uptime SLA Best-effort 99.5% 99.9%
Add-On

Predictions

AI-powered cost forecasting with Bayesian analysis, regime detection, and Monte Carlo scenario planning across all 25 safety domains.

  • Token cost prediction with Hierarchical Bayes
  • Regime change-point detection
  • Monte Carlo scenario analysis
  • 25-domain partial pooling
$249 /mo

Available for Dedicated & Enterprise

Frequently Asked Questions

What counts as a token?

Tokens are the units processed by our inference engine. Both input and output tokens are counted. Our custom BPE tokenizer is optimized for safety-domain vocabulary.

Can I switch plans later?

Yes. You can upgrade or downgrade at any time. Changes take effect at the start of your next billing cycle. No lock-in contracts.

What does Enterprise Security include?

Enterprise Security includes isolated infrastructure, expanded audit and governance controls, encryption, and support for contract review. A BAA may be available for qualifying healthcare use cases.

What's the difference between Shared and Dedicated inference?

Shared inference runs on CPU-optimized pooled infrastructure (~30-50ms latency). Dedicated and Enterprise tiers include their own dedicated GPU -- giving you 5-10x faster inference, isolated compute, and continual learning on your data.

How does token billing work?

All plans are usage-based — you pay per 1M tokens consumed at your tier's rate (Shared: $2.50, Dedicated: $5.00, Enterprise: $8.50). Dedicated and Enterprise tiers also have a monthly base fee that covers the dedicated GPU infrastructure. There are no free included tokens on any plan.