Choose Your Plan
All plans include all 25 safety domains, real-time violation detection, and deterministic safety guarantees. Pay per token -- scale as you grow.
Shared Inference
No base fee
CPU-powered inference on shared infrastructure. Ideal for development, testing, and light production workloads.
- All 25 safety domains
- Pay-per-use ($2.50/1M tokens)
- Up to 5 seats
- CPU inference (~30-50ms latency)
- Dashboard & usage analytics
- Standard rate limits (60 rpm)
- Email support
- Best-effort availability
Dedicated
+ $1,500/mo base fee
Dedicated GPU with isolated database. Continual learning on your domain data. 5-10x faster inference.
- All 25 safety domains
- Tokens billed at usage ($5.00/1M)
- Up to 50 seats
- Dedicated GPU inference (~5-10ms latency)
- Isolated database + node pool
- Continual learning on your data
- Priority rate limits (300 rpm)
- Custom domain gates
- Slack + email support
- 99.5% uptime SLA
Enterprise Security
+ $3,500/mo base fee
High-performance GPU with isolated VPC, HA database, and expanded audit and governance controls. BAA may be available for qualifying healthcare workloads.
- All 25 safety domains
- Tokens billed at usage ($8.50/1M)
- Unlimited seats
- High-performance GPU inference (~3-5ms latency)
- Isolated HA infrastructure + VPC
- Mapped to internal security control baseline
- BAA may be available for qualifying healthcare use cases
- Audit trail & explainability controls
- Dedicated support engineer
- 99.9% uptime SLA
Infrastructure at Every Tier
Every plan runs SolaceSentry's custom 350M parameter transformer with 4 judge transformers and dual-model courtroom. Higher tiers unlock dedicated GPU compute and isolated infrastructure.
| Feature | Shared | Dedicated | Enterprise |
|---|---|---|---|
| Compute | CPU-Optimized | Dedicated GPU | High-Performance GPU |
| Inference Latency | ~30-50ms | ~5-10ms | ~3-5ms |
| Included Tokens | 1M/mo | 10M/mo | 100M/mo |
| Infrastructure | Shared pool | Isolated node + DB | HA cluster + VPC |
| Continual Learning | -- | ||
| Uptime SLA | Best-effort | 99.5% | 99.9% |
Predictions
AI-powered cost forecasting with Bayesian analysis, regime detection, and Monte Carlo scenario planning across all 25 safety domains.
- Token cost prediction with Hierarchical Bayes
- Regime change-point detection
- Monte Carlo scenario analysis
- 25-domain partial pooling
Available for Dedicated & Enterprise
Frequently Asked Questions
What counts as a token?
Tokens are the units processed by our inference engine. Both input and output tokens are counted. Our custom BPE tokenizer is optimized for safety-domain vocabulary.
Can I switch plans later?
Yes. You can upgrade or downgrade at any time. Changes take effect at the start of your next billing cycle. No lock-in contracts.
What does Enterprise Security include?
Enterprise Security includes isolated infrastructure, expanded audit and governance controls, encryption, and support for contract review. A BAA may be available for qualifying healthcare use cases.
What's the difference between Shared and Dedicated inference?
Shared inference runs on CPU-optimized pooled infrastructure (~30-50ms latency). Dedicated and Enterprise tiers include their own dedicated GPU -- giving you 5-10x faster inference, isolated compute, and continual learning on your data.
How does token billing work?
All plans are usage-based — you pay per 1M tokens consumed at your tier's rate (Shared: $2.50, Dedicated: $5.00, Enterprise: $8.50). Dedicated and Enterprise tiers also have a monthly base fee that covers the dedicated GPU infrastructure. There are no free included tokens on any plan.
Select Safety Domains
Choose the domains you need. All plans include access to your selected domains.