Pricing
Simple, transparent pricing with no hidden fees. Pay only for what you use.Free Models Available! Get started with zero cost using our free tier models:
- Chat: Llama 3.3 70B via Groq (FREE)
- Embeddings: BGE-M3 via HuggingFace (FREE)
- Moderation: Llama Guard 3 via Groq (FREE)
Subscription Tiers
Free
$0/month
- 100K tokens/month
- 10 requests/minute
- Community support
- All models available
Developer
**24/mo annually)
- 5M tokens/month
- 100 requests/minute
- Email support (48h)
- Priority processing
Startup
**84/mo annually)
- 25M tokens/month
- 500 requests/minute
- Priority email support (24h)
- Advanced analytics
Enterprise
Custom pricing
- Unlimited tokens
- Custom rate limits
- Dedicated support (4h)
- SLA guarantees
- Custom models
Model Pricing
All models are billed per million tokens (input + output combined):Chat Models
| Model | Price per Million Tokens |
|---|---|
llama-3.3-70b (via Groq) | FREE |
llama-3.1-8b | $0.10 |
llama-3.1-70b | $0.90 |
mistral-7b | $0.10 |
qwen2-7b | $0.10 |
gemma-2-9b | $0.15 |
phi-3-mini | $0.08 |
Embedding Models
| Model | Price per Million Tokens |
|---|---|
bge-m3 | FREE |
e5-large-v2 | $0.01 |
bge-base-en | $0.01 |
jina-embeddings-v2 | $0.02 |
nomic-embed-text | $0.01 |
gte-large | $0.01 |
Safety Models
| Model | Price per Million Tokens |
|---|---|
llama-guard-3-8b (via Groq) | FREE |
llama-guard-3 | $0.20 |
shieldgemma | $0.15 |
bge-reranker-v2 | $0.05 |
jina-reranker | $0.08 |
How Billing Works
Token-Based Billing
You’re billed for total tokens (input + output):- Input: 500 tokens
- Output: 1,000 tokens
- Total: 1,500 tokens
- Model: llama-3.1-8b ($0.10/M)
- Cost: 1,500 × 0.00015**
Subscription vs. Pay-as-you-go
| Feature | Subscription | Pay-as-you-go |
|---|---|---|
| Monthly tokens | Included | From wallet |
| Overage | Charged to wallet | Charged to wallet |
| Rate limits | By tier | By tier |
| Rollover | No | N/A |
Overage Pricing
If you exceed your tier’s monthly tokens, additional usage is charged to your wallet at model prices.Rate Limits
| Tier | Requests/Minute | Tokens/Minute |
|---|---|---|
| Free | 10 | 100,000 |
| Developer | 100 | 1,000,000 |
| Startup | 500 | 5,000,000 |
| Enterprise | Custom | Custom |
Cost Calculator
Estimate your monthly costs:Comparison with OpenAI
| Use Case | OpenAI | Assisters | Savings |
|---|---|---|---|
| 1M chat tokens (GPT-4) | ~$30 | FREE (Llama 3.3 70B) | 100% |
| 1M chat tokens (GPT-3.5) | ~$2 | $0.10 | 95% |
| 1M embeddings | ~$0.13 | FREE (BGE-M3) | 100% |
| 1M moderation | ~$0.002 | FREE (Llama Guard 3) | 100% |
FAQ
What counts as a token?
What counts as a token?
Tokens are pieces of words. Roughly:
- 1 token ≈ 4 characters in English
- 1 token ≈ 0.75 words
- 100 tokens ≈ 75 words
Do unused tokens roll over?
Do unused tokens roll over?
No, subscription tokens reset monthly on your billing date. Consider upgrading if you consistently exceed your limit.
Can I downgrade my plan?
Can I downgrade my plan?
Yes, you can downgrade at any time. The change takes effect at the next billing cycle.
How does annual billing work?
How does annual billing work?
Annual plans are billed once per year and include a 17% discount. Tokens still reset monthly.
What payment methods do you accept?
What payment methods do you accept?
We accept all major credit cards through Stripe. Enterprise customers can pay via invoice.