Pricing

Simple, transparent pricing with no hidden fees. Pay only for what you use.

Free Models Available! Get started with zero cost using our free tier models:

Chat: Llama 3.3 70B via Groq (FREE)
Embeddings: BGE-M3 via HuggingFace (FREE)
Moderation: Llama Guard 3 via Groq (FREE)

Subscription Tiers

Free

$0/month

100K tokens/month
10 requests/minute
Community support
All models available

Perfect for testing and small projects.

Developer

29/month** (or

24/mo annually)

5M tokens/month
100 requests/minute
Email support (48h)
Priority processing

Ideal for indie developers and startups.

Startup

99/month** (or

84/mo annually)

25M tokens/month
500 requests/minute
Priority email support (24h)
Advanced analytics

For growing teams and products.

Enterprise

Custom pricing

Unlimited tokens
Custom rate limits
Dedicated support (4h)
SLA guarantees
Custom models

Contact sales for volume pricing.

Model Pricing

All models are billed per million tokens (input + output combined):

Chat Models

Model	Price per Million Tokens
`llama-3.3-70b` (via Groq)	FREE
`llama-3.1-8b`	$0.10
`llama-3.1-70b`	$0.90
`mistral-7b`	$0.10
`qwen2-7b`	$0.10
`gemma-2-9b`	$0.15
`phi-3-mini`	$0.08

Embedding Models

Model	Price per Million Tokens
`bge-m3`	FREE
`e5-large-v2`	$0.01
`bge-base-en`	$0.01
`jina-embeddings-v2`	$0.02
`nomic-embed-text`	$0.01
`gte-large`	$0.01

Safety Models

Model	Price per Million Tokens
`llama-guard-3-8b` (via Groq)	FREE
`llama-guard-3`	$0.20
`shieldgemma`	$0.15
`bge-reranker-v2`	$0.05
`jina-reranker`	$0.08

How Billing Works

Token-Based Billing

You’re billed for total tokens (input + output):

Cost = total_tokens × price_per_million / 1,000,000

Example:

Input: 500 tokens
Output: 1,000 tokens
Total: 1,500 tokens
Model: llama-3.1-8b ($0.10/M)
Cost: 1,500 × $0.10 / 1,000,000 = **$ 0.00015**

Subscription vs. Pay-as-you-go

Feature	Subscription	Pay-as-you-go
Monthly tokens	Included	From wallet
Overage	Charged to wallet	Charged to wallet
Rate limits	By tier	By tier
Rollover	No	N/A

Overage Pricing

If you exceed your tier’s monthly tokens, additional usage is charged to your wallet at model prices.

Rate Limits

Tier	Requests/Minute	Tokens/Minute
Free	10	100,000
Developer	100	1,000,000
Startup	500	5,000,000
Enterprise	Custom	Custom

Cost Calculator

Estimate your monthly costs:

def estimate_monthly_cost(
    requests_per_day: int,
    avg_tokens_per_request: int,
    price_per_million: float
) -> float:
    daily_tokens = requests_per_day * avg_tokens_per_request
    monthly_tokens = daily_tokens * 30
    cost = monthly_tokens * price_per_million / 1_000_000
    return cost

# Example: 1000 requests/day, 500 tokens each, $0.10/M
cost = estimate_monthly_cost(1000, 500, 0.10)
print(f"Monthly cost: ${cost:.2f}")  # $1.50

Comparison with OpenAI

Use Case	OpenAI	Assisters	Savings
1M chat tokens (GPT-4)	~$30	FREE (Llama 3.3 70B)	100%
1M chat tokens (GPT-3.5)	~$2	$0.10	95%
1M embeddings	~$0.13	FREE (BGE-M3)	100%
1M moderation	~$0.002	FREE (Llama Guard 3)	100%

FAQ

What counts as a token?

Tokens are pieces of words. Roughly:

1 token ≈ 4 characters in English
1 token ≈ 0.75 words
100 tokens ≈ 75 words

Both input and output tokens are counted.

Do unused tokens roll over?

No, subscription tokens reset monthly on your billing date. Consider upgrading if you consistently exceed your limit.

Can I downgrade my plan?

Yes, you can downgrade at any time. The change takes effect at the next billing cycle.

How does annual billing work?

Annual plans are billed once per year and include a 17% discount. Tokens still reset monthly.

What payment methods do you accept?

We accept all major credit cards through Stripe. Enterprise customers can pay via invoice.

Getting Started

Guides

SDKs

Billing

Security

Pricing

Pricing

Subscription Tiers

Free

Developer

Startup

Enterprise

Model Pricing

Chat Models

Embedding Models

Safety Models

How Billing Works

Token-Based Billing

Subscription vs. Pay-as-you-go

Overage Pricing

Rate Limits

Cost Calculator

Comparison with OpenAI

FAQ

Get Started

Sign Up Free

Contact Sales

Getting Started

Guides

SDKs

Billing

Security

​Pricing

​Subscription Tiers

Free

Developer

Startup

Enterprise

​Model Pricing

​Chat Models

​Embedding Models

​Safety Models

​How Billing Works

​Token-Based Billing

​Subscription vs. Pay-as-you-go

​Overage Pricing

​Rate Limits

​Cost Calculator

​Comparison with OpenAI

​FAQ

​Get Started

Sign Up Free

Contact Sales

Pricing

Subscription Tiers

Model Pricing

Chat Models

Embedding Models

Safety Models

How Billing Works

Token-Based Billing

Subscription vs. Pay-as-you-go

Overage Pricing

Rate Limits

Cost Calculator

Comparison with OpenAI

FAQ

Get Started