Skip to main content

Introduction to Assisters API

Assisters API provides OpenAI-compatible endpoints for accessing open-source AI models. Whether you’re building chatbots, search systems, or content moderation tools, our API offers a simple, cost-effective solution.

What is Assisters API?

Assisters API is a unified interface for accessing multiple open-source AI models through a single API. We handle the infrastructure, scaling, and optimization so you can focus on building great products.

OpenAI Compatible

Use the same code you write for OpenAI - just change the base URL

18+ Models

Access Llama, Mistral, Qwen, and more through one API

Pay Per Token

Only pay for what you use, starting at $0.08 per million tokens

99.9% Uptime

Enterprise-grade reliability with SLA guarantees

Key Features

Chat Completions

Generate conversational responses with streaming support. Perfect for chatbots, assistants, and interactive applications.
response = client.chat.completions.create(
    model="llama-3.1-8b",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True
)

Text Embeddings

Create vector representations for semantic search, clustering, and similarity matching.
response = client.embeddings.create(
    model="e5-large-v2",
    input="Your text here"
)

Content Moderation

Detect harmful, inappropriate, or policy-violating content before it reaches your users.
response = client.moderations.create(
    model="llama-guard-3",
    input="Content to check"
)

Document Reranking

Improve search results by reranking documents based on relevance to a query.
response = client.rerank.create(
    model="bge-reranker-v2",
    query="search query",
    documents=["doc1", "doc2", "doc3"]
)

Available Models

We offer models across four categories:
CategoryModelsStarting Price
ChatLlama 3.1, Mistral, Qwen, Phi-3$0.08/M tokens
EmbeddingsE5, BGE, Jina, Nomic$0.01/M tokens
ModerationLlama Guard, ShieldGemma$0.15/M tokens
RerankingBGE Reranker, Jina Reranker$0.05/M tokens

View All Models

Browse our complete model catalog with detailed specifications

Use Cases

Build conversational AI that understands context and provides helpful responses. Our chat models support multi-turn conversations with streaming for real-time interaction.
Automatically detect harmful content, spam, or policy violations before they reach your platform. Protect your users and brand.
Combine embeddings with chat completions to build retrieval-augmented generation systems that answer questions from your knowledge base.

Getting Started

Ready to build? Follow these steps:
1

Create an Account

Sign up at assisters.dev to get your free API key with 100K tokens.
2

Install the SDK

Use the official OpenAI SDK - it works with Assisters API out of the box.
3

Make Your First Call

Follow our quickstart guide to send your first request.

Support & Resources