Pricing based on actual usage.
No required monthly platform subscription, LLMLab charges for tokens and system activity that are actually used.
Costs follow workflow execution, model usage, codebase parsing, retrieval, storage, hosted surfaces, and optional GPU activity. LLMLab's own support-agent setup used less than $0.10 in platform usage.
On smaller screens, swipe horizontally to view the full table.
| Service | Price | Billing unit | Notes |
|---|---|---|---|
| Platform activity | |||
| Workflow run | $0.003 |
per run | Structured workflow execution overhead |
| Preset generation | $0.001 |
per preset | Template and preset generation overhead |
| Codebase parsing | $0.01 |
per 100K lines parsed | Source parsing for codebase ingestion |
| Hosted API tokens | |||
openai:gpt-5.4 |
$4.225 input / $25.35 output |
per 1M tokens | OpenAI flagship hosted path† |
openai:gpt-5.4-mini |
$1.2675 input / $7.605 output |
per 1M tokens | Lower-cost OpenAI hosted path† |
anthropic:claude-sonnet-4-0 |
$5.07 input / $25.35 output |
per 1M tokens | Anthropic Sonnet hosted path† |
anthropic:claude-opus-4-1 |
$25.35 input / $126.75 output |
per 1M tokens | Anthropic highest-cost hosted path† |
google:gemini-2.5-pro |
$4.225 input / $25.35 output |
per 1M tokens | Google hosted pro path† |
google:gemini-2.5-flash |
$0.507 input / $4.225 output |
per 1M tokens | Faster lower-cost Google hosted path† |
google:gemini-2.5-flash-lite |
$0.169 input / $0.676 output |
per 1M tokens | Lowest-cost Google flash hosted path† |
deepseek:deepseek-chat |
$0.4563 input / $1.859 output |
per 1M tokens | DeepSeek chat hosted path† |
deepseek:deepseek-reasoner |
$0.9295 input / $3.7011 output |
per 1M tokens | DeepSeek reasoning hosted path† |
xai:grok-4 |
$5.07 input / $25.35 output |
per 1M tokens | xAI Grok hosted path† |
mistral:mistral-medium-2508 |
$0.676 input / $3.38 output |
per 1M tokens | Mistral medium hosted path† |
mistral:mistral-small-2603 |
$0.2535 input / $1.014 output |
per 1M tokens | Mistral small hosted path† |
| Embeddings and reranking† | |||
sentence-transformers/all-minilm-l6-v2 |
$1.00 |
per 1M tokens | Hosted dense embedding model |
intfloat/multilingual-e5-small |
$1.00 |
per 1M tokens | Hosted multilingual embedding model |
qdrant/bm25 |
$0.40 |
per 1M tokens | Hosted sparse lexical embedding |
voyage:rerank-2.5 / voyage:voyage-rerank-2.5 |
$0.10 |
per 1K documents | Hosted reranker usage |
| GPU compute‡ | |||
| Budget GPU tier | $0.60 |
per GPU-hour | RTX 3070 Ti / RTX 3080(Ti) / T4 |
| Mid GPU tier | $1.20 |
per GPU-hour | RTX 3090(Ti) / L4 / A10 |
| High GPU tier | $4.00 |
per GPU-hour | RTX 4090 / A100 40GB / A40 / L40 |
| Ultra GPU tier | $10.00 |
per GPU-hour | A100 80GB / H100 / H200 class |
| Storage | |||
| Vector storage | $0.60 |
per GiB-month | Persistent vector index footprint |
| Hosted model storage | $0.60 |
per GiB-month | Persistent hosted model storage |
* Amounts are rounded up to the next whole $0.01 increment across charge groups.
† User-provided tokens are not billed by LLMLab. Provider rates are charged separately by the provider.
‡ One of these GPU options will be selected depending on availability and current cloud pricing.
Usage-based does not mean uncontrolled.
A public support integration needs pricing controls, abuse protection, and visibility. LLMLab is designed to keep spend tied to real support value.
Monitor suspicious traffic patterns and spam-like usage against a public integration or organization surface.
Support for protective behavior when traffic appears to abuse an exposed integration, assistant, or organization surface.
Usage across the actual system surface.
LLMLab pricing is designed around actual platform activity: workflow runs, model-backed responses, context ingestion, codebase parsing, retrieval, answer memory, web integration interactions, hosted model paths, and optional model infrastructure.
See how far $5 can go with a usage based plan
Use the free credit to build workflows, support agents, assistants, and operational AI systems. See just how far $5 can go before having to get the credit card.