How does LLM API pricing work?

LLM APIs charge per-token, with separate rates for input tokens (your prompts) and output tokens (model responses). Prices are quoted per million tokens. For example, Claude Sonnet 4.5 costs $3/1M input tokens and $15/1M output tokens. Output tokens are always more expensive because they require more compute. Most providers offer pay-as-you-go with no minimum commitments.

Which LLM API is the cheapest in 2026?

DeepSeek V3.2 is among the cheapest high-quality options at $0.14/$0.28 per 1M tokens (input/output), delivering quality scores rivaling much more expensive models. Google's Gemini 2.5 Flash is also extremely affordable. For free options, many providers offer free tiers with rate limits — DeepSeek R1, Llama 3.3 70B, and Gemma 3 are available at zero cost on OpenRouter.

How much does it cost to use the OpenAI API vs Claude API?

OpenAI's flagship GPT-5 costs roughly $10/$30 per 1M tokens (input/output), while Anthropic's Claude Opus 4.6 costs $5/$25. For mid-tier models, GPT-4.1 ($2/$8) competes with Claude Sonnet 4.5 ($3/$15). At the budget tier, GPT-5 Nano and Claude Haiku 4.5 are both under $5/1M output tokens. Exact prices change frequently — use the comparison table above for current rates.

Is DeepSeek API really cheaper than OpenAI?

Yes, significantly. DeepSeek V3.2 output tokens cost roughly $0.28/1M compared to GPT-5's $30/1M — over 100x cheaper. Even comparing to GPT-4.1 ($8/1M output), DeepSeek is ~29x cheaper. The trade-off is that DeepSeek scores lower on quality benchmarks, though the gap has been narrowing. For many use cases like chatbots and content generation, DeepSeek offers excellent value.

What is the best LLM API for the price?

The 'value score' in our comparison (quality points per dollar of output cost) highlights models that deliver the most quality per dollar spent. DeepSeek V3.2, Gemini 2.5 Flash, and MiniMax models consistently top value rankings. For premium quality without breaking the bank, Claude Sonnet 4.5 and GPT-4.1 offer strong quality at mid-range prices.

LAST UPDATED: MAY 28, 2026

LLM API Pricing Comparison

Compare pricing across 315+ LLM APIs from OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI, and more. Sorted by quality, price, or value score.

Comparison Value Rankings Pricing Guide Save Money FAQ

Pricing TLDR

• Budget models from $0.07/M input tokens — premium models up to $75/M output tokens
• Quality scores from 0-100 based on independent benchmarks (Theozard)
• Value score = quality per dollar of output cost — find the best bang for your buck

Official pricing:

OpenRouter API (live pricing)

•

Quality Scores: Theozard

LLM API Cost Comparison — Monthly Pricing

Calculate by

TokensWordsCharacters

Input Tokens

Output Tokens

API Calls / Month

Quick Examples:

Sort:

(openai/gpt-5.5)

Context

1.1M

Quality

100

Per 1M Tokens

In: $5.00

Out: $30.00

Value

3.3

Monthly Cost

$20.00

(openai/gpt-5.5-pro)

Context

1.1M

Quality

100

Per 1M Tokens

In: $30.00

Out: $180.00

Value

0.6

Monthly Cost

$120.00

(google/gemini-3.1-pro-preview-customtools)

Context

1.0M

Quality

Per 1M Tokens

In: $2.00

Out: $12.00

Value

7.9

Monthly Cost

$8.00

(google/gemini-3.1-pro-preview)

Context

1.0M

Quality

Per 1M Tokens

In: $2.00

Out: $12.00

Value

7.9

Monthly Cost

$8.00

(anthropic/claude-opus-4.7)

Context

1.0M

Quality

Per 1M Tokens

In: $5.00

Out: $25.00

Value

3.8

Monthly Cost

$17.50

(anthropic/claude-opus-4)

Context

200K

Quality

Per 1M Tokens

In: $15.00

Out: $75.00

Value

1.3

Monthly Cost

$52.50

(anthropic/claude-opus-4.7-fast)

Context

1.0M

Quality

Per 1M Tokens

In: $30.00

Out: $150.00

Value

0.6

Monthly Cost

$105.00

(openai/gpt-5.4)

Context

1.1M

Quality

Per 1M Tokens

In: $2.50

Out: $15.00

Value

6.3

Monthly Cost

$10.00

(openai/gpt-5.4-pro)

Context

1.1M

Quality

Per 1M Tokens

In: $30.00

Out: $180.00

Value

0.5

Monthly Cost

$120.00

(google/gemini-3.5-flash)

Context

1.0M

Quality

Per 1M Tokens

In: $1.50

Out: $9.00

Value

10.2

Monthly Cost

$6.00

(xiaomi/mimo-v2.5-pro)

Context

1.0M

Quality

Per 1M Tokens

In: $0.44

Out: $0.87

Value

102.3

Monthly Cost

$0.87

(moonshotai/kimi-k2.6)

Context

262K

Quality

Per 1M Tokens

In: $0.73

Out: $3.49

Value

25.5

Monthly Cost

$2.48

(openai/gpt-5.3-codex)

Context

400K

Quality

Per 1M Tokens

In: $1.75

Out: $14.00

Value

6.4

Monthly Cost

$8.75

(x-ai/grok-4.3)

Context

1.0M

Quality

Per 1M Tokens

In: $1.25

Out: $2.50

Value

35.2

Monthly Cost

$2.50

(anthropic/claude-opus-4.6)

Context

1.0M

Quality

Per 1M Tokens

In: $5.00

Out: $25.00

Value

3.5

Monthly Cost

$17.50

(anthropic/claude-opus-4.6-fast)

Context

1.0M

Quality

Per 1M Tokens

In: $30.00

Out: $150.00

Value

0.6

Monthly Cost

$105.00

(deepseek/deepseek-v4-pro)

Context

1.0M

Quality

Per 1M Tokens

In: $0.44

Out: $0.87

Value

98.8

Monthly Cost

$0.87

(qwen/qwen3.6-max-preview)

Context

262K

Quality

Per 1M Tokens

In: $1.04

Out: $6.24

Value

13.8

Monthly Cost

$4.16

(anthropic/claude-sonnet-4.6)

Context

1.0M

Quality

Per 1M Tokens

In: $3.00

Out: $15.00

Value

5.7

Monthly Cost

$10.50

(anthropic/claude-sonnet-4)

Context

1.0M

Quality

Per 1M Tokens

In: $3.00

Out: $15.00

Value

5.7

Monthly Cost

$10.50

(z-ai/glm-5.1)

Context

203K

Quality

Per 1M Tokens

In: $0.98

Out: $3.08

Value

27.6

Monthly Cost

$2.52

(openai/gpt-5.2-chat)

Context

128K

Quality

Per 1M Tokens

In: $1.75

Out: $14.00

Value

6.1

Monthly Cost

$8.75

(openai/gpt-5.2)

Context

400K

Quality

Per 1M Tokens

In: $1.75

Out: $14.00

Value

6.1

Monthly Cost

$8.75

(openai/gpt-5.2-pro)

Context

400K

Quality

Per 1M Tokens

In: $21.00

Out: $168.00

Value

0.5

Monthly Cost

$105.00

(z-ai/glm-5)

Context

203K

Quality

Per 1M Tokens

In: $0.60

Out: $1.92

Value

43.2

Monthly Cost

$1.56

(qwen/qwen3.6-plus)

Context

1.0M

Quality

Per 1M Tokens

In: $0.33

Out: $1.95

Value

42.6

Monthly Cost

$1.30

(anthropic/claude-opus-4.5)

Context

200K

Quality

Per 1M Tokens

In: $5.00

Out: $25.00

Value

3.3

Monthly Cost

$17.50

(minimax/minimax-m2.7)

Context

205K

Quality

Per 1M Tokens

In: $0.28

Out: $1.20

Value

68.3

Monthly Cost

$0.88

(x-ai/grok-4.20)

Context

2.0M

Quality

Per 1M Tokens

In: $1.25

Out: $2.50

Value

32.8

Monthly Cost

$2.50

(xiaomi/mimo-v2-pro)

Context

1.0M

Quality

Per 1M Tokens

In: $1.00

Out: $3.00

Value

27.3

Monthly Cost

$2.50

Spending across 315+ OpenRouter models?

Monitor your OpenRouter credits and usage in real-time.

Try free for 7 days Learn more →

Privacy-first desktop app. No sign-up required.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

Best Value LLM APIs — Quality Per Dollar

Value score = quality points per $1 of output cost (per 1M tokens). Higher is better. These models deliver the most capability per dollar spent.

Model

inclusionAI: Ling-2.6-flash

Provider

inclusionai

Quality

Output / 1M

$0.03

Value Score

1433.3

Model

Meta: Llama 3.1 8B Instruct

Provider

About LLM API Pricing

What is LLM API Pricing?

LLM APIs let you integrate large language models into your applications via HTTP requests. Every major AI provider — OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI — offers API access to their models with per-token pricing. You pay separately for input tokens (your prompts) and output tokens (model responses), quoted per million tokens.

Input vs Output Token Pricing: Input tokens (prompts, context) are cheaper because they only need to be processed once. Output tokens (completions) cost 2-5x more because each token requires a full forward pass through the model. Optimizing prompt length has the biggest impact on cost.
Quality-Price Tradeoff: More expensive models generally deliver higher quality responses. Our quality scores (0-100) let you compare: Claude Opus 4.6 scores 100 at $25/1M output, while DeepSeek V3.2 scores 79 at $0.28/1M. The right choice depends on your quality requirements.
Context Window Costs: Larger context windows let you send more data per request but increase token costs. A 200K context model processing long documents costs proportionally more in input tokens than a short chatbot interaction. Choose context size based on your actual needs.

When to Use LLM API Pricing

Different use cases call for different models. Match your quality requirements to your budget using the value score to find the optimal model.

Ideal for

Chatbots and conversational AI — mid-tier models like Sonnet or GPT-4.1 offer the best quality/cost balance
Code generation — specialized models like DeepSeek Coder or Codex variants optimize for code tasks
Bulk content processing — budget models like Gemini Flash or DeepSeek V3 handle volume at minimal cost
Complex reasoning tasks — premium models like Opus 4.6 or GPT-5 justify their cost for hard problems
Prototyping — free tier models let you build without spending anything

Not ideal for

Real-time applications needing sub-100ms latency (consider edge-deployed models)
Tasks that don't need language understanding (use traditional algorithms instead)
Processing sensitive data with compliance requirements (check each provider's data policies)

LLM API Monthly Cost Estimates

Hobby / Prototyping

$0-10/mo

• Free tier models

• < 1K requests/day

• Testing & development

Startup / MVP

$50-300/mo

• Mid-tier models (Sonnet, GPT-4.1)

• 5-20K requests/day

• Single product

Growth

$300-2,000/mo

• Mix of premium & budget models

• 20-100K requests/day

• Multiple use cases

Enterprise

$2,000+/mo

• Premium models for quality-critical tasks

• 100K+ requests/day

• Model fallback chains

5 LLM API Cost Optimization Tips

Use a Model Cascade

Route easy queries to cheap models (Haiku, Flash, GPT-5 Nano) and only escalate to expensive ones (Opus, GPT-5) when needed. A classifier model can decide the routing. This typically saves 60-80% vs using premium models for everything.

Optimize Prompt Length

Input tokens cost money. Strip unnecessary context, use concise system prompts, and avoid sending full documents when a summary suffices. A 50% reduction in prompt length = 50% savings on input costs.

Cache Frequent Requests

If you make similar API calls repeatedly, cache responses. Many providers also offer prompt caching features that reduce costs for repeated system prompts. Anthropic's prompt caching can save up to 90% on cached tokens.

Compare Value Scores, Not Just Prices

The cheapest model isn't always the best value. A model at $0.50/1M output with quality score 30 delivers less value than one at $2/1M with quality score 70. Use the value score column to find the sweet spot for your needs.

Monitor Per-Model Spending

Track costs per model and per use case with CostGoat. Identify which models consume the most budget, find opportunities to downgrade specific workflows, and catch cost spikes early before they become expensive surprises.

Start Tracking Your LLM API Spending

Monitor spending across OpenAI, Anthropic, Google, OpenRouter, and other LLM providers — all from one menubar app.

Try Free for 7 Days Learn more →

Privacy-first desktop app. 7-day free trial, no sign-up required.

LLM API Pricing FAQ

Common questions about LLM API costs, pricing models, and how to save money

LLM API Pricing Comparison

LLM API Cost Comparison — Monthly Pricing

Monitor your OpenRouter credits and usage in real-time.

Best Value LLM APIs — Quality Per Dollar

About LLM API Pricing

What is LLM API Pricing?

When to Use LLM API Pricing

Ideal for

Not ideal for

LLM API Monthly Cost Estimates

5 LLM API Cost Optimization Tips

Start Tracking Your LLM API Spending

LLM API Pricing FAQ

Which LLM API is the cheapest in 2026?

How much does it cost to use the OpenAI API vs Claude API?

Is DeepSeek API really cheaper than OpenAI?

What is the best LLM API for the price?

Related Pricing Calculators