What is Gemini API pricing per token?

Pricing per 1M tokens (input / output). Current flagships: Gemini 3.1 Pro Preview $2 / $12 (input doubles, output rises ~50% above 200K tokens) and Gemini 3.5 Flash $1.50 / $9. Mid-tier: Gemini 3 Flash Preview $0.50 / $3 and the new Gemini 3.1 Flash-Lite $0.25 / $1.50. Cheapest: Gemini 2.5 Flash-Lite $0.10 / $0.40. Batch API applies a 50% discount and context caching saves up to 90% on cached input.

What is the difference between Gemini 3.1 Pro, 3.5 Flash, and Flash-Lite?

Gemini 3.1 Pro Preview is the flagship — best for hard reasoning, complex coding, and long-context work (quality 95, paid-only). Gemini 3.5 Flash is the speed-meets-frontier tier with native grounding and search (quality 92, free-tier eligible). Gemini 3.1 Flash-Lite is the cost-effective newcomer for high-volume production (quality 56) — meaningfully pricier than the legacy 2.5 Flash-Lite (2.5x input / 3.75x output) but on the current Gemini 3.x architecture.

Is the Gemini API cheaper than OpenAI or Claude?

For comparable quality, yes. Gemini 3.1 Pro Preview at $2 / $12 matches Claude Opus 4.7 on Theozard's benchmark (both scoring 95) while costing 55-60% less than Opus ($5 / $25) and GPT-5.5 ($5 / $30). Gemini 3.5 Flash at $1.50 / $9 is half the input price of Claude Sonnet 4.6 ($3 / $15) and scores higher on benchmarks (92 vs 86). Generous free tiers on most Gemini models make prototyping essentially free.

Yes — most Gemini models have a generous free tier (rate-limited). Gemini 3.5 Flash, 3 Flash Preview, 3.1 Flash-Lite, 2.5 Pro/Flash/Flash-Lite, and the legacy 2.0 Flash/Flash-Lite are all free at the API level, subject to rate limits and Google using requests to improve their products. Gemini 3.1 Pro Preview is paid-only. Google Search grounding includes 5,000 free prompts per month across the Gemini 3.x family.

How do I get a Gemini API key?

Go to Google AI Studio (aistudio.google.com), sign in with your Google account, click 'Get API key' in the left sidebar, and create a new key. No credit card required for the free tier — API keys work immediately across all eligible Gemini models.

LAST UPDATED: MAY 21, 2026

Gemini API Pricing Calculator & Complete Cost Guide

Calculate Gemini API costs for text and chat models per token and per month. Compare the full Gemini 3.x family with batch & cache discounts.

Calculator Pricing Guide Examples Save Money FAQ

Pricing TLDR

• Generous free tiers for most models (rate-limited) · 3.1 Pro Preview is paid-only
• Pay-per-token: 2.5 Flash-Lite ($0.10/$0.40) · 3.1 Flash-Lite ($0.25/$1.50) · 3.5 Flash ($1.50/$9) · 3.1 Pro ($2/$12) per M tokens
• 50% batch discount · 90% context caching savings · Audio input 2-7x text pricing

Official pricing:

Google AI

•

Quality Scores: Theozard

Gemini API Cost Calculator - Monthly Pricing

Calculate by

TokensWordsCharacters

Input Tokens

Output Tokens

API Calls / Month

Quick Examples:

Cost Optimization:

Batch API(50% off)

Context Caching(90% off)

Long Context(Pro >200K: input 2x, output 1.5x)

Gemini 3.1 Pro Preview (gemini-3.1-pro-preview)

Context

Quality

Per 1M Tokens

In: $2.00

Out: $12.00

Monthly Cost

$8.00

Gemini 3.5 Flash (gemini-3.5-flash)

Context

Quality

Per 1M Tokens

In: $1.50

Out: $9.00

Monthly Cost

$6.00

Gemini 3 Flash Preview (gemini-3-flash-preview)

Context

Quality

Per 1M Tokens

In: $0.50

Out: $3.00

Monthly Cost

$2.00

Gemini 2.5 Pro (gemini-2.5-pro)

Context

Quality

Per 1M Tokens

In: $1.25

Out: $10.00

Monthly Cost

$6.25

Gemini 3.1 Flash-Lite (gemini-3.1-flash-lite)

Context

Quality

Per 1M Tokens

In: $0.25

Out: $1.50

Monthly Cost

$1.00

Gemini 2.5 Flash (gemini-2.5-flash)

Context

Quality

Per 1M Tokens

In: $0.30

Out: $2.50

Monthly Cost

$1.55

Gemini 2.0 Flash (gemini-2.0-flash)

Context

Quality

Per 1M Tokens

In: $0.10

Out: $0.40

Monthly Cost

$0.30

Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)

Context

Quality

Per 1M Tokens

In: $0.10

Out: $0.40

Monthly Cost

$0.30

Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)

Context

Quality

Per 1M Tokens

In: $0.08

Out: $0.30

Monthly Cost

$0.23

Tracking Gemini API costs?

Monitor your Gemini API usage and costs in real-time.

Try free for 7 days Learn more →

Privacy-first desktop app. No sign-up required.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

About Gemini API

What is Gemini API?

The Gemini API provides programmatic access to Google's family of multimodal AI models, designed for a wide range of tasks from simple chat to complex reasoning. The current flagships are Gemini 3.1 Pro Preview (top-tier reasoning) and Gemini 3.5 Flash (speed-meets-frontier with native grounding), backed by mid-tier 3.1 Flash-Lite and several prior-gen 2.5/2.0 options.

Multiple Model Tiers: Choose from Gemini 3.1 Pro Preview (current flagship, paid-only), Gemini 3.5 Flash (frontier + speed with grounding), Gemini 3 Flash Preview (prior-gen Flash), Gemini 3.1 Flash-Lite (newest cost-effective tier), Gemini 2.5 Pro/Flash/Flash-Lite (prior gen), and legacy Gemini 2.0 Flash/Flash-Lite (deprecating 2026-06-01).
Massive Context Windows: All current Gemini models support 1M token context windows (~750K words). Pro models have tiered pricing: standard rates for ≤200K tokens, then input 2x / output 1.5x above 200K tokens.
Token-Based Pricing: Pay only for what you use with separate pricing for input and output tokens. Output typically costs 4-8x more than input. Generous free tiers available for most models — Gemini 3.1 Pro Preview is the only current model that's paid-only.

When to Use Gemini API

Start with Gemini 2.5 Flash-Lite or 3.1 Flash-Lite for simple tasks and cost-effective production, then move to Gemini 3.5 Flash when you need frontier intelligence at sub-Pro pricing, and reserve Gemini 3.1 Pro Preview for hardest reasoning. Use batch processing and context caching to significantly reduce costs for production workloads.

Ideal for

Customer support chatbots with natural conversation flow
Code completion and review in development environments
Large document summarization and analysis (1M context)
Creative content generation for marketing and blogs
Grounded responses using Google Search integration

Not ideal for

Real-time applications requiring <100ms latency
Simple pattern matching tasks (use cheaper alternatives)
Applications needing guaranteed deterministic outputs
Tasks requiring specialized domain knowledge without context

Gemini API Pricing Breakdown

Free Tier

Most Gemini models offer generous free tiers with rate limits. Free tier usage is subject to Google using your data to improve products. No credit card required to get started.

Gemini 3.5 Flash, 3 Flash Preview, 3.1 Flash-Lite: Free with rate limits
Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite: Free with rate limits
Gemini 2.0 Flash, 2.0 Flash-Lite: Free (deprecating 2026-06-01)
Google Search grounding: 5,000 free prompts/month across the Gemini 3.x family
Gemini 3.1 Pro Preview: Paid tier only (no free access)

Cost Optimization Features

Batch API (50% Discount)

Process non-urgent workloads asynchronously at half price. Example: Gemini 3.1 Pro Preview drops to $1.00/$6.00 per million tokens (vs $2/$12). Gemini 2.5 Pro drops to $0.625/$5. Perfect for data processing and content generation. Not available on free tier.

Context Caching (90% Savings)

Cache frequently used prompts, system messages, or documents. Cache reads cost 10% of base input price. Storage costs $1-4.50 per million tokens per hour depending on model.

Grounding with Google Search

Get up-to-date information by grounding responses with Google Search. Gemini 3.x family: 5,000 prompts/month free (shared), then $14/1K queries. Gemini 2.x models: $35/1K prompts after free quota. Google Maps grounding: $14/1K prompts on Gemini 3.x, $25/1K on Gemini 2.x.

Tiered Context Pricing

Pro models (3.1 Pro Preview, 2.5 Pro) have standard pricing for prompts ≤200K tokens. Above 200K, input doubles and output increases ~50%. Flash and Flash-Lite models have flat pricing regardless of context length.

Audio Input Pricing Premiums

Audio input costs 2-7x more than text input depending on the model. Gemini 3 Flash: 2x ($1.00 vs $0.50). Gemini 2.5 Flash: 3.33x ($1.00 vs $0.30). Gemini 2.5 Flash-Lite: 3x ($0.30 vs $0.10). Gemini 2.0 Flash: 7x ($0.70 vs $0.10). Audio output pricing remains the same as text output.

Gemini API Monthly Cost Estimates

Light Use

$0-30/mo

• Personal projects

• <1K requests/day

• Flash-Lite free tier

Medium Use

$30-150/mo

• Small apps

• 1-5K requests/day

• Flash or Flash-Lite

Heavy Use

$150-800/mo

• Production apps

• 5-20K requests/day

• Mixed models

Enterprise

$800+/mo

• Large scale

• 20K+ requests/day

• Vertex AI available

6 Gemini API Cost Optimization Tips

Use Context Caching

Save 90% on repeated content by caching frequently used prompts, system messages, or documents. Cache reads cost just 10% of base input price. Storage costs $1-4.50/M tokens/hour. Example: 10K requests with 80% cache hits saves significant costs on Pro models.

Enable Batch API

Get 50% discount on all paid models by processing non-urgent workloads asynchronously. Gemini 3.1 Pro Preview drops to $1.00/$6.00 per M tokens (vs $2/$12 standard); 2.5 Pro drops to $0.625/$5. Perfect for data processing, content generation, and analysis tasks.

Start with Flash-Lite

Use the cheapest models (Gemini 2.5 Flash-Lite at $0.10/$0.40 or the newer 3.1 Flash-Lite at $0.25/$1.50 per M tokens) for classification and routing. Only escalate to 3.5 Flash or 3.1 Pro Preview when necessary for complex tasks.

Leverage Free Tiers

Most Gemini models have generous free tiers with rate limits. Use free tier for development and testing, then upgrade to paid for production. Free tier data may be used to improve Google products.

Optimize Context Length

Pro models charge 2x input and ~1.5x output for prompts >200K tokens. Keep prompts under 200K when possible, or use Flash models which have flat pricing regardless of context length for long document processing.

Monitor Gemini API Token Usage

Track Gemini API spending per model with CostGoat's token-level visibility. Get instant alerts when switching from Flash to Pro models, when context caching savings drop unexpectedly, or when batch processing opportunities are missed.

Gemini Model Selection Guide

Use Case

Customer Support Chat

Recommended Model

Gemini 2.5 Flash-Lite

Cheapest available

Monthly Cost (Est.)

~$5-30

Why This Model?

Lowest cost ($0.10/$0.40), free tier available

Use Case

Code Generation

Recommended Model

Gemini 3.1 Pro Preview

Current Pro flagship

Monthly Cost (Est.)

~$120-500

Why This Model?

Top quality (95), 1M context, replaces 3 Pro Preview

Use Case

Research & Analysis

Recommended Model

Gemini 3.1 Pro Preview

Highest quality

Monthly Cost (Est.)

~$100-500

Why This Model?

Most powerful multimodal understanding, best for hard reasoning

Use Case

Content Writing

Recommended Model

Gemini 3.5 Flash

Current Flash flagship

Monthly Cost (Est.)

~$60-300

Why This Model?

Quality 92, much cheaper than Pro, free tier available

Use Case

Data Extraction

Recommended Model

Gemini 2.5 Flash-Lite + Batch

Best value

Monthly Cost (Est.)

~$10-50

Why This Model?

Cheapest option ($0.05/$0.20 with batch discount)

Use Case

Grounded Search

Recommended Model

Gemini 3.5 Flash

+ Google Search

Monthly Cost (Est.)

~$70-250

Why This Model?

Native grounding strength, 5K free queries/month

Google Search Grounding Pricing

Gemini 3.x family (3.1 Pro, 3.5 Flash, 3 Flash, 3.1 Flash-Lite)

Free Quota

5,000 prompts/mo (shared)

Paid Rate

$14 / 1K queries

Notes

Free quota shared across all Gemini 3.x models · per-query pricing

Gemini 2.5 Pro

Free Quota

1,500 RPD (paid tier only)

Paid Rate

$35 / 1K prompts

Notes

Grounding not available on the free tier

Gemini 2.5 Flash / Flash-Lite

Free Quota

500 RPD (Free) / 1,500 RPD (Paid)

Paid Rate

$35 / 1K prompts

Notes

Shared limit between Flash & Flash-Lite

Gemini 2.0 Flash

Free Quota

500 RPD (Free) / 1,500 RPD (Paid)

Paid Rate

$35 / 1K prompts

Notes

Deprecating 2026-06-01

Google Maps Grounding (Gemini 3.x)

Free Quota

Shares 5,000/mo with Search

Paid Rate

$14 / 1K queries

Notes

Per-query pricing on Gemini 3.x

Google Maps Grounding (Gemini 2.5 Pro)

Free Quota

10,000 RPD (free)

Paid Rate

$25 / 1K prompts

Notes

Highest Maps free quota of any model

Google Maps Grounding (Gemini 2.5 Flash / 2.0 Flash)

Free Quota

1,500 RPD (free)

Paid Rate

$25 / 1K prompts

Notes

Location-based queries on prior-gen Flash

RPD: Requests Per Day. Each prompt can generate multiple search queries. Gemini 3.x models use per-query pricing with a shared 5,000-prompts/month free quota; Gemini 2.x models use per-prompt pricing with daily limits.

Start Tracking Your Gemini API Spending

Track usage across Gemini 3.1 Pro, 3.5 Flash, Flash-Lite, and prior-gen variants from your menubar. Get alerts when usage spikes.

Try Free for 7 Days Learn more →

Privacy-first desktop app. 7-day free trial, no sign-up required.

Gemini API Pricing FAQ

Common questions about Gemini API costs, billing, and optimization

Gemini API Pricing Calculator & Complete Cost Guide

Gemini API Cost Calculator - Monthly Pricing

Monitor your Gemini API usage and costs in real-time.

About Gemini API

What is Gemini API?

When to Use Gemini API

Ideal for

Not ideal for

Gemini API Pricing Breakdown

Free Tier

Cost Optimization Features

Gemini API Monthly Cost Estimates

6 Gemini API Cost Optimization Tips

Gemini Model Selection Guide

Google Search Grounding Pricing

Start Tracking Your Gemini API Spending

Gemini API Pricing FAQ

What is the difference between Gemini 3.1 Pro, 3.5 Flash, and Flash-Lite?

Is the Gemini API cheaper than OpenAI or Claude?

Is Gemini API free?

How do I get a Gemini API key?

Related Pricing Calculators