Gemini API Pricing Calculator & Complete Cost Guide
Calculate Gemini API costs for text and chat models per token and per month. Compare the full Gemini 3.x family with batch & cache discounts.
Pricing TLDR
- • Generous free tiers for most models (rate-limited) · 3.1 Pro Preview is paid-only
- • Pay-per-token: 2.5 Flash-Lite ($0.10/$0.40) · 3.1 Flash-Lite ($0.25/$1.50) · 3.5 Flash ($1.50/$9) · 3.1 Pro ($2/$12) per M tokens
- • 50% batch discount · 90% context caching savings · Audio input 2-7x text pricing
Gemini API Cost Calculator - Monthly Pricing
Calculate by
Input Tokens
Output Tokens
API Calls / Month
Quick Examples:
Cost Optimization:
Gemini 3.1 Pro Preview (gemini-3.1-pro-preview)
Context
Quality
Per 1M Tokens
In: $2.00
Out: $12.00
Monthly Cost
Gemini 3.5 Flash (gemini-3.5-flash)
Context
Quality
Per 1M Tokens
In: $1.50
Out: $9.00
Monthly Cost
Gemini 3 Flash Preview (gemini-3-flash-preview)
Context
Quality
Per 1M Tokens
In: $0.50
Out: $3.00
Monthly Cost
Gemini 2.5 Pro (gemini-2.5-pro)
Context
Quality
Per 1M Tokens
In: $1.25
Out: $10.00
Monthly Cost
Gemini 3.1 Flash-Lite (gemini-3.1-flash-lite)
Context
Quality
Per 1M Tokens
In: $0.25
Out: $1.50
Monthly Cost
Gemini 2.5 Flash (gemini-2.5-flash)
Context
Quality
Per 1M Tokens
In: $0.30
Out: $2.50
Monthly Cost
Gemini 2.0 Flash (gemini-2.0-flash)
Context
Quality
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Context
Quality
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Context
Quality
Per 1M Tokens
In: $0.08
Out: $0.30
Monthly Cost
Tracking Gemini API costs?
Monitor your Gemini API usage and costs in real-time.
Privacy-first desktop app. No sign-up required.

About Gemini API
What is Gemini API?
The Gemini API provides programmatic access to Google's family of multimodal AI models, designed for a wide range of tasks from simple chat to complex reasoning. The current flagships are Gemini 3.1 Pro Preview (top-tier reasoning) and Gemini 3.5 Flash (speed-meets-frontier with native grounding), backed by mid-tier 3.1 Flash-Lite and several prior-gen 2.5/2.0 options.
- Multiple Model Tiers: Choose from Gemini 3.1 Pro Preview (current flagship, paid-only), Gemini 3.5 Flash (frontier + speed with grounding), Gemini 3 Flash Preview (prior-gen Flash), Gemini 3.1 Flash-Lite (newest cost-effective tier), Gemini 2.5 Pro/Flash/Flash-Lite (prior gen), and legacy Gemini 2.0 Flash/Flash-Lite (deprecating 2026-06-01).
- Massive Context Windows: All current Gemini models support 1M token context windows (~750K words). Pro models have tiered pricing: standard rates for ≤200K tokens, then input 2x / output 1.5x above 200K tokens.
- Token-Based Pricing: Pay only for what you use with separate pricing for input and output tokens. Output typically costs 4-8x more than input. Generous free tiers available for most models — Gemini 3.1 Pro Preview is the only current model that's paid-only.
When to Use Gemini API
Start with Gemini 2.5 Flash-Lite or 3.1 Flash-Lite for simple tasks and cost-effective production, then move to Gemini 3.5 Flash when you need frontier intelligence at sub-Pro pricing, and reserve Gemini 3.1 Pro Preview for hardest reasoning. Use batch processing and context caching to significantly reduce costs for production workloads.
Ideal for
- Customer support chatbots with natural conversation flow
- Code completion and review in development environments
- Large document summarization and analysis (1M context)
- Creative content generation for marketing and blogs
- Grounded responses using Google Search integration
Not ideal for
- Real-time applications requiring <100ms latency
- Simple pattern matching tasks (use cheaper alternatives)
- Applications needing guaranteed deterministic outputs
- Tasks requiring specialized domain knowledge without context
Gemini API Pricing Breakdown
Free Tier
Most Gemini models offer generous free tiers with rate limits. Free tier usage is subject to Google using your data to improve products. No credit card required to get started.
- Gemini 3.5 Flash, 3 Flash Preview, 3.1 Flash-Lite: Free with rate limits
- Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite: Free with rate limits
- Gemini 2.0 Flash, 2.0 Flash-Lite: Free (deprecating 2026-06-01)
- Google Search grounding: 5,000 free prompts/month across the Gemini 3.x family
- Gemini 3.1 Pro Preview: Paid tier only (no free access)
Cost Optimization Features
Batch API (50% Discount)
Process non-urgent workloads asynchronously at half price. Example: Gemini 3.1 Pro Preview drops to $1.00/$6.00 per million tokens (vs $2/$12). Gemini 2.5 Pro drops to $0.625/$5. Perfect for data processing and content generation. Not available on free tier.
Context Caching (90% Savings)
Cache frequently used prompts, system messages, or documents. Cache reads cost 10% of base input price. Storage costs $1-4.50 per million tokens per hour depending on model.
Grounding with Google Search
Get up-to-date information by grounding responses with Google Search. Gemini 3.x family: 5,000 prompts/month free (shared), then $14/1K queries. Gemini 2.x models: $35/1K prompts after free quota. Google Maps grounding: $14/1K prompts on Gemini 3.x, $25/1K on Gemini 2.x.
Tiered Context Pricing
Pro models (3.1 Pro Preview, 2.5 Pro) have standard pricing for prompts ≤200K tokens. Above 200K, input doubles and output increases ~50%. Flash and Flash-Lite models have flat pricing regardless of context length.
Audio Input Pricing Premiums
Audio input costs 2-7x more than text input depending on the model. Gemini 3 Flash: 2x ($1.00 vs $0.50). Gemini 2.5 Flash: 3.33x ($1.00 vs $0.30). Gemini 2.5 Flash-Lite: 3x ($0.30 vs $0.10). Gemini 2.0 Flash: 7x ($0.70 vs $0.10). Audio output pricing remains the same as text output.
Gemini API Monthly Cost Estimates
Light Use
$0-30/mo
• Personal projects
• <1K requests/day
• Flash-Lite free tier
Medium Use
$30-150/mo
• Small apps
• 1-5K requests/day
• Flash or Flash-Lite
Heavy Use
$150-800/mo
• Production apps
• 5-20K requests/day
• Mixed models
Enterprise
$800+/mo
• Large scale
• 20K+ requests/day
• Vertex AI available
6 Gemini API Cost Optimization Tips
Use Context Caching
Save 90% on repeated content by caching frequently used prompts, system messages, or documents. Cache reads cost just 10% of base input price. Storage costs $1-4.50/M tokens/hour. Example: 10K requests with 80% cache hits saves significant costs on Pro models.
Enable Batch API
Get 50% discount on all paid models by processing non-urgent workloads asynchronously. Gemini 3.1 Pro Preview drops to $1.00/$6.00 per M tokens (vs $2/$12 standard); 2.5 Pro drops to $0.625/$5. Perfect for data processing, content generation, and analysis tasks.
Start with Flash-Lite
Use the cheapest models (Gemini 2.5 Flash-Lite at $0.10/$0.40 or the newer 3.1 Flash-Lite at $0.25/$1.50 per M tokens) for classification and routing. Only escalate to 3.5 Flash or 3.1 Pro Preview when necessary for complex tasks.
Leverage Free Tiers
Most Gemini models have generous free tiers with rate limits. Use free tier for development and testing, then upgrade to paid for production. Free tier data may be used to improve Google products.
Optimize Context Length
Pro models charge 2x input and ~1.5x output for prompts >200K tokens. Keep prompts under 200K when possible, or use Flash models which have flat pricing regardless of context length for long document processing.
Monitor Gemini API Token Usage
Track Gemini API spending per model with CostGoat's token-level visibility. Get instant alerts when switching from Flash to Pro models, when context caching savings drop unexpectedly, or when batch processing opportunities are missed.
Gemini Model Selection Guide
Use Case
Customer Support Chat
Recommended Model
Gemini 2.5 Flash-Lite
Cheapest available
Monthly Cost (Est.)
~$5-30
Why This Model?
Lowest cost ($0.10/$0.40), free tier available
Use Case
Code Generation
Recommended Model
Gemini 3.1 Pro Preview
Current Pro flagship
Monthly Cost (Est.)
~$120-500
Why This Model?
Top quality (95), 1M context, replaces 3 Pro Preview
Use Case
Research & Analysis
Recommended Model
Gemini 3.1 Pro Preview
Highest quality
Monthly Cost (Est.)
~$100-500
Why This Model?
Most powerful multimodal understanding, best for hard reasoning
Use Case
Content Writing
Recommended Model
Gemini 3.5 Flash
Current Flash flagship
Monthly Cost (Est.)
~$60-300
Why This Model?
Quality 92, much cheaper than Pro, free tier available
Use Case
Data Extraction
Recommended Model
Gemini 2.5 Flash-Lite + Batch
Best value
Monthly Cost (Est.)
~$10-50
Why This Model?
Cheapest option ($0.05/$0.20 with batch discount)
Use Case
Grounded Search
Recommended Model
Gemini 3.5 Flash
+ Google Search
Monthly Cost (Est.)
~$70-250
Why This Model?
Native grounding strength, 5K free queries/month
Google Search Grounding Pricing
Gemini 3.x family (3.1 Pro, 3.5 Flash, 3 Flash, 3.1 Flash-Lite)
Free Quota
Paid Rate
Notes
Free quota shared across all Gemini 3.x models · per-query pricing
Gemini 2.5 Pro
Free Quota
Paid Rate
Notes
Grounding not available on the free tier
Gemini 2.5 Flash / Flash-Lite
Free Quota
Paid Rate
Notes
Shared limit between Flash & Flash-Lite
Gemini 2.0 Flash
Free Quota
Paid Rate
Notes
Deprecating 2026-06-01
Google Maps Grounding (Gemini 3.x)
Free Quota
Paid Rate
Notes
Per-query pricing on Gemini 3.x
Google Maps Grounding (Gemini 2.5 Pro)
Free Quota
Paid Rate
Notes
Highest Maps free quota of any model
Google Maps Grounding (Gemini 2.5 Flash / 2.0 Flash)
Free Quota
Paid Rate
Notes
Location-based queries on prior-gen Flash
RPD: Requests Per Day. Each prompt can generate multiple search queries. Gemini 3.x models use per-query pricing with a shared 5,000-prompts/month free quota; Gemini 2.x models use per-prompt pricing with daily limits.
Start Tracking Your Gemini API Spending
Track usage across Gemini 3.1 Pro, 3.5 Flash, Flash-Lite, and prior-gen variants from your menubar. Get alerts when usage spikes.
Privacy-first desktop app. 7-day free trial, no sign-up required.

Gemini API Pricing FAQ
Common questions about Gemini API costs, billing, and optimization
