NEW: Real-Time Usage Tracking for AI Agents — track Claude Code, Kimi, Codex & more. Try it free →

CostGoat Logo

CostGoat

LAST UPDATED: MAY 21, 2026

Gemini API Pricing Calculator & Complete Cost Guide

Calculate Gemini API costs for text and chat models per token and per month. Compare the full Gemini 3.x family with batch & cache discounts.

CalculatorPricing GuideExamplesSave MoneyFAQ

Pricing TLDR

  • Generous free tiers for most models (rate-limited) · 3.1 Pro Preview is paid-only
  • Pay-per-token: 2.5 Flash-Lite ($0.10/$0.40) · 3.1 Flash-Lite ($0.25/$1.50) · 3.5 Flash ($1.50/$9) · 3.1 Pro ($2/$12) per M tokens
  • 50% batch discount · 90% context caching savings · Audio input 2-7x text pricing

Official pricing:

Google AI

Quality Scores: Theozard

Gemini API Cost Calculator - Monthly Pricing

Calculate by

Input Tokens

Output Tokens

API Calls / Month

Quick Examples:

Cost Optimization:

Gemini 3.1 Pro Preview (gemini-3.1-pro-preview)

Context

1M

Quality

95

Per 1M Tokens

In: $2.00

Out: $12.00

Monthly Cost

$8.00

Gemini 3.5 Flash (gemini-3.5-flash)

Context

1M

Quality

92

Per 1M Tokens

In: $1.50

Out: $9.00

Monthly Cost

$6.00

Gemini 3 Flash Preview (gemini-3-flash-preview)

Context

1M

Quality

77

Per 1M Tokens

In: $0.50

Out: $3.00

Monthly Cost

$2.00

Gemini 2.5 Pro (gemini-2.5-pro)

Context

1M

Quality

57

Per 1M Tokens

In: $1.25

Out: $10.00

Monthly Cost

$6.25

Gemini 3.1 Flash-Lite (gemini-3.1-flash-lite)

Context

1M

Quality

56

Per 1M Tokens

In: $0.25

Out: $1.50

Monthly Cost

$1.00

Gemini 2.5 Flash (gemini-2.5-flash)

Context

1M

Quality

45

Per 1M Tokens

In: $0.30

Out: $2.50

Monthly Cost

$1.55

Gemini 2.0 Flash (gemini-2.0-flash)

Context

1M

Quality

31

Per 1M Tokens

In: $0.10

Out: $0.40

Monthly Cost

$0.30

Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)

Context

1M

Quality

29

Per 1M Tokens

In: $0.10

Out: $0.40

Monthly Cost

$0.30

Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)

Context

1M

Quality

24

Per 1M Tokens

In: $0.08

Out: $0.30

Monthly Cost

$0.23

Tracking Gemini API costs?

Monitor your Gemini API usage and costs in real-time.

Try free for 7 daysLearn more →

Privacy-first desktop app. No sign-up required.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

About Gemini API

What is Gemini API?

The Gemini API provides programmatic access to Google's family of multimodal AI models, designed for a wide range of tasks from simple chat to complex reasoning. The current flagships are Gemini 3.1 Pro Preview (top-tier reasoning) and Gemini 3.5 Flash (speed-meets-frontier with native grounding), backed by mid-tier 3.1 Flash-Lite and several prior-gen 2.5/2.0 options.

  • Multiple Model Tiers: Choose from Gemini 3.1 Pro Preview (current flagship, paid-only), Gemini 3.5 Flash (frontier + speed with grounding), Gemini 3 Flash Preview (prior-gen Flash), Gemini 3.1 Flash-Lite (newest cost-effective tier), Gemini 2.5 Pro/Flash/Flash-Lite (prior gen), and legacy Gemini 2.0 Flash/Flash-Lite (deprecating 2026-06-01).
  • Massive Context Windows: All current Gemini models support 1M token context windows (~750K words). Pro models have tiered pricing: standard rates for ≤200K tokens, then input 2x / output 1.5x above 200K tokens.
  • Token-Based Pricing: Pay only for what you use with separate pricing for input and output tokens. Output typically costs 4-8x more than input. Generous free tiers available for most models — Gemini 3.1 Pro Preview is the only current model that's paid-only.

When to Use Gemini API

Start with Gemini 2.5 Flash-Lite or 3.1 Flash-Lite for simple tasks and cost-effective production, then move to Gemini 3.5 Flash when you need frontier intelligence at sub-Pro pricing, and reserve Gemini 3.1 Pro Preview for hardest reasoning. Use batch processing and context caching to significantly reduce costs for production workloads.

Ideal for

  • Customer support chatbots with natural conversation flow
  • Code completion and review in development environments
  • Large document summarization and analysis (1M context)
  • Creative content generation for marketing and blogs
  • Grounded responses using Google Search integration

Not ideal for

  • Real-time applications requiring <100ms latency
  • Simple pattern matching tasks (use cheaper alternatives)
  • Applications needing guaranteed deterministic outputs
  • Tasks requiring specialized domain knowledge without context

Gemini API Pricing Breakdown

Free Tier

Most Gemini models offer generous free tiers with rate limits. Free tier usage is subject to Google using your data to improve products. No credit card required to get started.

  • Gemini 3.5 Flash, 3 Flash Preview, 3.1 Flash-Lite: Free with rate limits
  • Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite: Free with rate limits
  • Gemini 2.0 Flash, 2.0 Flash-Lite: Free (deprecating 2026-06-01)
  • Google Search grounding: 5,000 free prompts/month across the Gemini 3.x family
  • Gemini 3.1 Pro Preview: Paid tier only (no free access)

Cost Optimization Features

Batch API (50% Discount)

Process non-urgent workloads asynchronously at half price. Example: Gemini 3.1 Pro Preview drops to $1.00/$6.00 per million tokens (vs $2/$12). Gemini 2.5 Pro drops to $0.625/$5. Perfect for data processing and content generation. Not available on free tier.

Context Caching (90% Savings)

Cache frequently used prompts, system messages, or documents. Cache reads cost 10% of base input price. Storage costs $1-4.50 per million tokens per hour depending on model.

Grounding with Google Search

Get up-to-date information by grounding responses with Google Search. Gemini 3.x family: 5,000 prompts/month free (shared), then $14/1K queries. Gemini 2.x models: $35/1K prompts after free quota. Google Maps grounding: $14/1K prompts on Gemini 3.x, $25/1K on Gemini 2.x.

Tiered Context Pricing

Pro models (3.1 Pro Preview, 2.5 Pro) have standard pricing for prompts ≤200K tokens. Above 200K, input doubles and output increases ~50%. Flash and Flash-Lite models have flat pricing regardless of context length.

Audio Input Pricing Premiums

Audio input costs 2-7x more than text input depending on the model. Gemini 3 Flash: 2x ($1.00 vs $0.50). Gemini 2.5 Flash: 3.33x ($1.00 vs $0.30). Gemini 2.5 Flash-Lite: 3x ($0.30 vs $0.10). Gemini 2.0 Flash: 7x ($0.70 vs $0.10). Audio output pricing remains the same as text output.

Gemini API Monthly Cost Estimates

Light Use

$0-30/mo

Personal projects

<1K requests/day

Flash-Lite free tier

Medium Use

$30-150/mo

Small apps

1-5K requests/day

Flash or Flash-Lite

Heavy Use

$150-800/mo

Production apps

5-20K requests/day

Mixed models

Enterprise

$800+/mo

Large scale

20K+ requests/day

Vertex AI available

6 Gemini API Cost Optimization Tips

1

Use Context Caching

Save 90% on repeated content by caching frequently used prompts, system messages, or documents. Cache reads cost just 10% of base input price. Storage costs $1-4.50/M tokens/hour. Example: 10K requests with 80% cache hits saves significant costs on Pro models.

2

Enable Batch API

Get 50% discount on all paid models by processing non-urgent workloads asynchronously. Gemini 3.1 Pro Preview drops to $1.00/$6.00 per M tokens (vs $2/$12 standard); 2.5 Pro drops to $0.625/$5. Perfect for data processing, content generation, and analysis tasks.

3

Start with Flash-Lite

Use the cheapest models (Gemini 2.5 Flash-Lite at $0.10/$0.40 or the newer 3.1 Flash-Lite at $0.25/$1.50 per M tokens) for classification and routing. Only escalate to 3.5 Flash or 3.1 Pro Preview when necessary for complex tasks.

4

Leverage Free Tiers

Most Gemini models have generous free tiers with rate limits. Use free tier for development and testing, then upgrade to paid for production. Free tier data may be used to improve Google products.

5

Optimize Context Length

Pro models charge 2x input and ~1.5x output for prompts >200K tokens. Keep prompts under 200K when possible, or use Flash models which have flat pricing regardless of context length for long document processing.

6

Monitor Gemini API Token Usage

Track Gemini API spending per model with CostGoat's token-level visibility. Get instant alerts when switching from Flash to Pro models, when context caching savings drop unexpectedly, or when batch processing opportunities are missed.

Gemini Model Selection Guide

Use Case

Customer Support Chat

Recommended Model

Gemini 2.5 Flash-Lite

Cheapest available

Monthly Cost (Est.)

~$5-30

Why This Model?

Lowest cost ($0.10/$0.40), free tier available

Use Case

Code Generation

Recommended Model

Gemini 3.1 Pro Preview

Current Pro flagship

Monthly Cost (Est.)

~$120-500

Why This Model?

Top quality (95), 1M context, replaces 3 Pro Preview

Use Case

Research & Analysis

Recommended Model

Gemini 3.1 Pro Preview

Highest quality

Monthly Cost (Est.)

~$100-500

Why This Model?

Most powerful multimodal understanding, best for hard reasoning

Use Case

Content Writing

Recommended Model

Gemini 3.5 Flash

Current Flash flagship

Monthly Cost (Est.)

~$60-300

Why This Model?

Quality 92, much cheaper than Pro, free tier available

Use Case

Data Extraction

Recommended Model

Gemini 2.5 Flash-Lite + Batch

Best value

Monthly Cost (Est.)

~$10-50

Why This Model?

Cheapest option ($0.05/$0.20 with batch discount)

Use Case

Grounded Search

Recommended Model

Gemini 3.5 Flash

+ Google Search

Monthly Cost (Est.)

~$70-250

Why This Model?

Native grounding strength, 5K free queries/month

Google Search Grounding Pricing

Gemini 3.x family (3.1 Pro, 3.5 Flash, 3 Flash, 3.1 Flash-Lite)

Free Quota

5,000 prompts/mo (shared)

Paid Rate

$14 / 1K queries

Notes

Free quota shared across all Gemini 3.x models · per-query pricing

Gemini 2.5 Pro

Free Quota

1,500 RPD (paid tier only)

Paid Rate

$35 / 1K prompts

Notes

Grounding not available on the free tier

Gemini 2.5 Flash / Flash-Lite

Free Quota

500 RPD (Free) / 1,500 RPD (Paid)

Paid Rate

$35 / 1K prompts

Notes

Shared limit between Flash & Flash-Lite

Gemini 2.0 Flash

Free Quota

500 RPD (Free) / 1,500 RPD (Paid)

Paid Rate

$35 / 1K prompts

Notes

Deprecating 2026-06-01

Google Maps Grounding (Gemini 3.x)

Free Quota

Shares 5,000/mo with Search

Paid Rate

$14 / 1K queries

Notes

Per-query pricing on Gemini 3.x

Google Maps Grounding (Gemini 2.5 Pro)

Free Quota

10,000 RPD (free)

Paid Rate

$25 / 1K prompts

Notes

Highest Maps free quota of any model

Google Maps Grounding (Gemini 2.5 Flash / 2.0 Flash)

Free Quota

1,500 RPD (free)

Paid Rate

$25 / 1K prompts

Notes

Location-based queries on prior-gen Flash

RPD: Requests Per Day. Each prompt can generate multiple search queries. Gemini 3.x models use per-query pricing with a shared 5,000-prompts/month free quota; Gemini 2.x models use per-prompt pricing with daily limits.

Start Tracking Your Gemini API Spending

Track usage across Gemini 3.1 Pro, 3.5 Flash, Flash-Lite, and prior-gen variants from your menubar. Get alerts when usage spikes.

Try Free for 7 DaysLearn more →

Privacy-first desktop app. 7-day free trial, no sign-up required.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

Gemini API Pricing FAQ

Common questions about Gemini API costs, billing, and optimization

AI Pricing

Gemini API PricingClaude API PricingGoogle Veo PricingAI Cost CalculatorsReplicate API PricingOpenRouter API PricingOpenRouter Free Models
DownloadsPricingDashboardContactIssuesAffiliatesTermsPrivacy

© 2026 CostGoat. All rights reserved.

Made by Functioncraft: Redis GUI Client · SSH GUI Client

Affiliate disclosure: Some links earn CostGoat a commission or credit when you sign up — no extra cost to you.