How much does the Gemini API cost?

Google Gemini API pricing in 2026: Gemini 2.5 Flash is free (with rate limits) up to 1,500 requests/day, then $0.075 per million input tokens and $0.30 per million output tokens paid. Gemini 2.5 Pro costs $1.25 per million input tokens (under 200K context) and $5.00 per million output tokens. Gemini 1.5 Flash is free up to limits, then $0.075/$0.30 per MTok. All models include a free tier via Google AI Studio.

Is the Gemini API free?

Yes, the Gemini API has a free tier in Google AI Studio: Gemini 2.5 Flash allows 1,500 requests per day and 1 million tokens per minute at no cost. Gemini 2.5 Pro allows 50 requests per day free. For production applications, you'll need a paid account via Google Cloud (Vertex AI) or Google AI Studio with billing enabled. Rate limits are significantly higher on paid plans.

What is Gemini 2.5 Flash and how much does it cost?

Gemini 2.5 Flash is Google's fastest and most cost-efficient model, optimized for high-volume tasks. In Google AI Studio: free up to 1,500 requests/day. Paid pricing: $0.075 per million input tokens (text/image/video), $0.30 per million output tokens. Thinking mode (for complex reasoning) adds $3.50/MTok for thinking tokens. Context window: 1M tokens.

Gemini vs OpenAI GPT-4o: which is cheaper?

For comparable quality: Gemini 2.5 Flash ($0.075 input) is significantly cheaper than GPT-4o mini ($0.15 input) — about 2x cheaper on input tokens. Gemini 2.5 Pro ($1.25 input) is about half the price of GPT-4o ($2.50 input). Gemini also offers a 1M token context window vs GPT-4o's 128K, which matters for large document processing. However, OpenAI generally has better third-party library support and ecosystem maturity.

What is the difference between Google AI Studio and Vertex AI for the Gemini API?

Google AI Studio is the developer-friendly API (api.google.com) with a free tier and simple setup — best for prototyping and smaller projects. Vertex AI is the enterprise-grade Google Cloud offering with more features (fine-tuning, model garden, batch jobs), SLAs, VPC support, and enterprise compliance. Pricing is similar but Vertex AI has additional Google Cloud charges for some features. Most startups start with AI Studio and migrate to Vertex AI at scale.

Does Gemini support long context and how is it priced?

Yes. Gemini 2.5 Pro and Flash both support 1 million token context windows (vs GPT-4o's 128K). For Gemini 2.5 Pro, prompts over 200K tokens have higher pricing: $2.50 input (vs $1.25 under 200K) and $10.00 output (vs $5.00 under 200K). Context caching is available to reduce costs on repeated long contexts — cached tokens cost $0.3125/MTok storage per hour plus reduced input costs.

Google Gemini API Pricing 2026

Complete pricing for Gemini 2.5 Pro, Flash, and 1.5 models — with free tier details, real cost scenarios, and comparison to OpenAI and Claude.

Gemini API Free Tier (Google AI Studio)

Google offers a generous free tier for the Gemini API through Google AI Studio. No credit card required to start.

Model	Free Requests/Day	Free RPM Limit	Free TPM Limit
Gemini 2.5 Flash Popular	1,500/day	10 RPM	1M TPM
Gemini 2.5 Pro Most Capable	50/day	2 RPM	32K TPM
Gemini 1.5 Flash	1,500/day	15 RPM	1M TPM
Gemini 1.5 Pro	50/day	2 RPM	32K TPM
Gemini 1.0 Pro	Unlimited	15 RPM	32K TPM

Prototyping tip: For most side projects and early-stage apps under 1,500 requests/day, Gemini 2.5 Flash is completely free. The 1M TPM limit is unusually generous — you can process massive documents without hitting rate limits.

Gemini API Paid Pricing

Once you exceed free tier limits or need higher rate limits, billing is per million tokens. Enable billing in Google AI Studio or use Vertex AI.

Model	Input (per MTok)	Output (per MTok)	Context Window
Gemini 2.5 Flash Popular Text/image/audio/video	$0.075	$0.30	1M tokens
Gemini 2.5 Flash (Thinking) Complex reasoning mode	$0.075	$3.50 (thinking tokens)	1M tokens
Gemini 2.5 Pro Most Capable ≤200K context	$1.25	$10.00	1M tokens
Gemini 2.5 Pro >200K context	$2.50	$15.00	1M tokens
Gemini 1.5 Flash ≤128K context	$0.075	$0.30	1M tokens
Gemini 1.5 Pro ≤128K context	$1.25	$5.00	2M tokens
Gemini Embedding 004	$0.00	N/A	2K tokens/request

Context caching: For applications that repeatedly send the same large system prompt or documents, Gemini Context Caching reduces costs significantly. Cached tokens cost $0.01875/MTok (Flash) or $0.31/MTok (Pro) — 4x cheaper than regular input pricing. Storage: $1.00/MTok/hour (Flash) or $4.50/MTok/hour (Pro).

Gemini API Cost Calculator

Model

Requests per month

Avg input tokens per request

Avg output tokens per request

Estimated monthly API cost $7.50

Real Cost Scenarios

Startup Chatbot — 100K conversations/month (Gemini 2.5 Flash)

100K conversations × 600 input tokens avg 60M input tokens

100K conversations × 250 output tokens avg 25M output tokens

Flash: $0.075 input + $0.30 output per MTok —

Total monthly API cost ~$12.00

Long Document Analysis — 10K legal/research docs/month (Gemini 2.5 Pro)

10K docs × 50,000 tokens avg (long docs) 500M input tokens

10K docs × 2,000 output tokens (analysis) 20M output tokens

Pro >200K tier: $2.50 input + $15.00 output —

Total monthly API cost ~$1,550

Early-Stage App (Free Tier) — Under 1,500 req/day

44,000 requests/month (≈1,400/day) Within free limit

Any token amount within rate limits Free

Total monthly API cost $0.00

RAG Application — 200K queries/month with caching (Flash)

System prompt: 5,000 tokens (cached for all requests) Cached input: $0.01875/MTok

200K queries × 5,000 cached tokens = 1B cached tokens $18.75

200K queries × 300 non-cached input + 400 output tokens $4.50 + $24.00

Total monthly API cost ~$47.25

Gemini vs OpenAI vs Claude API Pricing

Provider / Model	Input (per MTok)	Output (per MTok)	Context
Gemini 2.5 Flash (Google)	$0.075	$0.30	1M tokens
GPT-4o mini (OpenAI)	$0.15	$0.60	128K
Claude 3.5 Haiku (Anthropic)	$0.80	$4.00	200K
Gemini 2.5 Pro (Google)	$1.25	$10.00	1M tokens
GPT-4o (OpenAI)	$2.50	$10.00	128K
Claude 3.5 Sonnet (Anthropic)	$3.00	$15.00	200K
o3 (OpenAI)	$10.00	$40.00	200K
Claude 3 Opus (Anthropic)	$15.00	$75.00	200K

Gemini Flash is the cheapest option for most tasks — 2x cheaper than GPT-4o mini and 10x cheaper than Claude Haiku. For a full feature and benchmark comparison, see OpenAI vs Claude API pricing.

When to Choose Gemini API

Use Case	Recommendation	Reason
Side project / prototype	Gemini 2.5 Flash (Free)	1,500 req/day free — zero cost to ship v1
Cost-sensitive production app	Gemini 2.5 Flash (Paid)	$0.075/MTok is best price among major providers
Long document processing	Gemini 2.5 Pro	1M token context window handles entire books/codebases
Multimodal (image + video + text)	Gemini 2.5 Flash	Native multimodal at same price as text-only tasks
Google Cloud / GCP integration	Vertex AI (Gemini)	Single billing, VPC, enterprise SLA, IAM integration
Enterprise compliance (EU data residency)	Vertex AI + region selection	Vertex AI supports regional data residency; AI Studio doesn't

Google Gemini API Price History

Date	Change	Impact
Dec 2023	Gemini Pro launches (API)	Free in preview; first public Gemini API access
May 2024	Gemini 1.5 Flash launches	$0.35/MTok input — 10x cheaper than Gemini 1.5 Pro
Jul 2024	Gemini 1.5 Flash price cut	Cut from $0.35 to $0.075/MTok input — 79% reduction
Sep 2024	Free tier expanded	1,500 req/day free (was 60); 1M TPM free added
Feb 2025	Gemini 2.0 Flash launches	Same price as 1.5 Flash, significantly better quality
Mar 2025	Gemini 2.5 Pro launches	$1.25/MTok input — top benchmark performance, competitive price
Apr 2025	Gemini 2.5 Flash launches	Replaces 2.0 Flash; same price, 2.5 quality with thinking mode
Ongoing	Context caching added	Cached tokens 4x cheaper — major savings for repeated prompts

Get Alerted When Google Changes Gemini API Prices

Google cut Gemini Flash prices by 79% in a single announcement. Set up instant alerts so you always know when to renegotiate or switch models.

Set Up Price Alerts — Free Free API Access