AI API Cost Calculator
📅 Prices updated Mar 2026Calculate the exact cost of using ChatGPT, Claude, Gemini, DeepSeek and other AI APIs. Paste your text to count tokens automatically, enter your daily volume, and see your monthly spend — all prices are pre-filled but fully editable.
Quick Presets
✎ Prices per 1M tokens (editable — update anytime)
Your prompt / context sent to the model
The model's response length
How many API calls your app makes daily
Estimated API cost breakdown
Per Request
—
USD
Per Day
—
USD
Per Month
—
USD
Per Year
—
USD
Input cost / request
—
Output cost / request
—
Cost for 500 input + 300 output tokens (a typical short API call). Sorted cheapest first.
All prices in USD. Last updated March 2026.
| Model | Input $/1M | Output $/1M | Cost / request | Monthly (100/day) |
|---|
Sources: Official provider pricing pages (OpenAI, Anthropic, Google, DeepSeek). Prices change frequently — always verify before committing to a budget.
Characters
0
Words
0
Est. Tokens
0
chars ÷ 4
Context %
—
of 128k window
Cost to send this text as input
$0.000000
This is only the input cost. Output tokens are charged separately based on response length.
What Is a Token? A Simple Guide
When you send a message to an AI model via its API, your text is split into small pieces called tokens before the model processes it. A token is roughly 4 characters or about 0.75 of a word in English. The word "calculator" is about 2 tokens. "Hello!" is 2 tokens. A 100-word paragraph is approximately 130–150 tokens.
AI providers charge separately for input tokens (the text you send, including your system prompt, conversation history, and question) and output tokens (the text the model generates in response). Output tokens typically cost 3–5x more than input tokens because generating each token requires a full computation pass through the model.
💫 Token Rule of Thumb
1 token ≈ 4 characters ≈ 0.75 words. A tweet (280 chars) ≈ 70 tokens. A page of text (500 words) ≈ 667 tokens. A short novel (80,000 words) ≈ 107,000 tokens.
💷 Why Costs Add Up
In a multi-turn chatbot, every message includes the full conversation history as context. After 10 exchanges, your 11th message might carry 3,000 tokens of prior chat — all billed as input tokens.
🏭 Context Windows
Every model has a maximum context window — the total tokens it can process at once (input + output). GPT-4o supports 128k tokens. Claude 3.5 supports 200k tokens. Going over the limit causes errors or truncation.
AI API Pricing Explained: Input vs Output Tokens
All major AI providers use the same basic pricing structure: you pay per million tokens, with different rates for input and output. Here is how the current landscape breaks down:
| Model | Input $/1M tokens | Output $/1M tokens | Best for |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | General-purpose, high quality |
| GPT-4o mini | $0.15 | $0.60 | High-volume, budget-sensitive apps |
| Claude Sonnet 4 | $3.00 | $15.00 | Complex reasoning, long documents |
| Claude Haiku 3.5 | $0.80 | $4.00 | Fast, affordable Anthropic option |
| Gemini 2.5 Pro | $1.25 | $5.00 | Long context, Google ecosystem |
| Gemini 2.5 Flash | $0.075 | $0.30 | Cheapest quality option |
| DeepSeek V3 | $0.27 | $1.10 | Cheapest capable model |
Prices as of March 2026. Always verify at the official provider pricing page before committing to a budget. Prices change frequently.
How to Reduce Your AI API Costs
For developers and businesses, API costs can grow quickly at scale. Here are the most effective ways to reduce your spend:
- Use smaller models for simpler tasks: GPT-4o mini costs ~17x less than GPT-4o. For tasks like classification, extraction, or short responses, a smaller model often performs identically. Route easy tasks to cheap models and reserve expensive models for complex reasoning.
- Prompt caching: If your system prompt is long and repeated across many requests (common in chatbots), providers like Anthropic and OpenAI offer cached input pricing at ~90% discount. This alone can cut costs by 40–60% for many applications.
- Limit output length: Output tokens cost 3–5x more than input tokens. Set a max_tokens limit in your API calls to prevent runaway responses. For most tasks, a 500-token output is sufficient.
- Trim your system prompt: Many developers write verbose system prompts that add 500–2000 tokens to every single request. Audit and compress your system prompt — every 100 tokens you remove saves money on every API call you ever make.
- Batch requests: For non-realtime workloads (bulk document processing, overnight analysis), OpenAI and Anthropic offer batch APIs at 50% discount. If your task doesn't need instant results, batching can halve your bill.
- Use streaming wisely: Streaming doesn't change token cost, but it improves perceived speed and lets you stop generation early if needed, saving output tokens on long responses.
Real-World Cost Examples
To help calibrate expectations, here are typical monthly API costs for common use cases at different scales:
| Use Case | Volume | Suggested Model | Est. Monthly Cost |
|---|---|---|---|
| Personal AI assistant | 10 req/day | GPT-4o | ~$1–$3/mo |
| Small business chatbot | 500 req/day | GPT-4o mini | ~$2–$8/mo |
| Content generation tool | 200 req/day | Claude Sonnet | ~$40–$90/mo |
| High-volume classification | 10,000 req/day | Gemini Flash | ~$5–$20/mo |
| Document analysis (large) | 100 req/day, 5k tokens each | Claude Sonnet | ~$200–$500/mo |
Which AI API Should You Choose?
The right model depends on your use case, quality requirements, and budget:
- Best quality, don't mind cost: Claude Opus 4 or GPT-4o for complex reasoning, nuanced writing, and difficult analysis tasks.
- Best balance of quality and cost: Claude Haiku 3.5 or GPT-4o mini — both deliver strong performance at 10–20x lower cost than flagship models.
- Absolute cheapest: DeepSeek V3 at $0.27/$1.10 per million tokens, or Gemini 2.5 Flash at $0.075/$0.30. Suitable for high-volume simple tasks.
- Long documents and large context: Gemini 2.5 Pro (2M context window) or Claude Sonnet (200k context) are best for processing full books, codebases, or large datasets.
- Privacy and no vendor lock-in: Self-hosted Llama 3.1 via Groq, Together.ai, or your own infrastructure. No per-token billing, but requires infrastructure management.
✓ How to Use This Calculator
- In the Cost Calculator tab: select your model, adjust token counts and daily request volume, and see per-request, daily, monthly, and yearly costs instantly.
- Prices are pre-filled with current rates but fully editable — if you have custom pricing or rates change, just update the fields directly.
- In the Token Estimator tab: paste any text to instantly see its character count, word count, estimated token count, and input cost across models.
- Use Quick Presets to auto-fill settings for common use cases like chatbots, document summarisers, or high-volume apps.
- In the Compare All Models tab: see the cost of a typical request across all supported models, sorted cheapest first.
Frequently Asked Questions
How much does the ChatGPT API cost per 1000 tokens?
GPT-4o costs $2.50 per million input tokens ($0.0025 per 1,000) and $10.00 per million output tokens ($0.010 per 1,000). A typical short exchange (500 input + 300 output tokens) costs about $0.0042 — less than half a cent.
What is a token in AI APIs?
A token is approximately 4 characters or 0.75 words in English. The phrase "Hello, how are you?" is about 6 tokens. A 500-word article is roughly 667 tokens. Non-English languages and code can use more tokens per word than English.
Which AI API is cheapest in 2026?
DeepSeek V3 and Gemini 2.5 Flash are currently the cheapest capable models, both under $0.30/million input tokens. For applications where cost is the primary concern and latency is acceptable, these are strong choices. DeepSeek is especially competitive at $0.27/$1.10 per million input/output tokens.
Why are output tokens more expensive than input tokens?
Input tokens are processed in one parallel step. Generating each output token requires a full sequential forward pass through the model — far more compute intensive. This is why output prices are typically 3–5x higher than input prices across all major providers.
How many tokens is a typical ChatGPT conversation?
A short single exchange uses 150–500 tokens. A detailed technical question with a long response uses 1,000–3,000 tokens. In multi-turn conversations, all prior messages are sent as context each time, so costs grow with conversation length. After 10 exchanges, a single API call might carry 5,000+ tokens.
Do prices change frequently?
Yes — prices have been falling rapidly. OpenAI reduced GPT-4o mini pricing multiple times in 2024–2025. Google and Anthropic have also cut prices as competition intensified. We update this tool's default prices regularly, but always verify the latest rates on the official provider pricing page before budgeting a production project.
Related Tools
AI Energy & Carbon Footprint Calculator
How much energy and CO2 does your AI usage produce?
Currency Converter
Convert API costs from USD to your local currency
Digital Carbon Footprint Calculator
CO2 from streaming, emails, and downloads
Discount & Percentage Calculator
Calculate savings from batch API discounts
References & Data Sources
- OpenAI API pricing — platform.openai.com/docs/pricing
- Anthropic Claude pricing — anthropic.com/pricing
- Google Gemini API pricing — ai.google.dev/pricing
- DeepSeek API pricing — platform.deepseek.com/api-docs/pricing
- Groq (Llama hosting) pricing — groq.com/pricing
- Token counting methodology: 1 token ≈ 4 characters (OpenAI tokenizer documentation)
Prices shown are standard public API rates as of March 2026. Enterprise agreements, volume discounts, and promotional rates may differ. Always verify at the official provider page before committing to a production budget.