How Much Does DeepSeek Cost? | Real Price Math

DeepSeek is free for chat, while API use is pay-as-you-go from $0.028 to $0.87 per 1M tokens at current listed rates.

If you typed How Much Does DeepSeek Cost?, the clean answer is this: casual chat can cost nothing, and developer access is billed by token volume. That means the price changes with how much text you send, how much text the model returns, and whether repeated input qualifies for cheaper cache-hit billing.

The practical split is simple. Readers who want a chatbot should check the free web or mobile app. Builders who want API access should budget by the million tokens, then run small tests before adding a large balance.

DeepSeek Cost By Plan Type And Model

DeepSeek has two main cost paths: consumer chat and API usage. The public chat path is the easy one. DeepSeek’s own site presents free DeepSeek access for regular chat, which makes it the lowest-friction way to try the model.

The API path is different. It is not a flat monthly plan. You add funds, send requests, and DeepSeek deducts cost from your balance based on tokens. DeepSeek’s models and pricing page lists prices per 1 million input or output tokens, plus separate rates for cache-hit and cache-miss input.

What Free Chat Usually Means

Free chat works well for normal prompts, writing drafts, coding questions, and research-style tasks where you don’t need your own app to call the model. There may still be usage limits, slower periods, or account restrictions, but there is no public consumer subscription price listed on DeepSeek’s own main site.

This matters because many paid “DeepSeek” sites are not DeepSeek itself. Some are wrappers that charge for access to models through their own interface. That can be fine when you want their extra features, but don’t treat those prices as DeepSeek’s direct price.

How API Billing Works

API pricing depends on three numbers: fresh input tokens, cached input tokens, and output tokens. DeepSeek says tokens are the billing unit, and its token usage notes explain that a token can be a word, number, symbol, or piece of text.

At the listed rates, DeepSeek-V4-Flash costs $0.028 per 1M cache-hit input tokens, $0.14 per 1M cache-miss input tokens, and $0.28 per 1M output tokens. DeepSeek-V4-Pro costs $0.03625 per 1M cache-hit input tokens, $0.435 per 1M cache-miss input tokens, and $0.87 per 1M output tokens during the listed 75% promo window that ends May 5, 2026 at 15:59 UTC.

Those tiny rates can hide real spend when an app runs many calls. A request with 2,000 fresh input tokens and 800 output tokens on V4-Flash costs $0.000504. Ten thousand of those same requests cost $5.04.

Price Examples For Common DeepSeek Jobs

The table below uses the current V4-Flash rate unless the row names V4-Pro. It assumes cache miss input except where the row says cache hit. Real bills can move when prompts grow, answers run long, or repeated text gets cached.

DeepSeek Use How Billing Works Sample Spend
Web or mobile chat No listed consumer subscription on DeepSeek’s own page $0 before any third-party add-on
Short customer email reply 1K input + 500 output on V4-Flash $2.80 per 10K API calls
Article draft planning 3K input + 1.5K output on V4-Flash $8.40 per 10K API calls
Code explanation 6K input + 2K output on V4-Flash $14.00 per 10K API calls
Long file with repeated prompt 50K input + 2K output; cache miss, then cache hit $7.56 then $1.96 per 1K calls
Reasoning-heavy task 10K input + 5K output on V4-Pro promo $8.70 per 1K API calls
Long answer generation 5K input + 10K output on V4-Flash $3.50 per 1K API calls
Large batch cleanup 20K input + 1K output on V4-Flash $3.08 per 1K API calls

What Makes Your Bill Rise

The biggest cost driver is output length. Fresh input on V4-Flash is cheap, but output still costs twice the fresh input rate. Long answers, verbose drafts, and repeated retry loops can raise the bill more than a single large prompt.

Context size matters too. A million-token context window sounds generous, but sending huge files when only a few paragraphs are needed wastes money. Trim boilerplate, remove repeated headers, and pass only the text the model needs for the task.

Cache Hit Versus Cache Miss

Cache miss means DeepSeek processes input as new. Cache hit means repeated input can be billed at a lower rate. This is useful when an app sends the same long instruction block, policy text, or product catalog across many requests.

Do not count on every request being cached. Design your budget with cache-miss rates first, then treat cache-hit pricing as savings when your traffic pattern earns it.

Ways To Cut DeepSeek API Spend

Small edits in prompt design can change monthly cost. The goal is not to make prompts tiny at any price. The goal is to send the right text, ask for the right length, and avoid paying for repeated clutter.

Cost Move Why It Works Where It Fits
Start with V4-Flash Its listed token prices are lower than V4-Pro Drafting, coding, extraction, chat apps
Reuse long instructions Repeated input may qualify for cache-hit pricing Apps with stable system prompts
Set answer length limits Output tokens are billed on top of input Briefs, labels, product fields
Send clean source text Menus, footers, and duplicate blocks add tokens File reading and page extraction
Log usage by feature One costly feature can hide inside a cheap app Dashboards, agents, internal tools

When Paid DeepSeek Access Makes Sense

Use the free chat product when you only need answers inside DeepSeek’s own interface. It is the sane starting point for learning the model, testing writing style, or solving one-off tasks.

Use the API when DeepSeek needs to sit inside your product, workflow, or private tool. API access makes sense when you need automation, batch jobs, saved logs, your own interface, or calls from code.

Budget Rule For Small Projects

For a small prototype, add a low balance and track token use for a few days. Multiply daily spend by 30, then add a safety margin for retries and longer answers. If your app sends large files, test with real files rather than short demos.

For a team tool, separate costs by feature. Chat, extraction, coding, and long-form generation have different token patterns. One messy prompt can cost more than a clean batch of short calls.

Final Price Check Before You Add Funds

DeepSeek can cost nothing for regular chat, but API users pay by input and output tokens. V4-Flash is the cheaper current choice for most price-sensitive work. V4-Pro costs more, even during its listed promo, and should be reserved for tasks where the higher rate pays for itself.

Before you add a larger balance, write down three numbers: average input tokens, average output tokens, and expected calls per month. That small bit of math will tell you whether DeepSeek costs cents, dollars, or more for your use case.

References & Sources