TL;DR — DeepSeek ships an Anthropic-compatible API endpoint that lets Claude Code treat DeepSeek V4 Pro like a drop-in replacement for Claude Opus. The setup is eight environment variables, and it works — including tool calling, sub-agent spawning, and native web search. At $0.435/M input tokens (permanent price after the initial launch promo), it’s roughly 4–17× cheaper than Claude Opus 4.7. This is a practical guide based on a real setup we run daily.
Why This Matters
Claude Code is Anthropic’s terminal-based AI coding agent. It reads your codebase, runs bash commands, spawns sub-agents, and writes edits — all through Anthropic’s API. The problem: it only speaks Anthropic’s message format. You can’t point it at OpenAI, Gemini, or a local Ollama instance without a translation layer.
DeepSeek solved this the obvious way: they built an Anthropic-compatible API endpoint at https://api.deepseek.com/anthropic and documented the Claude Code integration on day one. No proxy, no wrapper, no SDK fork. Just environment variables.
We’ve been running this setup in production — managing a multi-project workspace with six active sub-projects, MCP servers, and daily coding sessions. Here’s what works, what doesn’t, and the exact configuration.
Step 1: Get a DeepSeek API Key
Sign up at platform.deepseek.com and create an API key. DeepSeek uses prepaid balance — top up what you need, no subscription.
Two models matter for Claude Code:
| Model | Role | Context | Max Output | Input (cache miss) | Output |
|---|---|---|---|---|---|
deepseek-v4-pro |
Heavy reasoning, main agent | 1M tokens | 384K | $0.435/M | $0.87/M |
deepseek-v4-flash |
Sub-agents, fast tasks | 1M tokens | 384K | $0.14/M | $0.28/M |
After the initial launch promotion ends (2026-05-31), DeepSeek permanently adjusts the official price to 1/4 of the original — so $0.435/M input becomes the new normal, not a temporary deal. That’s still 34× cheaper than Claude Opus 4.7 ($15/M input, ~$75/M output) on input tokens. V4 Flash stays as-is.
Cache hits are absurdly cheap: $0.003625/M for V4 Pro and $0.0028/M for V4 Flash. Claude Code generates a lot of repetitive context (system prompts, CLAUDE.md files, tool definitions), so cache hits dominate real usage.
Step 2: Configure Environment Variables
Create a shell script (we call ours claude.sh) and source it before launching Claude Code:
|
|
Then launch:
|
|
What each variable does:
| Variable | Purpose |
|---|---|
ANTHROPIC_BASE_URL |
Redirects all API calls to DeepSeek’s Anthropic-compatible endpoint |
ANTHROPIC_AUTH_TOKEN |
Your DeepSeek API key (not ANTHROPIC_API_KEY — Claude Code uses AUTH_TOKEN) |
ANTHROPIC_MODEL |
Default model for the main agent loop |
ANTHROPIC_DEFAULT_OPUS_MODEL |
What Claude Code uses when it internally requests Opus |
ANTHROPIC_DEFAULT_SONNET_MODEL |
What Claude Code uses when it internally requests Sonnet |
ANTHROPIC_DEFAULT_HAIKU_MODEL |
What Claude Code uses when it internally requests Haiku |
CLAUDE_CODE_SUBAGENT_MODEL |
Model for spawned sub-agents (Explore, Plan, etc.) |
CLAUDE_CODE_EFFORT_LEVEL |
Thinking budget — max gives the model the most reasoning tokens |
The [1m] suffix on model names is a DeepSeek convention for requesting the 1M-token context window. Without it, you get the default context length.
Step 3: Understand Model Mapping
DeepSeek does automatic model name mapping. When Claude Code internally requests claude-opus-4-7, DeepSeek’s API maps it:
claude-opus-* → deepseek-v4-pro
claude-sonnet-* → deepseek-v4-flash
claude-haiku-* → deepseek-v4-flash
This means you don’t need to patch Claude Code’s source. When the agent decides it needs “Opus-level” reasoning, DeepSeek routes it to V4 Pro. When it wants Haiku for a fast sub-agent, it gets V4 Flash.
We explicitly set the model variables anyway (rather than relying on mapping) because it gives us control over which model handles sub-agents. V4 Flash is fast enough for search and file-reading sub-agents, and it’s 3× cheaper than V4 Pro on input.
What Actually Works
Tool Calling
DeepSeek’s Anthropic API fully supports tool_use and tool_result message types. Claude Code’s entire agent loop is built on tool calling — Read, Write, Edit, Bash, Grep, Glob — and all of it works.
Message: array, type = "tool_use"
- id: Fully Supported
- input: Fully Supported
- name: Fully Supported
- cache_control: Ignored
Message: array, type = "tool_result"
- tool_use_id: Fully Supported
- content: Fully Supported
- is_error: Ignored
Sub-Agent Spawning
Claude Code spawns specialized sub-agents (Explore for file search, Plan for architecture, etc.) using the Agent tool. Each sub-agent is itself a tool-calling loop with restricted permissions. This works on DeepSeek — we’ve tested multi-agent sessions where the main V4 Pro agent spawns V4 Flash sub-agents for file search, and the results flow back correctly.
Web Search (the surprising part)
This is the feature that caught us off guard. DeepSeek’s API natively supports Claude Code’s built-in Web Search tool. When the model determines your question needs web results, it invokes the search tool through DeepSeek’s own search infrastructure — not Anthropic’s.
From DeepSeek’s documentation:
“The DeepSeek API natively supports the Web Search feature in Claude Code. When using Claude Code, if the model determines that your question requires a web search, it will invoke the Web Search tool and perform the search through the API provided by DeepSeek.”
In practice: ask Claude Code “what’s the latest version of LangGraph?” and it will trigger a web search, get results, and summarize them — all through DeepSeek. The web_search_tool_result message type is fully supported in the API.
Cost caveat: Each web search triggers additional LLM API calls to summarize the retrieved content. DeepSeek bills these as normal token usage. A single search-then-summarize cycle might consume 5–20K extra input tokens.
Thinking Mode
DeepSeek V4 supports thinking mode (extended reasoning). The thinking field in the API is supported, though budget_tokens is ignored — DeepSeek manages its own reasoning budget internally. Setting CLAUDE_CODE_EFFORT_LEVEL=max gives the model maximum latitude to think.
Streaming
Fully supported. Responses stream token-by-token just like native Claude.
What Doesn’t Work
Being honest about limitations matters. DeepSeek’s Anthropic API is not a perfect clone — it’s a pragmatic subset.
No Image or Document Input
array, type = "image" → Not Supported
array, type = "document" → Not Supported
You can’t paste screenshots or upload PDFs through Claude Code when using DeepSeek. If your workflow involves vision tasks (analyzing UI mockups, reading diagrams), you need native Claude or a vision-capable model for those sessions.
No Prompt Caching
cache_control → Ignored (on tools, messages, and tool results)
DeepSeek has its own context caching (cache hits are priced separately), but the Anthropic-compatible endpoint ignores cache_control markers. Caching happens at DeepSeek’s discretion based on content similarity, not explicit breakpoints.
No MCP Tool Passthrough
array, type = "mcp_tool_use" → Not Supported
array, type = "mcp_tool_result" → Not Supported
MCP (Model Context Protocol) tools work differently — they’re handled client-side by Claude Code, not server-side by the API. So MCP tools like SearXNG, filesystem watchers, or database connectors still work because Claude Code intercepts them before they hit the API. The “not supported” here means DeepSeek’s API won’t process MCP messages natively, which doesn’t affect actual functionality.
Minor Field Ignorances
top_k— ignoredanthropic-beta/anthropic-versionheaders — ignoredstop_sequences— fully supportedcontainer,mcp_servers,service_tier— ignored
None of these affect core Claude Code functionality.
Cost Comparison
A realistic Claude Code session: ~500K input tokens (system prompt + context + tool definitions) and ~50K output tokens.
| Provider | Input Cost | Output Cost | Session Total |
|---|---|---|---|
| Claude Opus 4.7 (direct) | ~$7.50 | ~$3.75 | ~$11.25 |
| DeepSeek V4 Pro | ~$0.22 | ~$0.04 | ~$0.26 |
| DeepSeek V4 Flash (sub-agents) | ~$0.07 | ~$0.01 | ~$0.08 |
With the main agent on V4 Pro and sub-agents on V4 Flash, a typical mixed session costs around $0.15–0.30. That’s roughly 30–70× cheaper than Claude Opus direct — and it’s the permanent price, not a limited-time promo.
The cache hit pricing makes repetitive sessions (same project, same CLAUDE.md, same tool definitions) even cheaper. Our workspace loads ~80K tokens of context on every session start — most of that hits cache at $0.003625/M.
Real-World Tips
Use a wrapper script
Don’t export environment variables in your .bashrc globally — you’ll accidentally use DeepSeek for tools that need native Claude (like vision tasks). We use a claude.sh script:
|
|
Run it with bash claude.sh or source claude.sh && claude.
The ANTHROPIC_USER_ID matters
DeepSeek supports the user_id metadata field for rate limit isolation. Setting ANTHROPIC_USER_ID ensures your requests are bucketed separately from other users on the same API key — useful if you share a key across projects.
V4 Flash for routine work
If you’re doing routine file editing, formatting, or batch operations, swap ANTHROPIC_MODEL to deepseek-v4-flash. It’s 3× cheaper and fast enough for non-reasoning tasks. Save V4 Pro for architecture decisions, debugging, and complex multi-step problems.
The Bottom Line
DeepSeek’s Anthropic-compatible API is the most seamless third-party Claude Code integration available. No proxy server, no SDK patches, no feature gaps on the things that matter (tool calling, sub-agents, web search). The only real limitation is vision — if you need image input, you still need native Claude.
For pure coding work, the cost savings are dramatic enough that there’s no reason not to try it. Eight environment variables, one API key, and you’re running.
The Configuration
|
|
References
- DeepSeek API: Integrate with Claude Code
- DeepSeek API: Anthropic API Compatibility
- DeepSeek API: Models & Pricing
- Claude Code Official Documentation
Built with: Claude Code (latest), DeepSeek V4 Pro + V4 Flash, Node.js 22. Written from a real multi-project workspace running this setup daily.