Skip to content

DeepSeek V4 Flash

DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'deepseek/deepseek-v4-flash',
prompt: 'Why is the sky blue?'
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
DeepSeek
1M
1.7s
112tps
$0.14/M$0.28/M
Read:$0.0/M
Write:
+1
04/23/2026
Novita AI
1M
0.9s
163tps
$0.14/M$0.28/M
Read:$0.03/M
Write:
+3
04/23/2026
DeepInfra
1M
1.2s
25tps
$0.14/M$0.28/M
Read:$0.03/M
Write:
+3
04/23/2026
Blackbox
1M
$0.14/M$0.28/M
Read:$0.0/M
Write:
+1
04/23/2026
Fireworks
1M
1.7s
135tps
$0.14/M$0.28/M
Read:$0.0/M
Write:
+1
04/23/2026
Azure
1M
1.7s
110tps
$0.19/M$0.51/M
+1
04/23/2026