DeepSeek V4 Flash

DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'deepseek/deepseek-v4-flash',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

DeepSeek

1.7s

112tps

$0.14/M

$0.28/M

Read:$0.0/M

Write:—

—

04/23/2026

Novita AI

0.9s

163tps

$0.14/M

$0.28/M

Read:$0.03/M

Write:—

—

04/23/2026

DeepInfra

1.2s

25tps

$0.14/M

$0.28/M

Read:$0.03/M

Write:—

—

04/23/2026

Blackbox

$0.14/M

$0.28/M

Read:$0.0/M

Write:—

—

04/23/2026

Fireworks

1.7s

135tps

$0.14/M

$0.28/M

Read:$0.0/M

Write:—

—

04/23/2026

Azure

1.7s

110tps

$0.19/M

$0.51/M

—

04/23/2026

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

DeepSeek V4 Flash

Providers