DeepSeek API, paid in USDT.
No VISA. No Alipay. No KYC.
OpenAI-compatible drop-in for Cursor, Cline, OpenClaw — same DeepSeek you'd get direct, at the same discount price (passthrough), but accessible to anyone with a USDT wallet. Built for developers whose payment rails keep getting declined upstream.
USDT, no Visa needed
Top up via USDT TRC-20. The way to pay if your bank keeps declining DeepSeek's Stripe checkout. 3% top-up fee, that's it.
One balance, many models
DeepSeek today, Qwen / GLM / Hermes-3 coming. One USDT top-up, one bill, no juggling accounts across upstream providers.
OpenAI-compatible
Drop into the OpenAI SDK. Change base_url and key — done. Works with LangChain, AutoGen, CrewAI, OpenClaw, Cline, Continue.
Shared with GPU rental
Same USDT balance powers GPU rentals AND API calls. New users get $0.60 signup credit to try both. One account, two products.
Passthrough pricing
You pay exactly what we pay the upstream model provider. Zero per-token markup. The only fee on top is 3% when you top up your USDT balance.
| Model | Model ID | $/1M input | Cached input | $/1M output | Best for | Status |
|---|---|---|---|---|---|---|
| DeepSeek V4 Flash | deepseek-v4-flash | $0.14 | $0.0028 | $0.28 | Code, general chat, agents (non-thinking mode) | Live |
| DeepSeek V4 Pro | deepseek-v4-pro | $0.435 | $0.003625 | $0.87 | Chain-of-thought reasoning, complex agents (thinking mode) | Live |
| Qwen3 Plus | qwen-plus | $0.40 | — | $2.40 | Long context, multilingual | Coming |
| Qwen3 Max | qwen-max | $1.20 | — | $6.00 | Top-tier reasoning, multimodal | Coming |
| GLM-4.7 | glm-4.7 | $0.60 | — | $2.20 | Chinese-language tasks, RAG | Coming |
All prices in USD per million tokens. Cache-hit pricing applies automatically when the upstream cache fires; we pass the discount straight through to you.
DeepSeek V3 output, side by side
| Provider | $/1M output | Notes |
|---|---|---|
| DeepSeek direct | $0.28 | Best price, requires international card, no aggregator |
| CloudGPU AI Gateway | $0.28 | Passthrough, USDT topup, multi-model abstraction |
| OpenRouter | $0.28 | +5.5% top-up fee, no APAC optimization |
| Together AI | ~$0.88 | ~3× markup over upstream |
| Fireworks AI | ~$0.90 | ~3× markup over upstream |
Competitor rates observed May 2026. Verify on their pricing pages before locking in your budget.
What you would actually pay
Five realistic monthly workloads, billed at each provider's public price. Token counts are typical patterns we see in the open-source agent ecosystem.
| Workload | Where it shows up | Monthly tokens | cloudgpu | gpt-4o-mini | gpt-4o | Claude Sonnet |
|---|---|---|---|---|---|---|
IDE code completion deepseek-chat · ~400 completions/day | Cursor, Continue.dev, Tabby, Codeium-like inline autocomplete | 18M in + 2M out | $3.08 | $3.90 | $65 | $84 |
RAG production API deepseek-chat · 200 queries/day | LangChain, LlamaIndex, Haystack with 6K-token retrieved context | 26M in + 2M out | $4.19 | $5.02 | $84 | $106 |
Multi-agent code task deepseek-reasoner · 5 sessions/day | AutoGen, CrewAI, OpenClaw — 4 agents bouncing per feature | 9M in + 2M out | $5.27 | $2.31(mini tier, weaker quality) | $39 | $51 |
Autonomous coding agent deepseek-reasoner · full workday | Cline, aider, Claude-Code-style terminal agents reading whole repos | 7M in + 1M out | $3.44 | $1.39(mini tier, weaker quality) | $23 | $30 |
Function-calling automation deepseek-chat · 1000 calls/day | Hermes-3 style pipelines, GPT-Researcher, n8n + LLM nodes | 66M in + 11M out | $12.32 | $16.50 | $275 | $363 |
Chat-tier workloads (completion, RAG, simple agents)
deepseek-chat sits ~20% under gpt-4o-mini on output-heavy work and matches on input-heavy. Quality is comparable for non-reasoning tasks. The savings get larger as token volume grows.
Reasoning workloads (planning, complex agents, autonomous coding)
deepseek-reasoner punches at gpt-4o / Claude Sonnet quality (per HumanEval, MATH, BigCodeBench) for 5-10× less. gpt-4o-mini at this scale gives up too much accuracy — agents loop, RAG hallucinates, code patches don't apply. That's the real cost ratio.
Competitor prices: OpenAI gpt-4o-mini $0.15/$0.60, gpt-4o $2.50/$10.00; Anthropic Claude Sonnet 4 $3.00/$15.00 per million tokens (input/output) as of May 2026. Verify on each provider's pricing page before locking in your budget. Token counts are illustrative — your actual usage will vary.
Open-source starter · MIT
Deploy a WhatsApp Business autoresponder in 10 minutes
Production-ready Node.js bot — Meta WhatsApp Cloud API + DeepSeek-V4 via cloudgpu. Editable faq.yml for the shop owner, $0 cost on FAQ hits and ~$0.0001 per AI reply. One-click deploy to Railway or Fly.io.
Built for small businesses in markets where Stripe-based AI subscriptions keep getting declined. Indonesia, Vietnam, Brazil, Mexico, Nigeria, India, the Philippines — anywhere WhatsApp is the primary customer-comms channel and the going SaaS option starts at $40/mo.
5-minute quick start
curl https://cloudgpu.app/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cgw-sk-..." \
-d '{
"model": "deepseek-v4-flash",
"messages": [{"role": "user", "content": "Hello"}]
}'Common questions
If I have a working Visa, why not just go direct to DeepSeek?
Honest answer: if your card works at platform.deepseek.com and you only ever need DeepSeek, going direct is fine — you save the 3% top-up fee and we add a hop. We exist for developers whose cards keep declining (common for users in CN, SEA, Latam, Africa), users who want USDT-only payment, and developers who want a single account that bills DeepSeek + Qwen + GLM + more (rolling out). If none of those describe you, go direct.
How does this compare to OpenRouter?
OpenRouter wins on model breadth (290+ models). We win on focus (Chinese open-source models, no Western model padding) and on USDT payment (OpenRouter takes USDT but their fee structure is 5.5% vs our 3%). For Chinese model users specifically, we're cheaper. For Western model users (GPT-4, Claude, Llama via providers), OpenRouter has them and we don't.
Is my data sent to mainland China?
No. Requests route through Singapore / Hong Kong edge nodes to upstream model providers. We do not log prompt or completion content; we record token counts and request metadata only. Beta status — production-grade DPA available on request once a corporate entity is established.
What payment methods do you accept?
USDT (TRC-20) only at v0. Top up your balance with USDT; per-request costs deduct in USD. 3% top-up fee covers gas, conversion and platform overhead. Credit card and crypto-other-than-USDT are on the roadmap.
What about rate limits?
100 requests per minute per API key, 10,000 requests per day per account. These are anti-abuse defaults for the public beta and can be raised on request once you're past the trial.
Will the OpenAI SDK work?
Yes. Use the OpenAI Python or Node.js SDK and override base_url to https://cloudgpu.app/v1. The /v1/chat/completions surface is request-compatible. Anthropic SDK compatibility is on the roadmap.
$0.60 free credit on signup
Enough for ~4M tokens on deepseek-v4-flash. No credit card. Same balance covers GPU rentals.
Create your API key