apiAI API Gateway · Beta

DeepSeek API, paid in USDT.
No VISA. No Alipay. No KYC.

OpenAI-compatible drop-in for Cursor, Cline, OpenClaw — same DeepSeek you'd get direct, at the same discount price (passthrough), but accessible to anyone with a USDT wallet. Built for developers whose payment rails keep getting declined upstream.

currency_bitcoin

USDT, no Visa needed

Top up via USDT TRC-20. The way to pay if your bank keeps declining DeepSeek's Stripe checkout. 3% top-up fee, that's it.

savings

One balance, many models

DeepSeek today, Qwen / GLM / Hermes-3 coming. One USDT top-up, one bill, no juggling accounts across upstream providers.

swap_horiz

OpenAI-compatible

Drop into the OpenAI SDK. Change base_url and key — done. Works with LangChain, AutoGen, CrewAI, OpenClaw, Cline, Continue.

link

Shared with GPU rental

Same USDT balance powers GPU rentals AND API calls. New users get $0.60 signup credit to try both. One account, two products.

Passthrough pricing

You pay exactly what we pay the upstream model provider. Zero per-token markup. The only fee on top is 3% when you top up your USDT balance.

ModelModel ID$/1M inputCached input$/1M outputBest forStatus
DeepSeek V4 Flashdeepseek-v4-flash$0.14$0.0028$0.28Code, general chat, agents (non-thinking mode)Live
DeepSeek V4 Prodeepseek-v4-pro$0.435$0.003625$0.87Chain-of-thought reasoning, complex agents (thinking mode)Live
Qwen3 Plusqwen-plus$0.40$2.40Long context, multilingualComing
Qwen3 Maxqwen-max$1.20$6.00Top-tier reasoning, multimodalComing
GLM-4.7glm-4.7$0.60$2.20Chinese-language tasks, RAGComing

All prices in USD per million tokens. Cache-hit pricing applies automatically when the upstream cache fires; we pass the discount straight through to you.

DeepSeek V3 output, side by side

Provider$/1M outputNotes
DeepSeek direct$0.28Best price, requires international card, no aggregator
CloudGPU AI Gateway$0.28Passthrough, USDT topup, multi-model abstraction
OpenRouter$0.28+5.5% top-up fee, no APAC optimization
Together AI~$0.88~3× markup over upstream
Fireworks AI~$0.90~3× markup over upstream

Competitor rates observed May 2026. Verify on their pricing pages before locking in your budget.

What you would actually pay

Five realistic monthly workloads, billed at each provider's public price. Token counts are typical patterns we see in the open-source agent ecosystem.

WorkloadWhere it shows upMonthly tokenscloudgpugpt-4o-minigpt-4oClaude Sonnet
IDE code completion
deepseek-chat · ~400 completions/day
Cursor, Continue.dev, Tabby, Codeium-like inline autocomplete18M in + 2M out$3.08$3.90$65$84
RAG production API
deepseek-chat · 200 queries/day
LangChain, LlamaIndex, Haystack with 6K-token retrieved context26M in + 2M out$4.19$5.02$84$106
Multi-agent code task
deepseek-reasoner · 5 sessions/day
AutoGen, CrewAI, OpenClaw — 4 agents bouncing per feature9M in + 2M out$5.27$2.31(mini tier, weaker quality)$39$51
Autonomous coding agent
deepseek-reasoner · full workday
Cline, aider, Claude-Code-style terminal agents reading whole repos7M in + 1M out$3.44$1.39(mini tier, weaker quality)$23$30
Function-calling automation
deepseek-chat · 1000 calls/day
Hermes-3 style pipelines, GPT-Researcher, n8n + LLM nodes66M in + 11M out$12.32$16.50$275$363

Chat-tier workloads (completion, RAG, simple agents)

deepseek-chat sits ~20% under gpt-4o-mini on output-heavy work and matches on input-heavy. Quality is comparable for non-reasoning tasks. The savings get larger as token volume grows.

Reasoning workloads (planning, complex agents, autonomous coding)

deepseek-reasoner punches at gpt-4o / Claude Sonnet quality (per HumanEval, MATH, BigCodeBench) for 5-10× less. gpt-4o-mini at this scale gives up too much accuracy — agents loop, RAG hallucinates, code patches don't apply. That's the real cost ratio.

Competitor prices: OpenAI gpt-4o-mini $0.15/$0.60, gpt-4o $2.50/$10.00; Anthropic Claude Sonnet 4 $3.00/$15.00 per million tokens (input/output) as of May 2026. Verify on each provider's pricing page before locking in your budget. Token counts are illustrative — your actual usage will vary.

chat

Open-source starter · MIT

Deploy a WhatsApp Business autoresponder in 10 minutes

Production-ready Node.js bot — Meta WhatsApp Cloud API + DeepSeek-V4 via cloudgpu. Editable faq.yml for the shop owner, $0 cost on FAQ hits and ~$0.0001 per AI reply. One-click deploy to Railway or Fly.io.

Built for small businesses in markets where Stripe-based AI subscriptions keep getting declined. Indonesia, Vietnam, Brazil, Mexico, Nigeria, India, the Philippines — anywhere WhatsApp is the primary customer-comms channel and the going SaaS option starts at $40/mo.

5-minute quick start

curl https://cloudgpu.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cgw-sk-..." \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Common questions

If I have a working Visa, why not just go direct to DeepSeek?

Honest answer: if your card works at platform.deepseek.com and you only ever need DeepSeek, going direct is fine — you save the 3% top-up fee and we add a hop. We exist for developers whose cards keep declining (common for users in CN, SEA, Latam, Africa), users who want USDT-only payment, and developers who want a single account that bills DeepSeek + Qwen + GLM + more (rolling out). If none of those describe you, go direct.

How does this compare to OpenRouter?

OpenRouter wins on model breadth (290+ models). We win on focus (Chinese open-source models, no Western model padding) and on USDT payment (OpenRouter takes USDT but their fee structure is 5.5% vs our 3%). For Chinese model users specifically, we're cheaper. For Western model users (GPT-4, Claude, Llama via providers), OpenRouter has them and we don't.

Is my data sent to mainland China?

No. Requests route through Singapore / Hong Kong edge nodes to upstream model providers. We do not log prompt or completion content; we record token counts and request metadata only. Beta status — production-grade DPA available on request once a corporate entity is established.

What payment methods do you accept?

USDT (TRC-20) only at v0. Top up your balance with USDT; per-request costs deduct in USD. 3% top-up fee covers gas, conversion and platform overhead. Credit card and crypto-other-than-USDT are on the roadmap.

What about rate limits?

100 requests per minute per API key, 10,000 requests per day per account. These are anti-abuse defaults for the public beta and can be raised on request once you're past the trial.

Will the OpenAI SDK work?

Yes. Use the OpenAI Python or Node.js SDK and override base_url to https://cloudgpu.app/v1. The /v1/chat/completions surface is request-compatible. Anthropic SDK compatibility is on the roadmap.

$0.60 free credit on signup

Enough for ~4M tokens on deepseek-v4-flash. No credit card. Same balance covers GPU rentals.

Create your API key