apiAI API Gateway · Beta

DeepSeek API, paid in USDT.
No VISA. No Alipay. No KYC.

OpenAI-compatible drop-in for Cursor, Cline, OpenClaw — same DeepSeek you'd get direct, at the same discount price (passthrough), but accessible to anyone with a USDT wallet. Built for developers whose payment rails keep getting declined upstream.

Get your API key Read the docs

currency_bitcoin

USDT, no Visa needed

Top up via USDT TRC-20. The way to pay if your bank keeps declining DeepSeek's Stripe checkout. 3% top-up fee, that's it.

savings

One balance, many models

DeepSeek today, Qwen / GLM / Hermes-3 coming. One USDT top-up, one bill, no juggling accounts across upstream providers.

swap_horiz

OpenAI-compatible

Drop into the OpenAI SDK. Change base_url and key — done. Works with LangChain, AutoGen, CrewAI, OpenClaw, Cline, Continue.

link

Shared with GPU rental

Same USDT balance powers GPU rentals AND API calls. New users get $0.60 signup credit to try both. One account, two products.

Passthrough pricing

You pay exactly what we pay the upstream model provider. Zero per-token markup. The only fee on top is 3% when you top up your USDT balance.

Model	Model ID	$/1M input	Cached input	$/1M output	Best for	Status
DeepSeek V4 Flash	deepseek-v4-flash	$0.14	$0.0028	$0.28	Code, general chat, agents (non-thinking mode)	Live
DeepSeek V4 Pro	deepseek-v4-pro	$0.435	$0.003625	$0.87	Chain-of-thought reasoning, complex agents (thinking mode)	Live
Qwen3 Plus	qwen-plus	$0.40	—	$2.40	Long context, multilingual	Coming
Qwen3 Max	qwen-max	$1.20	—	$6.00	Top-tier reasoning, multimodal	Coming
GLM-4.7	glm-4.7	$0.60	—	$2.20	Chinese-language tasks, RAG	Live

All prices in USD per million tokens. Cache-hit pricing applies automatically when the upstream cache fires; we pass the discount straight through to you.

DeepSeek V3 output, side by side

Provider	$/1M output	Notes
DeepSeek direct	$0.28	Best price, requires international card, no aggregator
CloudGPU AI Gateway	$0.28	Passthrough, USDT topup, multi-model abstraction
OpenRouter	$0.28	+5.5% top-up fee, no APAC optimization
Together AI	~$0.88	~3× markup over upstream
Fireworks AI	~$0.90	~3× markup over upstream

Competitor rates observed May 2026. Verify on their pricing pages before locking in your budget.

What you would actually pay

Five realistic monthly workloads, billed at each provider's public price. Token counts are typical patterns we see in the open-source agent ecosystem.

Workload	Where it shows up	Monthly tokens	cloudgpu	gpt-4o-mini	gpt-4o	Claude Sonnet
IDE code completion deepseek-chat · ~400 completions/day	Cursor, Continue.dev, Tabby, Codeium-like inline autocomplete	18M in + 2M out	$3.08	$3.90	$65	$84
RAG production API deepseek-chat · 200 queries/day	LangChain, LlamaIndex, Haystack with 6K-token retrieved context	26M in + 2M out	$4.19	$5.02	$84	$106
Multi-agent code task deepseek-reasoner · 5 sessions/day	AutoGen, CrewAI, OpenClaw — 4 agents bouncing per feature	9M in + 2M out	$5.27	$2.31(mini tier, weaker quality)	$39	$51
Autonomous coding agent deepseek-reasoner · full workday	Cline, aider, Claude-Code-style terminal agents reading whole repos	7M in + 1M out	$3.44	$1.39(mini tier, weaker quality)	$23	$30
Function-calling automation deepseek-chat · 1000 calls/day	Hermes-3 style pipelines, GPT-Researcher, n8n + LLM nodes	66M in + 11M out	$12.32	$16.50	$275	$363

Chat-tier workloads (completion, RAG, simple agents)

deepseek-chat sits ~20% under gpt-4o-mini on output-heavy work and matches on input-heavy. Quality is comparable for non-reasoning tasks. The savings get larger as token volume grows.

Reasoning workloads (planning, complex agents, autonomous coding)

deepseek-reasoner punches at gpt-4o / Claude Sonnet quality (per HumanEval, MATH, BigCodeBench) for 5-10× less. gpt-4o-mini at this scale gives up too much accuracy — agents loop, RAG hallucinates, code patches don't apply. That's the real cost ratio.

Competitor prices: OpenAI gpt-4o-mini $0.15/$0.60, gpt-4o $2.50/$10.00; Anthropic Claude Sonnet 4 $3.00/$15.00 per million tokens (input/output) as of May 2026. Verify on each provider's pricing page before locking in your budget. Token counts are illustrative — your actual usage will vary.

chat

Open-source starter · MIT

Deploy a WhatsApp Business autoresponder in 10 minutes

Production-ready Node.js bot — Meta WhatsApp Cloud API + DeepSeek-V4 via cloudgpu. Editable faq.yml for the shop owner, $0 cost on FAQ hits and ~$0.0001 per AI reply. One-click deploy to Railway or Fly.io.

codeView on GitHub →Deploy on Railway →Deploy on Fly.io →

Built for small businesses in markets where Stripe-based AI subscriptions keep getting declined. Indonesia, Vietnam, Brazil, Mexico, Nigeria, India, the Philippines — anywhere WhatsApp is the primary customer-comms channel and the going SaaS option starts at $40/mo.

5-minute quick start

curl https://cloudgpu.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cgw-sk-..." \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Full docs → Python, Node.js, OpenAI SDK migration

Common questions

If I have a working Visa, why not just go direct to DeepSeek?

Honest answer: if your card works at platform.deepseek.com and you only ever need DeepSeek, going direct is fine — you save the 3% top-up fee and we add a hop. We exist for developers whose cards keep declining (common for users in CN, SEA, Latam, Africa), users who want USDT-only payment, and developers who want a single account that bills DeepSeek + Qwen + GLM + more (rolling out). If none of those describe you, go direct.

How does this compare to OpenRouter?

OpenRouter wins on model breadth (290+ models). We win on focus (Chinese open-source models, no Western model padding) and on USDT payment (OpenRouter takes USDT but their fee structure is 5.5% vs our 3%). For Chinese model users specifically, we're cheaper. For Western model users (GPT-4, Claude, Llama via providers), OpenRouter has them and we don't.

Is my data sent to mainland China?

No. Requests route through Singapore / Hong Kong edge nodes to upstream model providers. We do not log prompt or completion content; we record token counts and request metadata only. Beta status — production-grade DPA available on request once a corporate entity is established.

What payment methods do you accept?

USDT (TRC-20) only at v0. Top up your balance with USDT; per-request costs deduct in USD. 3% top-up fee covers gas, conversion and platform overhead. Credit card and crypto-other-than-USDT are on the roadmap.

What about rate limits?

100 requests per minute per API key, 10,000 requests per day per account. These are anti-abuse defaults for the public beta and can be raised on request once you're past the trial.

Will the OpenAI SDK work?

Yes. Use the OpenAI Python or Node.js SDK and override base_url to https://cloudgpu.app/v1. The /v1/chat/completions surface is request-compatible. Anthropic SDK compatibility is on the roadmap.

$0.60 free credit on signup

Enough for ~4M tokens on deepseek-v4-flash. No credit card. Same balance covers GPU rentals.

Create your API key

DeepSeek API, paid in USDT.No VISA. No Alipay. No KYC.