Simple, Transparent Pricing

Pay only for what you use. No hidden fees. No egress charges. Cancel anytime.

Up to 60% cheaper

than RunPod on the same GPU models

Multi-Region Infrastructure

GPUs in APAC and beyond. Low-latency for developers in India, SEA, LatAm, and Europe.

60-second deploy

Browser-based. No SSH, no setup.

GPU Model	Stock	CloudGPU	Vast.ai	RunPod
RTX 3090 24GSupplied via P2P agents	Low stock	$0.20/hr	$0.22/hr	$0.22/hr
RTX 4090 24G	Low stock	$0.32/hr	$0.32/hr	$0.34/hr
RTX 4090D 48GChina-spec 4090, 48GB VRAM	Low stock	$0.59/hr	N/A	N/A
5090 32G	Low stock	$0.49/hr	N/A	N/A
A100 40G	Low stock	$0.89/hr	$0.80/hr	$1.89/hr
L20 48GChina-spec Ada, big VRAM	Low stock	$0.89/hr	N/A	$1.20/hr
L40 48G	Low stock	$0.79/hr	N/A	$1.19/hr
L40S 48G	Low stock	$1.19/hr	$1.65/hr	$1.89/hr
H20 96GChina-spec Hopper, huge HBM3	Low stock	$1.59/hr	N/A	N/A
Ascend 910B 64GNPU — CANN / MindSpore / PyTorch-NPU only	Low stock	$0.85/hr	N/A	N/A
A800 80GChina-spec A100	Low stock	$1.39/hr	N/A	$2.17/hr
H800 80GChina-spec H100, 400Gb/s NVLink capped	Low stock	$3.49/hr	N/A	$3.99/hr

Vast.ai / RunPod rates observed April 2026 for equivalent SKU. "N/A" = competitor does not offer that GPU in their catalog.

Performance & Workload Fit

Price alone doesn't tell you which GPU is right. L20, H20, and Ascend 910B are China-spec cards that US clouds don't sell — they have unique VRAM / bandwidth profiles worth comparing.

GPU	Arch	VRAM	FP16 TFLOPS	Mem BW	Best For
RTX 3090 24G Supplied via P2P agents	Ampere	24 GB	36	936 GB/s	LLM inference ≤ 7B, Stable Diffusion, cheap dev box
RTX 4090 24G	Ada	24 GB	73	1008 GB/s	Fine-tuning 7B-13B, Flux / SDXL, fast inference
RTX 4090D 48G China-spec 4090, 48GB VRAM	Ada	48 GB	73	1008 GB/s	Fine-tuning 13B-30B at higher batch sizes, 70B quantized inference
5090 32G	Blackwell	32 GB	104	1792 GB/s	Latest consumer flagship — Blackwell features, SD/Flux speed
A100 40G	Ampere	40 GB	312	1555 GB/s	LLM training / fine-tune 13B-30B, research
L20 48G China-spec Ada, big VRAM	Ada	48 GB	119	864 GB/s	32B inference, multi-model serving, VRAM-heavy workloads
L40 48G	Ada	48 GB	90	864 GB/s	32B inference, visualization + compute combined
L40S 48G	Ada	48 GB	183	864 GB/s	High-end inference + light training, better than L40
H20 96G China-spec Hopper, huge HBM3	Hopper	96 GB	148	4000 GB/s	70B-120B inference, long-context, MoE models
Ascend 910B 64G NPU — CANN / MindSpore / PyTorch-NPU only	Ascend	64 GB	320	1200 GB/s	Domestic-compliant training, MindSpore/PyTorch-NPU workloads
A800 80G China-spec A100	Ampere	80 GB	312	2039 GB/s	FP32/FP16 training 30B-70B, large-memory research
H800 80G China-spec H100, 400Gb/s NVLink capped	Hopper	80 GB	989	3350 GB/s	Frontier training, 70B+ fine-tuning, FP8 inference

VRAM determines model size

24 GB fits 7B at FP16 / 30B quantized. 48 GB fits 13B at FP16 / 70B quantized. 96 GB comfortably runs 70B FP16.

Mem bandwidth = tokens/sec

For LLM inference, throughput scales with memory bandwidth, not TFLOPS. H20's 4 TB/s beats A100 for serving large models.

FP16 TFLOPS = training speed

A100's 312 TFLOPS shines for fine-tuning. 910B's 320 is on paper — actual speed depends on CANN/MindSpore ecosystem maturity.

Cost Calculator

GPU Model

GPU Count1x

Hours per Day8h

Number of Days7 days

1x RTX 4090 24G · 56 total hours

$17.92

Save $1.12 vs RunPod

6% cheaper than RunPod

Rent Now

Pricing Plans

On-Demand

Pay as you go. No commitment.

Market Rate

check Per-hour billing
check Cancel anytime
check No minimum spend
check All GPU models

Get Started

Best Value

Monthly

Commit monthly, save 30%.

30% Off

on hourly rates

check Everything in On-Demand
check 30% discount on all GPUs
check Priority provisioning
check Email support

Prepaid Credits

Buy credits in bulk, get bonus.

$100 → $110

10% bonus credits

check 10% bonus on all top-ups
check Credits never expire
check Use on any GPU model
check Transferable balance

Buy Credits

No hidden fees · No egress charges · Per-hour billing · Cancel anytime