GPU · dedicated cards
A dedicated GPU from ₹44.554 an hour.
Every GPU instance gets a whole NVIDIA RTX Pro Blackwell card, not a time-slice of one. The VM underneath is NVMe-backed, the meter is hourly, and the rate on this page is the rate on the invoice.
GPU access starts with an email to support@excloud.dev — details below.
nv2a.xlarge · RTX 4500 Pro Blackwell · 32 GiB VRAM
- one hour
- ₹44.554
- a working day × 8 h
- ₹356.43
- around the clock × 24 h
- ₹1,069.30
The cards
Three cards, sized by VRAM.
Pick by the model you want to fit. All three are the current RTX Pro Blackwell generation, and the docs list the same workloads for each: LLMs, GPU inference, and professional visualization.
| Card | Instance | vCPU | RAM | VRAM | On-demand |
|---|---|---|---|---|---|
| RTX 4500 Pro Blackwell | nv2a.xlarge | 4 | 16 GiB | 32 GiB | ₹44.554/hr |
| RTX 5000 Pro Blackwell | nv3a.2xlarge | 8 | 32 GiB | 48 GiB | ₹63.849/hr |
| RTX 6000 Pro Blackwell | nv1a.4xlarge | 16 | 64 GiB | 96 GiB | ₹126.784/hr |
Disks are EBS, priced like everywhere else on Excloud: ₹4/GB·mo for NVMe block storage, egress flat at ₹1/GiB, ingress free. Every rate lives on the rate card.
The math
Price out the run before you start it.
Hourly billing means a GPU job has a known cost before you submit it. An eight-hour fine-tune on an nv2a.xlarge is 8 × ₹44.554, which comes to ₹356.43. If the run finishes early, so does the bill.
The 96 GiB card works the same way. A full day on an nv1a.4xlarge is 24 × ₹126.784 = ₹3,042.82, and nothing on top of it except the disk and whatever you move out.
8 h fine-tune · nv2a.xlarge
8 × ₹44.554
₹356.43
24 h inference · nv1a.4xlarge
24 × ₹126.784
₹3,042.82
Access
GPU access starts with an email, and that's deliberate.
Every GPU here is a physical card in our racks in Mumbai, and we'd rather hand them to people who will actually use them than let a signup script reserve the lot. So new accounts start with zero GPU quota, and you write to us to raise it. Tell us roughly what you're running and which card you want; a human reads it and replies.
What people run
Models on your card, or tokens on ours.
Self-host with Ollama
The 32 and 48 GiB cards are a comfortable fit for open-weight models served through Ollama; the 96 GiB card takes the larger quantizations. The docs walk through the whole setup on an Excloud VM.
Visualization and rendering
The same cards the docs list for inference also carry professional visualization work. A dedicated card means your viewport isn't sharing the GPU with a stranger's training job.
Skip the card entirely
If you only need tokens, our hosted Qwen3.6-27B costs ₹20 per million input tokens and ₹60 per million output. No quota request, no idle hours.
Get started
Tell us what you want to run.
One email opens your GPU quota. After that, an evening of experiments on the smallest card costs about ₹178 for four hours, and stopping the instance ends the bill.