GPU · dedicated cards

A dedicated GPU from ₹44.554 an hour.

Every GPU instance gets a whole NVIDIA RTX Pro Blackwell card, not a time-slice of one. The VM underneath is NVMe-backed, the meter is hourly, and the rate on this page is the rate on the invoice.

GPU access starts with an email to support@excloud.dev — details below.

nv2a.xlarge · RTX 4500 Pro Blackwell · 32 GiB VRAM

one hour
₹44.554
a working day × 8 h
₹356.43
around the clock × 24 h
₹1,069.30
Stop the instance when the run finishes and the meter stops with it.

The cards

Three cards, sized by VRAM.

Pick by the model you want to fit. All three are the current RTX Pro Blackwell generation, and the docs list the same workloads for each: LLMs, GPU inference, and professional visualization.

Card Instance vCPU RAM VRAM On-demand
RTX 4500 Pro Blackwell nv2a.xlarge 4 16 GiB 32 GiB ₹44.554/hr
RTX 5000 Pro Blackwell nv3a.2xlarge 8 32 GiB 48 GiB ₹63.849/hr
RTX 6000 Pro Blackwell nv1a.4xlarge 16 64 GiB 96 GiB ₹126.784/hr

Disks are EBS, priced like everywhere else on Excloud: ₹4/GB·mo for NVMe block storage, egress flat at ₹1/GiB, ingress free. Every rate lives on the rate card.

The math

Price out the run before you start it.

Hourly billing means a GPU job has a known cost before you submit it. An eight-hour fine-tune on an nv2a.xlarge is 8 × ₹44.554, which comes to ₹356.43. If the run finishes early, so does the bill.

The 96 GiB card works the same way. A full day on an nv1a.4xlarge is 24 × ₹126.784 = ₹3,042.82, and nothing on top of it except the disk and whatever you move out.

8 h fine-tune · nv2a.xlarge

8 × ₹44.554

₹356.43

24 h inference · nv1a.4xlarge

24 × ₹126.784

₹3,042.82

Access

GPU access starts with an email, and that's deliberate.

Every GPU here is a physical card in our racks in Mumbai, and we'd rather hand them to people who will actually use them than let a signup script reserve the lot. So new accounts start with zero GPU quota, and you write to us to raise it. Tell us roughly what you're running and which card you want; a human reads it and replies.

What people run

Models on your card, or tokens on ours.

Self-host with Ollama

The 32 and 48 GiB cards are a comfortable fit for open-weight models served through Ollama; the 96 GiB card takes the larger quantizations. The docs walk through the whole setup on an Excloud VM.

Visualization and rendering

The same cards the docs list for inference also carry professional visualization work. A dedicated card means your viewport isn't sharing the GPU with a stranger's training job.

Skip the card entirely

If you only need tokens, our hosted Qwen3.6-27B costs ₹20 per million input tokens and ₹60 per million output. No quota request, no idle hours.

Get started

Tell us what you want to run.

One email opens your GPU quota. After that, an evening of experiments on the smallest card costs about ₹178 for four hours, and stopping the instance ends the bill.