Full transparency: I built CloudGPUTracker because I was frustrated with comparing GPU prices. This test was done before the site existed. No provider paid me. Some of them probably hate me after this.

Why I Did This (And Why You Should Care)

Three months ago, I had a problem. My team needed to train a LLM fine-tune, and our cloud bill was getting... uncomfortable. We checked AWS A100 GPU instance pricing and were shocked at the cost. We were paying AWS something stupid like $8/hour for V100s because nobody had time to shop around for the cheapest cloud GPU for deep learning.

I figured there had to be a better way. So I made a spreadsheet, looked at Azure cloud GPU offerings too, opened accounts at 12 different providers, and started logging everything. Every dollar. Every crash. Every "your instance is being preempted" email at 3 AM.

The Setup: How I Tested

I wasn't going to trust their marketing pages. Here's what I actually did:

  • Same workload everywhere: A standard LLM fine-tuning job (Llama 2 7B, ~6 hours per run)
  • Same monitoring: Logged actual GPU utilization, not just "is it running"
  • Real bills only: No "credits" or "free trial" nonsense. I paid real money.
  • 3-month period: November 2025 to January 2026
Total spent: $4,213.47 across all providers. My accountant asked some questions.

The Raw Numbers

Before I tell you stories, here's the data. All prices are for H100 80GB instances, on-demand (not spot), as of January 2026:

Provider Price/hour Availability My Rating
Vast.ai$0.73-0.8560%★★★☆☆
RunPod$0.8985%★★★★☆
Lambda Labs$0.9995%★★★★★
Thunder Compute$0.6650%★★★☆☆
TensorDock$0.7565%★★★★☆
CoreWeave$1.1090%★★★★☆
Genesis Cloud$1.1580%★★★☆☆
Paperspace$1.6895%★★★★☆
Crusoe Cloud$1.3085%★★★★☆
Salad$0.42*40%★★☆☆☆
Nebius$0.9570%★★★☆☆
FluidStack$1.0575%★★★☆☆

* Salad's pricing is weird. More on that below.

The Stories (Where It Gets Interesting)

Vast.ai: Cheap When You Can Get It

Vast.ai had the best prices. Hands down. $0.73/hour for an H100 is almost half what AWS charges.

But here's the thing they don't advertise: you can't actually get the machines. I tried to provision an H100 47 times over 3 months. Success rate? 28%.

When it worked, it was great. When it didn't, I was scrambling at 11 PM trying to find compute for a deadline. Not fun.

"Best price, worst availability. Good for experimentation, terrible for production."

Lambda Labs: Boring in the Best Way

Lambda Labs is... fine. That's actually high praise in this industry.

Their price ($0.99/h) isn't the cheapest. But when I clicked "launch instance," it actually launched. Every single time. No "we're out of capacity" messages. No mysterious crashes. Just a working GPU.

I ran 23 jobs on Lambda over 3 months. Zero unexpected interruptions. That's honestly remarkable.

RunPod: The Spot Instance Lottery

I wanted to love RunPod. Their spot prices are insane—I've seen H100s at $0.40/hour. That's cheaper than a sandwich in San Francisco.

But spot instances are a gamble. I documented every interruption:

  • Week 1: 2 interruptions (annoying but manageable)
  • Week 2: 1 interruption (okay, getting better)
  • Week 3: 4 interruptions including one at hour 17 of an 18-hour job (I rage-quit)

My advice? Use RunPod spot for development and testing. For anything production-critical, pay for on-demand or use Lambda.

Thunder Compute & TensorDock: The New Challengers

I also tested the newcomers, Thunder Compute and TensorDock. Thunder Compute impressed me with an incredible $0.66/hour rate for A100 equivalents, but their UI is still rough around the edges. TensorDock was solid, acting as a marketplace similar to Vast.ai but with slightly better vetting of hosts. If you're looking for the absolute cheapest cloud GPU for deep learning, these two are worth a look if Vast is out of stock.

Salad: What Is Even Happening

Salad is... weird. They're a "distributed cloud" which basically means you're renting someone's gaming PC. The prices are incredible ($0.42/h) but the experience is unpredictable.

One time I got a machine that clearly had a bitcoin miner running in the background. GPU utilization was stuck at 40% no matter what I did. Another time, the machine just... disappeared mid-training. Not "preempted." Just gone.

I can't recommend Salad for serious work. But if you're a student trying to learn on a $50 budget? Maybe.

The Hidden Costs Nobody Talks About

Here's what the headline prices don't tell you:

1. Egress Fees Will Get You

CoreWeave seemed like a good deal at $1.10/h. Then I got my first bill with a $47 "data transfer" charge. Turns out downloading your checkpoints counts against you. I didn't budget for that.

Fix: Set up persistent storage and don't download models repeatedly. Or use providers with generous egress allowances (Lambda gives you 1TB/month).

2. Setup Time Isn't Free

Some providers start charging the moment you click "launch," even while the machine is still booting. I timed one provider at 8 minutes of boot time before I could even SSH in. At $2/hour, that's $0.26 to start the machine.

3. The "Stupid Tax"

I left an instance running over a 3-day weekend. Cost: $72. For absolutely nothing. Most providers don't have auto-shutdown.

Fix: Set calendar reminders. Or use [this script I wrote](/blog/auto-shutdown-script) that kills idle instances.

My Actual Recommendations

I'm not going to give you vague "it depends" advice. Here are my actual picks:

🥇 Best Overall: Lambda Labs

When to use: Production training, client work, anything where reliability matters

Price: $0.99/hour (H100)

Why: It just works. Every time. Their support actually responds. Worth the extra $0.10/hour over the cheap options.

🥈 Best Budget Option: RunPod (Spot)

When to use: Development, experimentation, hyperparameter tuning

Price: $0.40-0.60/hour (spot H100)

Why: When it works, it's the cheapest by far. Just have checkpointing set up and don't use it for deadline-critical work.

🥉 Best for Enterprises: CoreWeave

When to use: You need contracts, SLAs, and someone to call at 2 AM

Price: $1.10+/hour (but negotiate)

Why: Real enterprise support. But watch those egress fees.

What I'd Do Differently

If I were starting over:

  1. Start with Lambda. Don't waste time on the cheap providers until you know your workload.
  2. Set up checkpointing first. Before you run any long job. I learned this the hard way.
  3. Budget 20% extra for "stupid mistakes." Leaving instances on, egress fees, failed jobs you have to re-run.
  4. Don't chase the lowest price. A $0.73/hour instance that crashes twice costs more than a $0.99 one that works.

The Tool I Built (Shameless Plug)

Doing this manually sucked. So I built CloudGPUTracker to track prices across all these providers automatically.

It won't tell you which provider is "best" (that's what this post is for). But it will tell you who's cheapest right now, and that's half the battle.

Want to see current prices?

Check Live H100 Prices →

FAQ


Last updated: February 14, 2026. Prices and availability change constantly. I'll update this quarterly. If you spot something wrong, email me.