llama3.1:8b on A6000, A100, H100 — 2026-04-18
2026-04-18T03:48:11.000Z → 2026-04-18T03:53:53.844Z
On 2026-04-18, llama3.1:8b ran across A6000, A100, H100. H100 finished first at 192.5 tok/s, 25× faster than A6000. The per-million-token cost: $3.59 (H100) vs $12.63 (A6000) — the surprise: the headline GPU H100 was also the cheapest per-million tokens, coming in 3.5× less than A6000.
Podium
H100
1st
thunder-h100
- peak tok/s
- 192.5
- avg tok/s
- 192.5
- $ / 1M tok
- $3.59
A100
2nd
thunder-a100
- peak tok/s
- 14.4
- avg tok/s
- 14.4
- $ / 1M tok
- $15.05
A6000
3rd
thunder-a6000
- peak tok/s
- 7.7
- avg tok/s
- 7.7
- $ / 1M tok
- $12.63