llama3.1:8b on A6000, A100, H100 — 2026-04-17

2026-04-17T21:56:19.000Z → 2026-04-17T21:57:39.630Z

On 2026-04-17, llama3.1:8b ran across A6000, A100, H100. H100 finished first at 191.4 tok/s, 26× faster than A6000. The per-million-token cost: $3.61 (H100) vs $13.32 (A6000) — the surprise: the headline GPU H100 was also the cheapest per-million tokens, coming in 3.7× less than A6000.

Podium

H100

1st

thunder-h100

peak tok/s: 191.4
avg tok/s: 191.4
$ / 1M tok: $3.61

A100

2nd

thunder-a100

peak tok/s: 13.7
avg tok/s: 13.7
$ / 1M tok: $15.82

A6000

3rd

thunder-a6000

peak tok/s: 7.3
avg tok/s: 7.3
$ / 1M tok: $13.32