llama3.1:8b on A6000, A100, H100 — 2026-04-17

2026-04-17T22:28:31.000Z → 2026-04-17T22:29:15.644Z

On 2026-04-17, llama3.1:8b ran across A6000, A100, H100. H100 finished first at 192.7 tok/s, 13× faster than A100. The per-million-token cost: $3.59 (H100) vs $14.07 (A100) — the surprise: the headline GPU H100 was also the cheapest per-million tokens, coming in 3.9× less than A100.

Podium

H100

1st

thunder-h100

peak tok/s: 192.7
avg tok/s: 192.7
$ / 1M tok: $3.59

A6000

2nd

thunder-a6000

peak tok/s: 69.5
avg tok/s: 69.5
$ / 1M tok: $1.40

A100

3rd

thunder-a100

peak tok/s: 15.4
avg tok/s: 15.4
$ / 1M tok: $14.07