llama3.2:1b on A6000, A100, H100 — 2026-04-18

2026-04-18T02:25:49.000Z → 2026-04-18T02:26:26.555Z

On 2026-04-18, llama3.2:1b ran across A6000, A100, H100. H100 finished first at 587.8 tok/s, 22× faster than A6000. The per-million-token cost: $1.18 (H100) vs $3.61 (A6000) — the surprise: the headline GPU H100 was also the cheapest per-million tokens, coming in 3.1× less than A6000.

Podium

H100

1st

thunder-h100

peak tok/s: 587.8
avg tok/s: 587.8
$ / 1M tok: $1.18

A100

2nd

thunder-a100

peak tok/s: 52.5
avg tok/s: 52.5
$ / 1M tok: $4.13

A6000

3rd

thunder-a6000

peak tok/s: 26.9
avg tok/s: 26.9
$ / 1M tok: $3.61