llama3.2:1b on A100, H100 — 2026-04-18

2026-04-18T04:32:42.000Z → 2026-04-18T04:33:12.962Z

On 2026-04-18, llama3.2:1b ran across A100, H100. H100 finished first at 590.4 tok/s, 12× faster than A100. The per-million-token cost: $1.17 (H100) vs $4.32 (A100) — the surprise: the headline GPU H100 was also the cheapest per-million tokens, coming in 3.7× less than A100.

Podium

H100

1st

thunder-h100

peak tok/s: 590.4
avg tok/s: 590.4
$ / 1M tok: $1.17

A100

2nd

thunder-a100

peak tok/s: 50.2
avg tok/s: 50.2
$ / 1M tok: $4.32