run detail
thunder-a6000-ff768cefcb
A6000 eval suite (backfill from profile.json)
started 3h agoended 1h agoduration 120mstatus completedsession thunder-backfill
lanes
| gpu | tok/s | latency (ms) | $/hr | $/1M tokens |
|---|---|---|---|---|
| A6000winner | 5.3 | 34699 | $0.35 | $18.3 |
eval scores · 120
Rubric-scored quality measurements on this run's model outputs. Higher composite = better.
| model | use case | test | composite | tok/s |
|---|---|---|---|---|
| qwen2.5:14b | chunking | chunk_technical_doc | 94.0 | 2.6 |
| qwen2.5:14b | chunking | chunk_technical_doc | 94.0 | 3.1 |
| qwen2.5:14b | chunking | chunk_mixed_content | 92.0 | 2.7 |
| qwen2.5:14b | chunking | chunk_mixed_content | 92.0 | 2.9 |
| qwen2.5:14b | chunking | chunk_code_narrative | 94.0 | 3.0 |
| qwen2.5:14b | chunking | chunk_code_narrative | 94.0 | 2.8 |
| qwen2.5:14b | chunking | chunk_short_text | 97.0 | 3.5 |
| qwen2.5:14b | chunking | chunk_short_text | 97.0 | 3.4 |
| qwen2.5:14b | search_query | sq_temporal_filter | 84.0 | 3.4 |
| qwen2.5:14b | search_query | sq_temporal_filter | 84.0 | 3.4 |
| qwen2.5:14b | search_query | sq_code_search | 100.0 | 3.4 |
| qwen2.5:14b | search_query | sq_code_search | 84.0 | 3.0 |
| qwen2.5:14b | search_query | sq_multi_source | 96.3 | 2.9 |
| qwen2.5:14b | search_query | sq_multi_source | 96.3 | 3.0 |
| qwen2.5:14b | search_query | sq_memory_recall | 100.0 | 3.1 |
| qwen2.5:14b | search_query | sq_memory_recall | 100.0 | 3.0 |
| qwen2.5:14b | search_query | sq_delta_search | 84.0 | 3.3 |
| qwen2.5:14b | search_query | sq_delta_search | 84.0 | 3.2 |
| qwen2.5:14b | context_synthesis | synth_architecture | 93.2 | 3.2 |
| qwen2.5:14b | context_synthesis | synth_architecture | 92.6 | 3.1 |
| qwen2.5:14b | context_synthesis | synth_dietary | 87.0 | 3.5 |
| qwen2.5:14b | context_synthesis | synth_dietary | 87.0 | 3.6 |
| qwen2.5:14b | context_synthesis | synth_conflicting | 67.0 | 3.5 |
| qwen2.5:14b | context_synthesis | synth_conflicting | 87.0 | 3.5 |
| qwen2.5:14b | memory_extraction | mem_dietary | 90.9 | 3.2 |
| qwen2.5:14b | memory_extraction | mem_dietary | 94.7 | 3.3 |
| qwen2.5:14b | memory_extraction | mem_incident | 100.0 | 3.3 |
| qwen2.5:14b | memory_extraction | mem_incident | 100.0 | 3.4 |
| qwen2.5:14b | memory_extraction | mem_preferences | 100.0 | 3.3 |
| llama3.1:8b | adapter_extraction | adapt_email | 100.0 | 4.7 |
| llama3.1:8b | adapter_extraction | adapt_email | 100.0 | 5.4 |
| llama3.1:8b | adapter_extraction | adapt_imessage | 100.0 | 5.4 |
| llama3.1:8b | adapter_extraction | adapt_imessage | 100.0 | 5.8 |
| llama3.1:8b | adapter_extraction | adapt_code_file | 58.0 | 5.2 |
| llama3.1:8b | adapter_extraction | adapt_code_file | 58.0 | 6.4 |
| llama3.1:8b | adapter_extraction | adapt_voice_memo | 100.0 | 6.2 |
| llama3.1:8b | adapter_extraction | adapt_voice_memo | 100.0 | 6.4 |
| llama3.1:8b | classification | cls_email | 100.0 | 6.4 |
| llama3.1:8b | classification | cls_email | 100.0 | 6.8 |
| llama3.1:8b | classification | cls_imessage | 100.0 | 6.5 |
showing 40 of 120