Hacker News

6 comments

S4phyre•2m
Oh how cool. Always wanted to have a tool like this.
phelm•5m
This is awesome, it would be great to cross reference some intelligence benchmarks so that I can understand the trade off between RAM consumption, token rate and how good the model is
sxates•13m
Cool thing!
A couple suggestions:
1. I have an M3 Ultra with 256GB of memory, but the options list only goes up to 192GB. The M3 Ultra supports up to 512GB. 2. It'd be great if I could flip this around and choose a model, and then see the performance for all the different processors. Would help making buying decisions!
GrayShade•10m
This feels a bit pessimistic. Qwen 3.5 35B-A3B runs at 38 t/s tg with llama.cpp (mmap enabled) on my Radeon 6800 XT.
John23832•37m
RTX Pro 6000 is a glaring omission.
- schaefer•16m
  No Nvidia Spark workstation is another omission.