6 comments
  • S4phyre2m

    Oh how cool. Always wanted to have a tool like this.

  • phelm5m

    This is awesome, it would be great to cross reference some intelligence benchmarks so that I can understand the trade off between RAM consumption, token rate and how good the model is

  • sxates13m

    Cool thing!

    A couple suggestions:

    1. I have an M3 Ultra with 256GB of memory, but the options list only goes up to 192GB. The M3 Ultra supports up to 512GB. 2. It'd be great if I could flip this around and choose a model, and then see the performance for all the different processors. Would help making buying decisions!

  • GrayShade10m

    This feels a bit pessimistic. Qwen 3.5 35B-A3B runs at 38 t/s tg with llama.cpp (mmap enabled) on my Radeon 6800 XT.

  • John2383237m

    RTX Pro 6000 is a glaring omission.

    • schaefer16m

      No Nvidia Spark workstation is another omission.