Hacker News

firloop•1y

Truffle-1 is an AI inference engine designed to run open source models at home itsalltruffles.com

25 comments

simonw•1y
The documentation says you start with:
```
    brew install truffle
```
But the Truffle formula currently on Homebrew - https://formulae.brew.sh/formula/truffle - is for something else, its for an Ethereum testing environment of some sort: https://archive.trufflesuite.com/
- fuddle•1y
  This seems very suspicious to me. There is no details about the founders on their website either.
- cipherself•1y
  Yes, also the text
```
  or download on Mac
```
  is not hyperlinked.
- educaysean•1y
  Ouch. That's not a good sign at all.
- parentheses•1y
  WIP syndrome
- colesantiago•1y
  [flagged]
smoldesu•1y
As a friendly heads-up to the people interested - you can buy the board inside this off Ebay for like 300-400 dollars cheaper. "Jetson AGX Orin 64GB" specifically, then you wouldn't have to deal with whatever SaaS middleware that the Truffle ships with.
- tmp777•1y
  Keep in mind that you need at least the jetson board and the carrier board to make it work. The official nvidia devkit device with both boards is actually way more expensive that this (amazon lists it at $1,999).
  The jetson itself is pretty cool though, and nvidia has a bunch of tutorials / cool demos of running LLMs on it, e.g.: https://www.jetson-ai-lab.com/tutorial_live-llava.html
  - throwanem•1y
    $2000 for fast, self-contained CUDA inference doesn't seem too unreasonable. How's it bench next to a 4090 or two?
    - tmp777•1y
      The big upside here is memory: you get 64G, which means you can easily run 70B models at 4bits. You'd need 3x4090s for that. And because of how most inference engines work today, the performance of such setup will actually be slightly lower than 1x4090. You should be really comparing this to A6000, which has similar performance to 3090/4090 (depending on the gen) but with 48GB memory — A6000 is way more expensive.
      In terms of numbers, jetson agx orin is closer to 3090:
      - jetson agx orin 64gb has 275 TOPS
      - 3090 has 285 TOPS
      - 4090 has 1321 TOPS
      Another big advantage is power: jetson is getting these 275 tops at 60W, vs. 350W for 3090.
    - ZiiS•1y
      It is from the 3090 generation and 1/6th the TDP.
- azinman2•1y
  They claim that their middleware is allowing better throughput. Don’t you still need a carrier board?
- renewiltord•1y
  That is actually helpful.
root_axis•1y
The 22 tokens per second claim for Mixtral conveniently fails to mention what type of quantization is going on with that benchmark.
- mmoskal•1y
  They say 200GB/s of mem bandwidth; Mixtral uses 13B parameters for inference; they claim 22t/s, so 22 * 13B parameters per second, so (200 * 8) bits / (22 * 13) around 5.6 bits / parameter max. With overheads, it's probably 4 bit quant.
  edit: formatting
flakiness•1y
Looks cute!
> NVIDIA Orin Module
So it has some NVIDIA chip inside it looks like?
https://www.nvidia.com/en-us/autonomous-machines/embedded-sy...
The 64GB Orin module is sold at about $2K on Amazon. https://www.amazon.com/NVIDIA-Jetson-Orin-64GB-Developer/dp/...
swyx•1y
im not 100% sure yet what this is for. is this basically stand in for an always on laptop that is running mixtral, that has an api endpoint? and effectively no different than self hosting mixtral on some cloud somewhere?
- •1y
  [deleted]
raj_khare•1y
More details here: https://twitter.com/iamgingertrash/status/176759390225142176...
- tomschwiha•1y
  Our compiler is not open source. It is optimized for our boards, and thus wouldn't be valuable to OSS anyway
  Feels like a non-argument.
  - simonw•1y
    Yeah. A lot of the value in open source is in letting people see how the thing works, and giving them the freedom to then adapt those solutions to other contexts.
- hobofan•1y
  Yeah, I'd think twice before wiring 1.2k to someone named "simp 4 satoshi" for a product that currently almost exclusively consists of renders.
  EDIT: Oh and the brew install command points to Truffle, the ethereum dev evnvironment, which has no relation to the product: https://formulae.brew.sh/formula/truffle
parentheses•1y
I want a DIY guide that basically spells out from hardware purchases -> usably running models. I haven't seen one yet.
toisanji•1y
can it be used for whisper and asr as well? I would buy it if its cheaper.
•1y
[deleted]