Nvidia just announced an interesting piece of hardware
There is a vibrant community that runs open large language models (LLMs) locally. So far there was somewhat of a hardware problem, though. Besides compute power, LLMs require an interesting combination of large amounts of fast memory.
The dilemma is expressed with the following (very simplified) table:
what | fast computing | fast memory | large memory | price |
---|---|---|---|---|
random PC/server, no GPU | no | no | yes | thousands of $ |
Nvidia SBCs such as “Jetson” etc. | somewhatish | somewhatish | no | hundreds of $ |
PC + Nvidia consumer GPU cards | yes | yes | no | thousands of $ |
high-end Apple Silicon Mac | somewhat | somewhat | yes | many thousands of $ |
high-end server, no GPU | somewhat | somewhat | yes | tens of thousands of $ |
server + Nvidia datacenter compute cards | yes | yes | yes | tens of thousands of $ |
Basically, to get rid of any no you’d need to shell out lots of money or end up renting the hardware. Apple Silicon Macs somehow sneaked in there too, at least as an alternative to high-end servers. A high-end Mac Studio with 128 GB of RAM ist still ~ $ 5000, though, and it’s still only “somewhat” fast for LLMs.
Some people got creative and build “Frankenracks” with stacks of linked Nvidia consumer cards or older datacenter compute cards to increase the memory. They might be fast and affordable, but are not really stable and maintainable from a business standpoint.
Basically, no combination hits all the sweet spots.
Until today.
Maybe.
Nvidia just announced “Project DIGITS”, a computer in a small box with a SOC featuring a GPU of the latest generation (“Blackwell”), 20 ARM cores and a standard size of 128 GB of unified RAM. That immediately checks “fast computing” and “large memory”. The only missing spec so far is memory bandwidth. From the presentation it appears to be LPDDR5X, which very likely will not be as tightly linked to the GPU as it is on the consumer cards (GGDR7) or datacenter cards (HBM3).
So we get this “new” permutation:
what | fast computing | fast memory | large memory | price |
---|---|---|---|---|
Nvidias new “Project DIGITS” box | yes | somewhat? | yes | thousands of $ |
Unless there’s some catch, such as memory bandwidth turns out very disappointing or there is some artificial cap with their (proprietary) software support, this looks like a very interesting machine for the local LLM folks!
It will be available starting from May 2025, so the wait isn’t over yet.
Read Nvidias press release at