Nvidia just announced an interesting piece of hardware

2025-01-07

There is a vibrant community that runs open large language models (LLMs) locally. So far there was somewhat of a hardware problem, though. Besides compute power, LLMs require an interesting combination of large amounts of fast memory.

The dilemma is expressed with the following (very simplified) table:

what	fast computing	fast memory	large memory	price
random PC/server, no GPU	no	no	yes	thousands of $
Nvidia SBCs such as “Jetson” etc.	somewhatish	somewhatish	no	hundreds of $
PC + Nvidia consumer GPU cards	yes	yes	no	thousands of $
high-end Apple Silicon Mac	somewhat	somewhat	yes	many thousands of $
high-end server, no GPU	somewhat	somewhat	yes	tens of thousands of $
server + Nvidia datacenter compute cards	yes	yes	yes	tens of thousands of $

Basically, to get rid of any no you’d need to shell out lots of money or end up renting the hardware. Apple Silicon Macs somehow sneaked in there too, at least as an alternative to high-end servers. A high-end Mac Studio with 128 GB of RAM ist still ~ $ 5000, though, and it’s still only “somewhat” fast for LLMs.

Some people got creative and build “Frankenracks” with stacks of linked Nvidia consumer cards or older datacenter compute cards to increase the memory. They might be fast and affordable, but are not really stable and maintainable from a business standpoint.

Basically, no combination hits all the sweet spots.

Until today.

Maybe.

Nvidia just announced “Project DIGITS”, a computer in a small box with a SOC featuring a GPU of the latest generation (“Blackwell”), 20 ARM cores and a standard size of 128 GB of unified RAM. That immediately checks “fast computing” and “large memory”. The only missing spec so far is memory bandwidth. From the presentation it appears to be LPDDR5X, which very likely will not be as tightly linked to the GPU as it is on the consumer cards (GGDR7) or datacenter cards (HBM3).

So we get this “new” permutation:

what	fast computing	fast memory	large memory	price
Nvidias new “Project DIGITS” box	yes	somewhat?	yes	thousands of $

Unless there’s some catch, such as memory bandwidth turns out very disappointing or there is some artificial cap with their (proprietary) software support, this looks like a very interesting machine for the local LLM folks!

It will be available starting from May 2025, so the wait isn’t over yet.

Read Nvidias press release at

https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwell-on-every-desk-and-at-every-ai-developers-fingertips

Update 2025-03-20

More info is available: the product will be called DGX Spark:

https://www.nvidia.com/en-us/products/workstations/dgx-spark/

According to the specs, its unified LPDDR5x memory has a bandwidth of 273 GB/s, which unfortunately is not too exiting and puts it versus the low-end of my “somewhat” fast memory category :/

By comparison, the Apple M4 Pro has exactly the same memory bandwidth at 273 GB/s, the Apple M4 Max has twice as much at 546 GB/s.

Nvidia also announced the high end (and likely very expensive) DGX Station with dedicated GPU HBM3e memory at 8 TB/s (!).

Sprachauswahl