M4 Max here w/ 128GB RAM. Can confirm this is the bottleneck.
https://pastebin.com/2wJvWDEH
I weighed about a DGX Spark but thought the M4 would be competitive with equal RAM. Not so much.
However it will be better for training / fine tuning, etc. type workflows.
For the DGX benchmarks I found, the Spark was mostly beating the M4. It wasn't cut and dry.
The M4 Max has double the memory bandwidth, so it should be faster for decode (token generation).
M4 Max here w/ 128GB RAM. Can confirm this is the bottleneck.
https://pastebin.com/2wJvWDEH
I weighed about a DGX Spark but thought the M4 would be competitive with equal RAM. Not so much.