Having half the number of GPUs in a workstation/local server setup to have same amount of VRAM might make up for whatever slowdown there would be if you had to use less-optimized code. For instance running or training a model that required 192GB of VRAM would take 4x48GB VRAM but 8x24GB VRAM GPUs.
Having half the number of GPUs in a workstation/local server setup to have same amount of VRAM might make up for whatever slowdown there would be if you had to use less-optimized code. For instance running or training a model that required 192GB of VRAM would take 4x48GB VRAM but 8x24GB VRAM GPUs.