Quote from: captcha on August 16, 2024, 20:59:38Well M3Max GPU is around 4070 mobile just as expected.
But M3Max NPU is on par with 4090 in quantized models (the ones you will be using locally).
What's surprising is iPad M4 NPU and 8 Gen3 NPU are absolutely destroying 4090 in quantized models.
So NPUs are the next big thing and 4090s for local AI inferences will be fading away. Nice, new insights of what's what are always welcome.
The market for AI that gpu vendors are most interested in is servers for precision workloads. Nobody is buying a gpu to do quantesized models. Those were often just done on the cpu. And now NPUs will make that more efficient