Apple Trained its Apple Intelligence Models on Google TPUs, Not NVIDIA GPUs

For the training of the 6.4 billion parameter AFM-server, Apple's largest language model, the company utilized an impressive array of 8,192 TPUv4 …
View full source