Is NPU the Reason for Apple Intelligence in the AI Era?
Apple officially announced its entry into the “Apple Intelligence” era this year, where AI assistants and machine learning will lead the user experience of future iPhones, Macs, and iPads. Unfortunately, not all Apple users are invited to this party. Only two models, the iPhone 15 Pro and 15 Pro Max, along with the recently released iPhone 15, can access these new AI features. It’s surprising to see that a phone released less than a year ago is already considered outdated. In contrast, Mac users only need an Apple Silicon computer released after 2020. Among all Apple products, the requirements for iPhones seem to be quite strict.
What is limiting the Apple Intelligence features on iPhones? Some may speculate that it’s the Neural Processing Unit (NPU) hindering the progress of AI on iPhones. Microsoft, Intel, Qualcomm, and other tech giants emphasize the importance of having an NPU for terminal devices in this wave of generative AI.
So, what is an NPU? In the past, phones and laptops relied mainly on the Central Processing Unit (CPU) and Graphics Processing Unit (GPU) for computation. The CPU excels at handling complex tasks, while the GPU specializes in graphics processing. However, AI technology requires the capabilities of an NPU. NPUs have numerous small cores that are proficient at simultaneously processing a large number of repetitive tasks while consuming less energy. This makes them a crucial component for generative AI calculations, making them essential for the next generation of AI PCs and smartphones.
Cristiano Amon, CEO of Qualcomm, emphasized the importance of NPUs in current AI computing at the 2024 Taipei International Computer Show. However, is this the reason why iPhones below the iPhone 15 Pro cannot access Apple Intelligence?
Andrew Williams, a well-known journalist from WIRED, points out that iPhones have been using NPUs since 2017. The first-generation Apple Neural Engine (ANE) was introduced that year and was equipped in the iPhone 8 and iPhone X. It was essentially an NPU, showing that Apple had already begun incorporating AI features into iOS before AI became a popular term. The latest iPhone 15 series also includes NPUs. The A16 chip in these phones has an NPU performance of 17 TOPS (trillions of operations per second). Analyst Ming-Chi Kuo from TF International Securities noted that even though Microsoft defines the basic requirements for AI PCs to be at least 40 TOPS, Apple only needs 11 TOPS for AI operations on iPhones, with the remaining computational power being handled by the cloud (Private Cloud Compute).
“It’s possible that Apple Intelligence is not so different from previous AI, but rather has a few additional features,” says Andrew.
So, the real key to accessing Apple Intelligence is not the NPU. The answer may be quite simple: it’s because of insufficient RAM memory.
Ming-Chi Kuo points out that currently, only the iPhone 15 Pro and iPhone 15 Pro Max models are equipped with 8GB of RAM, while the iPhone 15 has only 6GB of RAM. This is not enough to support the computational requirements of Apple Intelligence.
Why is RAM memory important for AI smartphones? When AI models run offline locally instead of being connected to the cloud, they need to be temporarily stored in RAM or vRAM (virtual memory). This requires a significant amount of storage capacity. Taking the example of the NVIDIA H200 used in ChatGPT, a single card alone has 141GB of vRAM, and several hundred such cards are needed to run the service.
Even the highest-spec iPhone 15 Pro Max with 8GB of RAM can only handle small and relatively simple AI models. According to Apple’s official website, offline capabilities include features such as photo review, photo scene recognition, Siri suggestions, voice recognition, and translation. “If we reason a little, we’ll realize that these offline features don’t include generative AI,” says Andrew. These locally run features seem to be present in other brands of smartphones, but the exciting generative AI applications, such as image generation in chat rooms or document generation in emails, still rely on the cloud.
Ming-Chi Kuo points out that the current Apple Intelligence features require at least a 3B large language model (LLM), and in the future, Apple may upgrade to a 7B model, which would require even more memory to execute. RAM size might become a differentiating factor for Apple’s high and low-end models.
Therefore, there is considerable anticipation regarding the configuration of the iPhone 16 and whether it will have full access to Apple Intelligence or be limited to the Pro series. The answer will be revealed in September this year.
Reference: WIRED, Ming-Chi Kuo on X