- Raspberry Pi AI HAT+ 2 enables local LLMs on Raspberry Pi 5
- Hailo-10H accelerator provides 40 TOPS of INT4 inference power
- PCIe interface allows high-speed communication with Raspberry Pi 5
The Raspberry Pi community has taken a significant step forward with the introduction of the AI HAT+ 2, an add-on board aimed at facilitating generative AI tasks on the Raspberry Pi 5.
Previous AI HAT models primarily focused on enhancing computer vision capabilities, such as object detection and scene segmentation.
The new AI HAT+ 2 expands its functionality to support large language models (LLMs) and vision language models (VLMs) that can operate locally, eliminating the need for cloud services or constant internet access.
Hardware Enhancements for Local Language Models
At the heart of this upgrade is the Hailo-10H neural network accelerator, which boasts 40 TOPS of INT4 inference performance.
In contrast to its predecessor, the AI HAT+ 2 is equipped with 8GB of dedicated onboard memory, allowing for the execution of larger models without utilizing the Raspberry Pi's system RAM.
This advancement enables the direct execution of LLMs and VLMs on the device, ensuring low latency and local data processing—critical for many edge computing applications.
Users can install compatible models using a standard Raspberry Pi distribution and interact with them through familiar interfaces, including browser-based chat applications.
The AI HAT+ 2 connects to the Raspberry Pi 5 via the GPIO header and utilizes the PCIe interface for data transfer, making it incompatible with the Raspberry Pi 4.
This connection supports high-bandwidth data transfer between the accelerator and the host, which is vital for efficiently handling model inputs, outputs, and camera data.
Demonstrations of the AI HAT+ 2 include text-based question answering with Qwen2, code generation using Qwen2.5-Coder, basic translation tasks, and visual scene descriptions from live camera feeds.
These tasks leverage AI tools designed to integrate seamlessly with the Raspberry Pi software ecosystem, including containerized backends and local inference servers.
All processing is conducted on the device itself, without reliance on external computing resources.
The supported models range from one to one and a half billion parameters, which is relatively modest compared to cloud-based systems that can handle much larger scales.
These smaller LLMs are optimized for limited memory and power constraints rather than broad, general-purpose knowledge.
To mitigate these limitations, the AI HAT+ 2 supports fine-tuning techniques such as Low-Rank Adaptation, enabling developers to tailor models for specific tasks while preserving most parameters.
Vision models can also be retrained using application-specific datasets via Hailo’s toolchain.
Priced at $130, the AI HAT+ 2 is positioned above earlier vision-focused accessories while delivering comparable computer vision throughput.
For tasks centered solely on image processing, the upgrade offers limited improvements, as its primary value lies in local LLM execution and applications requiring data privacy.
In summary, this hardware demonstrates that generative AI is now a viable option on Raspberry Pi devices, although challenges related to memory capacity and model size persist.
