At the recent Augmented World Expo, Ziad Asghar, SVP of XR & Spatial Computing at Qualcomm, presented the Snapdragon AR1+ Gen 1 chipset powering the RayNeo X3 Pro smart glasses. During his talk, he publicly demonstrated a Generative AI experience for the first time, using voice input. The Small Language Model (SLM) was able to answer questions with a response time close to that of a cloud-based LLM (~5 seconds). Let’s begin with a guided comparison between SLMs and LLMs. Until now, most Generative AI applications have relied on LLMs. However, thanks to optimized models such as DeepSeek and increasing computing efficiency, scientists are now able to fit SLMs on embedded devices. The table below highlights the key…