Megan Crouse
2025-06-23 18:12:00
www.techrepublic.com
Microsoft’s newest small language model for on-device processing has a specific use case: the Windows 11 Settings application.
Mu is the technology behind the AI agent in the Settings menu that allows users to ask natural language questions. With permission, the agent can take actions on its own to solve the problem posed by the user; as such, it needs to be able to interpret and manipulate hundreds of system settings.
Mu is now in preview for some Windows Insiders.
How Mu packs processing power onto relatively compact hardware
In a press release on June 23, Microsoft revealed how the on-device small language model behind the AI agent in Settings works. Mu started off by training on NVIDIA A100 GPUs on Azure Machine Learning. After training, Mu runs on the PC’s Neural Processing Unit (NPU), responding at more than 100 tokens per second.
Mu builds on what Microsoft learned about running small language models on-device from Phi Silica, the model built in 2024 for Windows 11 Copilot+ PCs on Snapdragon X Series laptops.
Choosing an encoder-decoder language model instead of a decoder-only architecture also increases efficiency, according to Microsoft.
“By separating the input tokens from output tokens, Mu’s one-time encoding greatly reduces computation and memory overhead,” Vivek Pradeep, vice president and distinguished engineer in Windows Applied Sciences at Microsoft, wrote in the blog post. “In practice, this translates to lower latency and higher throughput on specialized hardware.”
An encoder-decoder language model is more efficient than a decoder-only model, Microsoft said. Image: Microsoft
Mu is optimized for the NPUs on Copilot+ PCs
Over the course of working with NPUs, Microsoft’s developers learned how to shape Mu’s design to fit the processor. This included ensuring the model architecture and parameter shapes aligned with the hardware’s parallelism and memory limits, optimizing the parameter distribution between the encoder and decoder, and enhancing efficiency in other ways.
The parameter count was reduced by using the same set of weights to representing input tokens and generating output logits, a crucial element in ensuring speedy performance on memory-constrained NPUs.
If the user asks a question that cues any operations that are unsupported or inefficient on the NPU, Mu will avoid those operations.
In addition, changes to the transformer architecture and to model quantization techniques improve power efficiency on the NPU.
The AI agent in Settings is available in the Windows 11 Insider Preview Build, accessible to Windows Insiders in the Dev Channel. Only Snapdragon-powered Copilot+ PCs can use it for now, although Microsoft said AMD and Intel-based PCs will gain access at an unspecified date.
Read about Microsoft’s negotiations with OpenAI as the ChatGPT maker considers restructuring.
Keep your entertainment at your fingertips with the Amazon Fire TV Stick 4K! Enjoy streaming in 4K Ultra HD with access to top services like Netflix, Prime Video, Disney+, and more. With an easy-to-use interface and voice remote, it’s the ultimate streaming device, now at only $21.99 — that’s 56% off!
With a 4.7/5-star rating from 43,582 reviews and 10K+ bought in the past month, it’s a top choice for home entertainment! Buy Now for $21.99 on Amazon!
Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.
As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!
BITCOIN bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.