Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Elon Musk’s xAI has made waves in the last week with the release of its Grok-2 large language model (LLM) chatbot — available through an $8 USD monthly subscription on the social network X.
Now, both versions of Grok-2 — Grok-2 and Grok-2 mini, the latter designed to be less powerful but faster — have both increased the speed at which they can analyze information and output responses, after two developers at xAI rewrite the inference code stack completely in the last three days.
As xAI developer Igor Babuschkin posted this afternoon on the social network X under his handle @ibab:
“Grok 2 mini is now 2x faster than it was yesterday. In the last three days @lm_zheng and @MalekiSaeed rewrote our inference stack from scratch using SGLang. This has also allowed us to serve the big Grok 2 model, which requires multi-host inference, at a reasonable speed. Both models didn’t just get faster, but also slightly more accurate. Stay tuned for further speed improvements!”
The two developers responsible are Lianmin Zheng and Saeed Maleki, according to Babuschkin’s post.
To rewrite the inference for Grok-2, they relied on SGLang, an open-source (Apache 2.0 licensed) highly efficient system for executing complex language model programs, achieving up to 6.4 times higher throughput compared to existing systems.
SGLang was developed by researchers from Stanford University, UC Berkeley, Texas A&M University and Shanghai Jiao Tong University and integrates a frontend language with a backend runtime to simplify the programming of language model applications.
The system is versatile, supporting a wide range of models, including Llama, Mistral, and LLaVA, and is compatible with both open-weight and API-based models like OpenAI’s GPT-4. SGLang’s ability to optimize execution through automatic cache reuse and parallelism within a single program makes it a powerful tool for developers working with large-scale language models.
Grok-2 and Grok-2-Mini Performance Highlights
Additionally, in the latest update to the third-party Lmsys Chatbot Arena leaderboard that rates AI model performance, the main Grok-2 has secured the #2 spot with an impressive Arena Score of 1293, based on 6686 votes.
This effectively puts Grok-2 in the number two spot (fittingly) for the most powerful AI models in the world, tied with Google’s Gemini-1.5 Pro model, and just behind OpenAI’s latest version of ChatGPT-4o.
Grok-2-mini, which has also benefited from the recent enhancements, has climbed to the #5 position, boasting an Arena Score of 1268 from 7266 votes, just behind GPT-4o mini and Claude 3.5 Sonnet.
Both models are proprietary to xAI, reflecting the company’s commitment to advancing AI technology.
Grok-2 has distinguished itself particularly in mathematical tasks, where it ranks #1. The model also holds strong positions across various other categories, including Hard Prompts, Coding, and Instruction-following, where it consistently ranks near the top.
This performance places Grok-2 ahead of other prominent models like OpenAI’s GPT-4o (May 2024), which now ranks #4.
Future Developments
According to a response by Babuschkin on X, the main advantage of using Grok-2-mini over the full Grok-2 model is its enhanced speed.
However, Babuschkin pledged that xAI wwould further improve the processing speed of Grok-2-mini, which could make it an even more attractive option for users seeking high performance with lower computational overhead.
The addition of Grok-2 and Grok-2-mini to the Chatbot Arena leaderboard and their subsequent performance have garnered significant attention within the AI community.
The models’ success is a testament to xAI’s ongoing innovation and its commitment to pushing the boundaries of what AI can achieve.
As xAI continues to refine its models, the AI landscape can expect further enhancements in both speed and accuracy, keeping Grok-2 and Grok-2-mini at the forefront of AI development.
Source Link
Support Techcratic
If you find value in our blend of original insights (Techcratic articles and Techs Got To Eat), up-to-date daily curated articles, and the extensive technical work required to keep everything running smoothly, consider supporting Techcratic with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to future updates and improvements. I am committed to continually enhancing the site and staying at the forefront of trends to provide the best possible experience. Your generosity and commitment are deeply appreciated. Thank you!
Bitcoin Address:bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge
Please verify this address before sending any funds to ensure your donation is directed correctly.Bitcoin QR Code
Your contribution is vital in supporting my efforts to deliver valuable content and manage the technical aspects of the site. To donate, simply scan the QR code below. Your generosity allows me to keep providing insightful articles and maintaining the server infrastructure that supports them.
Privacy and Security Disclaimer
- No Personal Information Collected: We do not collect any personal information or transaction details when you make a donation via Bitcoin. The Bitcoin address provided is used solely for receiving donations.
- Data Privacy: We do not store or process any personal data related to your Bitcoin transactions. All transactions are processed directly through the Bitcoin network, ensuring your privacy.
- Security Measures: We utilize industry-standard security practices to protect our Bitcoin address and ensure that your donations are received securely. However, we encourage you to exercise caution and verify the address before sending funds.
- Contact Us: If you have any concerns or questions about our donation process, please contact us via the Techcratic Contact form. We are here to assist you.