Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

Start your free Amazon Prime trial
today and unlock unlimited streaming and more!

2024-11-13 03:16:00
simonwillison.net

Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

12th November 2024

There’s a whole lot of buzz around the new Qwen2.5-Coder Series of open source (Apache 2.0 licensed) LLM releases from Alibaba’s Qwen research team. On first impression it looks like the buzz is well deserved.

Qwen claim:

Qwen2.5-Coder-32B-Instruct has become the current SOTA open-source code model, matching the coding capabilities of GPT-4o.

That’s a big claim for a 32B model that’s small enough that it can run on my 64GB MacBook Pro M2. The Qwen published scores look impressive, comparing favorably with GPT-4o and Claude 3.5 Sonnet (October 2024) edition across various code-related benchmarks:

How about benchmarks from other researchers? Paul Gauthier’s Aider benchmarks have a great reputation and Paul reports:

The new Qwen 2.5 Coder models did very well on aider’s code editing benchmark. The 32B Instruct model scored in between GPT-4o and 3.5 Haiku.

84% 3.5 Sonnet,
75% 3.5 Haiku,
74% Qwen2.5 Coder 32B,
71% GPT-4o,
69% Qwen2.5 Coder 14B,
58% Qwen2.5 Coder 7B

That was for the Aider “whole edit” benchmark. The “diff” benchmark scores well too, with Qwen2.5 Coder 32B tying with GPT-4o (but a little behind Claude 3.5 Haiku).

Given these scores (and the positive buzz on Reddit) I had to try it for myself.

My attempts to run the Qwen/Qwen2.5-Coder-32B-Instruct-GGUF Q8 using llm-gguf were a bit too slow, because I don’t have that compiled to use my Mac’s GPU at the moment.

But both the Ollama version and the MLX version worked great!

I installed the Ollama version using:

ollama pull qwen2.5-coder:32b

That fetched a 20GB quantized file. I ran a prompt through that using my LLM tool and Sergey Alexandrov’s llm-ollama plugin like this:

llm install llm-ollama
llm models # Confirming the new model is present
llm -m qwen2.5-coder:32b 'python function that takes URL to a CSV file and path to a SQLite database, fetches the CSV with the standard library, creates a table with the right columns and inserts the data'

Here’s the result. The code worked, but I had to work around a frustrating ssl bug first (which wouldn’t have been an issue if I’d allowed the model to use requests or httpx instead of the standard library).

I also tried running it using the Apple Silicon fast array framework MLX using the mlx-llm library directly, run via uv like this:

uv run --with mlx-lm \
  mlx_lm.generate \
  --model mlx-community/Qwen2.5-Coder-32B-Instruct-8bit \
  --max-tokens 4000 \
  --prompt 'write me a python function that renders a mandelbrot fractal as wide as the current terminal'

That gave me a very satisfying result—when I ran the code it generated in a terminal I got this:

$macOS terminal window displaying a pleasing mandelbrot fractal as ASCII art$

This is a really promising development. 32GB is just small enough that I can run the model on my Mac without having to quit every other application I’m running, and both the speed and the quality of the results feel genuinely competitive with the current best of the hosted models.

Given that code assistance is probably around 80% of my LLM usage at the moment this is a meaningfully useful release for how I engage with this class of technology.

Source Link

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: HACKER NEWS

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

Support Techcratic

Bitcoin QR Code

YOU WON'T BELIEVE how this DOG looks AFTER shaving all this matted fur (She was found on the st

Today’s NYT Connections: Sports Edition Hints, Answers for Nov. 13, #51

Related Posts

Leave a Reply Cancel reply

Your Tech Resources

Tech News

Tech News

Tech News​

Site Links

Tech News