Rendered at 00:58:36 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
fcanesin 12 minutes ago [-]
Zhipu AI is founded by a superstar Tsinghua professor, did an IPO in January (Hong Kong stock exchange) hired half it's past research lab and it's stock is >10x since. This is not a "just distill Claude" thing.
osti 3 hours ago [-]
Given that DeepSwe is one of the very few coding benchmarks worth taking a look at, this achieves rather excellent result at it (not far from opus 4.8).
From looking at the results and my own impression of 5.1 and other models, I think this is the best Chinese coding model by some non-insignificant margin.
LaurensBER 3 hours ago [-]
I've been very pleased with it's performance over the last few days.
It's definitely not near Opus 4.8 level but it's very impressive nonetheless and it does do design extremely well.
If I have a fully maxed out MacBook Pro, would it make sense to just switch from Opus 4.8 to this? I've never tried running local models for coding...
entrope 38 minutes ago [-]
HuggingFace says this model has 753B parameters, which will need a lot more RAM than a maxed-out MacBook Pro. With 40B active parameters, running from SSD would need patience.
api 37 minutes ago [-]
I’ve wondered for a while if anyone is working on very wide channel parallel (kind of like RAID 0) SSD for this purpose. Couple that with a tensor processor and that would be interesting.
From looking at the results and my own impression of 5.1 and other models, I think this is the best Chinese coding model by some non-insignificant margin.
It's definitely not near Opus 4.8 level but it's very impressive nonetheless and it does do design extremely well.
Better than Opus?