i have glm and kimi. kimi was in most of the cases better and my replacement for claude when i run out of tokens. Now im finding myself using glm more then kimi. Its funny that glm vs kimi, is like codex vs claude. Where glm and codex are better for backend and kimi and claude more for frontend.
as kimi did a huge amount of claude distilation it seems to be somewhat based in data
Yeah it seems they did not align it to much, at least for now. Yesterday it helped me bypass the bot detection on a local marketplace. that i wanted to scrap some listing for my personal alerting system. Al the others failed but glm5.1 found a set of parameters and tweaks how to make my browser in container not be detected.
I always jump on the Chinese models when I'm trying to do something that the US ones chastise me for, they're a little more liberal, especially around copyright.
When it works and its not slow it can impress. Like yesterday it solved something that kimi k2.5 could not. and kimi was best open source model for me. But it still slow sometimes. I have z.ai and kimi subscription when i run out of tokens for claude (max) and codex(plus).
i have a feeling its nearing opus 4.5 level if they could fix it getting crazy after like 100k tokens.
Why don't you start a new session or use the /compact command when context gets to 100k tokens?
From my testing it was ok until 145k tokens, the largest context I had before switching to a new session. I think Z.ai officially said it should be good until 200k tokens.
Using it in Open Code is compacting the context automatically when it gets too large.
It will also cost openai dearly if they don't communicate clearly, because I for one will internally push to switch from openai (we are on azure actually) to anthropic. Besides that my private account also.
I think in the West we think everything is blocked. But for example, if you book an eSIM, when you visit you already get direct access to Western services because they route it to some other server. Hong Kong is totally different: they basically use WhatsApp and Google Maps, and everything worked when I was there.
But also yes, parent is right, HF is more or less inaccessible, and Modelscope frequently cited as the mirror to use (although many Chinese labs seems to treat HF as the mirror, and Modelscope as the "real" origin).
https://snipboard.io/jmGKfM.jpg
reply