Hacker Newsnew | past | comments | ask | show | jobs | submit | tuzemec's commentslogin

I ended up using an UGreen dusplayport switch AND separate USB switch. It turns out that most of the kvm switches can't handle well usb mice with high polling rate like Logitech G502 X Plus. Now I need to press two buttons when switching from the gaming pc to the MacBook, but everything works.

You can enable the lm studio server and use any openai compatible harness to use the models that are running inside it. OpenCode, pi, even Claude and Codex...

I'm currently experimenting with running google/gemma-4-26b-a4b with lm studio (https://lmstudio.ai/) and Opencode on a M3 Ultra with 48Gb RAM. And it seems to be working. I had to increase the context size to 65536 so the prompts from Opencode would work, but no other problems so far.

I tried running the same on an M3 Max with less memory, but couldn't increase the context size enough to be useful with Opencode.

It's also easy to integrate it with Zed via ACP. For now it's mostly simple code review tasks and generating small front-end related code snippets.


I have a similar setup. It might be worth checking out pi-coding-agent [0].

The system prompt and tools have very little overhead (<2k tokens), making the prefill latency feel noticeably snappier compared to Opencode.

[0] https://www.npmjs.com/package/@mariozechner/pi-coding-agent#...


Just set up Pi after listening to Marios talk at AIE Europe[0] and have solid initial impressions! Especially on limited hardware like a MB Air, seems a lot more resource efficient

[0] https://www.youtube.com/live/_zdroS0Hc74?t=3633s


Thanks! I just ran a quick test with pi, and it's working a bit faster.

Pi is _really_ good for personal stuff, but since it lacks every single safety imaginable, it's not realy something one can deploy in a corporate environment :D

I run this model on my AMD RX7900XTX with 24GB VRAM with up to 4 concurrent chats and 512K context window in total. It is very fast (~100 t/s) and feels instant and very capable, and I have used Claude Code less and less these days.

I do the same thing on a MacBook Pro with an M4 Max and 64GB. I had problems until the most recent LM Studio update (0.4.11+1), tool calling didn't work correctly.

Now both codex and opencode seem to work.


Which do you prefer? And what lmstudio api works best for these tools?

I use the OpenAI API for everything. I think codex is more polished, but I don't really prefer anything: I haven't used them enough. I mostly use Claude Code.

I did the same using the mlx version on an M1 Macbook using LMStudio integrated into XCode. I had to up the context size I ran it a against a very modest iOS codebase and it didn't do well, just petered out at one point. Odd. Pretty good chatbot and maybe against other code it'll work but not useful with XCode for me

I spun up a GPU on Runpod and tried the 31b full res and it was really impressive. I'm now using it via the Google API which gives you 1500 requests a day for free, IIRC.

Be very careful about using googles apis as a consumer, they have poor rate limiting and ineffective anomoly protection.

I (a hobbyist running a small side project for a dollar or two a month in normal usage, so my account is marked as "individual") got hit with a ~$17,000 bill from Google cloud because some combination of key got leaked or my homelab got compromised, and the attacker consumed tens of thousands in gemini usage in only a few hours. It wasn't even the same Google project as for my project, it was another that hasn't seen activity in a year+.

Google refuses to apply any adjustments, their billing specialist even mixed up my account with someone else, refuses to provide further information for why adjustments are being rejected, refuses any escalation, etc. I already filed a complaint with the FTC and NYS attorney General but the rep couldn't care any less.

My gripe is not that the key was potentially leaked or compromised or similar and then I have to pay as a very expensive "you messed up" mistake, it's that they let an api key rack up tens of thousands in maybe 4 hours or so with usage patterns (model selection, generating text vs image, volume of calls, likely different IP and user agent and whatnot). That's just predatory behavior on an account marked as individual/consumer (not a business).


Agree totally. I'm super paranoid and anxious about this issue. I've seen too many horror stories posted on Reddit. I did set alarms at $10 a day on the account, but those are only alarms and it could be thousands over before I see them.

I think Google did finally implement hard limits this month and I need to go and find that setting, but it's useless if, like you say, they have shitty rate limiting and measurement so that you're way over the limit before they stop you.


Not sure if you already tried but both GLM Flash and Qwen models are much better than Gemma for that in my experience.

I am using a 24GB GPU so it might be different in your case, but I doubt it.


gguf or mlx? edit, just tried a community mlx and lm studio said it didn't support loading it yet.

I have 4-5 typescript projects and one python opened in Zed at any given time (with a bunch of LSPs, ACPs, opened terminals, etc.) and I see around 1.2 - 1.4gb usage.

I opened just one of the typescript projects inside VSCode and I see something like 1gb (combining the helpers usage). I'm not using it actively, so no extra plugins and so on.

That's on mac, so I guess it may vary on other systems.


Also Fluke - Risotto. Similar vibes.


That's pretty cool!

Btw, it seems that "How trimming works" screen has some missing translations.


I'm using two external 1440p displays at work and one 32" 4k at home (with a MBP). Mostly front-end development and music production at home.

Aerospace improved my productivity a lot on that front. My main apps/windows are now bound to an alt+key combination - I can easily switch without alt+tabbing like in the dark ages.

All my windows usually take up the full screen - I simply can't stand a window that doesn't fill the entire screen - not sure if that's some kind of OCD. The cognitive load of managing apps and spaces was quite high at the beginning, but now it's just muscle memory. I do recognize that it's not for everyone, but works very well for me.


I own a pair of Focal Bathys - amazing sound, bluetooth connectivity, noise canceling, AND usb-c with DAC. I'm very happy with them.


Went to Zed from Sublime and never looked back (I'm on Mac). Never touched VSCode apart of small tests to see that the project setting (format on save, etc) will work for my colleagues. Can't stand it.

I have opened at all times at least 3-4 medium sized front-end projects and at least one large python project and never had memory issues.


I can recommend 'Delta-V' by the same author too.


Delta-V was great, but I was disappointed by the bland characters and storytelling of it's sequel Critical Mass. It's easily Daniel Suarez's weakest novel.

The first of his books I read is "Kill Decision" which has more than held up in it's predictions of drone warfare and AI advancements. Based on the accuracy of his predictions from this book, and Daemon too, I'm bullish on space mining!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: