More

girvo · 2026-05-10T02:53:59 1778381639

But… it doesn’t matter? Even if it was some very illegal drug, that doesn’t change the fact that this detention system (and Japans justice system in general) is quite inhumane.

girvo · 2026-05-09T21:45:27 1778363127

I’ve struggled to get Opus to not write the weirdest possible Rust, ignoring all idioms and so on. Any tips?

nyssos · 2026-05-10T10:25:26 1778408726

Be absolutely ruthless with technical debt. Opus is perfectly capable of producing idiomatic code in any mainstream language you please, but will seize on any opportunity to justify writing basically-python instead because that's "consistent" with the "convention". Deprive it of that excuse.

girvo · 2026-05-10T21:00:07 1778446807

Yeah that’s basically what I mean! I have no issues wrangling it myself, but now I’m curious how those who are managing “fleets” of agents while shipping four features a day are doing it. They’re not, I’d assume?

antonvs · 2026-05-10T03:05:38 1778382338

Give it coding guidelines. It'll largely try to do what you ask.

Left to itself, it often follows human developers who conceive of their goal as "get the program working, the end justifies the means." Which makes sense because there are a lot of systems like that in the training corpus.

girvo · 2026-05-06T00:42:18 1778028138

Oh that's fascinating. 3.6 27B is pretty damned good, but slow in wall-clock times on my DGX Spark-alike. It generates huge reams of thinking before it gets the (usually correct!) answer, so wall-clock time is rough for tasks even at ~20tk/s

I'm surprised the 26B-A4B is better? It should be faster too, interesting. I'm excited to try 31B with MTP, because MTP-2 is what makes 27B bearable on the GB10.

What are you using it for? Agent-based coding, or something else?

nzeid · 2026-05-06T19:10:58 1778094658

General purpose, mostly internet research in the form of slow-crawling. (Emphasis on slow - I've ultimately landed on Scrapling's API for seamless content rendering, and I use image support so as not to exclude informative images or weirdly rendered text.)

For coding I don't need image support so I stuff the entire GPU with text-only mode. I don't have a workflow where I send LLMs off to generate thousands of lines of code but what little coding I did I did with Qwen3.6 and it was spectacular, as you likely suggest.

girvo · 2026-05-05T21:06:29 1778015189

You can, though. We used pthreads (well, pthresd compatible API) in production at massive scale on the ESP32-S3.

girvo · 2026-05-05T13:32:54 1777987974

It’s been quite a few years since I last did PHP, but I wrote my own wedding invite management tool using Laravel and PHP 8.5

Herd is super neat, reminds me of XAMPP back in the day. FrankenPHP is everything I wanted out of a modern PHP web server.

And the language? Basically the same as I left it, with some nice things added. I kind of miss it :)

girvo · 2026-05-05T13:21:47 1777987307

The dirty secret is all the people talking about shipping 4 features a day etc are just lying about reviewing anything. They don’t review it at all.

spicyusername · 2026-05-06T11:29:59 1778066999

I didn't say shipping a day. I said shipping at the same time.

The review comes at the end, though I truly believe this will go away as well. Agents will also get better at review until they're good enough that no one will want to do it anyways. Good enough is good enough.

swader999 · 2026-05-05T14:17:38 1777990658

I review more thoroughly and faster with Claude than without.

Salgat · 2026-05-05T16:47:58 1777999678

Claude absolutely improves code review quality, but it still misses a lot. It's a second pair of eyes, it doesn't replace/remove the work you have to put in to fully review the code yourself.

It's like saying that you code reviewed faster just because someone else also reviewed the code, that's not how it works.

swader999 · 2026-05-05T16:56:11 1778000171

Agree, and with CC my volume and quality of PR review has substantially increased since 4.5. Without CC for review we would have a ridiculous bottleneck in our dev/qa pipeline.

girvo · 2026-05-05T22:25:23 1778019923

I'm faster, sure, but more thorough, no. The same, because I was already very careful. But it's not a massive win either; 4.7 misses too much still because it would need to read too much of the context each time to understand the architectural problems I'm catching.

Its nice to not have to care about nits and other things that we don't have lints for though, so that's useful.

girvo · 2026-05-05T05:15:15 1777958115

Frenchisms are great! “No A, No B, No C. Just D” in every comment “assisted” by AI isn’t that great :(

girvo · 2026-05-04T21:16:47 1777929407

It’s flagged because it’s obvious AI slop

johncole · 2026-05-08T03:19:57 1778210397

No because of Russian trolls.

girvo · 2026-05-04T13:19:23 1777900763

I’m flagging them, but it’s not enough. Front page is full of it these days. Sad times.

girvo · 2026-05-04T11:38:02 1777894682

Already sort of exists in the high end contracting/consulting software dev business in Australia at least