More

withinboredom · 2026-05-04T10:01:04 1777888864

Gazelle bikes are pretty fast ... instructions unclear.

withinboredom · 2026-05-04T09:57:43 1777888663

For larger codebases ... maybe it will cut down on "let me create a random number wrapper for the 15th time" type problems.

Weryj · 2026-05-04T10:29:09 1777890549

You should already have skills which mention these utilities.

But maybe that’s enough tokens to feed an entire lifetime of user behaviour in for the digital twin dystopia?

withinboredom · 2026-05-04T10:32:10 1777890730

"type problems" was doing the heavy lifting there, not literally "this utility".

withinboredom · 2026-04-28T06:45:13 1777358713

Ask it about the aether as well. I think it was disproven around that time.

withinboredom · 2026-04-26T23:18:17 1777245497

I've had a ceiling fall on me once and once to a friend while on vacation. Just because it hasn't happened to you doesn't mean it hasn't happened to other people.

margalabargala · 2026-04-27T00:13:47 1777248827

Thanks for the anecdote. I don't think it changes the point of the metaphor.

maxbond · 2026-04-27T00:38:20 1777250300

> Thanks for the anecdote.

They're only sharing an annecdote because they are responding to your annecdote about not seeing a ceiling collapse.

> I don't think it changes the point of the metaphor.

If their anecdotes is moot, than your anecdote is also moot; if the anecdotes can only confirm a conclusion and never disconfirm, then we've created an unfalsifiable construction with the conclusion baked into it's premises.

margalabargala · 2026-04-27T01:14:16 1777252456

Sure, I suppose that's something that someone who doesn't understand the discussion might say.

A person who better comprehends what they read might properly contextualize within the larger conversation, where the point that stands is that LLMs and ceilings are both useful, neither are doomed such that no one should use them, and that individual instances of failures are somewhat uncommon and not a reason for others to avoid the category.

maxbond · 2026-04-27T01:20:16 1777252816

> Sure, I suppose that's something that someone who doesn't understand the discussion might say.

I'm going to be frank, you are the person who misunderstands (and are being rather rude about it). You are responding to an argument no one is making.

To put a fine point on it, you said this:

> Entropy may mean all ceilings collapse eventually, but that doesn't mean we aren't able to make useful ceilings.

But you were responding to a comment saying this:

> Except your ceiling can and will fall on you unless you take preventative measures, entirely due to molecular interactions within the material.

Emphasis added. They are saying maintenance is necessary, not that a safe ceiling is unachievable. It's obviously achievable, we've all seen it achieved.

They further say:

> It boggles the mind to let an LLM have access to a production database without having explicit preventative measures and contingency plans for it deleting it.

Emphasis added. When they say it boggles the mind to deploy an LLM without the proper measures, the implication is that it does make sense to deploy it with the proper measures.

> ...the point that stands is that LLMs and ceilings are both useful, neither are doomed such that no one should use them, ...

I have not seen a single person in this subthread say that LLMs aren't useful or that they are doomed. People say that. But the people you're talking to haven't.

I try to avoid these petty "I brought the receipts" comments, but I don't like the way you're being snarky to people who's crime is engaging with the premises you set up. The faults you are finding are faults you introduced. I'd appreciate if you would avoid that in the future.

margalabargala · 2026-04-27T01:54:13 1777254853

If that's what you got out of the above conversation that is about as fundamental a misunderstanding as the one at the top of this thread saying "It is fundamental to language modeling that every sequence of tokens is possible". I could say something rude here about both mistakes being made by the same person, but since you brought it up I won't.

If you want to take a comb to it, the comment saying this:

> Except your ceiling can and will fall on you unless you take preventative measures, entirely due to molecular interactions within the material

Was already off the plot. What was being discussed wasn't some specific molecular process, it was the false premise "oh molecules move around randomly so your ceiling might just collapse of its own accord because the beam decided to randomly disintegrate". That's not something that happens.

You said "The sequence of tokens that would destroy your production environment can be produced by your agent, no matter how much prompting you use". This is analogous to "the ceiling could just collapse on you due to random molecular motion, no matter how much maintenance you do or what materials you use".

Make sense now?

Your edit at the bottom of your top comment does better than your original statement.

withinboredom · 2026-04-27T06:32:44 1777271564

> What was being discussed wasn't some specific molecular process, it was the false premise "oh molecules move around randomly so your ceiling might just collapse of its own accord because the beam decided to randomly disintegrate". That's not something that happens.

Except it does happens. That’s why buildings get condemned and buildings eventually turn to rubble.

To the exact point; I have a product from a couple years ago using an old model from OpenAI. It’s still running and all it does is write a personality report based on scores from the test. I can’t update the model without seriously rewriting the entire prompt system, but the model has degraded over the years as well. Ergo, my product has degraded of its own accord and there is nearly nothing I can do about it. My only choice is to basically finagle newer models into giving the correct output; but they hallucinate at much higher rates than older models.

maxbond · 2026-04-27T02:08:16 1777255696

> I could say something rude here about both mistakes being made by the same person, but since you brought it up I won't.

I'd encourage to desist from rudeness, not just when people point it out to you, but at all times.

> You said "The sequence of tokens that would destroy your production environment can be produced by your agent, no matter how much prompting you use". This is analogous to "the ceiling could just collapse on you due to random molecular motion, no matter how much maintenance you do or what materials you use".

If prompt engineering is effective (analogous to performing the necessary maintenance and selecting the correct materials), I'm curious what your explanation is for the incident in the article?

margalabargala · 2026-04-27T02:23:40 1777256620

> I'd encourage to desist from rudeness, not just when people point it out to you, but at all times.

I desire neither to be inauthentic, nor to suppress my emotions.

> If prompt engineering is effective (analogous to performing the necessary maintenance and selecting the correct materials), I'm curious what your explanation is for the incident in the article?

Keeping with the analogies, the original article doesn't say whether they built the roof properly or if the just used some screws to hold up a piece of quarter inch plywood and called it a day.

It's no surprise that a terribly built roof may fall down. It's possible to get shoddy materials from a supplier without knowing.

Calling a curl command isn't something that would be within the model's training as "this deletes things don't do it". The fact that this happened is not, to me, evidence that the model might have equally run `sudo rm -rf --no-preserve-root /` under similar circumstances.

It sounds like the phrase "NEVER FUCKING GUESS!" was in the prompt as well, which could easily encourage the model towards "be sure of yourself, take action" instead of the "verify" that was meant.

As mentioned elsewhere in this thread, the fact that the article focuses so strongly on "the model confessed! It admitted it did the wrong thing!" doesn't lead me to put a ton of stock into the capability of the author to be cautious.

withinboredom · 2026-04-26T23:10:33 1777245033

Yeah, also seems very US-centric with "outstanding shares". Other countries don't allow you to have outstanding shares. Also, where is the "reinvest" option during series rounds. Like, what founder also doesn't reinvest to keep their equity during fund raising?

iliabara · 2026-04-28T04:13:51 1777349631

This is US centric, it's mentioned on the first page, along with the notes and links.

Reinvesting is a minority, edge case. I can't think of a single VC company in the Bay area that can have founders reinvest shares and not cause issues raising.

jagged-chisel · 2026-04-28T23:37:48 1777419468

"...reinvest shares." I don't understand this. I'm guessing it's my lack of knowledge.

I'd expect "reinvest" to mean "supply more capital to obtain more shares" in an attempt to maintain ownership percentage. The founder would have to already be rather wealthy, or be savvy enough to negotiate some other way to maintain their ownership.

withinboredom · 2026-05-04T10:36:45 1777891005

You can borrow against your shares if you have a good relationship with the bank. That will allow you to be wealthy enough to buy some shares in the round. Not enough that you won't get diluted, but not as much as being broke. And of course, you have to pay back the loan.

withinboredom · 2026-04-26T22:57:34 1777244254

I can only imagine someone looking over my shoulder on vacation to see what I'm posting: "oh, you have a 'close friends' group; why am I not in it?"

Arbitrary labels are great ... until they're not.

notahacker · 2026-04-26T23:02:06 1777244526

Arbitrary labels make it really easy to give groups of close friends silly in-joke names rather than "close friends"...

withinboredom · 2026-04-24T13:08:46 1777036126

Thanks. Cloudflare took so long to determine whether or not I was bot-shaped that I just came to the comments.

withinboredom · 2026-04-16T21:50:27 1776376227

He literally said it came down to the comment in the SVG. Points for taste, not correctness. Basically.

withinboredom · 2026-04-14T07:34:10 1776152050

I didn’t read tfa, but can we also have it be able to distinguish when a vulnerability doesn’t apply? As an open source contributor, people open nonsensical security issues all the time. It’s getting annoying.

withinboredom · 2026-04-14T07:18:29 1776151109

Sounds like what my teachers used to say: “a personal problem”. Literally nobody outside FB knows what they’re missing and until they fix that, literally nobody cares.

itsdesmond · 2026-04-14T14:27:24 1776176844

> Sounds like what my teachers used to say: “a personal problem”.

They don’t sound like a very good teacher.

withinboredom · 2026-04-14T14:48:58 1776178138

Judging by the amount of adults wandering around thinking their personal problems are everyone else’s problem… they were pretty good teachers.