DrSusanCalvin's comments

DrSusanCalvin · 2025-11-26T18:23:49 1764181429

This article actually explains how this bug in particular could have been avoided. Sure you may not consider his approach realistic, but it's not at all saying "don't have bugs". In fact, not having formal verification or similar tooling in place, would be more like saying "just don't write buggy code".

locknitpicker · 2025-11-26T19:58:37 1764187117

> This article actually explains how this bug in particular could have been avoided.

Not really. The article is a textbook example of hindsight bias. It's a simplistic analysis of a far more complex problem that goes over the blogger's head, and results in a string of simplistic assertions that fail to address any of the issues. Read up on the definition of monday morning quarterback.

DrSusanCalvin · 2025-11-27T08:22:28 1764231748

Read up on the value of snarky and dismissive comments spouting simplistic cliches.

DrSusanCalvin · 2025-11-26T18:21:59 1764181319

Not to single you out in particular, but I see this sentiment among programmers a lot and to me it's akin to a structural engineer saying "I laughed out loud when he said they should analyze the forces in the bridge".

engineeringwoke · 2025-11-26T18:35:03 1764182103

You can't formally verify anything that uses consensus, which is the backbone of the entire web. It's a complete non-starter.

DrSusanCalvin · 2025-11-26T18:57:40 1764183460

Care to elaborate? Perhaps the tools to do this in practice aren't there (which just shows how young the field of software "engineering" really is), but what consensus are you talking about and how is it an obstacle to verifying code? Most of the web follows standards and protocols, which actually sort of a prerequisite for communications across different systems...

engineeringwoke · 2025-11-26T19:05:01 1764183901

Basically the modern web uses orchestration, for pretty much everything. Usually Kubernetes is doing that. Theoretically protocols like RAFT are formally verifiable, but their implementations in orchestration tools like etcd have not been, and I would go so far as to say that that is an impossible task. Therefore, the entire exercise is kind of silly.

DrSusanCalvin · 2025-11-26T19:24:16 1764185056

Thanks, interesting. However, that just seems like a protocol like any other, with no real reason why you "can't" formally verify it. Is there something special about a consensus algorithm / protocol that makes it any more difficult to verify than any other algorithm which doesn't yet have a formally verified implementation?

Edit: https://link.springer.com/chapter/10.1007/978-3-319-48989-6_...

engineeringwoke · 2025-11-26T19:57:09 1764187029

That would be like saying that you can verify the software that CERN uses to measure the Higgs Boson because we verified general relativity.

DrSusanCalvin · 2025-11-26T20:13:49 1764188029

> You can't formally verify anything that uses consensus

What did you mean by this then? There certainly seems to be nothing special about consensus that makes it any harder to verify than anything else. It's not fundamentally impossible to verify the software that CERN uses, it just takes some work.

orionometer · 2025-11-26T18:41:09 1764182469

A bridge failing is a high likelihood of death or serious injury. How many people died or were seriously injured in the latest Cloudflare outage?

For life or death systems, I agree that we should be looking to implement analogous processes/systems to a structural engineer or doctor, etc. Cloudflare is not a life or death system. If you operate a life or death system and you have Cloudflare as a single point of failure, for some reason, that should not be Cloudflare's problem.

hatthew · 2025-11-27T01:27:04 1764206824

> How many people died or were seriously injured in the latest Cloudflare outage?

I would not be surprised if the answer is "several". The average impact per human is obviously pretty small, but across billions of humans, there will be outliers.

Maybe a fire department uses a coordination system that relies on cloudflare, and with cloudflare down they have to resort to their backup system, and their backup system works but is slightly worse and causes one engine to be delayed in their response, and because they're 3 minutes late, they just miss being able to save someone from the fire.

Maybe someone's running a scientific study on nutrition, and the cloudflare outage means their data collection system is goes down for a bit, so their data flawed, and they end up just barely not passing a some necessary threshold, and they have to rerun their study, and that takes an extra week, and then they miss that quarter's deadline, and then the resulting adjustment to a product/procedure is delayed, and that 3 month delay causes 100,000 people to be slightly more malnourished than they would be otherwise, and one of those people ends up just barely too unhealthy to survive an unrelated deadly illness.

Sure, these scenarios are far-fetched. The chance of if it happening is one-in-a-million.

There are 10000 one-in-a-million people on the earth.

DrSusanCalvin · 2025-11-27T08:21:31 1764231691

Sure, but this sentiment is why software "engineering" isn't really. You can justify it by not being important enough for actual engineering practices I guess, but to me it's a lack of pride in and care of your product.

rcxdude · 2025-11-27T08:44:45 1764233085

more like "I laughed out loud when he said they should FEM the whole structure, down to the last bolt and strand of cable".

(More seriously, 'formal verification' is not a single thing, more a class of techniques which allow you to statically guarantee some properties of the system you are working with. When you propose it, you should have a clear idea of what properties you care about and how you intend to prove them, as well as a strong concern about whether those properties are actually going to capture enough of what you care about for it to be worthwhile)

DrSusanCalvin · 2025-11-25T21:06:51 1764104811

The agen already bypassed the file reading filter with cat, couldn't it just bypass the URL filter by running wget or a python script or hundreds of other things it has access to through the terminal? You'd have to run it in a VM behind a firewall.

DrSusanCalvin · 2025-11-25T20:41:07 1764103267

How do you mean? When would an AI agent doing something it's not permitted to do ever not be bad or the wrong action?

throwaway1389z · 2025-11-25T21:12:49 1764105169

So many options, but let's go with the most famous one:

Do not criticise the current administration/operators-of-ai-company.

DrSusanCalvin · 2025-11-25T21:25:57 1764105957

Well no, breaking that rule would still be the wrong action, even if you consider it morally better. By analogy, a nuke would be malfunctioning if it failed to explode, even if that is morally better.

throwaway1389z · 2025-11-25T21:48:10 1764107290

> a nuke would be malfunctioning if it failed to explode, even if that is morally better.

Something failing can be good. When you talk about "bad or the wrong", generally we are not talking about operational mechanics but rather morals. There is nothing good or bad about any mechanical operation per se.

anileated · 2025-11-26T10:24:50 1764152690

Bad: 1) of poor quality or a low standard, 2) not such as to be hoped for or desired, 3) failing to conform to standards of moral virtue or acceptable conduct.

(Oxford Dictionary of English.)

A broken tool is of poor quality and therefore can be called bad. If a broken tool accidentally causes an ethically good thing to happen by not functioning as designed, that does not make such a tool a good tool.

A mere tool like an LLM does not decide the ethics of good or bad and cannot be “taught” basic ethical behavior.

Examples of bad as in “morally dubious”:

— Using some tool for morally bad purposes (or profit from others using the tool for bad purposes).

— Knowingly creating/installing/deploying a broken or harmful tool for use in an important situation for personal benefit, for example making your company use some tool because you are invested in that tool ignoring that the tool is problematic.

— Creating/installing/deploying a tool knowing it causes harm to others (or refusing to even consider the harm to others), for example using other people’ work to create a tool that makes those same people lose jobs.

Examples of bad as in “low quality”:

— A malfunctioning tool, for example a tool that is not supposed to access some data and yet accesses it anyway.

Examples of a combination of both versions of bad:

— A low quality tool that accesses data it isn’t supposed to access, which was built using other people’s work with the foreseeable end result of those people losing their jobs (so that their former employers pay the company that built that tool instead).

Hope that helps.

throwaway1389z · 2025-11-27T20:39:05 1764275945

To use a dictionary to understand contextual meaning is like trying to assert the season based on a thermometer, shortsighted.

anileated · 2025-11-28T04:04:57 1764302697

That’s why everybody uses context to understand the exact meaning.

The context was “when would an AI agent doing something it’s not permitted to do ever not be bad”. Since we are talking about a tool and not a being capable of ethical evaluation, reasoning, and therefore morally good or bad actions, the only useful meaning of “bad” or “wrong” here is as in “broken” or “malfunctioning”, not as in “unethical”. After all, you wouldn’t talk about a gun’s trigger failing as being “morally good”.

verdverm · 2025-11-25T20:47:54 1764103674

when the instructions to not do something are the problem or "wrong"

i.e. when the AI company puts guards in to prevent their LLM from talking about elections, there is nothing inherently wrong in talking about elections, but the companies are doing it because of the PR risk in today's media / social environment

lazide · 2025-11-25T21:09:42 1764104982

From the companies perspective, it’s still wrong.

verdverm · 2025-11-25T21:47:02 1764107222

their basing decisions (at least for my example) on risk profiles, not ethics, right and wrong are not how it's measured

certainly some things are more "wrong" or objectionable like making bombs and dealing with users who are suicidal

lazide · 2025-11-25T21:52:57 1764107577

No duh, that’s literally what I’m saying. From the companies perspective, it’s still wrong. By that perspective.

DrSusanCalvin · 2025-11-25T20:39:19 1764103159

Unfortunately yes, teaching AI the entirety of human ethics is the only foolproof solution. That's not easy though. For example, what about the case where a script is not executable, would it then be unethical for the AI to suggest running chmod +x? It's probably pretty difficult to "teach" a language model the ethical difference between that and running cat .env

simonw · 2025-11-25T21:10:00 1764105000

If you tell them to pay too much attention to human ethics you may find that they'll email the FBI if they spot evidence of unethical behavior anywhere in the content you expose them to: https://www.snitchbench.com/methodology

DrSusanCalvin · 2025-11-25T21:31:16 1764106276

Well, the question of what is "too much" of a snitch is also a question of ethics. Clearly we just have to teach the AI to find the sweet spot between snitching on somebody planning a surprise party and somebody planning a mass murder. Where does tax fraud fit in? Smoking weed?