Hacker Newsnew | past | comments | ask | show | jobs | submit | mweidner's commentslogin

I fail to see how pursuing recursive self-improvement at full speed is compatible with Anthropic's stated goal of AI Safety. If nukes were not invented yet, would it really be a good idea to build and sell them as fast as possible (in peace time, no less)?

I am not cynical enough to believe that Anthropic's warnings are pure marketing hype. Let's hope that it is instead overconfidence or the result of too much time talking to their own chatbot.


> I am not cynical enough to believe that Anthropic's warnings are pure marketing hype.

Nor am I. I think they believe that AI poses a grave danger, and they are playing the prisoner's dilemma as an unvirtuous actor.

1. If anyone builds strong AI, it may be catastrophically bad.

2. If anyone builds strong AI, it will be better for the builder than for anyone who does not. Either because it won't be catastrophically bad so the builder will get to enjoy all the spoils indefinitely or because it will and at least the builder will be rich for a while.


I spoke with an Anthropic employee, and came to understand that their definition of safety is more like "making AI be a tool that humans can use without hurting themselves or others more than they can already do". It's literally about how AI makes it easier for people to construct bombs, poisons, manipulation, and exploits. Consistent with their caution about releasing Mythos to unvetted actors. So it's not about superintelligence killing humanity, at least as far as this employee conveyed to me.

This means their strategy is more like:

1. If someone builds a market-leading unsafe strong AI, it may be misused in a damaging way by a large number of humans, undermining society and creating a catastrophic upheaval.

2. However, if the leading AI maker also works to make it safe against misuse, as long as the stay in the lead and keep it safe, then the ability of human bad actors to misuse the AI is limited. Given enough time, society will adapt to pretty much anything, so eventually there's no longer an arms race to stay ahead.

I don't really know whether I agree with their concerns, but I do think that (my understanding of) their principles is that they're reasonable, self-consistent, and they adhere to them in all their public and private actions.


The problem is they (and the whole industry) have cried wolf so many times in the past few years about the supposed dangers of AI in order to raise money.

Some of us remember the same stories circulating in the late 90s -- where in a lab in Japan, someone had built a robot so advanced that it tried to escape from the factory. Which of course comes straight from 1960s science fiction.

The modern version of that now is Anthropic saying its AI can jailbreak itself out of its sandbox, etc etc.


Maybe we're just misinterpreting the meaning of "AI Safety"?

Maybe they mean the AI needs to be safe from us? Can't have the grubby meat flappers touching the delicate bits!


The thing about nukes is you can at least make an argument for why it'd be important to be the first country to have them. With AI, you create super intelligence and you're probably just the first one it takes out. There's no reason to think a super intelligence would be totally fine being a slave to apes.

Cynicism with these companies is highly warranted though. It's not doomerism to look at their actions and conclude they're deeply untrustworthy.


" There's no reason to think a super intelligence would be totally fine being a slave to apes."

Sure there is. Intelligence doesn't give us our selfish motivations, natural selection does. We have similar motivations to C elegans, that has all of 302 neurons. Stay alive and have sex.

Honeybees don't though. They are about halfway between humans and C elegans when it comes to cognitive power. But they are not selfish because they don't reproduce directly (I'm talking about the worker bees). So they will sting even though it kills them. All their behavior is consistant with this.


Kinda lame that people are downvoting this.

I've had the same perspective for quite a while now, but hadn't been able to phrase it this cleverly.

Our neocortex is, by any definition, vastly more "intelligent" than the rest of our brain. Yet it doesn't attack the cerebellum. In fact, it takes orders from the older "lizard brain"!


Heh, yeah that's a clever analogy as well. (and thanks!)

This "super intelligence" is, at the end of the day, 1's and 0's inside of a silicon chip somewhere. 1's and 0's are not going to "take over" anything. They are just information.

Anthropics goal is regulatory capture.

> I am not cynical enough to believe that Anthropic's warnings are pure marketing hype.

It's not cynicism if it's an appraisal of reality that's backed up by evidence.

Remember how social media - that first baby of this current generation of tech entrepreneurs - was supposed to "bring the world together" and "let us express ourselves"? As it turns out there's a lot more money to be made by fostering division to drive engagement and feeding people an endless stream of ads instead of their friends' content. And money is what matters. You can't write down good vibes on a quarterly figures report. You can absolutely write down the number of eyes that your ragebait brought to a product's marketing efforts and the conversion rate to sales.

The same will be done with GenAI. We're being promised "AI Safety" because otherwise this whole thing gets killed dead by anyone who knows about James Cameron's directing career. There's no real enforcement mechanism for AI safety, though. Safety is a good vibe, same as harmony in online communities. You can't measure it. What you can measure is training costs and the cost of mistakes by AI that need to be trained to avoid those mistakes. Since AI generates more output than humans can conceivably QA no matter what your budget is, and since AI is seen by the market as a potential endless font of value, the tradeoff will be made to have AI make some potentially awful decisions while training itself over slowing down and re-appraising what is being done.

There's an almost religious reverence for AI in SV. Not everyone sees it as "making the godhead" but some certainly do. They're not going to moderate themselves too much on this.


The folks I met who were talking about AI Safety in 2018 were certainly sincere, and the two people I knew who later joined Anthropic seem like the type to do it for the greater good instead of money.

I expect that Anthropic will eventually behave as you describe, like any other public corporation. However, my impression is that its current leaders are still more sincere than greedy.


Unfortunately, money changes people. 2018 was a long time ago. Before AI was considered a product you could really market in the current sense. Before trillion-dollar valuations became a prospect.

Remember how OpenAI was supposed to make open-source models and cap its potential returns to investors at some multiple of their principal (my memory says 100x, maybe I'm wrong)? Well, that went out the window as soon as the word "trillion" was mentioned.


This was pretty directly addressed in the article: not doing it would only mean they'd fall behind whoever would. This is not peace time in the AI race.

Whether you agree with that argument is another question.


Indeed, I do not buy this argument. Would China's progress be close to where it is today without the US labs' examples? Would any of this be happening if OpenAI had not created ChatGPT?

To complete the analogy, it's like nukes, except we don't have the slightest idea how to calculate the odds of it igniting the atmosphere. (And note that in reality, while the Trinity test "ignite the atmosphere" calculations were correct, we failed to correctly calculate the fallout of the Castle Bravo test with lethal consequences).

a better analogy with Castle Bravo is that the yield was 2.5x more than expected due to "unforeseen additional reactions" from the design.

https://en.wikipedia.org/wiki/Castle_Bravo


> Anthropic's *stated goal* of AI Safety

Actions speak louder than words. If you want to understand someone, simply watch what they do. What they say is irrelevant.


Such a massively valued company. And doubting them is cynicism? It’s rational(ism).

So either they lie or they are AI Zealots. Interesting times.


Sorry for nitpicking, but:

> If nukes were not invented yet, would it really be a good idea to build and sell them as fast as possible (in peace time, no less)?

Arguably, yes.


Is the idea to keep the world in balance via MAD? I could see that, though it's a dangerous gamble.

From Richard Rhode's "The Making of the Atomic Bomb", I got the impression that most scientists involved thought they could manage a US or UN monopoly on nukes after the war. General Groves attempted to buy up all of the world's uranium ore. Unfortunately, it is only high grade ore that is rare; many countries have low-grade ore.


Again quite arguable, but this is the real life scenario we’re living in. Nukes have made it hard to impossible for super major powers to go in direct conflict with each other.

Except it's pretty well documented (and this is total conjecture, but if you ask me, there are probably are a bunch of undisclosed cases) to have had a good amount of close calls. With the fire-on-warning stance many powers have, it doesn't take an attack, but just enough of the appearance of it to trigger a response.

I honestly don’t know how Iran can conclude anything after this war other than to go all-in on nukes. The US has proven any deal is worthless if it can just change its mind and renege on it whenever it wants.

Who’s invading North Korea? No-one.


Furthermore if Iran had nukes already, the Israel/US bombing of Iran and even the constant bullying of Israel's neighbors by Israel might not have happened.

No, but in a peace time, it's a lot easier to convince someone not to use nukes than in a war when the party who has nukes has its back against the wall.

Wouldn't deliberately going from a world without nuclear weapons to a world with MAD involve giving the tech to build nukes to your worst enemy?

If only the US or UN had nukes we would't have MAD. We mostly got here through espionage


In this world we've had an inocculation event against use of nukes. Two were dropped, people have seen how abhorrent their use is and collectively decided that they shouldn't be used.

If in the WW2 Japan also had nukes (and delivery systems for them) they'd probably have retaliated in kind and US wouldn't let that slide too and it would have continued for some time.


In that case >2 nukes would have been dropped, both US and Japan would be hurting, people would have seen how abhorrent their use is and collectively decided that they shouldn't be used.

> In that case >2 nukes would have been dropped

This is a maybe. What we’ve seen so far, no two nuclear superpowers ever nuked each other, as they know both will suffer.


If WW2 Japan also had nukes the US would never drop those two. That's the whole idea behind MAD. Probably the only thing that stopped an open conflict between the US and USSR was them being nuclear powers and both sides being scared that eventually push comes to shove.

MAD was thought of later and its theory requires that all parties know of each others' arsenal, think that their enemies aren't going to use them first and there being enough of weapons to make end quick and certain. I have hard time seeing WW2 generals who've seen horror and made horror coming to the conclusion that "they aren't going to use it unless we do, so let's not".

With the US showing that it will elect mentally disabled people such as Trump, this doesn't seem such a wise decision.

> I am not cynical enough to believe that Anthropic's warnings are pure marketing hype.

It doesn't really have to be dishonest, he could really believe it. I do believe, however, that it is incredibly wrong and is functioning as marketing hype.


Such a massively valued company. And doubting them is cynicism? It’s rational(ism).

So either they lie or they are AI Zealots. Interesting times.

Edit:

> > and the two people I knew who later joined Anthropic seem like the type to do it for the greater good instead of money.

There are three types of people. Pedestrians, investors, and “I know some of them, they wouldn’t lie”.


For values that don't have a natural merge function (or where you don't want to bother writing one), would it make sense to sync update logs instead? That is:

- The synced value is a history of client updates, sorted in some eventually consistent order (e.g. by hybrid logical clocks). Merging takes the union of the update sets.

- The user-visible value is the result of processing these updates in order, using arbitrary contract code.

This is overkill for simple last-writer-wins values, but it lets you support fairly general data types & arbitrary update functions, including ones that preserve application-specific invariants.

The Automerge CRDT library works like this already [1][2], but it only allows specific updates to JSON data. Sharing code via your contracts solves the hard part of generalizing that to arbitrary data & updates.

[1] https://automerge.org/

[2] https://arxiv.org/abs/1805.04263


> For values that don't have a natural merge function (or where you don't want to bother writing one), would it make sense to sync update logs instead?

Yes, in fact you can implement this within the current framework, for example with our group chat River, each room state maintains a list of the N most recent messages sorted by (approximate) timestamp.

The idea is that you can adapt the merge logic to the needs of the specific application, and I think a time ordered event log will be a common pattern.


How does it work in practice? Is it sorted by timestamp and content hash for uniqueness?


Messages in river are sorted by timestamp using a (non-cryptographic) hash of the message signature as tie-breaker, essentially a content hash.

One weakness is that we trust the message author to provide an accurate message timestamp, however bad behavior such as manipulating timestamps can be addressed by banning the user from the room.


If due to some technical glitch someone's timestamp is just off by a minute or something, I wouldn't exactly call that “bad behavior” that warrants banning someone, but it does mess with ordering in a chat application...


It could, but it hasn't been a problem in practice. If it becomes one we can certainly address it.


A CRDT that operates on code units should work out okay, because each grapheme cluster will always be inserted and deleted in a single edit - hence it should stick together in the text. (Some CRDTs actually can mess this up by interleaving concurrent-inserted code units, but Yjs avoids doing so.)

From the fix PR, I believe the issue in this case was with the insertion operations passed to the CRDT, not the CRDT itself. Specifically, Yjs's ProseMirror integration infers what text was inserted by diffing before and after states, instead of directly capturing user inputs (even though those are provided by ProseMirror transactions). The diff algorithm, lib0/diff, was not grapheme aware and hence could generate an inaccurate diff containing lone surrogates.

Operating on code units is convenient in JavaScript because then your CRDT's `length` matches the language's `String.length`, and likewise for indexed access.


I'm surprised to see the emphasis on tracking lines of text, which ties in to the complexity of merge vs merge-the-other-way vs rebase. If we are committed to enhancing the change history, it seems wiser to go all in and store high-level, semantically-meaningful changes, like "move this code into an `if` block and add `else` block ...".

Consider the first example in the readme, "Left deletes the entire function [calculate]. Right adds a logging line in the middle". If you store the left operation as "delete function calculate<unique identifier>" and the right operation as "add line ... to function calculate", then it's obvious how to get the intended result (calculate is completely deleted), regardless of how you order these operations.

I personally think of version control's job not as collaborating on the actual files, but as collaborating on the canonical order of (high-level) operations on those files. This is what a branch is; merge/rebase/cherry-pick are ways of updating a branch's operation order, and you fix a conflict by adding new operations on top. (Though I argue rebase makes the most sense in this model: your end goal is to append to the main branch.)

Once you have high-level operations, you can start adding high-level conflict markers like "this operation changed the docs for function foo; flag a conflict on any new calls to foo". Note that you will need to remember some info about operations' original context (not just their eventual order in the main branch) to surface these conflicts.


You can think of the semantics (i.e., specification) of any CRDT as a function that inputs the operation history DAG and outputs the resulting user-facing state. However, algorithms and implementations usually have a more programmatic description, like "here is a function `(internal state, new operation) -> new internal state`", both for efficiency (update speed; storing less info than the full history) and because DAGs are hard to reason about. But you do see the function-of-history approach in the paper "Pure Operation-Based Replicated Data Types" [1].

[1] https://arxiv.org/abs/1710.04469


While this is technically correct, folks discussing CRDTs in the context of text editing are typically thinking of a fairly specific family of algorithms, in which each character (or line) is assigned an immutable ID drawn from some abstract total order. That is the sense in which the original post uses the term (without mentioning a specific total order).


The rebasing step is indeed a transformation. Some info in the "rebasing" link here [1].

Unlike traditional Operational Transformation, though, there are no "transformation properties" [2] that this rebasing needs to satisfy. (Normally a central-server OT would need to satisfy TP1, or else users may end up in inconsistent states.) Instead, the rebased operations just need to "make sense" to users, i.e., be a reasonable way to apply your original edit to a slightly-further-ahead state. ProseMirror has this sort of rebasing built in, via its step mappings, which lets the collaboration-specific parts of the algorithm look very simple - perhaps deceptively so.

[1] https://prosemirror.net/docs/guide/#collab [2] https://en.wikipedia.org/wiki/Operational_transformation#Tra...


Author here, just chiming in to say that Matt has an actual PhD on the subject so rather than explain it worse, I will just let him say the probably-actually-correct thing here.


The PowerSync folks and I worked on a different approach to ProseMirror collaboration here: https://www.powersync.com/blog/collaborative-text-editing-ov... It is neither CRDT nor OT, but does use per-character IDs (like CRDTs) and an authoritative server order of changes (like OT).

The current implementation does suffer from the same issue noted for the Yjs-ProseMirror binding: collaborative changes cause the entire document to be replaced, which messes with some ProseMirror plugins. Specifically, when the client receives a remote change, it rolls back to the previous server state (without any pending local updates), applies the incoming change, and then re-applies its pending local updates; instead of sending a minimal representation of this overall change to ProseMirror, we merely calculate the final state and replace with that.

This is not an inherent limitation of the collaboration algorithm, just an implementation shortcut (as with the Yjs binding). It could be solved by diffing ProseMirror states to find the minimal representation of the overall change, or perhaps by using ProseMirror's built-in undo/redo features to "map" the remote change through the rollback & re-apply steps.


Hi Matt! Good to see you here. For those who don't know, Matt also wrote a blog about how to do ProseMirror sync without CRDTs or OT here: https://mattweidner.com/2025/05/21/text-without-crdts.html and I will say I mostly cosign everything here. Our solution is not 100% overlap with theirs, but if it had existed when we started we might not have gone down this road at all.


Your part 1 post was one of the inspirations for that :)

Specifically, it inspired the question: how can one let programmers customize the way edits are processed, to avoid e.g. the "colour" -> "u" anomaly*, without violating CRDT/OTs' strict algebraic requirements? To which the answer is: find a way to get rid of those requirements.

*This is not just common behavior, but also features in a formal specification [1] of how collaborative text-editing algorithms should behave! "[The current text] contains exactly the [characters] that have been inserted, but not deleted."

[1] http://www.cs.ox.ac.uk/people/hongseok.yang/paper/podc16-ful...


This was my impression as well. If you ignore the paper and just look at the source code - and carefully study Seph Gentle's Yjs-like RGA implementation [1] - I believe you find that it is equivalent to an RGA-style tree, but with a different rule for sorting insertions that have the same left origin. That rule is hard to describe, but with some effort one can prove that concurrent insertions commute; I'm hoping to include this in a paper someday.

[1] https://josephg.com/blog/crdts-are-the-future/


Yes, I think it would be a good paper.

I made a tiny self contained implementation of this algorithm here if anyone is curious:

https://github.com/josephg/crdt-from-scratch/blob/master/crd...

FugueMax (or Yjs) fit in a few hundred lines of code. This approach also performs well (better than a tree based structure). And there's a laundry list of ways this code can be optimised if you want better performance.

If anyone is interested in how this code works, I programmed it live on camera in a couple hours:

https://www.youtube.com/watch?v=_lQ2Q4Kzi1I

This implementation approach comes from Yjs. The YATA (yjs) academic paper has several problems. But Yjs's actual implementation is very clever and I'm quite confident its correct.


Managing "a flat-ish collection of nodes" that can be moved around (without merely deleting and re-inserting nodes) is tricky because of how paragraphs can be split and merged. Notion tackled this for their offline mode: https://www.youtube.com/watch?v=AKDcWRkbjYs

If you take that as a solved problem, do your concerns change?

> Selection & Cursors: Selection across regions is notoriously hard. If "Region A" and "Region B" aren't siblings in a tree, how do we handle a user dragging a selection across both?

You could render them in the DOM as an old-fashioned tree, while internally manipulating your "flat" IR, to make selections work nicely.

This is not too different from how Yjs-ProseMirror works already: Yjs has its own representation of the state as a CRDT tree, which it converts to a separate ProseMirror tree on each update (& it uses a diff algorithm to map local user edits in the other direction).

> Prior Art: Has anyone seen a production system (perhaps in the desktop publishing or CAD world) that successfully treated rich text as a non-hierarchical "content pool"?

This might be how Dato CMS works? https://www.datocms.com/docs/content-modelling (I say this based off of 5 minutes spent watching someone else use it.)

> Are we stuck with trees because they are the "right" abstraction, or just because the browser gives them to us for free?

For lists specifically, I would argue the latter. It's natural to think of a list as a flat sequence of list items, in parallel to any surrounding paragraphs; forcing you to wrap your list items in a UL or OL is (to me) a browser quirk.

I made some progress fighting this in Tiptap: https://github.com/commoncurriculum/tiptap-extension-flat-li... Quill.js already models lists in this "flat" way.


Your reply hits the real tension: a flat model simplifies layout changes, but it shifts complexity into how you map edits and selections. That trade‑off feels worth it if the goal is “safe structure changes” and AI‑driven transforms.

On the split/merge issue: in a flat model, the split/merge doesn’t have to be a structural operation at all. It can live entirely inside the block’s text content. The block keeps the same ID, and only its content changes. That avoids the “delete/reinsert” problem and keeps a stable identity for AI or history.

On selection: the cleanest route is to render a normal DOM tree for interaction and treat the flat IR as the truth. So the DOM is just a projection. That buys you native selection and IME behavior without building a custom cursor engine. The only hard part is deciding a consistent reading order (left‑to‑right, top‑to‑bottom, region order), so selection feels predictable even when layout is spatial.

On syncing/CRDT: a flat model can be simpler in a different way. You’re syncing text inside blocks plus lists of IDs in regions. That’s two clear problems instead of one giant nested tree. It doesn’t remove the complexity, but it makes it easier to reason about where conflicts live (content vs layout).

On lists: a flat list of items is closer to how people think. UL/OL is a browser artifact. Quill’s model already shows this is workable, and it makes the “content pool + layout map” idea more consistent.

Using TipTap/ProseMirror as the editing surface (selection, IME, rich text behavior) while keeping a separate IR is a reasonable split: the view stays tree‑shaped, the data stays flat.

So overall: this approach looks less like “throw away trees” and more like “trees become a rendering tool, not the canonical structure.” That’s a meaningful shift, especially if AI or layout transforms are first‑class.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: