Hacker Newsnew | past | comments | ask | show | jobs | submit | upghost's commentslogin

theprimagen called this[1] like three days ago. That was fast.

[1] https://www.youtube.com/watch?v=m-bT5v5Tm7w&t=164s


No, he didn't? He predicted that third parties would donate tokens to FOSS projects, not that the labs would. One is PR that started ages ago, the other is a reasonable prediction of where the world is going.

Not quite donate tokens directly (technically and practically weird), but donation -> compute has been out for a couple months on opub.dev (disclaimer, built it). So his prediction was somewhat correct if not late!

They’ve been doing this since at least March

If this wasn't CERN tech I would think I was being taken for a ride. Conventional wisdom is that distributed consensus is not possible at this kind of performance, does anyone have a sense for how this is different and how my mental model is wrong?


> Conventional wisdom is that distributed consensus is not possible at this kind of performance

I'm not sure why you would think that? If you can assume the fiber is the same in both directions you know the round trip time is exactly double the latency of the connection. Then you know to phase shift your start time by that much when you get a start signal and you're in sync.

Obviously it's not trivial in practice, but it's not a fundamentally insurmountable problem.


Twice the path delay + the time it takes to send the return packet. I assume WR does this in hardware to get a predictable time.


Thanks. I thought it was interesting choosing arithmetic instead of some other relation because multimodal arithmetic (via CLP) is more of a PhD thesis than a blog post. Other relations might've been easier to demonstrate a general query.

What I couldn't tell from the article was if the author somehow achieved a multimodal arithmetic relation without needing CLP using a stack machine. That would be a neat technique.


Man that is such a bummer. The Naval Support Activity (NSA) "base" is not a hardened military facility. I've never been to the one in Bahrain, but it's usually where you go to play ultimate frisbee, maybe some paintball if you are lucky, and other types of R&R. Usually have a Naval Exchange (NEX) which is like a really discounted 7-11 / gift shop / walmart (depending on where you are).


What do you mean by "that is such a bummer"?

That military base of the aggressors targeting civilians is being hit in return, and the sailors get to save their lives?

Yeah, such a tragedy that a military facilitiy is damaged in response to intentional killing of civilians.

Sorry but that's some....... surprising concern.


Schools getting blown up is also a bummer. Everything about this situation and maybe the world is a bummer.

As soon as we stop treating these as bummers, there is literally nothing stopping a cycle of destruction. There may not be anyways, I don't know but giving up on empathy entirely seems even more dangerous than being bad at it.


I have plenty of the sympathy for the victims but none FIR the aggressors in this illegal war.

You seem to be suggesting that not feeling sorry for the soldiers who got to evacuate without all their belongings somehow means I'm losing my humanity. That's a dangerous thing - lives of the innocent civilians who didn't chose to be bombed are more important. Aggressors could simply.... Leave and stop being in danger.

Similarly I have little pity for Russian soldiers losing lives in another illegal war of aggression, knowing how many war crimes they committed in their wake.

Better?


Great writeup. Only thing I din't see in here was an analysis of the impact of players like Talaas[1] and their stupid faster hardware LLMs.

I feel like it could be majorly disruptive, but idk if it's going to prolong the apocalypse or bring it about sooner -- or if it's a big nothing burger.

But the demo[2] is super cool.

[1]: https://taalas.com

[2]: https://chatjimmy.ai/


I'm bullist for something like talaas to get smaller and easy to put in a desktop. Imagine an RPG where NPCs.... are way more complex and the entire game is very non deterministic.


I think I would like that as well. The problem is that if we bake an LLM into HW and make it cheaper and very efficient to run, then all games will have the same AI slop content, which could get boring pretty fast. The alternative is that these cards should load a different / fine-tuned LLM per game, but then we already have GPUs for that and today's LLMs are nowhere near good enough at the size which a GPU can run.


They claim to have qwen 3.5 27B on a card at end of year on the market. If they do, I’ll be buying one immediately.


> familiarity vs simplicity

Love this, I've never heard it put that way before.


Rich Hickey did in "Simple made easy" talk.


> Pre-training allows organizations to build domain-aware models by learning from large internal datasets.

> Post-training methods allow teams to refine model behavior for specific tasks and environments.

How do you suppose this works? They say "pretraining" but I'm certain that the amount of clean data available in proper dataset format is not nearly enough to make a "foundation model". Do you suppose what they are calling "pretraining" is actually SFT and then "post-training" is ... more SFT?

There's no way they mean "start from scratch". Maybe they do something like generate a heckin bunch of synthetic data seeded from company data using one of their SOA models -- which is basically equivalent to low resolution distillation, I would imagine. Hmm.


Pre-training mean exposing an already-trained model to more raw text like PDF extracts etc (aka continued pre-training). You wouldn't be starting from scratch, but it's still pre-training because the objective is just next token prediction of the text you expose it to.

Post-training means everything else: SFT, DPO, RL, etc. Anything that involves things like prompt/response pairs, reward models, or benefits from human feedback of any kind.


Er, then what is the "already trained" model? I thought pre-training was the gradient descent through the internet part of building foundational models.


Yeah, this checks out. I wonder what they are doing to prevent semantic collapse. Also, I wonder if the base model would already be instruct and RLHF tuned or only pre-trained. Trying to do additional training without semantic collapse in a way that is meaningful would be interesting to understand. Presumably they are using adapters but I've never had much luck in stacking adapters.

i.e.:

1. Do I start with an RLHF tuned model, "pretrain" on top of that (with adapter or by freezing weights?), then SFT on top of that (stack another adapter, or add layer(s) and freeze weights?) (and where did I get the dataset? synthetic extraction from corpus?), then RL (adapter, add layer(s) and freeze?)

2. or do I start at SF tuned model, ...

3. or do I start at raw pre-trained model, ...

Would love to know what the matrix used was.


Probably marketing speak for full fine-tuning vs PEFT/LoRA.


I think they are referring to “continued pretraining”.


I would guess:

Pre-training: refining the weights in an existing model using more training data.

Post-training: Adding some training data to the prompt (RAG, basically).


I can imagine that, as usual, you start with a few examples and then instruct an LLM to synthesize more examples out of that, and train using that. Sounds horrible, but actually works fairly well in practice.


Probably just means SFT fine-tuning a base model, vs behavioural dpo and/or SFT fine-tuning a instruction model.


> We are doing this to self-fund further investment in AI and enterprise sales while strengthening our financial profile.

Some quotes from the video:

> ...at the same time, we're a people company.

> Your work will live on in our products.

> Doing the right thing for Atlassian while acting with humanity and doing the right thing for all those on all sides of this set of decisions.

Wow. There's a lot to unpack here.


> Mr Cannon-Brookes told investors he “couldn’t be more bullish” about the opportunities ahead, despite relentlessly selling his own shares in the company daily. The Nightly reports he kept selling 7665 shares on a daily basis even in the month prior to the results at prices ranging from $US161.11 (AU$227) a share on January 8 to $US105.14 on February 4.

> While ordinary Aussies are asked to make big changes, the 46-year-old decided to treat himself to a ritzy new private jet late last year, admitting to a “deep internal conflict” over the carbon-heavy method of travel.

> The Atlassian co-founder and CEO bought a Bombardier 7500 and will use it to travel across his vast business operations, which include a minority stake in the Utah Jazz NBA team and a sponsorship deal with Formula 1.

https://www.msn.com/en-au/money/other/aussie-sacks-1600-afte...


Very interesting stuff. Apparently this is the implementation: https://github.com/dicpeynado/prolog-in-forth

Thinking about the amount of thought and energy that went into this, back in 1987 -- mostly preinternet, pre-AI. Damn.

I feel really lucky that we get to build on things like this.


There's a great 1986 book "Designing and Programming Personal Expert Systems" by Feucht and Townsend that implements expert systems in Forth (and in the process, much of the capability of Prolog and Lisp).


Ha,you beat me to it! That book was my first thought when I saw this post. I have a copy sitting here on my bookshelf.

Just to expand on how bonkers this book is... they assume that everyone has easy access to a Forth implementation. So they teach you how to build a Lisp on top of it. Then they use the Lisp you just built to build a Prolog. Then, finally, they do what the topic of the book actually is: build a simple expert system on top of that Prolog.

I love it!


To be fair, in the 1980s thanks to the Forth Interest Group (FIG), free implementations of Forth existed for most platforms at a time when most programming languages were commercial products selling for $100 or more (in 1980s dollars). It's still pretty weird, but more understandable with that in mind.


I'm surprised how hard I had to dig for an actual example of syntax[1], so here you go.

[1]: https://www.lix.polytechnique.fr/~dale/lProlog/proghol/extra...


There is also an implementation of 99 Bottles of Beer on Rosetta Code: https://rosettacode.org/wiki/99_bottles_of_beer#Lambda_Prolo...


Constantly amused by the split in comments of any moderately innovative language post between ‘I don't care about all this explanation, just show me the syntax!’ and ‘I don't understand any of this syntax, what a useless language!’

If the language is ‘JavaScript but with square brackets instead of braces’ maybe the syntax is relevant. But in general concrete syntax is the least interesting (not least important, but easiest to change) thing in a programming language, and its similarity to other languages a particular reader knows less interesting still. JavaScript is not the ultimate in programming language syntax (I hope!) so it's still worth experimenting, even if the results aren't immediately comprehensible without learning.


In Prolog the syntax is incredibly important. It is designed to be metainterpreted with the same ease in which a for-loop might be written in another language.

https://www.metalevel.at/acomip/

  mi1(true).
  mi1((A,B)) :-
        mi1(A),
        mi1(B).
  mi1(Goal) :-
        Goal \= true,
        Goal \= (_,_),
        clause(Goal, Body),
        mi1(Body).
This can be arbitrarily extended in very interesting, beautiful, and powerful ways. This is extraordinarily hard to achieve and did not happen by accident.

As a challenge, see how easy it is to write a metainterpreter in another language of your choice. Alternately, see if you can think of any way the metainterpretation system in Prolog could be improved.

Finally, think of what would happen to this if we changed the syntax and introduced something like object.field notation.

So while logical programming can be achieved with other syntaxes, the metaintrepretive aspect will be lost. I have yet to see a language that does this better.


Nice link, thank you! I'm not sure it's super related to my comment but it is closely related to some other things I'm thinking about. I'll give it a read :)


There are some examples in this tutorial PDF:

https://www.lix.polytechnique.fr/Labo/Dale.Miller/lProlog/fe...


I have written stuff in Prolog, but I find this lambda Prolog syntax very difficult to grok.


So brainfuck x lisp


Christ... it's incomprehensible... I guess that ones staying in academia :P


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: