More

fennecfoxy · 2026-03-20T17:10:19 1774026619

>Students can be trusted to obey a simple "no phones in class" rule.

And what if they don't? En masse?

rtkwe · 2026-03-20T19:07:09 1774033629

At first a lot of parents get inconvenienced coming to get confiscated phones and if that doesn't inspire them to discipline their kid at home the school can move to the more draconian pouch systems.

fennecfoxy · 2026-03-12T15:40:51 1773330051

>forces sparsity

That's branching and then coalescing, right? It selects a path that is weighted as being most beneficial to the input?

Given you pointed out how even the vertical part of the architecture allows for skipping layers anyway, isn't that essentially the same thing?

fennecfoxy · 2026-03-12T15:37:19 1773329839

How about, as you found repeating x-y was useful for locating the block of 7 layers in the first place; I'd be incredibly curious if, knowing that block of 7, if you then iterated from repeating x-y in that block z times.

Like for those 7 layers 1,2,3,4,5,6,7 does efficiency increase if you run 1,2,3,3,4,4,4,5,6,7 or perhaps 1,2,3,3,4,5,6,6,7 etc. If only GPUs grew on trees

dnhkng · 2026-03-13T15:00:53 1773414053

Yes, I have done these thype of experiments; thats for the next post

fennecfoxy · 2026-03-12T15:33:21 1773329601

I wouldn't be surprised if even in the same model, the organ block size varied wildly depending on what you're looking for (i.e. his probes).

But if there are sizes that are common, then that could also point to an architectural flaw, because whilst it could be universal constant-ness it could also be bounded by some inner working - and perhaps this is something that could be improved upon.

fennecfoxy · 2026-03-12T15:13:46 1773328426

I found this super interesting! Excellent writing! And I loved the cowboy quote, that was the best part; poor thing.

Now it's making me wonder - instead of smashing things together more violently for MoE type stuff, perhaps it's more effective to create better toolsets to allow us to analyse smaller models.

Then small models can be trained (faster & cheaper) to be excellent at very specific tasks or domains, the toolset used to identify the organ and organ selection layers, a larger Frankenstein's monster model can be stitched together from these organs with perhaps a little extra training/fine-tuning to improve its organ selection abilities.

That makes me imagine some sort of future of layer standardisation, in which for a standard and optimal architecture sets of layers can be dynamically downloaded, added, swapped out etc to maintain fastest inference speed whilst allowing for flexible skills. Almost like the concept of subagents but within the architecture of the model itself. Hmmm.

I'm only versed in transformer architecture at a high level, does anybody know of any architectures where the layers branch & then coalesce like that? Or is it majority linear layer by layer?

fennecfoxy · 2026-03-10T10:21:55 1773138115

Tibetans, uyghurs, etc? Factories full of North Korean workers under the watchful eye of their overlords, modern slavery even of your own people.

As a Kiwi I look at the US, Russia, China, etc as the same. Even the UK (where I now live) is a scarier place than back home.

raven12345 · 2026-03-12T06:16:32 1773296192

That's fair, because you're in a Western media environment, and the fact that you can see them as the same already proves my point.

fennecfoxy · 2026-03-10T10:17:29 1773137849

Why world model? To emulate how we became sentient?

A "world" is just senses. In a way the context is one sense. A digital only world is still a world.

I think more success is in a model having high level needs and aspirations that are borne from lower level needs. Model architecture also needs to shift to multiple autonomous systems that interact, in the same ways our brains work - there's a lot under the surface inside our heads, it's not just "us" in there.

We only interact with our environment because of our low level needs, which are primarily: food, water. Secondary: mating. Tertiary: social/tribal credit (which can enable food, water and mating).

omegastick · 2026-03-10T16:21:20 1773159680

Because if you have an explicit world model you can optimize against it.

It sounds like you are imagining tacking a world model onto an LLM. That's one approach but not what LeCun advocates for.

fennecfoxy · 2026-03-10T10:09:23 1773137363

Lmao why. Stop driving through red light, stop speeding. Ya fuckers.

In the UK it's ridiculous, barely any speed cameras and those that are there are clearly marked (legally have to be). Everyone just slows down for the speed cameras and then start speeding again after.

I've actually heard people say that the above is effective because it makes people slow down where it's important. Or, you know how about people just don't fucken speed in general?

If it were up to me they'd be everywhere, totally unmarked and all revenue from fines would go to charitable causes to rule out the "but they just do it for da money!11" bs - no, they're doing it to stop people speeding and killing someone for fuck's sake.

Stop speeding.

Orygin · 2026-03-10T10:41:23 1773139283

Except cameras don't increase safety. You say it yourself that everyone just speeds up after the camera.

Getting a ticket also does nothing to prevent you from speeding in the first place (the ticket does not arrive to you instantly, you're still speeding on the road).

Road safety is an infrastructure problem, but it is always easier and cheaper to just put a camera and collect money. While designing roads that you cannot go too fast, and actually building them cost money.

They just want the cheapest option to say "we did something". Not the safest.

presentation · 2026-03-10T11:22:14 1773141734

One time when I was living in Shanghai, I accidentally took the train to the wrong airport and had to take a cab to the other one. The cabbie was driving on the highway right at the speed limit, and I was worried I wouldn’t make my flight. I asked him if he could rush a bit, but he replied that he would not speed because 100% he would get a ticket.

It only doesn’t work if the system is half assed. But I agree that in low speed pedestrian areas, the built form is a better solution, but knowing you will get caught is also effective (if you accept the privacy tradeoffs).

Orygin · 2026-03-10T12:07:52 1773144472

It is effective if there are cameras everywhere, meaning you are tracked and spied on everywhere you go.

I'd prefer we spend a bit more on the road infrastructure than live in a surveillance hell.

fennecfoxy · 2026-03-12T15:46:05 1773330365

>You say it yourself that everyone just speeds up after the camera

...because they know where it is and there are so few of them...

Humans are very good at trying to get away with things. The only way to solve speeding is for people to receive the fines, receive the points on their license. They only speed because they're selfish fucks and because it's so easy to get away with it.

Orygin · 2026-03-13T09:28:17 1773394097

So put cameras everywhere? What a privacy nightmare.

fennecfoxy · 2026-03-10T10:05:34 1773137134

I think you mean LORAs more than a fine tune. Yes, plus there are plenty of online resources to train a LORA as well, CivitAI you can just give it a bunch of images + labels and it just does it for you, the bar is pretty low.

fennecfoxy · 2026-03-10T10:02:54 1773136974

I mean there's no point; everyone still gets super mad even in the cases where models where trained only on content that a company owns or has paid for.

I wish artists would stop with the "it stole our work bullshit" and just be more honest about the "it can do what we do and we're terrified and scare for our future" part.

Because that I can 100% understand, and contrary to previous jobs just disappearing, we do live in "the future" and things like UBI or free cross-training should be available for this sort of thing.