Hacker Newsnew | past | comments | ask | show | jobs | submit | bilsbie's commentslogin

I can’t see how this is possible. You’re losing so much information.

It's because they're natively trained with 1 bit, so it's not losing anything. Now, the question might be how they manage to get decent predictive performance with such little precision. That I don't know.

Not training. Transposing rows/columns of matrices to group 128 parameters with similar (shared) scale factor. Qwen-3 model.

I'm not sure what you mean. Could you please elaborate?

I always remind myself and everyone else that human DNA is "only" 1.6 GB of data, and yet it encodes all of the complex systems of the human body including the brain, and can replicate itself. Our intuitive feel of how much stuff can be packed into how many bits are probably way off from the true limits of physics.

It encodes the data on top of locally optimal trajectories in the physical world that were learned in millions of years of evolution. Treat this as context, not weights.

That's not strictly true - DNA doesnt replicate itself, a cell with DNA replicates itself.

You need to count the information contained in the non-DNA part of the cell too.

Just in case it's not obvious, you can't take human DNA and put it in a cat cell, it won't work, that cell won't replicate.


True.

For now, the DNA replication and the synthesis of RNA and proteins using the information stored in DNA are the best understood parts about how a cell grows and divides, but how other complex cellular structures, e.g. membranes or non-ribosomal peptides, are assembled and replicated is much less understood.

We need more years of research, perhaps up to a decade or two, until we will be able to know the entire amount of information describing a simple bacterial cell, and perhaps more than that for a much more complex eukaryotic cell.


Human DNA has 3.2 billion base pairs, and with 2x the information density compared to binary systems (due to 4-letters as opposed 2), that's roughly 800MB of informational data.

Second, what's even more crazy is that roughly 98% of that DNA is actually non-coding.. just junk.

So, we are talking about encoding entirety of the logic to construct a human body in just around 16MB of data!!!

That's some crazy levels of recursive compression.. maybe it's embedding "varying" parsing logic, mixed with data, along the chain.


>Second, what's even more crazy is that roughly 98% of that DNA is actually non-coding.. just junk.

I think it's a myth that non-coding DNA is junk. Say:

https://www.nature.com/articles/444130a

>'Non-coding' DNA may organize brain cell connections.


As another poster has said, much of the "junk" is not junk.

The parts of the DNA with known functions encode either proteins or RNA molecules, being templates for their synthesis.

The parts with unknown functions include some amount of true junk caused by various historical accidents that have been replicated continuously until now, but they also include a lot of DNA that seems to have a role in controlling how the protein or RNA genes are expressed (i.e. turning off or on the synthesis of specific proteins or RNAs), by mechanisms not well understood yet.


And anybody who’s ever met a baby can tell you, they score very poorly on most llm benchmarks.

Would you use a js game engine or just vanilla js?

Just vanilla JS unless you've got prior experience because any engine you use is going to have a setup process and bootstrapping code and a learning curve for you that will eat into your time. Across the weekend you might only really have a few hours to dedicate to this project and to hold their attention.

Using the "memory" game as an example, do you want the problems you solve to be how to shuffle the cards in a random order, or do you want to be solving why the cards all positioned weirdly because PhaserJS defines an anchor "origin" point in objects and by default that's x 0.5 / y 0.5 which means 50% width / 50% height aka the center of the object so you need to either set their origin to x 0 / y 0 or factor that into their position by subtracting half their width and height, and their width and height has scaled and unscaled values too width vs displayWidth... and of course if you're using a group for the card's display objects, that class does not support setting the origin.


Has anyone found a good prompt to fix this? It seems like a subtle problem because it’s 90% too agreeable but will sometimes get really stubborn.

There is no sufficient prompt because this is trained into them during mid-late phases. It's ingrained into the weights

State the idea comes from a third party. Ask for pros/cons. Just have to find a way to counter its nature.

It seems like most breakthroughs I see are for efficiency? What are the most importsnt breakthroughs from the past two or three years for intelligence?

If you think of it from the point of view of the universal approximation theorem, it's all efficiency optimisation. We know that it works if we do it incredibly inefficiently.

Every architecture improvement is essentially a way to achieve the capability of a single fully-connected hidden layer network n wide. With fewer parameters.

Given these architectures usually still contain fully connected layers, unless they've done something really wrong, they should still be able to do anything if you make the entire thing large enough.

That means a large enough [insert model architecture] will be able to approximate any function to arbitrary precision. As long as the efficiency gains with the architecture are retained as the scale increases they should be able to get there quicker.


Most breakthroughs that are published are for efficiency because most breakthroughs that are published are for open source.'

All the foundation model breakthroughs are hoarded by the labs doing the pretraining. That being said, RL reasoning training is the obvious and largest breakthrough for intelligence in recent years.


With all the floating around of AI researchers though, I kind of wonder how "secret" all these secrets are. I'm sure they have internal siloing, but even still, big players seem to regularly defect to other labs. On top of this, all the labs seem to be pretty neck and neck, with no one clearly pulling ahead across the board.

> What are the most importsnt breakthroughs from the past two or three years for intelligence?

The most important one in that timeframe was clearly reasoning/RLVR (reinforcement learning with verifiable rewards), which was pioneered by OpenAI's Q* aka Strawberry aka o1.


Efficiency gains can be used to make existing models more profitable, or to make new larger and more intelligent models.

Some yes, others no. Distillation and quantization can't be used to make new base models since they require a preexisting one.

it enables models larger than was previously possible.

No because the base model from which the distilled or quantized models are derived is larger.

This is an intelligence breakthrough

I’m confused why the hype and the investment got so high. And why everyone treats it like a race. Why can’t we gradually develop it like dna sequencing.

To be fair, DNA sequencing was very hyped up (although not nearly as much as AI). The HGP finished two years ahead of schedule, which is sort of unheard of for something in it's domain, and was mainly a result of massive public interest about personalized medicine and the like. I will admit that a ton of foundational DNA sequencing stuff evolved over decades, but the massive leap forward in the early 2000s is comparable to the LLM hype now.

I assumed it was obvious. Being first is all that matters. Investors don't want to invest in second place. Obviously, first is achieving AGI and not some GPT bot. That's why so many people keep saying AGI is in _____ weeks away with some even being preposterous stating AGI might have already happened. They need to keep attracting investors. Same as Musk constantly saying FSD is ____ weeks away.

If I’m understanding this correctly it’s a one stop shop for an entire out of the box it department.

No, there's no mention of MDM.

Ctrl-F MDM

"Apple Business offers built-in mobile device management (MDM) [...]"


That’s the wrong lesson. Rather we should control things we own and not them control us.

Did you make the graphics yourself or find them?

Yep, I made the sprite sheets etc by myself in Figma.

Someone should make this for pickleball.

I’d love to do it if I knew anything about pickleball tactics, but I don’t.

I’m mulling over a similar idea for pickleball. Not a puzzle though. Feel free to get in touch (email in profile).

Ask your agents, come on!

Haha, maybe I should :D

I always wondered why there aren’t a whole bunch of minerals from the asteroid inside the crater? Shouldn’t it be loaded with gold and such?


"The object that excavated the crater was a nickel-iron meteorite about 160 ft (50 m) across."[0]

Not a lot of gold and such. It's not like the impact was going to fuse atoms of nickel-iron into gold.

[0]https://en.wikipedia.org/wiki/Meteor_Crater


1. The impact object was a nickel-iron meteorite. There's not much of anything else in those kinds of objects.

2. Most of the impact object vaporized in the estimated 10MT release of energy.


Maybe there are. Buried deep under sediments.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: