Hacker Newsnew | past | comments | ask | show | jobs | submit | rnxrx's commentslogin

Maybe there's an analogy to our long and short term memory - immediate stimuli is processed in the context deep patterns that have accreted over a lifetime. The effect of new information can absolutely challenge a lot of those patterns but to have that information reshape how we basically think takes a lot longer - more processing, more practice, etc.

In the case of the LLM that longer-term learning / fundamental structure is a proxy for the static weights produced by a finite training process, and that the ability to use tools and store new insights and facts is analogous to shorter-term memory and "shallow" learning.

Perhaps periodic fine-tuning has an analogy in sleep or even our time spent in contemplation or practice (..or even repetition) to truly "master" a new idea and incorporate it into our broader cognitive processing. We do an amazing job of doing this kind of thing on a continuous basis while the machines (at least at this point) perform this process in discrete steps.

If our own learning process is a curve then the LLM's is a step function trying to model it. Digital vs analog.


How about we just start with SCOTUS having transparent (and enforced) ethics and corruption policies?

The issue lies in who enforces it. In theory, that's congress with the ability to impeach and convict members of SCOTUS.

I've also thrown around ideas in my head of state SC's chief justices having a channel to court marshal a SCOTUS and eject them with a supermajority ruling. Or a band of federal judges. But there's so much more involved there I haven't begun to consider.


I'm also increasingly aware that my own writing style and punctuation seem to line up with what might be associated with an AI, but some of the tells (em-dashes, spaces after periods, etc) seem like artifacts of when in history we learned to write.

I wonder how much crossover there would be between a trained text analysis model looking for Gen-X authors and another looking for LLM's.


I worked on something like this in 2000-1. We were attempting to identify the native language and origin region of authors based on aberrant modes in second languages (as a simple case, a french person writing english might say "we are tuesday.") It was accurate and fast with the sota back then; I think you could one-shot a general purpose LLM today.

People don't put spaces after periods? Do people really write.like.this?

On the Gboard keyboard. Without fail.

But that's a different issue.


Happens to me all the time trying to type a search phrase in Safari in iPhone for some reason.

When you’re trying to type a URL there’s a period next to the space bar where your right thumb usually hits space, but if you’re just texting iOS won’t show that. That’s my theory, just muscle memory.

The development of steam technology is a great metaphor. The basic understanding of steam as a thing that could yield some kind of mechanical force almost certainly predated even the Romans. That said, it was the synthesis of other technologies with these basic concepts that started yielding really interesting results.

Put another way, the advent of industrialized steam power wasn't so much about steam per se, but rather the intersection of a number of factors (steam itself obviously being an important one). This intersection became a lot more likely as the pace of innovation in general began accelerating with the Enlightenment and the ease with which this information could be collected and synthesized.

I suspect that the LLM itself may also prove to be less significant than the density of innovation and information of the world it's developed in. It's not a certainty that there's a killer app on the scale of mechanized steam, but the odds of such significant inventions arguably increase as the basics of modern AI become basic knowledge for more and more people.


Its mostly metallurgy. The fact that we became so much better and precise at metallurgy enabled us to make use of steam machines. Of course a lot of stuff helped (Glassmaking, whale oil immediatly come to mind) but mostly, metallurgy.


I remember reading an article that argued that it was basically a matter of being path dependent. The earliest steam engines that could do useful work were notoriously large and fuel-inefficient, which is why their first application was for pumps in coal mines - it effectively made the fuel problem moot and similarly their other limitations were not important in that context, while at the same time rising wages in UK made even those inefficient engines more affordable than manual labor. And then their use in that very narrow niche allowed them to be gradually improved to the point where they became suitable for other contexts as well.

But if that analogy holds, then LLM use in software development is the "new coal mines" where it will be perfected until it spills over into other areas. We're definitely not at the "Roman stage" anymore.


If we go by that analogy, i think LLMs (and all of current programming automation like compilers) are just different mechanical parts. They will improve in quality, in precision, surrounding product will make them even more effective (MCP is vulcanized rubber here? :D), but they aren't coal or even the steam engine.


There's a point in the article that mentions allowing the model to ask questions. I've found this to be especially helpful in avoiding the bad or incomplete assumptions that so often lead to lousy code and debugging.

The (occasionally) surprising part is that there are times where the generated clarifying questions actually spawn questions of my own. Making the process more interactive is sort of like a pseudo rubber duckie process: forcing yourself to specifically articulate ideas serves to solidify and improve them.


I think the progression of sentiment is basically the same. There were lots of folks pushing the agenda that connecting us all would somehow bring about the evolution of the human race by putting information at our fingertips that was eventually followed by concern about kids getting obsessed/porn-saturated.

The same cycle happened (is happening) with crypto and AI, just in more compressed timeframes. In both cases the initial period of optimism that transitioned into growing concerns about the negative effects on our societies.

The optimistic view would be that the cycle shortens so much that the negatives of a new technology are widely understood before that tech becomes widespread. Realistically, we'll just see the amorality and cynicism on display and still sweep it under the rug.


It's only a matter of time until private enterprises figure out they can monetize a lot of otherwise useless datasets by tagging them and selling (likely via a broker) to organizations building models.

The implications for valuation of 'legacy' businesses are potentially significant.


Already happening.


The other side of this argument is that we're constantly fed lots of extraneous information along with the actual interesting content. The point about listening to the storyteller is completely valid, but that story teller wasn't full of advertisements, links to other stories or entreaties to smash a like button.

To an extent we're becoming wired to skim content because that content has been so deeply interleaved with items that aren't just extraneous, they're not even from the storyteller. I'd suggest this capability is even a kind of survival skill, akin to not only being able to spot motion in a dense jungle but to also instinctively focus on certain kinds of motion.


I'm curious about why the performance gains mentioned were so substantial for Qwen vs Llama?


it looks like llama.cpp has some performance issues with bf16


I had the same experience with Digital Ocean. Thankfully there were several other providers happy to take my money immediately.


DO will sign off in minutes (or did for me)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: