I remember a scene in this show which felt like many real meetings I've had in my life. The big hot shot CEO guy pulls everyone into a meeting to share his big idea. The idea? Let's sell a computer that's "twice the speed, half the price!"
...The engineer then rolls his eyes like "yeah no duh". If we could just magically do stuff like that, we would have done it already. Classic management thinking they have an original idea with no understanding of the engineering beneath it all. I thought they would just tell him off and that would be it. I really felt seen in that moment.
The frustrating thing is, they then take pointy haired boss's idea seriously. The rest of the season is spent actually pursuing that dumb, dumb idea... This felt disrespectful, and I stopped watching.
There's a case cited in that paper which does suggest something similar:
> A report in the lay literature describes the case of Claire Sylvia who reported changes in her personality, preferences, and behaviors following a heart and lung transplant at Yale-New Haven hospital in 1988. Following surgery, Sylvia developed a new taste for green peppers and chicken nuggets, foods she previously disliked. As soon as she was released from the hospital, she promptly headed to a Kentucky Fried Chicken to order chicken nuggets. She later met her donor’s family and inquired about his affinity for green peppers. Their response was, “Are you kidding? He loved them… But what he really loved was chicken nuggets” (p. 184, [9]). Sylvia later discovered that at the time of her donor’s death in a motorcycle accident, a container of chicken nuggets was found under his jacket [9].
I haven't read the whole thing, maybe there's something more relevant as well. That report isn't really about accessing the previous persons "memories" but at least claims she adopted a part of their personality. I'd be skeptical about its accuracy without more such reports, however.
A safer assumption would be that our body influences our behavior and tastes, and in turn they are directly affected by changes in our body, like an organ transplant.
A more interesting question regarding the case above would be "what's in our hearth and lungs that affects our perception of capsaicin?".
So if this were true you'd expect people with spine injuries to forget large parts of their lives? Or what is the mechanism to be able to transfer these memories from organs to the brain?
Could you maybe have your harness limit the memory of Claude and then occasionally, when Claude specifically asks for it ("i need to remember something"), you can give Claude the full game history? Most turns, I'll bet it's okay to have a short context and maybe some notes. And then maybe once in a while it's nice to see the full chat history. Wdyt?
Not exactly the same, but kinda: my gen 1 Google Home just got Gemini and it finally delivers on the promise of like 10 years ago! Brought new life to the thing beyond playing music, setting timers, and occasionally asking really basic questions
I think of it more from an information retrieval (i.e. search) perspective.
Imagine the input text as though it were the whole internet and each page is just 1 token. Your job is to build a neural-network Google results page for that mini internet of tokens.
In traditional search, we are given a search query, and we want to find web pages via an intermediate search results page with 10 blue links. Basically, when we're Googling something, we want to know "What web pages are relevant to this given search query?", and then given those links we ask "what do those web pages actually say?" and click on the links to answer our question. In this case, the "Query" is obviously the user search query, the "Key" is one of the ten blue links (usually the title of the page), and the "Value" is the content of the web page that link goes to.
In the attention mechanism, we are given a token and we want to find its meaning when contextualized with other tokens. Basically, we are first trying to answer the question "which other tokens are relevant to this token?", and then given the answer to that we ask "what is the meaning of the original token given these other relevant tokens?" The "Query" is a given token in the input text, the "Key" is another token in the input text, and the "Value" is the final meaning of the original token with that other token in context (in the form of an embedding). For a given token, you can imagine it is as though the attention mechanism "clicked the 10 blue links" of the other most relevant tokens in the input and combined them in some way to figure out the meaning of the original query token (and also you might imagine we ran such a query in parallel for every token in the input text at the same time).
So the self attention mechanism is basically google search but instead of a user query, it's a token in the input, instead of a blue link, it's another token, and instead of a web page, it's meaning.
Read through my comments and those of others in this thread, the way you are thinking here is metaphorical and so disconnected from the actual math as to be unhelpful. It is not that case that you can gain a meaningful understanding of deep networks by metaphor. You actually need to learn some very basic linear algebra.
Heck, attention layers never even see tokens. Even the first self-attention layer sees positional embeddings, but all subsequent attention layers are just seeing complicated embeddings that are a mish-mash of the previous layers' embeddings.
Doge cut muscle sure. They cut the bones too. They sold one of our kidneys on the black market. And then jabbed us in the eyes 3 Stooges style for good measure so we couldn't even see how bad it really was.
We went in for liposuction and buccal fat removal surgery and came out the other side severely disfigured with Maralago face and a hunchback.
That's so generous it's past inaccurate. If they genuinely wanted a liposuction they wouldn't have hired their local pedophile rapist grifter to do it.
What I'm getting from this thread is that people have their own private benchmarks. It's almost a cottage industry. Maybe someone should crowd source those benchmarks, keep them completely secret, and create a new public benchmark of people's private AGI tests. All they should release for a given model is the final average score.