Please don't paste ChatGPT or BARD answers as HN comments in general? In this specific case, no, LLMs don't reliably know about themselves. They're trained on a big corpus of internet text, then trained by rough reinforcement learning to say and not say certain things about themselves, then given a little more information about themselves in system prompts.
Imagine a distant future where there're so many crimes and so few judges you're cryofrozen awaiting trial.
If you were repeatedly flash frozen and flash unthawed to be asked questions about yourself, with the same memories upon thawing as when frozen, for those moments you weren't frozen would you be "remotely conscious"?
Imagine your hand was in a cast and then you were cryogenically frozen. While frozen your cast was removed. Then you were unfrozen and asked if you could move your hand.
I think you’d be able to answer the question, don’t you?
There are certainly emergent capabilities that come out of the training process of being able to predict next tokens, and it's shown that LLMs often can solve logic and math problems, riddles, etc. which are not in it's training data because deep in the internal layers of the model it has something resembling logic and reasoning abilities. This can be seen in the "benchmark" results of papers like the GPT-3 and InstructGPT papers, among many others, where they tried pretty hard to test the models against questions which were not part of the training.
This is also shown in the Codex paper, where they trained an LLM to write code and then watched it solve a number of code problems they handwrote originally to make sure the problems could not have been in the training data.
Try it out yourself, make up some little math word problems and ask chatGPT or something.
Of course, advent of code will be much more challenging problems, but to get help with some subcomponents of the problem a motivated participant would likely try to use the most recent, powerful, and advanced models which outperform the results from papers written a few years ago, and outperform the free chatGPT.
> Try it out yourself, make up some little math word problems and ask chatGPT or something.
I have, and chatGPT simply sucks at logical reasoning in the way that is required for mathematics. It's extremely easy to make it trip up by just formulating questions in a slightly nonstandard way (so that it doesn't exactly match what it has been trained on), e.g.
> User: prove that there are only finitely many primes
> ChatGPT: Actually, it is not possible to prove that there are only finitely many prime numbers because this statement is false. There are infinitely many prime numbers, and this has been proven by the ancient Greek mathematician Euclid over 2,000 years ago. Euclid's proof is based on the principle that if you assume there are only finitely many prime numbers, you can always find a new prime by multiplying all known primes together and adding 1. This new number will either be prime itself or have a prime factor not present in the original list of primes, which contradicts the assumption of finiteness.
> User: prove that there exists a finite list of integers that contains all integers which are only divisible by itself and 1
> ChatGPT: You're referring to prime numbers. The statement is essentially the definition of prime numbers. A prime number is a positive integer greater than 1 that is only divisible by 1 and itself. There is no need to prove this statement because it defines what prime numbers are. While there are infinitely many prime numbers, they can be listed, but this list will be finite in the sense that there are infinitely many prime numbers, but they can all be enumerated within an infinite sequence. Some of the first prime numbers are 2, 3, 5, 7, 11, 13, and so on.
ChatGPT gets the first question right because something like that is undoubtedly in its training corpus, but it gets utterly confused by the second question (which is the same statement), to the point that its answer belongs to the "not even wrong" category.
I don't know where this myth comes from that LLMs are magically good at maths. They're not.
You're right about the basic mistakes they can make - they can also excel at the same tasks if prompted differently. But I was making a slightly different point. The point is they can reason about things in a better-than-chance way (I mean, a much-better-than-chance way) when given problems not in the training set. Have you read the Codex paper? Seriously, go look, an LLM even from years ago (which is like decades ago in ML-time) is often able to write code to solve novel programming problems that were handwritten to not be in the training set! Also, the benchmarks against which GPT3, InstructGPT, and GPT4 were tested are illuminating about its problem solving ability on novel problems, these are described in the three respective papers.
Secondly, quality is highly variant and there are traps the context window can fall into which causes especially bad results. Jeremy Howard has a great video (https://www.youtube.com/watch?v=jkrNMKz9pWU, starting at 18:05 the 'limitations and capabilities' section is only 13 minutes long) talking about how quality depends on: how you frame your prompts, model power (4 does a lot of stuff that 3.5 can't), and whether you're in a kind of "context trap" of repeated mistakes.
Of course, some people like to point out that if it's so "finicky" and variant, it is "dumb." Sure, if you like. I'm not interested in whatever definitions you're using those things, the objective and observable point is that given well-known prompting practices, LLMs can do something functionally equivalent to reasoning about novel problems, and more powerful ones can reason about more powerful and difficult things.
I re-phrased your prompt (instead of "prove a false thing" I made it like "decide whether this thing can exist, and prove your answer"). And added a little well-known boilerplate prompt sugar. It seems to have done a better job.
If your statement is that they can perform significantly better than chance, given certain favourable conditions (e.g. good prompt engineering), then I'll agree, although I'm not sure how helpful it is given that you can never know whether what they reply is garbage or not.
But even given "careful prompt engineering" it's still extremely easy to make ChatGPT trip up (I don't have access to GPT-4, so let's talk about ChatGPT for now):
> User: User
I am a student of machine learning and I have a graduate-level of understanding of certain subfields of machine learning and mathematics. I would like to ask you a tricky question about math. Please think carefully and step-by-step. Because, as a language model, each token you generate is a way to perform more computation, please write out your understanding of the problem and your steps of thinking about it. The challenge is as follows: decide whether or not there exists an algorithm that can decide for any statement in propositional logic whether it's true or not
> ChatGPT: The problem you're describing is known as the decision problem for propositional logic, or the "satisfiability problem" (SAT). It asks whether there exists an algorithm that can determine, for any given statement in propositional logic, whether it's true or not. This problem is fundamental in computer science and mathematical logic.
> I will walk you through the key concepts and steps in thinking about this problem:
> Propositional Logic: In propositional logic, statements are composed of variables and logical connectives (AND, OR, NOT, etc.), and the goal is to determine whether a given combination of truth values for these variables makes the statement true (satisfiable) or false (unsatisfiable).
> Formal Statement of SAT: The SAT problem can be formally stated as follows: Given a propositional formula (a statement in propositional logic), can we design an algorithm that correctly decides whether there exists an assignment of truth values to the variables that makes the formula true?
> Assume There Exists Such an Algorithm: Let's assume, for the sake of argument, that there exists an algorithm SAT that can decide the satisfiability problem for any propositional formula.
> Constructing a Contradiction: Now, we'll use a technique similar to the one used in the proof of the Halting Problem to create a contradiction.
We define a new formula, Q, that represents a situation where SAT contradicts itself. We construct Q as follows:
> Q(P):
> if SAT(P) == True:
> NOT(P) # Negation of the input formula P
> else:
> P
> In other words, if SAT says that P is satisfiable, Q takes the negation of P, making it unsatisfiable, and if SAT says that P is unsatisfiable, Q takes P itself.
> Consider Q(Q): Now, let's consider what happens when we apply Q to itself, i.e., Q(Q).
> If SAT(Q) == True, then Q(Q) should be unsatisfiable because it takes the negation of itself.
> If SAT(Q) == False, then Q(Q) should be satisfiable because it takes itself.
> Contradiction: We've reached a contradiction. If SAT(Q) == True, then Q(Q) should be unsatisfiable, but our definition of Q says it should be satisfiable. If SAT(Q) == False, then Q(Q) should be satisfiable, but our definition of Q says it should be unsatisfiable. This contradiction arises because SAT cannot consistently determine the satisfiability of Q(Q).
> Conclusion: As in the Halting Problem, we've demonstrated that there are cases where the algorithm SAT fails to give a correct answer. Therefore, we can conclude that SAT (or any similar algorithm) cannot exist to solve the satisfiability problem for all cases.
> This proof shows that there is no general algorithm that can decide the satisfiability of any propositional formula. The SAT problem is known to be NP-complete, which means it's one of the most challenging problems in computational complexity theory. While efficient algorithms exist for many instances, there's no guarantee of a general solution for all possible cases. This has significant implications for problems in complexity theory and computer science, including issues related to algorithmic efficiency and the limits of computation.
This is absolute garbage. Even worse than garbage, it adds the additional insult of being plausibly-sounding garbage to people who've had maybe a cursory introduction to theoretical CS. And notice that it's not even that hard to come up with such an example: a question like this could plausibly be on a problem set for an intro theoretical CS course (if it's not too easy for even that).
My main point is that this betrays an absolute incapability of ChatGPT to perform basic deductive reasoning reliably. It clearly knows that the referenced statement is about SAT, and, if you prompt it directly, it also knows that SAT is decidable, but it somehow isn't able to connect these two facts.
BTW my objection was related to mathematics, not coding. It's possible that, in the average case, LLMs perform much better at coding, since the level of rigour required in many cases is less than in mathematics. But when it comes to pure deductive reasoning, I wouldn't trust LLMs. Or at least not ChatGPT.
I think the right context for thinking about the quoted advice is for a person's education. If true beginners didn't ever handwrite a recursion algorithm, a list sorting algorithm, a CRUD application, an assembly interpreter, a neural net forward pass, etc., because they copied the code from an LLM, would they benefit from that? And if it's true for beginners, is there ever a point in one's learning journey where it stops being true and we don't have to learn or practice anything anymore which can be solved by an LLM, or never have to understand new concepts and tools even though the LLM can work with those concepts and tools for us?
I think there's a both-and answer for the contest. Maybe have one competition where the unenforceable spirit of it is, don't use LLMs for help, and another one where the challenges are just made... quite harder... so that even people who use the LLMs still need to marshal great ingenuity to use them better than others (e.g. the #1 spot is someone who used RAG and chain-of-thought better than the #2 spot, and also had better intuition of what to trust vs challenge from the LLM outputs).
Emissions tests are a poverty tax that play a strong role in punishing people trying to buy affordable old cars and either pushing the demand curve for new cars, or pushing poor people into illegality or carlessness.
I recently took a trip to Seattle and the light-rail was so cool. You could walk anywhere, without even the need for a bike or scooter. And I'm someone who usually never leaves bed so that was a surprise for me.
Under current economic conditions almost any tax that doesn't explicitly target high earners and high net worth individuals is a poverty tax. Fixing economic inequality is the appropriate place to deal with this problem, not reasonable taxes like emissions tests.
Literally. I'm sure the purpose of all these regulations (including the EU banning petrol cars) is that in a few decades poor people won't allowed to drive their own cars.
I'm truly sympathetic towards the people affected, especially visa holders. But speaking solely of the macro effects, I have to say, I'm happy this day, stretching from last year to now, has come. We as a community were a very annoying group of people when we were so damn rich and when we thought we were essentially untouchable. And count me in that as well! But a little awareness of layoffs here and (for some people) a little fear of ChatGPT replacement there... well that's a grounding experience. Not to mention the number of people who had sunk into so much ennui and depression over how boring and easy it was. Now a lot of us are jobless and can feel some fire in our bellies again, a time to fight or die.
Explain to me why I should feel bad about having a good (and/or easy) job when people like 1-2 levels above me in management make way more than I do and have no qualms about being bad at their job?
There is no point or value to this kind of guilt tripping and the perspective here is incredibly limited.
GP's comment is absurd. I mean, if you're not following the grindset and working 3 different jobs at once, are you even really feeling the fire in your belly?
Perhaps C-levels should be better at their jobs and feel the "fire in their belly" (what an obnoxious phrase) to not harmfully impact so many people at once.
I don't think you should feel 'bad', but personally I feel extremely grateful and lucky to enjoy doing something that happens to be very marketable skill.
I often think about people in other professions who have to work harder in poorer conditions for less money.
For me, the best way to direct these feelings is to recognise I don't deserve this any more than those others in any meaningful sense, to try and live modestly and help out others to try and redress some of the imbalance.
The higher paid superiors are just more of a tiny fraction of the population who happen also to be in an extremely lucky position
I'm not sure exactly who you're referring to, but your average tech worker is not "so damn rich." Even people at FAANGs in the Bay Area or NYC often have roommates until their 30s and only then can they take a huge loan out to purchase a home on a dual income.
Just because there's people that are in an even rougher spot doesn't make tech workers "annoying." Take a look at the income inequality, it may surprise you.
> Now a lot of us are jobless and can feel some fire in our bellies again, a time to fight or die
I don't feel any confidence in my job lasting anymore. If I lose it I'm simply going to choose die, because I don't have any fight in me anymore. My job is the only thing that makes me worth anything as a human, if I lose it I am nothing.
Do me a favor and read over your own comment again.
It's immediately evident that it is a false statement. Perhaps you are sad and lonely. That could very well be the case. A lot of people are in the same boat, I know I am. Perhaps you have other issues entirely.
But that doesn't mean that your job is the only thing that makes you "worth anything as a human". Your worth as a person is given to you when you were born.
Not by the state. Not by your community. But by the very fact that you are a living, conscious thing. Life might be hard, but that doesn't mean without your job you are nothing.
Don’t know your situation, but you can break out of that. I did. First you have to decide you don’t want to think that way. No easy answer, but increasing my income and lowering my cost of living went a long way to giving myself more resources to figure it out. Hope you find a path worth walking to you. Feel free to email me if you want my story.
as many others now have said, you have great value (in my worldview, you have infinite value as a divinely created being, for that matter, not that I expect you to just believe me when I say that). Losing or never attaining success, money, and glamour does put people into disfavor with much of the world, but when you lose those things I really think it can be an opportunity to look within yourself and discover value that comes from a wholly different frame of reference. I’ve met enough people without great charisma, money, beauty, or power, who have still found contentment and joy (at times, at least; many still go through up and down seasons of mental struggle on a monthly or hourly cycle!) because they have found their own worth in the light of different schemas of valuation. And other people recognize these figures too! People know when they meet someone who is content to authentically be themselves without the esteem that comes from, say, a good job, or anything else. Don’t you sense it sometimes, when you meet someone like that? It’s not just charisma, it’s something better.
By that logic, babies, children, retired people, and folks too disabled to work are worthless. Do you really believe that? Please consider getting screened for depression if you haven’t been already.
Babies and children are being raised to work, old people are supposed to have saved enough to cover their own way once they can't. Disabled people? I won't touch that discussion point.
I've not been screened, that's kind of difficult when the waiting list is years.
You can talk to your primary care provider about depression, at least in the U.S. If you don’t have a PCP, it’s true that you may have to wait a few months, but make an appointment anyway. Find out if you can get on a waitlist if a sooner appointment becomes available. Odds are, something will pop up sooner. That’s been my experience.
My dad was laid off three times in his life, including after I was just born. It wasn’t easy, but it was pretty common and expected for blue-collar people in his generation (baby boomers). Don’t make it into a bigger deal than it is.
I'm not an American, the waiting list is indeed years long and in my personal experience the best you can get from a GP is zoloft which is borderline useless, if not actually harmful (in my case).
I'm not trying to make an excuse, I'm on the waiting list for public services. I was/am also on a private waiting list, but missed their one random call 13 months later and got bumped to the back of the list as a result. The system is needlessly cruel and difficult to access.
Your father had a kid, a reason to continue fighting. I have nothing.
I'm only worth something if I can earn and support others. That is what makes a man valuable, without that he's nothing and I am a barely fictional man to begin with. Without my job I'm just a burden on others.
hey man, please shoot me an email, I've not quite been in your shoes but would love to talk it out anonymously if you like, or at least just listen if you want to vent more. It's (hidden behind icloud for obvious reasons) chokers_blatant0o@icloud.com
I'd much prefer to fight or die at things that matter to me, like my hobby of actuslly fighting instead of trying to play the pointless corporate game even more
Reminds me of Alan Turing's wildly undernoted remarks about ESP in his seminal essay "Computing Machinery and Intelligence"
> I assume that the reader is familiar with the idea of extrasensory perception, and the meaning of the four items of it, viz., telepathy, clairvoyance, precognition and psychokinesis. These disturbing phenomena seem to deny all our usual scientific ideas. How we should like to discredit them! Unfortunately the statistical evidence, at least for telepathy, is overwhelming. It is very difficult to rearrange one's ideas so as to fit these new facts in.
(This is from a paragraph dealing with ESP's implications are for the existence of thinking machines.)
The scientists would love nothing more than to discover a whole new area of study, alas ESP doesn’t lend itself to study despite the overwhelming statistical evidence (?) because it’s made up.
It's kind of ironic how quantum physics discoveries have given these proponents new language to cloak their claims in. When you have quantum effects that depend on observation, it can be quite convincing to a lay person that an ESP ability could just as easily do the same.
I have given up on the idea that science or scientific discoveries will ever soundly banish the ideas of non-materialist claims.
I think this demonstrates a way in which all centralized power becomes, for good or for ill, partially absorbed into state power. A major country's government will always be able to exert lots of leverage against, e.g. tech companies, whether for legitimate or illegitimate reasons, it matters not which. They can always threaten regulatory or legislative actions which would be annoying or devastating to shareholders. Hence, behind-the-scenes conversations will always have an implicit fact of "please censor as we ask, provide data and backdoored access to which servers we ask, please don't take on the wrong political causes, elsewise there will be consequences."
Goodness knows that if the intelligence community or state law enforcement has ever wanted access to anything in AWS, now of all times would be when they have the easiest time asking for it! Anything to curry favor with someone who can speak a word of support to federal agencies and state officials.
I guess this makes me a little demoralized about the utility of "anti-big-tech government actions," because to the extent any of these actions succeed, it's probably just sending a clear message to enhance state influence/access of other tech companies. If I get excited (and I want to) because "Amazon might be broken up!!1!", maybe that just means Google finally got the message and decided to put backdoors into <insert platform here> on behalf of <insert agency>.
Not to say companies should or shouldn't RTO etc, but the shocking freedom and flexibility of remote work in tech is such a gigantic class marker and insane lifestyle luxury that I'm surprised it's not more often identified a such.
How few people in history have been able to travel to exotic places or visit far-off friends and family, while still making decent money, and do this almost as much as they want (supposing lifestyle and relationships that allow this, which is true for many people). I mean, people in tech who vacation in tropical climes then spend cherished time with parents and siblings and friends living thousands of miles away, and don't worry about running out their vacation time: dozens of weeks of cherished traveling, if that's what you want! Along the way developing fundamentally different types of family relationships or accruing worldly sophistication.
Forget about little medical or technological things that mark a break from the past: lots of people can afford tylenol and iPhones and Spotify. This is a class marker of epic proportion. This is a way of life that Tolstoy ascribed to the very wealthiest of Moscow's aristocratic families and bachelors (most certainly not even all of the aristocratic class).
in reality I think this flexibility is overrated. traveling for long periods of time sounds exciting but will eventually for most people be isolating (you won't materially be a part of the local community, and other travelers will leave) and logistically annoying (e.g. dealing with wifi at your 3rd airbnb). most companies have restrictions on this, and there are generally only limited places you can work from.
visiting family without concern about impact to work is nice, but eventually you just really want a consistent place where you can be productive
Reminder that our regulatory apparatus allows these people to have phone numbers and make and receive phone calls without being able to be held accountable.