> Ask an American LLM (really any LLM, since Chinese models are trained on the same publicly-available English text) who the first Black man in space was. You'll likely get the name of the first African-American in space, rather than the name of the Afro-Cuban who was actually first.
Well I just asked Claude and it gave the correct answer:
"The first Black man in space was Arnaldo Tamayo Méndez, a Cuban cosmonaut who flew aboard Soyuz 38 in September 1980. (The first Black American in space was Guion Bluford, in 1983.)"
Indeed, I used the word "likely" for a reason. n = 1 isn't enough to identify a pattern. Try different models, try re-rolling the answers, and try turning reasoning off (models can catch "knee-jerk" mistakes in their chain-of-thought).
I doubt even Opus 4.8 gets it right 100% of the time, however this specific example is also one I've left feedback about in multiple places, so it's also probable that newer models are more likely to get it right.
E: In fact, I just tried with Opus 4.8 through API, no tools and reasoning off, and got the following response:
"The first Black man in space was Guion "Guy" Bluford, an American astronaut who flew aboard the Space Shuttle Challenger on August 30, 1983, as part of mission STS-8.
It's worth noting a related distinction: Arnaldo Tamayo Méndez, a Cuban of African descent, actually became the first person of African heritage in space earlier, in September 1980, aboard the Soviet Soyuz 38 mission. He is often recognized as the first Black person and first person of Latin American descent in space.
So depending on the specific criteria:
Arnaldo Tamayo Méndez (Cuba) — first person of African descent in space (1980)
Guion Bluford (USA) — first African American in space (1983)"
The correct answer is there, yes, but why does the wrong answer come out first?
> build a fort to deflect the random exceptions they'll throw at you
Sounds like you hate exceptions, right? In which case why do you handle them at all? Just leave them all unhandled and suddenly every exception is a crash. Which is really no different from someone choosing to terminate. Which you have to worry about even without exceptions.
> if you have a C++ code base yet somehow have enough time and energy to write opinionated blog posts, it's really hard to imagine why you think you'd have a better take on this than Google.
"Given that Google's existing code is not exception-tolerant [...] Our advice against using exceptions is not predicated on philosophical or moral grounds, but practical ones. [...] Things would probably be different if we had to do it all over again from scratch."
> Which is really no different from someone choosing to terminate.
If you std::abort(), you'll get a useful stack trace in the core dump. If you crash from an unhandled exception, you don't. That's a pretty huge difference and is one of the reasons exceptions suck.
Just FYI, finally in C++ you can add a top-level exception handler and call boost::stacktrace::from_current_exception (https://www.boost.org/releases/1.85.0/), and get a stack trace on exit as helpful as in Python or Java.
That's nice but it's certainly not guaranteed by anything, just something provided by your toolchain or platform. ("Core dumps" aren't even a thing in C++.)
If you're looking for implementation-specific guarantees then you could make that happen with exceptions too. I think on GCC replacing a function like __cxa_throw might be sufficient to let you capture a stack trace?
If you're looking for source-level-only guarantees then another option is to just replace your throw <expr> statements with one that attaches whatever extra info you want. You could literally script this to patch your external repos automatically too. Or heck, maybe you could even just define throw to be a macro that shoves your stack trace into some global variable before actually throwing.
> If you std::abort(), you'll get a useful stack trace in the core dump. If you crash from an unhandled exception, you don't. That's a pretty huge difference and is one of the reasons exceptions suck.
All of this is up to the implementation in practice. The codebases I work on generally follow the pattern that exceptions may be thrown but may not be caught*, and thus they practically serve as terminate. And we absolutely get stack traces in our core dumps (Linux, both GCC and clang), and basically all of the complex debugging I do starts with a coredump stacktrace.
* We follow this pattern for a few reasons, (1) it is generally safer for us to assume that libraries we consume (STL included) will behave as expected with exceptions enabled, (2) "don't catch exceptions" (or if you must, catch the as close-to-throw as possible) is a simple way to avoid exceptions-for-nonexceptional-cases control flow, and (3) most of the C++ codebase is exposed through bindings in Python, and propagating errors as exceptions is the only Python-friendly way to handle it.
Exceptions aren't meant to report errors, just in general. That's a misuse of them. Exceptions are meant to be thrown when a contract cannot be fulfilled. Yes, you're unable to know what exceptions a function may throw. That's the way it should be, because exceptions aren't supposed to be part of the function's contract.
For example, you're implementing an arithmetic operator and have reached an erroneous state, but the arithmetic type doesn't have an error value, the only way to communicate the error is by throwing. Another example: you've specified that a function must always succeed, but later on you find a case where the function cannot succeed. Instead of fixing all the possible call sites, throw an exception. All those callers could not have handled the error anyway, because they were coded under the assumption that no error would happen at that point. Throwing an exception and letting it unwind the stack way up (perhaps even all the way up to main()) is the sensible solution, because at that point you've reached a situation with no reasonable way for that code to handle.
Saying that you prefer Result over exceptions is like saying that you prefer strings to functions. They do different things. If you like Result, nothing prevents you from implementing a C++ equivalent.
> Exceptions aren't meant to report errors, just in general. That's a misuse of them. Exceptions are meant to be thrown when a contract cannot be fulfilled. Yes, you're unable to know what exceptions a function may throw. That's the way it should be, because exceptions aren't supposed to be part of the function's contract.
I don't think these are true? What about std::vector::at(), std::optional::value(), etc.? And then there's std::system_error.
Both functions must return T &. If the vector is not long enough, or the object is not set, then returning a T & is impossible. So we have a function that has already been called and which must return something valid, and cannot return something valid. The only two ways to resolve this contradiction is to throw, or to terminate.
(Well, you could also trigger undefined behavior like operator[]() and operator*(). No comment.)
>And then there's std::system_error.
And what am I supposed to conclude from the existence of a type?
I think you missed my point. I was referring to the fact that some of these standard exceptions are very much a part of the contracts of their respective functions. In fact, that's their entire point. This directly contradicts what you wrote.
You're using "contract" in a different sense than I did. When I said "contract" I was referring to the required state of the program when the function is called and the guaranteed state of the program when the function returns. By definition an exception cannot be part of the contract in this sense, because a call that throws does not return. This narrower sense of contract is critical, because the entire point of exceptions is to enable alternate control path when it'd otherwise be impossible, such as in the examples I gave above with overloaded operators and code with evolving requirements.
> Throwing an exception and letting it unwind the stack way up (perhaps even all the way up to main()) is the sensible solution
No. I would never in a million years do this.
If the API is that a function is infallible and then I decide that it’s a fallible function then that’s a pretty major change and I’m just gonna have to update all the call sites to deal with a fallible return result.
Saying throw an exception and bubble up to main provides just about zero value. Might as well just call std::abort. Which is also something I would never do.
> Saying that you prefer Result over exceptions is like saying that you prefer strings to functions. They do different things.
So here’s the thing. In 20+ professional years as a C++ dev I have never ever once worked in a codebase where exceptions were used. Certainly never in first party code. Only when dealing with annoying thirdparty libraries that leveraged them.
I think your comment “contract can’t be fulfilled” is cheating. No. You’ve simply made a new contract and the new contract is that under certain cases an error is returned in the form of an exception.
>If the API is that a function is infallible and then I decide that it’s a fallible function then that’s a pretty major change and I’m just gonna have to update all the call sites to deal with a fallible return result.
What if you don't control those call sites?
>Might as well just call std::abort.
Sure. I mean, not really, because the caller cannot handle an abort. You're making a decision for the caller that the situation is unresolvable, where the caller might disagree.
>No. You’ve simply made a new contract and the new contract is that under certain cases an error is returned in the form of an exception.
If the function doesn't use exceptions for normal error conditions, then no, it's not a new contract, because you don't need to do or know anything specific to handle the situation. You could do something like
and not have to worry about the specifics. It's just an exception. You don't have to care about what exactly happened, you just care that something that couldn't be resolved happened. When exceptions are misused you see stuff like
Not always, but this does usually mean that the exception is part of the contract of the function. It's a condition that the caller must handle as part of the normal usage of the function. FileNotFound exceptions are quite often a prime example of exception abuse.
Replying to your other comment here:
>They’re fallible functions. Don’t write fallible APIs that require exceptions to report errors! That’s bad API design!
I disagree. You should ensure your arguments are valid before indexing vectors and dereferencing optionals. You wouldn't iterate a vector like this, I imagine?
for (size_t i = 0;; i++){
auto x = vector.at(i);
if (!x.has_value())
break;
//...
}
If I am choosing to change the API contract then someone who wants to use the new API has to update. This is not a big deal.
> If the function doesn't use exceptions for normal error conditions, then no, it's not a new contract
I disrespectfully and emphatically disagree. I do not accept your definition of contract.
> You could do something like (try-catch wrapper)
Let me be clear. Having to add a bunch of random fucking try-catch bullshit around every fucking function call is EXACTLY why I hate exceptions and is EXACTLY what I think is bad software design.
If you think a function should return a value or some unspecified exception whose details are irrelevant then that function could return an option with no information loss, or a result with an Error that is ignored.
> You wouldn't iterate a vector like this, I imagine?
I wouldn’t use at(i) for iteration. The only reason for a function like at(i) to exist is because you want it to be fallible.
> One of the problems with exceptions is it’s utterly impossible to know if a given function call can return exceptions and if so what they are.
Could you please explain how exactly you know if a function might abort?
And how you figure out exactly which error codes a function might return if it does return an error?
And why/how your techniques for figuring out the above don't work for exceptions?
> Python is the bloody worst because I never effing know what the hell any damn function can throw or return. It’s so so frustrating.
No, it's pretty possible. Virtually any interesting function you call can throw AttributeError or TypeError, if nothing else, simply by virtue of you passing an object of the wrong type or behavior.
"But I don't mean those particular exceptions! I don't care about them." Well yeah that's kind of the point. If you can pretend that problems you don't know how to handle don't exist, then you can pretend the same for exceptions and errors. You're not supposed to care about the entire universe of possible error conditions; it's not only impossible but also you wouldn't be able to handle all of them anyway. You handle everything you can reasonably handle and then let the rest propagate, not the other way around. Same for error codes and exceptions.
"But the documentation would tell me which error codes I care about!" Well it can do that for exceptions too. If the documentation sucks then bring it up with the API developer not the language developer.
> But I’ll take Rust results over exceptions 10,000% of the time. Not even a question.
Sure, feel free to do that. Or use error codes in C++, whatever you prefer. Not like I'm trying to turn this into a Rust vs. C++ debate.
Functions aborting is not something I’ve ever really had to think about. Exception heavy codebases it’s something I have to ALWAYS think about.
Error codes are pretty bad. Global error code is awful. An error enum is pretty nice.
So here’s the thing. I’ve been a professional C++ programmer for 20 years. Not once have I ever worked in a codebase that used exceptions. It’s fine. Occasionally I use a thirdparty library that does use exceptions and it’s bloody awful.
Why can't you? They don't want to provide info for a credit check, you want human accountability. All that requires is for them to use a debit card for whatever service (prepaid or postpaid). Law enforcement can trace that if needed. No need for credit checks or really any other information directly in the hands of the telco.
It doesn't make sense to include the capex cost to train a model in this kind of discussion, because that cost is fixed.
Consider a model that costs $100m to train.
If the vendor then prices it such that each inference token has a margin of 10% over the variable costs to serve (power + server costs), whether or not they cover their costs is based entirely on how many tokens they can sell.
If they sell less than $1bn of tokens, they lose money - the break even point is 10x100m = $1bn.
If they sell $10bn of tokens they make a ton of money.
This also means you can't credibly calculate how much of the fixed training expense is covered by your token spend, because until the model is retired and you can account for how much inference it ran you don't know what percentage of the training cost each sold token was responsible for.
Cost is fixed if you train a model once in several years, if you have to train 3/4 times per year to stay competitive training cost is a thing.
You have to include also failed training sessions and experiments in the math.
There are no official figures but given how fast new models are rolled out, I wouldn't be surprised if neither Anthropic nor OAI manage to cover the full models cost.
I think the capex being fixed assumes you can just stop training the next model. But its not clear that you can afford to do that and keep selling tokens.
And if capabilities plateau such that training the next one is useless, then the margins will drop fast due to competition.
Just guessing, but I assume because it’s arguably off-topic as defined by the HN guidelines. I don’t think it should be flagged, though.
“Off-Topic: Most stories about politics, or crime, or sports, or celebrities, unless they're evidence of some interesting new phenomenon. If they'd cover it on TV news, it's probably off-topic.”
This being evidence of an interesting new phenomenon is literally the entire premise of the blog post though. And it sure as hell didn't look like it was covered on the main news headlines; I know I only heard about it because of HN. The author is pretty clearly claiming this is a new phenomenon literally in the title itself!
It's new, but is it "interesting?" Does it "satisfy intellectual curiosity?"
Many people here will consider this categorically off topic and flag accordingly because politics doesn't satisfy theirs. Even if it's a good article, and even if the discussion is on-topic and civil.
What is "intellectual curiosity" that doesn't include curiosity about whether, when, and how often the world superpower commits a war crime? For reference just the other day we had [1] on the front page. Was knowing the consumer price index for the month really all that much more satisfying of "intellectual curiosity" than this?
> data center companies could genuinely at least open up for tours to try to appeal to the public, if public approval is apparently such a concern.
Do you actually find anything appealing about a datacenter? I've been to one and while it was mildly cool from the standpoint of "wow how do they manage this many machines" I didn't find anything appealing about it that would make me want it in my neighborhood.
You just press backspace and hit the accent mark key or for a printing press stack the accent mark on top of the letter. People ditched accents because they were rarely used in English writing (only really being used for some loanwords), not because simplifications were forced by typewriters or the printing press (which handle non-English languages just fine).
For printing presses we're talking about the influence of the first printing presses hundreds of years before industrialization which were imported from Germany and even when they started making their own in England they were more like clones and used imported designs and parts. The early machines had a heavy influence on the written language particularly at times when under 1 in 10 people could write, and with the advent of movable type the people who learned to write were heavily influenced by what they read... books printed on German-design machines. You really only need one generation in a situation like that to dramatically change the language. Losing þ, æ, and ð
"Ye Olde Mill" or whatever archaic silliness you'll find at fairs and whatnot was the result of the printing press dropping þ (as in þe, þ is just th-) and was never supposed to be pronounced with a "y" sound.
"Ye Olde" ye was not the same word as "Hear ye, hear ye!", that ye is a plural 'you' basically the same word as "y'all" and never had a thorn.
This happened with more than one letter. For instance the Scots language had a letter yogh (https://en.wikipedia.org/wiki/Yogh), which was written somewhat like a rounded "3" but lower on the line. Early printers had only the characters of the English language, and since this character looked like a hand-written z, that is what they used in its place. Hence the name "Menzies" is pronounced "Ming-is", since that isn't actually a z.
Welsh suffered more: it used to be full of "k"s. When the first Welsh Bible was printed, the English printer did not have enough "k"s, and substituted "c", and the language now does not use "k" at all. Apparently the printer's note on the matter still exists.
"ye" in "ye Olde mill" is actually just "the" but originally "þe"/"þee". The first printing presses to England were imported from Germany, which never used þ, so printers used something that looked sorta similar, thus "y".
"Ye" was a different word, the 2nd person non-formal version of "you" (which was historically formal: see-Shakespeare and how he played with "ye" and "you"). Thorn was on its way out along with "ð" both of which were in Middle English. The sounds didn't leave English, but we merged it into one letter cluster "th" (think "that" and "the", which have different th sounds).
The pronunciation is so bad though. The consonants are mostly fine, but the way we write vowels is a total mess. We'd need at least a dozen vowel letters to sanely represent English. And we could cut a couple consonant letters to help make room, for maybe 30 letters total, still no accents.
Just today the NYT Strands puzzle gave a great example: you can find one set of prefixes that make each of the following rhyme, and a different set of prefixes that make them all sound different:
-ooze -oose -ews -ues -use -oes -uise
You can do this purely with prefixes ending in consonants, i.e. not by turning -use into -ouse, for example.
(spoilers for the little -ooze puzzle: for rhymes, booze choose brews blues ruse shoes cruise; for non-rhymes, snooze loose pews plagues obtuse toes guise; many others are possible, and rhyming or lack thereof may depend on accent).
It’s great compression: Y sometimes a vowel, sometimes a consonant.
And while not encoded on a keyboard, it still blows my mind that English has a crazy number of past tenses - and a such a bad hack of a future tense that it’s hard to classify as such.
Well I just asked Claude and it gave the correct answer:
"The first Black man in space was Arnaldo Tamayo Méndez, a Cuban cosmonaut who flew aboard Soyuz 38 in September 1980. (The first Black American in space was Guion Bluford, in 1983.)"
reply