This administration has taken grift and corruption to a level only seen in banana republics. I seriously don't know how you come back from this. GOP voters seem to be openly cheering it.
You mean the folk with highest purchasing power (2-3x median wages of the average person in the city) moving in and out of the city have negligible impact on the average rent in the entire city? I guess the 20% increase in rent in 2021 in Austin was just vibes.
Should be obvious that its tools like Claude Code. If you are a junior dev not experienced in delivering entire products but with good ideas you have incredible leverage now...
Is there a leading American AI research organization - big tech or academia - that isn't "full of Chinese Nationals"? If the DoD want an all-American SoTA model, they may have to wait for a while.
What Anthropic was complaining about is training on mass-elicited chat logs. It is very much a ToS violation (you aren't allowed to exploit the service for the purpose of building a competitor) so the complaint is well-founded but (1) it's not "distillation" properly understood; it can only feasibly extract the same kind of narrow knowledge you'd read out from chat logs, perhaps including primitive "let's think step by step" output (which are not true fine-tuned reasoning tokens); because you have no access to the actual weights; and (2) it's something Western AI firms are very much believed to do to one another and to Chinese models all the time anyway. Hence the brouhaha about Western models claiming to be DeepSeek when they answer in Chinese.
So they're paying expensive input tokens to extract at best a tiny amount of information ("judgment") per request? That's even less like "distillation" than the other claim of them trying to figure out reasoning by asking the model to think step by step.
LLM-as-a-judge is quite effective method to RL a model, similar to RLHF but more objective and scalable. But yes, anthropic is making it more serious than it is. Plus DeepSeek only did it for 125k requests, significantly less than the other labs, but Anthropic still listed them first to create FUD.
Task Time horizons are improving exponentially with doubling times around 4 months per METR. At what timescale would you accept that they "can be strategic"? Theres little reason to think they wont be at multi week or month time horizons very soon. Do you need to be strategic to complete multi month tasks?
>Can an LLM give you an upfront estimate that a task will take multiple months?
>Can it decide intelligently what it would have to change if you said "do what you can to have it ready in half the time?"
Do you think ChatGPT 5.2 Pro can't estimate how long a task might take? Do you think that estimate would necessarily be worse than the estimates, which are notoriously poor, coming from human engineers?
But you can still answer my question. When an LLM can complete a task that takes a person N months or years, is it capable of being strategic?
Well to be fair Nasa isn't nearly as good as it once was. The quality of engineer during the Apollo era was far better and more like what can be found at Spacex
What is that based on? NASA's recent accomplishments are far beyond anyone. Off the top of my head: The many Mars missions, JWST, Europa Clipper (still in progress), etc. SpaceX hasn't left Earth orbit, afaik.
That is only looking at (mostly orbital) launch systems, such a minor part of NASA's R&D and missions that the Obama administration decided to contract it out.
NASA doesn't develop or build lots of technology that has become mature enough that private industry can take it on and NASA can focus on the past-the-bleeding-edge stuff.
Reality? Outside of JPL the talent level at Nasa is frankly very poor. You really want to claim that the current version of Nasa could pull off the Apollo program today?
All of which are peanuts compared to Apollo. Not to mention insane cost overruns and timeline failures on their projects like JWST or SLS for example. Many of those successful projects came out of JPL, where I mentioned the talent level is much higher.
reply