LLM’s can still only pass limited Touring Tests. The longer the interaction the worse they do. Which of course means you can easily create an experiment they successfully pass, but just as easily you can create an experiment where they fail.
CAPTCHAs are nearly useless because of how little you need to pay humans to solve them.
A more interesting question is whether there is a Turing test that is easy for ALL humans to pass, while still being hard for LLMs.
In practice, most of the major CAPTCHA vendors already rely on non-privacy-preserving tests for those needing more accessible solutions than a visual puzzle.
Google's audio captcha (only available in a few languages and unusable for those who also have hearing issues) only works for a narrow band of users, not trusted enough to bypass the captcha entirely, but also not untrusted enough. If you fall outside of that band, you get a nice "your device has been classified as a fraud risk, please use the visual captcha" message.
hCaptcha goes even further and straight-up requires you to have an "accessibility cookie", which requires verifying your email address (and apparently your phone number in some cases) to obtain, as well as disabling some anti-tracking settings in your browser.
I've seen one recently where it's basically a series of animated objects and you're asked to click on the slowest one. It's surprisingly easy as a human, but anything that depends on a single screenshot of the page isn't able to solve it.
Obviously, that's only solveable by sighted humans, not ones that are blind or have otherwise low vision.
Flowery language is a powerful tool, but it demands more from both the reader and writer.
That’s the fundamental flaw in using simple heuristics to evaluate language, the exact same text can be useful or deeply flawed just based on the context. You need to make sacrifices the wider the intended audience.
The issue is using a single factor to push change does not mean that change is a net good. Nobody talks about windmills killing birds because that’s what they actually care about, instead there are so few downsides they needed to find something no matter how meaningless in context.
As such single issues are often a fake justification for what they want to happen for other reasons.
Not bankrupt doesn’t mean the company is actually worth anything. A large number of tiny business are still in operation because the owner is willing to work at below market rates to keep it operating.
No. The law allows passengers in self driving Taxi not to be responsible. Including Taxi operated by Tesla.
Here Tesla makes it clear to people who turn on “Full self driving” the driver must maintain supervision and thus responsibility. As such it’s Tesla’s choice that they aren’t selling self driving cars.
It wouldn’t be such a big deal if some random engineer said they’d eventually do X, but when it’s the CEO repeatedly saying the same across many public appearances that’s as binding as a Super Bowl advertisement.
No, climate is based on consistent weather data over a long period. Across long enough periods the underlying assumptions that make climate a meaningful thing to talk about fail due to orbital mechanics etc.
Plate tectonics for example shows you can’t even assume an area’s latitude is consistent, just look at the fossil history of Antarctica. Humans have dumped so much carbon and methane in the atmosphere even 100 years ago was quite different.
Force of gravity for spherical objects of constant density is calculated from 2 masses + 1 distance + a constant.
Before the experiment you can measure the mass of both objects. In the experiment you measure the force and distance to calculate the constant.
The weight either object gives you the force between that object and earth (adjusting for atmospheric buoyancy). Altitude at your location + size and shape of earth gives distance between object and center of earth, you just learned the constant. So you know 4 out of five variables in an equation and can thus calculate the mass of the earth.
Technically that excludes the weight of the atmosphere above your altitude, but you can get that from the air pressure. Similarly the density of the earth isn’t constant but it is very close to symmetrical so you can get a reasonable estimate.
It’s not a bias on the educational side, it’s the inherent requirement for knowledge before you can learn skills. Memorization creates a Rosetta Stone the enables people to start reading. You need to know what happened historically before you can have meaningful opinions about it. You need to memorize mathematical symbols meaning before you can use them etc etc.
The only bias here is people disliking memorization. It takes effort and has concrete right and wrong answers so you can fail in a way that doesn’t happen with skills. But disliking something doesn’t mean it’s actually wrong.
I agree that memorization is a very useful skill, but I believe it’s over-used in some education systems.
There have been classes I’ve taken where ~half of the evaluation is brute memorizing dates/event names. Ive also taken classes (machine design) where the majority of the evaluation is open book and about solving problems. Most classes land somewhere in the middle.
I think there is a bias towards memorization-based testing because it’s easy. Coming up with trivia questions is easy. Grading those answers is easy. A students can’t complain about marks when they get a date wrong.
Coming up with problem solving questions is hard. Grading them is ambiguous. Students will complain that their mark should be higher. Everything is harder.
If the testing is memorization based, students will get good at memorizing facts and spitting them out on the test. If the test is problems solving, students will optimize for that.
CAPTCHAs are nearly useless because of how little you need to pay humans to solve them.
reply