If someone seems to have productivity gains when using an AI, it is hard to come up with an alternate explanation for why they did.
If someone sees no productivity gains when using an AI (or a productivity decrease), it is easy to come up with ways it might have happened that weren't related to the AI.
This is an inherent imbalance in the claims, even if we both people have brought 100% proof of there specific claims.
A single instance of something doing X is proof of the claim that something can do X, but no amount of instances of something not doing X is proof of the claim that something cannot do X. (Note, this is different from people claiming that something always does X, as one counter example is enough to disprove that.)
Same issue in math with the difference between proving a conjecture is sometimes true and proving it is never true. Only one of these can be proven by examples (and only a single example is needed). The other can't be proven even by millions of examples.
>They still "can't" asses the quality of the output, after all. They can't just ask the model, as they can't know if the answer is not misleading.
Wasn't this a problem before AI? If I took a book or online tutorial and followed it, could I be sure it was teaching me the right thing? I would need to make sure I understood it, that it made sense, that it worked when I changed things around, and would need to combine multiple sources. That still needs to be done. You can ask the model, and you'll have the judge the answer, same as if you asked another human. You have to make sure you are in a realm where you are learning, but aren't so far out that you can easily be misled. You do need to test out explanations and seek multiple sources, of which AI is only one.
An AI can hallucinate and just make things up, but the chance it different sessions with different AIs lead to the same hallucinations that consistently build upon each other is unlikely enough to not be worth worrying about.
I had some .csproj files that only worked with msbuild/vsbuild that I wanted to make compatible with dotnet. Copilot does a pretty good job of updating these and identifying the ones more likely to break (say web projects compared to plain dlls). It isn't a simple fire and forget, but it did make it possible without me needing to do as much research into what was changing.
Is that a net benefit? Without AI, if I really wanted to do that conversion, I would have had to become much more familiar with the inner workings of csproj files. That is a benefit I've lost, but it would've also taken longer to do so, so much time I might not have decided to do the conversion. My job doesn't really have a need for someone that deeply specialized in csproj, and it isn't a particular interest of mine, so letting AI handle it while being able to answer a few questions to sate my curiosity seemed a great compromise.
A second example, it works great as a better option to a rubber duck. I noticed some messy programming where, basically, OOP had been abandoned in favor of one massive class doing far too much work. I needed to break it down, and talking with AI about it helped come up with some design patterns that worked well. AI wasn't good enough to do the refactoring in one go, but it helped talk through the pros and cons of a few design pattern and was able to create test examples so I could get a feel for what it would look like when done. Also, when I finished, I had AI review it and it caught a few typos that weren't compile errors before I even got to the point of testing it.
None of these were things AI could do on their own, and definitely aren't areas I would have just blindly trusted some vibe coded output, but overall it was productivity increase well worth the $20 or so cost.
(Now, one may argue that is the subsidized cost, and the unsubsidized cost would not have been worthwhile. To that, I can only say I'm not versed enough on the costs to be sure, but the argument does seem like a possibility.)
>If you had a choice between a phone you knew was made by slaves and a phone that wasn't I assume you'd pick the slave free version every time.
At the same cost? Sure.
At different costs? We see that is not the case.
People don't. A few do, but most don't. There are many who would still prefer the more popular phone and an ethical cost is something they only mention when asked but is given only minor weight when it comes to decision making. Some might try to justify it by saying you can't be sure a phone claiming to be ethically made actually is, but how many even considered that much when making the decision?
>While it's fine to feel guilty for your involvement in the scheme don't let that get in the way of placing the blame for it squarely on the people who set things up this way and put you in this position.
Who is really at fault on a systematic level if the population decides lower costs is what they really wants regardless of what sacrifices have to be made. If we look at a less morally challenging area, say air travel, and see how many people claim to want a nicer experience, yet airlines are always focused on cutting costs. Is that the fault of the airlines? Or is it the fault of the consumers who, despite what they say, show extreme preference for lower costing tickets? We can blame any seller at the moment, but we can't ignore the market pressures that picked the sellers who stayed and the ones who went out of business.
> Who is really at fault on a systematic level if the population decides lower costs is what they really wants regardless of what sacrifices have to be made.
It's always the people who are actually forcing slaves to work for them. Always. Consumers will always want lower prices but that doesn't justify slavery. It's not as if a company like Apple is being forced to abuse workers because they'd be bankrupt otherwise. These companies are pulling in massive amounts of profits year after year. It's not "market pressures" that force them to abuse their workers it's just greed.
> see how many people claim to want a nicer experience, yet airlines are always focused on cutting costs. Is that the fault of the airlines? Or is it the fault of the consumers who, despite what they say, show extreme preference for lower costing tickets?
Every customer wants low cost tickets. Of course they do. There's a lot that goes into that though. Almost nobody wants to fly in the first place. It's annoying, expensive, stressful and uncomfortable. What people actually want is to get to their destination. Consumers are basically forced to deal with airlines since it's the fastest, and often the only, way they can get to where they want to go when they need to. It's just a necessary evil that must endured.
That's not the airlines fault, but it does put airlines in a position where they know they can take advantage of travelers at every opportunity and so they do. They overbook their flights, they charge endless bullshit fees, they cram as many people into the plane as they can, their ticket prices change by the minute and airlines aggressively charge people as much as they think they can get away with.
Mergers and the high cost of entry into the airline industry have greatly hurt competition and often most people have only one choice in airline when flying to certain destinations. Airlines have consumers bent over a barrel and they pound away at them relentlessly. That's all on the airlines, not the consumers.
The only real thing consumers have any control over is the price of their ticket, and because airlines play so many games with ticket pricing they enable a certain amount of gaming the system to "get a better deal" so many flyers do work hard to limit what they pay for what will inevitably be a shitty service.
There's also a question of how much consumers can even afford. Many consumers would love to pay more to get a less shitty air travel experience but they can't if it means they'd no longer be able to afford their trip. ULCCs are often the only viable options travelers have and even then many people go into debt to travel. Others may figure that going with a cheap airline or putting in the effort to get a cheap ticket will be worth it because while the flights will be a miserable 6-8 hours it means they'll be able to afford a nice dinner or have a little bit more spending money when they reach their destination. Those kinds of choices can be put squarely on the consumer.
Cheat sheets have an extra bonus, they are a great way to trick students into studying without realizing it is studying. By giving them a limited size, the student has to consider all of what they know and decide which areas they are the weakest on that need to be included, which they then have to organize into a compact and quick to reference chart. It doesn't replace the more boring phases of studying, but it does create a one off that gets better engagement and is more personalized than a fillable study guide or example test.
>I've been the ultra-cynic before, and agree that doesn't work either. People don't like working with you, and don't trust you.
Is the issue being that one isn't being cynical enough? If you are very cynical about how things will turn out, and share that with others who don't appreciate it (even if you are right), then you are being optimistic in thinking it will change things. Controlling one's displays to others to appear as whatever gets one their best outcome is being even more cynical, to the point of abandoning any attempts at open honest relationships, but it likely works the best if one can pull it off.
Though that might be a very big if, and getting caught faking this likely is worse. Then again, is forcing oneself to adopt optimism just an attempt to do this indirectly, a sort of 'fool yourself so you can better fool others' approach when more direct manipulation doesn't work, given that drive for the optimism is to get better outcomes?
>Also, just like how calculators are allowed in the exam halls, why not allow AI usage in exams?
Dig deeper into this. When are calculators allowed, and when are they not? If it is kids learning to do basic operations, do we really allow them to use calculators? I doubt it, and I suspect that places that do end up with students who struggle with more advanced math because they off loaded the thinking already.
On the other hand, giving a calculus student a 4 function calculator is pretty standard, because the type of math they can do isn't what is being tested, and having a student be able to plug 12 into x^3 - 4x^2 + 12 very quickly instead of having to work it out doesn't impact their learning. On the other hand, more advanced calculator are often not allowed when they trivialize the content.
LLMs are much more powerful than a calculator, so finding where in education it doesn't trivialize the learning process is pretty difficult. Maybe at grad level or research, but anything grade school it is as bad as letting a kid learning their times tables use a calculator.
Now, if we could create custom LLMs that are targeted at certain learning levels? That would be pretty nice. A lot more work. Imagine a Chemistry LLM that can answer questions, but know the homework well enough to avoid solving problems for students. Instead, it can tell them what chapter of their textbook to go read, or it can help them when they are having a deep dive beyond the level of material and give them answers to the sorts of problems they aren't expected to solve. The difficulty is that current LLMs aren't this selective and are instead too helpful, immediately answering all problems (even the ones they can't).
>Some contries enforce regulations on what tyres are deemed road-legal, due to requirements on safety and minimum grip. It's also why it's illegal to drive around with bald tyres.
Yes, this is a good thing. Where it becomes bad is when someone says "Oh, we should stop that from happening, let's ban the sell of such tires." With no exception.
This isn't a problem unique to regulations and laws. In software development, it is very common for the user to not think about exceptions. The rare the exception, the more likely it is missed in the requirements. It is the same fundamental problem of not thinking about all the exception cases, just in different contexts. You also see this commonly in children learning math. They'll learn and blindly apply a rule, not remembering the exceptions they were told they need to handle (can't divide by zero being a very common one).
I've asked for non 1:1 versions and have been refused. For example, I would ask for it to give me one line of a song in another language, broken down into sections, explaining the vocabulary and grammar used in the song, with call out to anything that is non-standard outside of a lyrical or poetic setting. Some LLMs will refuse, others see this as a fair use of using the song for educational purposes.
So far all I've tried are willing to return a random phrase or grammar used in a song, so it is only getting to asking for a line of lyrics or more that it becomes troublesome.
(There is also the problem that the LLMs who do comply will often make up the song unless they have some form of web search and you explicitly tell them to verify the song using it.)
I would ask for it to give me one line of a song in another language, broken down into sections, explaining the vocabulary and grammar used in the song, with call out to anything that is non-standard outside of a lyrical or poetic setting.
I know no one wants to hear this from the cursed IP attorney, but this would be enough to show in court that the song lyrics were used in the training set. So depending on the jurisdiction you're being sued in, there's some liability there. This is usually solved by the model labs getting some kind of licensing agreements in place first and then throwing all that in the training set. Alternatively, they could also set up some kind of RAG workflow where the search goes out and finds the lyrics. But they would have to both know that the found lyrics where genuine, and ensure that they don't save any of that chat for training. At scale, neither of those are trivial problems to solve.
Now, how many labs have those agreements in place? Not really sure? But issues such as these are probably why you get silliness like DeepMind models not being licensed for use in the EU for instance.
I didn't really say this in my previous point as it was going to get a bit too detailed about something not quite related to what I was describing, but when models do give me lyrics without using a web search, it has hallucinated every time.
As for searching for the lyrics, I often have to give it the title and the artist to find the song, and sometimes even have to give context of where the song is from, otherwise it'll either find a more popular English song with a similar title or still hallucinate. Luckily I know enough of the language to identify when the song is fully wrong.
No clue how well it would work with popular English songs as I've never tried those.
We have many expectations in society which often aren't formalized into a stated commitment. Is it really unreasonable to have some commitment towards society to these less formally stated expectations? And is expecting communication presented as being human to human to actually be from a human unreasonable for such an expectation? I think not.
If you were to find out that the people replying to you were actually bots designed to keep you busy and engaged, feeling a bit betrayed by that seems entirely expected. Even though at no point did those people commit to you that they weren't bots.
Letting someone know they are engaging with a bot seems like basic respect, and I think society benefits from having such a level of basic respect for each other.
It is a bit like the spouse who says "well I never made a specific commitment that I would be the one picking the gift". I wouldn't like a society where the only commitments are those we formally agree to.
I do appreciate this side of the argument but.. do you think that the level/strength of a marriage commitment is worthy of comparison to walking by someone in public / riding the same subway as them randomly / visiting their blog?
I find them comparable, but not equal, for that reason.
Especially if we consider the summation of these commitments. One is obviously much larger, but it defines just one of our relationships within society. The other defines the majority of our interactions within society at large, so a change to it, while much less impactful to any one single interaction or relationship (I use them interchangeably here as often the relationship is just that one single interaction) is magnified by how much more often it occurs. This does move towards making the costs of losing some trust in such a small interaction as having a much larger cost than it first appears, which I think further increases how one can compare them.
(More generally, I also like comparing things even when the scale doesn't match, as long as the comparison really applies. Like apples and oranges, both are fruits you can make juice or jam with.)
That is how illustrations work. If someone doesn't see something, you amplify it until it clubs them over the head and even an idiot can see it.
And sometimes of course even that doesn't work but there has always been and always will be the clued, clue-resistant, and the clue-proof. Can't do anything about the clue-proof but at least presenting the arguments allows everyone else to consider them.
This fixation on the reverence due a spouse is completely stupid and beside the point of the concept being expressed. As though you think there is some arbitrary rule about spouses that is the essense of the problem? The gift-for-spouse is an intentionally hyberbolic example of a concept that also exists and applies the same at non-hyperbolic levels.
The point of a clearer example is you recognize "oh yeah, that would be wrong" and so then the next step is to ask what makes it wrong? And why doesn't that apply the same back in the original context?
You apparently would say "because it's not my wife", but there is nothing magically different about needing to respect your spouses time vs anyone else's. It's not like there is some arbitrary rule that says you can't lie to a spouse simply because they are a spouse and those are the rules about spouses. You don't lie to a spouse because it's intrinsically wrong to lie at all to anyone. It's merely extra wrong to to do anything wrong to someone you supposedly claim to extra-care about. Lying was already wrong all by itself for reasons that don't have anything special to do with spouses.
This idea that it's fine to lie to and waste the time of everyone else, commandeer and harness their attention of an interaction with you, while you just let a robot do your part and you are off doing something more interesting with your own time and attention, to everyone else who isn't your spouse simply because you don't know them personally and have no reason to care about them is really pretty damning. The more you try to make this argument that you seem to think is so rational, the more empty inside you declare yourself to be.
I really can not understand how anyone can try to float the argument "What's so bad about being tricked if you can't tell you were tricked?" There are several words for the different facets of what's so wrong, such as "manipulation". All I can say is, I guess you'll just have to take it on faith that humans overwhemingly consider manipulation to be a bad thing. Read up on it. It's not just some strange idea I have.
I think we are having a fundamental disagreement about "being tricked" happening at all. I'm intelligent enough to follow the argument.
I see that, in the hyperbolic case, you are actively tricking your wife. I just don't agree that you are actively tricking randomly public visitors of a blog in any real way? there is no agreement in place such that you can "trick" them. Presumably you made commitments in your marriage. No commitments were made to the public when a blog got posted.
It's equally baffling to me that you would use one case to make the point of the other. It doesn't make any fucking sense.
Why was it wrong in the wife case? What specifically was wrong about it? Assume she never finds out and totally loves the gift. Is purely happy. (I guess part of this also depends on the answer to another question: What is she so happy about exactly?)
If someone sees no productivity gains when using an AI (or a productivity decrease), it is easy to come up with ways it might have happened that weren't related to the AI.
This is an inherent imbalance in the claims, even if we both people have brought 100% proof of there specific claims.
A single instance of something doing X is proof of the claim that something can do X, but no amount of instances of something not doing X is proof of the claim that something cannot do X. (Note, this is different from people claiming that something always does X, as one counter example is enough to disprove that.)
Same issue in math with the difference between proving a conjecture is sometimes true and proving it is never true. Only one of these can be proven by examples (and only a single example is needed). The other can't be proven even by millions of examples.
reply