> If your metric is an LLM that can copy/paste without alterations, and never hallucinate APIs, then yeah, you'll always be disappointed with them.
I struggle to take comments like this seriously - yes, it is very reasonable to expect these magical tools to copy and paste something without alterations. How on earth is that an unreasonable ask?
The whole discourse around LLMs is so utterly exhausting. If I say I don't like them for almost any reason, I'm a luddite. If I complain about their shortcomings, I'm just using it wrong. If I try and use it the "right" way and it still gets extremely basic things wrong, then my expectations are too high.
I think what they're best at right now is the initial scaffolding work of projects. A lot of the annoying bootstrap shit that I hate doing is actually generally handled really well by Codex.
I agree that there's definitely some overhype to them right now. At least for the stuff I've done they have gotten considerably better though, to a point where the code it generates is often usable, if sub-optimal.
For example, about three years ago, I was trying to get ChatGPT to write me a C program to do a fairly basic ZeroMQ program. It generated something that looked correct, but it would crash pretty much immediately, because it kept trying to use a pointer after free.
I tried the same thing again with Codex about a week ago, and it worked out of the box, and I was even able to get it to do more stuff.
I think it USED to be true that you couldn't really use an LLM on a large, existing codebase. Our codebase is about 2 million LOC, and a year ago you couldn't use an LLM on it for anything but occasional small tasks. Now, probably 90% of the code I commit each week was written by Claude (and reviewed by me and other humans - and also by Copilot and ZeroPath).
It seems like just such a weird and rigid way to evaluate it? I am a somewhat reasonable human coder, but I can't copy and paste a bunch of code without alterations from memory either. Can someone still find a use for me?
For a long time, I've wanted to write a blog post on why programmers don't understand the utility of LLMs[1], whereas non-programmers easily see it. But I struggle to articulate it well.
The gist is this: Programmers view computers as deterministic. They can't tolerate a tool that behaves differently from run to run. They have a very binary view of the world: If it can't satisfy this "basic" requirement, it's crap.
Programmers have made their career (and possibly life) being experts at solving problems that greatly benefit from determinism. A problem that doesn't - well either that needs to be solved by sophisticated machine learning, or by a human. They're trained on essentially ignoring those problems - it's not their expertise.
And so they get really thrown off when people use computers in a nondeterministic way to solve a deterministic problem.
For everyone else, the world, and its solutions, are mostly non-deterministic. When they solve a problem, or when they pay people to solve a problem, the guarantees are much lower. They don't expect perfection every time.
When a normal human asks a programmer to make a change, they understand that communication is lossy, and even if it isn't, programmers make mistakes.
Using a tool like an LLM is like any other tool. Or like asking any other human to do something.
For programmers, it's a cardinal sin if the tool is unpredictable. So they dismiss it. For everyone else, it's just another tool. They embrace it.
[1] This, of course, is changing as they become better at coding.
My problem isn't lack of determinism, it's that it's solution frequently has basic errors that prevent it from working. I asked ChatGPT for a program to remove the background of an image. The resulting image was blue. When I pointed this out to ChatGPT it identified this as a common error in RGB ordering in OpenCV and told me the code to change. The whole process did not take very long, but this is not a cycle that is anything I want to be part of. (That, and it does not help me much to give me a basic usage of OpenCV that does not work for the complex background I wanted to remove)
Then there are the cases where I just cannot get it do what I ask. Ask Gemini to remove the background of an image and you get a JPEG with a backed in checkerboard background, even when you tell it to produce an RGBA PNG. Again, I don't have any use for that.
But it does know a lot of things, and sometimes it informs me of solutions I was not aware of. The code isn't great, but if I were non-technical (or not very good), this would be fantastic and better than I could do.
I’m perfectly happy for my tooling to not be deterministic. I’m not happy for it to make up solutions that don’t exist, and get stuck in loops because of that.
I use LLMs, I code with a mix of antigravity and Claude code depending on the task, but I feel like I’m living in a different reality when the code I get out of these tools _regularly just doesn’t work, at all_. And to the parents point, I’m doing something wrong for noticing that?
If it were terrible, you wouldn't use them, right? Isn't the fact that you continue to use AI coding tools a sign that you find them a net positive? Or is it being imposed on you?
> And to the parents point, I’m doing something wrong for noticing that?
There's nothing wrong pointing out your experience. What the OP was implying was he expects them to be able to copy/paste reliably almost 100% of the time, and not hallucinate. I was merely pointing out that he'll never get that with LLMs, and that their inability to do so isn't a barrier to getting productive use out of them.
I was the person who said it can't copy from examples without making up APIs but.
> he'll never get that with LLMs, and that their inability to do so isn't a barrier to getting productive use out of them.
This is _exactly_ what the comment thread we're in said - and I agree with him.
> The whole discourse around LLMs is so utterly exhausting. If I say I don't like them for almost any reason, I'm a luddite. If I complain about their shortcomings, I'm just using it wrong. If I try and use it the "right" way and it still gets extremely basic things wrong, then my expectations are too high.
> If it were terrible, you wouldn't use them, right? Isn't the fact that you continue to use AI coding tools a sign that you find them a net positive? Or is it being imposed on you?
You're putting words in my mouth here - I'm not saying that they're terrible, I'm saying they're way, way, way overhyped, their abilities are overblown, (look at this post and the replies of people saying they're writing 90% of code with claude and using AI tools to review it), but when we challenge that, we're wrong.
> And so they get really thrown off when people use computers in a nondeterministic way to solve a deterministic problem
Ah, no. This is wildly off the mark, but I think a lot of people don't understand what SWEs actually do.
We don't get paid to write code. We get paid to solve problems. We're knowledge workers like lawyers or doctors or other engineers, meaning we're the ones making the judgement calls and making the technical decisions.
In my current job, I tell my boss what I'm going to be working on, not the other way around. That's not always true, but it's mostly true for most SWEs.
The flip side of that is I'm also held responsible. If I write ass code and deploy it to prod, it's my ass that's gonna get paged for it. If I take prod down and cause a major incident, the blame comes to me. It's not hard to come up with scenarios where your bad choices end up costing the company enormous sums of money. Millions of dollars for large companies. Fines.
So no, it has nothing to do with non-determinism lol. We deal with that all the time. (Machine learning is decades old, after all.)
It's evaluating things, weighing the benefits against the risks and failure modes, and making a judgement call that it's ass.
Also good for manufacturing consent in Reddit and other places. Intelligence services busy with certain country now, bots using LLMs to pump out insane amounts of content to mold the information atmosphere.
Its strong enough to replace humans at their jobs and weak enough that it cant do basic things. Its a paradox. Just learn to be productive with them. Pay $200/month and work around with its little quirks. /s
> The more surprising part is the unusual reactions of the other people getting a better picture and context of what I’m explaining without the usual back and forth - which has landed me my fair share of complaints of having to hear mini lectures, but not more than people appreciative of the fuller picture.
It’s not surprising to me at all. People don’t tend to appreciate being lectured at - especially in a conversational context. Moreover, people really don’t like being spoken to as if they’re robots (which is something I’ve started to notice happening more and more in my professional life).
The fact that the author considers these reactions surprising and “unusual” betrays a misunderstanding of (some of) the purposes of communication. Notably, the more “human” purposes.
> The fact that the author considers these reactions surprising and “unusual” betrays a misunderstanding of (some of) the purposes of communication. Notably, the more “human” purposes.
Guess that's what early access to the internet and a pandemic during the final school years does to a person, ah well haha
You keep saying “nullification”. Can you explain precisely what you mean by that?
Because as far as I’m aware, immigration law is not a concern of the state, and what folks typically mean when they say “nullification” in this context is “the state isn’t doing the fed’s job for them.”
You also brought up warrants to enter private property. What do you make of the incident a few days ago where an agent hopped a fence to arrest someone, without a warrant? Should we just ignore those violations of our rights?
>Because as far as I’m aware, immigration law is not a concern of the state, and what folks typically mean when they say “nullification” in this context is “the state isn’t doing the fed’s job for them.”
It's not just immigration law, it's any federal law. States have the right to ignore federal law if they like. This is called nullification. However, it very, very rarely happens because its inherently undemocratic. It especially rarely happens to the extent that cities and states pass explicit laws that order state law enforcement to ignore federal laws, and even work against the federal government's interests.
It's happened recently with marijuana legalization, with success. Where the federal government did some raids, but marijuana legalization is politically popular, so they backed off... and there has even been talk in some years of ending the illegality of marijuana federally.
State nullification has been somewhat unsuccessful with illegal immigration. These raids are the result of the federal government going its own way to enforce the law without cooperation of the states. The last time we saw this level of federal enforcement against state objection is after Brown v Board of Education: https://en.wikipedia.org/wiki/Little_Rock_Nine
I good comparison to the seriousness of nullification as an act that is inherently an escalation is gun control laws. Suppose some red states wanted to just nullify the National Firearms Act -- https://en.wikipedia.org/wiki/National_Firearms_Act -- The are perfectly in their rights to ignore federal laws and allow firearms dealers to sell unregistered, suppressed, machine guns to felons. The only way neighboring blues states -- obviously outraged that this is happening -- can do anything about this is by seeking federal enforcement, again, which would include raids, arrests, etc.
>You also brought up warrants to enter private property. What do you make of the incident a few days ago where an agent hopped a fence to arrest someone, without a warrant? Should we just ignore those violations of our rights?
I'm very much not saying ICE is always acting within the law. Like any other policing force, they're going to make mistakes (intentional or otherwise). We should be very angry about those things, especially if they're happening in bad faith. The problem I see is that when we're yelling about actually -- and unfortunately -- legal things then those serious issues are just going to look like background noise. The other serious problem is that all this crying wold literally makes the left look undemocratic. You don't like the law? Fight to change it. Don't just take the ball and go home, and then cry when the neighbors come to your house to get the ball back.
There is a world of difference between “passing a state law that directly contradicts federal law” and “declining to proactively enforce federal laws in ways that are not required by those laws.”
To drive the point home: federal immigration laws are already enforced by federal agencies. Here in IL, state and local officials cooperate to the extent required by law. There are no federal laws on the books requiring them to do the job of the federal government for them (they could pass one, but they haven’t).
Calling that “nullification” is intellectually dishonest. As you said - “if you don’t like the law, fight to change it.” Don’t pretend it’s something it’s not.
>Here in IL, state and local officials cooperate to the extent required by law.
This is clearly false in regards to most federal laws. To illustrate this, I'll take an exceptional example. If there where a serial killer who was living in IL, but had only killed anyone in other states, I suspect that IL government would likely go out of their way to assist the Feds in apprehending this killer, even though this is not required by state law.
IL would likely do the same for many, if not most, federal laws. The point of nullification is exactly when the state does not help when asked, still there are reasons for practical resources there, but it becomes very obvious nullification when the state passes laws preventing individuals who would LIKE to help, like local policed departments, from helping even if they wanted to. And this is exactly what has happened in many blue states.
Pretending that's not overt nullification is unserious.
Not assisting with enforcement acts you don't feel are worthwhile is not nullification. I'm not engaging in "nullification" when I don't call the police on a jaywalker. Or I mean maybe you think this is, but then police engage in wildcat strikes all the time, or change enforcement priorities, or whatever you want to frame it as. Calling a difference in prioritization "nullification" wrong, especially if local police in immigrant communities want to maintain good relationships with those communities. I think it's laudable that some police forces show an interest in serving their communities interests, as opposed to yearning to be fashy.
> but it becomes very obvious nullification when the state passes laws preventing individuals who would LIKE to help, like local policed departments, from helping even if they wanted to. And this is exactly what has happened in many blue states.
Can you give examples?
Keep in mind, "sanctuary city" policies are usually actually supported by local police forces, because while they may look not tough on crime (and for this reason sometimes police forces halfheartedly lobby against them), they actually make on-the-ground local policing easier, because they engender trust between the local police force and immigrant communities who otherwise might not report crimes at all.
>I’m not going to engage with you if you’re going to get in multiple threads and refer to things as “fashy.”
>It’s difficult enough to engage in a heterodox view in good faith. I don’t need to deal with slapdash bullshit.
I see we've reached the point in the discussion where you 'abruptly fall silent, loftily indicating...that the time for argument is over.'
Good fascist! Nice fascist! Late for a Bund meeting, are we?
Source:
“Never believe that anti-Semites [or in this case, fascist apologists] are completely unaware of the absurdity of their replies. They know that their remarks are frivolous, open to challenge. But they are amusing themselves, for it is their adversary who is obliged to use words responsibly, since he believes in words. The anti-Semites have the right to play. They even like to play with discourse for, by giving ridiculous reasons, they discredit the seriousness of their interlocutors. They delight in acting in bad faith, since they seek not to persuade by sound argument but to intimidate and disconcert. If you press them too closely, they will abruptly fall silent, loftily indicating by some phrase that the time for argument is past.” ― Jean-Paul Sartre[0]
> But this notion that roving bands of assassins are driving down the street looking for browns is likely an exaggeration (made worse by misinformation on social media).
Assassins? Nobody said that.
But my friend I can assure you they are, in fact, driving down the street and taking people who “look suspicious.”
(They also are doing more targeted things - both are true.)
In a way, the article understates how bad it is. I live in Chicago, and in my neighborhood every lamp post (and mailbox, and other surface) has a poster detailing your rights. “Fuck ICE” (and related) signs all over. Most businesses and a lot of houses in my neighborhood have signs explicitly stating that ICE is not welcome inside without a warrant. My coffee shop regularly has free whistles to take, so you can help alert others.
Just a few days ago I was working at a coffee shop and got a rapid response notice that ICE was about a block from me. I got a few more that day, all within a few blocks of my house.
It is incredibly stressful. I married people, have kids who are not white - they are a target. I pray every day that the next daycare raid isn’t my sons daycare, that ICE doesn’t stop my husband as he goes to work, that my mother-in-law doesn’t get snatched off the street when she walks to Target.
> Because you didn't tell it to make a "professional analytics application" for a while and then switch to nonsensical "unicorns and rainbows" at the end. You forgot to trick it into the "gotcha!" situation that OP intentionally created to make fun of the stupid AI.
Even if the OP initially asked for a “professional” application, this is hardly a “gotcha” situation - our tools should do what we ask!
I’m sure we could come up with some realistic exceptions, but let’s not waste our words on them: this is a pretty benign thing and I cannot believe we are normalizing the use of tools which do not obey our whims.
Our tools should not do what we ask if we ask them to do things they should not do.
If it were possible for a gun to refuse to shoot an innocent person then it should do that.
It just so happens that LLMS aren't great at making perfectly good decisions right now, but that doesn't mean that if a tool were capable of making good decisions it shouldn't be allowed to.
If you define the behavior of the system in an immutable fashion, it ought to serve as a guardrail to prevent anyone (yourself included) from fucking it up.
I want claude to tell me to fly a kite if I ask it to do something antithetical to the initially stated mission. Mixing concerns is how you end up spending time and effort trying to figure out why 2+2 seems to also equal 2 + "" + true + 1
> The chances of a citizen being targeted by ICE is low.
You can’t start with this premise, though. Recent rulings allow stops based on “probable cause” such as a combination of “speaks Spanish”, “is brown”, and “is in a place where we think illegal immigrants might be”.
So like: any Latino US citizen, who happens to be working someplace like a landscaping company. Or a kitchen.
The idea that citizens aren’t likely to be targets is now laughable. And we have ample reporting indicating that in fact, citizens are being detained, for hours and hours (if not longer).
Low doesn't mean zero, it means low. You might notice I used different terms for the different groupings, with the chance of a citizen being targeted by ICE as the highest overall at "low". ICE has so far deported more than 400,000 illegal aliens. [1] If they were "only" 99% accurate, you'd be able to find thousands of instances where things went wrong. Instead, you're looking more at tens to low hundreds of instances, so it's likely that their overall accuracy is somewhere in the 99.9% to 99.99% range.
And as I was demonstrating above, the conditional probabilities required for a false positive from this app mean that it's practical effective accuracy rate will likely be 100%.
> Instead, you're looking more at tens to low hundreds of instances
…based on what, the independent research ProPublica did? DHS doesn’t even keep statistics on how many citizens they detain so I’m not sure we should be assuming the numbers here are that low.
What I mentioned already. Being incorrectly detained isn't consequence free. People can and have successfully sued, winning substantial sums of money in the process. And we also live in the social media age where nothing gets more of those sacred likes and other such things (including that sweet sweet GoFundMe money) than framing oneself as a victim. And on top of this all immigration enforcement runs contrary to the corporate media's biases. They are actively trying to make a mountain out of every single molehill, yet they are clearly finding themselves annoyingly short of molehills.
In other words, ICE's errors are highly visible. That my approximation aligns with ProPublica's (which is probably a higher end ballpark since I doubt they were especially critical of any claims they discovered) is unsurprising.
> People can and have successfully sued, winning substantial sums of money in the process.
I would be quite interested to know if you can cite sources on that.
> And we also live in the social media age where nothing gets more of those sacred likes and other such things (including that sweet sweet GoFundMe money) than framing oneself as a victim.
What, exactly, is your argument here? That it’s all fine because you think people will play victim and strike it rich on GoFundMe? I’m struggling to see what point you’re actually trying to make.
> They are actively trying to make a mountain out of every single molehill, yet they are clearly finding themselves annoyingly short of molehills.
Sources? Please list out what molehills were made into mountains. What evidence do you have that they are actively trying to do it because of their “bias”? You’re regurgitating tired, right-wing talking points. Back it up with evidence if you’re so sure about it.
> which is probably a higher end ballpark since I doubt they were especially critical of any claims they discovered
Well, sure. Let’s see some evidence that they weren’t critical enough with their reporting. My understanding is that they are a highly respected journalistic outfit. What makes you so sure they were playing fast and loose with the facts?
Here [1] is a case where somebody won $150,000 for being detained for 12 hours. The cases aren't especially difficult to search for yourself, so I'm not sure why you're asking me. I can actually respond to everything else you said with an example from ProPublica's story you cited [2], to emphasize that I'm not cherry picking cases to make my point! Scroll down about 25% of the way and you'll get their first inline video example of "Rafie Ollah Shouhed".
Now go frame by frame at about the 5.5s mark. You can see the individual in question charge and then thrust his body in front of the responding ICE officer (watch how he leans left into the officer before they are in physical contact) to create a physical altercation. He then attempts to grab the legs of the officer as he jogs away. The same guy then comes out for more, and pushes one ICE officer dealing with somebody else, and then starts grappling with another ICE officer before he's finally tackled and arrested.
Media Framing:
- Surveillance footage shows Ice agents pushing 79-year-old man to the ground (Guardian)
- Car wash owner files $50M claim over injuries sustained during immigration raid (ABC)
- 79-year-old US citizen pinned by ICE agents (Fox)
- U.S. citizen files civil rights claim after ICE raid at his car wash (NBC)
As this is the first video ProPublica featured, presumably they think that's the most compelling case. In any case it's certainly one of their cases which are supposed to be injust, yet there wasn't even the slightest injustice there whatsoever. And now he wants $50 million lol. I'd also add that ProPublica implies that the government dropping charges in cases is because of lack of merit. In reality it's going to be a balance of gain:loss from such. This is one of those cases where the charges were dropped, but obviously that was not done for lack of merit.
What you see as “thrusting” sure looks to me like he was trying to stop himself from a full-on run - why did he grab a door handle on the wall? Why would you grab and pull like that if you were trying to tackle?
And “grabbing his legs”… come on man. That looks a hell of a lot like an old man flailing after getting tackled.
And you think he grappled with the officer before getting arrested outside? It looks like precisely the opposite.
I didn't say he was trying to tackle, I said he was trying to create a physical altercation, probably with the premeditated goal of trying to sue and/or buy time for the likely illegal aliens working for him - not only for their sake, but because hiring illegal aliens is a felony. I don't believe you believe that he just 'accidentally fell' exactly on the officer exactly as he came into range.
The reason he grabbed the door handle is because in his mind he thought he was going to be the one knocking the officer down. He's a big and very aggressive guy that's this spry at 79 - I'm positive this wasn't even remotely close to his first rodeo. He grabbed on to help maintain balance.
As another issue he wasn't running anywhere in particular, except towards the officer. As soon as he collides, he then gets up from crashing into him he turns around and starts racing back towards him again. He then pushes the other officer at 28 seconds and begins grappling with yet a third officer at 30 seconds. He's then tackled at 35 seconds.
Given he was not arrested for intentionally crashing into the first officer I think ICE was generally trying their hardest to ignore him, but that probably became impossible about the point he actively decided to start grappling with them.
> probably with the premeditated goal of trying to sue and/or buy time for the likely illegal aliens working for him
My bad. Didn’t realize you already knew his heart, and that this was premeditated. And that you clearly know who his employees were and that they were here illegally.
Whoops. Since we know they’re guilty, I guess all that’s left to do is find the evidence!
Enjoy yourself. I won’t engage in this uncharitable, ugly discussion with you anymore. I hope you find peace in you heart, and I hope others treat you with the charity and dignity you’re clearly unwilling to give others.
Feel free to try to create a plausible explanation for his aggressive behavior otherwise. Why would you say he was running towards the office officer only to 'accidentally' land on him right as he passed him? And then why would he go outside, push one officer, and begin grappling with another? This is not how normal people behave.
He actually gave an explanation for this which we clearly know is a lie - he claimed that "when he tried to speak with the agents and show them the legal paperwork for his employees, they shoved him to the ground, and at least one agent put his knee on Shouhed’s neck." [1] He probably wasn't aware the outside altercation had been recorded. Where's the paperwork? And in this case 5 illegal aliens were arrested, including one who had already been arrested and deported twice previously.
There's a balance to all things in life. Obviously we should not be blindly prejudiced against individuals on one extreme, yet on the equal but opposite extreme one can be so open minded that your brain falls out.
Available reporting indicates that judge ruled on Thursday, and that DHS deported on Friday. Moreover, available reporting also indicates:
> DHS and ICE did not respond to questions from The Associated Press seeking additional details on the timeline and how officials receive federal court orders.
So they aren’t clarifying anything. Odd.
And don’t forget back in March, when the administration publicly asserted that oral orders from a judge carried no authority and that they would only heed written orders.
When you put those two together, one wonders: perhaps DHS is playing fast and loose with timelines again.
Why on earth would you treat anything they say as if it were truthful or reliable? They have lost the right to be treated as trustworthy by default.
Are you suggesting a government agency is just making things up in official communications?
If that’s the case you must also assume the deportee is lying as well? Between the two it’s the deportee who has the bigger incentive to make things up.
If we’re going to go with those assumptions there is no point in even discussing it because neither of have any facts to base an argument on.
So why should I believe anything they say these days? They are blatantly lying, in ways that are manifestly obvious to anyone that is willing to look. We don’t owe the presumption of good faith to people who time and again have been publicly caught lying - and worse, who haven’t even tried to correct the record.
Half of your sources are other government officials. That kind of runs counter to your argument that you can't rely on government official statements to be true, no?
And let's look at the Reason article. "Martinez also was taken to the hospital by ambulance, and the criminal complaint against her only mentions two cars, not 10."
Ok, so the DHS "lied" about being boxed in by 2 not 10 cars. That seems to miss the forest for the tree no? The DHS agents were still boxed in - normally threatening federal law enforcement officers is illegal, no?
If someone is taking the time to refute ChatGPT’s output and telling you why the answer isn’t applicable in a given situation, it certainly implies that ChatGPT wasn’t “correct enough” at all.
What situations do you think it’s fine to be “correct enough?”
But this person says they want to refute it in every situation.
Some people seem to make a hobby of refuting the output of others. So no, I don’t trust the implication that if somebody spends time refuting it that it must be worth refuting.
In my experience (with both people-output and ChatGPT-output) my goal is to not refute anything unless it absolutely positively must be refuted. If it’s a low-stakes situation where another person has an idea that seems like it might/will probably work, let them go nuts and give it a shot. I’ll offer quick feedback or guiding questions but I have 0 interest in refuting it even if I think there’s a chance it’ll go wrong. They can learn by doing.
I struggle to take comments like this seriously - yes, it is very reasonable to expect these magical tools to copy and paste something without alterations. How on earth is that an unreasonable ask?
The whole discourse around LLMs is so utterly exhausting. If I say I don't like them for almost any reason, I'm a luddite. If I complain about their shortcomings, I'm just using it wrong. If I try and use it the "right" way and it still gets extremely basic things wrong, then my expectations are too high.
What, precisely, are they good for?
reply