The leaderboards are dumb, but I understand the point of telling people not to worry about tokens and just use it. They are trying to get people to try it, to discover new uses without asking “is this worth testing”. It’s basically early R&D budget. Eventually these companies will decide it’s time to transition into efficient usage.
Mailing patches is the same as squashing commits. The Linux kernel would be much harder to maintain without messy history being carefully distilled down to well crafted patches.
But mailing patches is a pain in the ass. VCSes should support squashing and rebasing.
The data entry is a pain in the ass with those apps when cooking food from scratch. It’s much much easier with LLMs and natural language and voice mode and pictures of a food scale and things like that.
I’ve found that the best way to deal with this is to add an entry to /etc/hosts for my local machine that fits the pattern for QA environment. Then I run a local reverse proxy with a self signed certificate.
For what it's worth, codex doesn't yet seem to be aggressively terminating accounts or invalidating auth tokens if they detect usage in a non-first party tool. Whether that will continue to be the case or not a gamble though.
Of course, you need to be careful about what access you give to your agent. I gave my agent its own email, and I can forward it emails if I need it to read anything in my inbox.
Everyone will have their own threshold for what type of access they want to give their agent. some people will give it access to their personal email, bank account, etc, but I wouldn't recommend it yet! But I bet in a couple years this will be standard practice.
There’s a lot of humans I wouldn’t trust to be an assistant with access to my bank account. It’s bold to assume that within 2 years these things are going to be scam resistant.
It’s going to be bleak when there’s articles about how “my agent fell for a scam and now my life savings are gone”.
Yeah if this can truly just autonomously make great software, then where is all the new SaaS that is able to undercut incumbents by charging 10-20% of what they are charging?
reply