More

lancekey · 2026-05-23T16:37:47 1779554267

Also check out his nanochat repo. I used the repo, claude and shadeform to train my own mini model for about $300. Would have been less but I screwed up and let the cloud gpu rental run for a few hours even though the training run errored out.

Of course the model was dumber than GPT2 but still it was a great learning experience.

lancekey · 2026-05-21T00:47:43 1779324463

https://hochuzhit.com/en/

I think i saw videos of drones w loudspeakers or dropping flyers w a telegram QR code but I don’t have sources.

lancekey · 2026-05-20T22:14:44 1779315284

25/75%. Plenty of stores are owned directly by McDonalds corp.

lancekey · 2026-03-27T13:44:55 1774619095

Do you see any benefit in doing this locally versus having Codex review the PR Claude generates?

axldelafosse · 2026-03-27T17:41:28 1774633288

The feedback loop is faster. But PR reviews are still useful as they are multiplayer (meaning that you and another human reviewer can talk about a specific agent's comment directly on the diff, which is very useful sometimes).

lancekey · 2026-03-27T12:03:33 1774613013

Ha I just SPECed out a version of this. I have a simple static website that I want a few people to be able to update.

So, we will give these 3 or 4 trusted users access to an on-site chat interface to request updates.

Next, a dev environment is spun up, agent makes the changes, creates PR and sends branch preview link back to user.

Sort of an agent driven CMS for non-technical stakeholders.

Let’s see if it works.

lancekey · 2026-03-25T14:49:27 1774450167

I don’t think so. IIRC the desktop app is called Claude and it has a code option in the UI.

anthuswilliams · 2026-03-25T23:17:37 1774480657

Claude Cowork (part of the Desktop app) is claude code, running inside a VM.

Helpful writeup here: https://pvieito.com/2026/01/inside-claude-cowork (I am not the author)

Mashimo · 2026-03-25T15:53:34 1774454014

If you go to the product website: https://claude.com/product/claude-code

> Use Claude Code where you work

> Desktop Termianl IDE WEb and iOS Slack

Not that it is important any way ¯\_(ツ)_/¯

lancekey · 2026-03-09T11:58:47 1773057527

For over a year now, I’ve been working on Compute Prices (https://computeprices.com).

It’s been a great way for me to better understand the cloud GPU industry, learn about data collection, normalization and use agentic coding to build a side project.

One thing I’m working on is distinguishing spot vs on demand prices and listing those separately. Also, including inference pricing for non-text AI models.

What features or data would you like to see me add next?

lancekey · 2026-02-05T00:11:52 1770250312

Can you say a bit more about evals and your approach?

alexhans · 2026-02-06T02:38:30 1770345510

High level, the approach is:

- I'm pain point driven:

  - I can't compare what I can't measure. 

  - I can't trust to run this "AI" tool to run on its own

- That's automation, which is about intentionality (can I describe what I want?) and risk profile understanding (What's the blast radius/worst that could happen)

Then I treat it as if it was an Integration Test/Test Driven Development exercise of sorts.

- I don't start designing an entire cloud infrastructure.

- I make sure the "agent" is living in the location where the users actually live so that it can be the equivalent of an extra paid set of hands.

- I ask questions or replicate user stories and use deterministic tests wherever I can. Don't just go for LLMaaJ. What's the simplest thing you can think of?

- The important thing is rapid iteration and control. Just like in a unit testing scenario it's not about just writing a 100 tests but the ones that qualitatively allow you to move as fast as possible.

- At this stage where the space is moving so fast and we're learning so much, don't assume or try to over-optimize places that don't hurt and instead think about minimalism, ease of change, parameterization and ease of comparison with other components that form "the black box" and with itself.

- Once you have the benchmarks that you want, you can decide things like pick the cheapest model/agent configuration that does the job within the acceptable timeframe.

Happy to go deeper on these. I have some practical/runnable samples/text I can share on the topic after the weekend. I'll drop a link here when it's ready

lancekey · 2026-02-10T12:34:19 1770726859

This is really insightful. Thank you.

Your first two points jive with my intuition that an agents primaries should be a code execution sandbox, mounted files and git.

If you have any practical examples to share I’m sure a ton of people would appreciate it.

alexhans · 2026-02-15T19:00:55 1771182055

I just shared this in HN https://news.ycombinator.com/item?id=47026263 to see if it's possible to scale the knowledge sharing and simple and good practices which keep people in control.

It may or may not address the practical examples you need but I'd been to hear your thoughts and maybe it's possible to come up with a more illustrative one.

I didn't go for bubblewrap or similar containers yet because I didn't want to lose a specific type of baseline newcomer yet (Economists who do some coding) but I will be adding to it with whatever most elegant approaches I can find that don't leak too much complexity for things like sandboxing, system testing, integration mocking (reverse proxying), Observing with Openteleletry or otherwise, presenting benchmarks, etc.

lancekey · 2026-01-29T20:41:03 1769719263

Human Emulator offers agent based (near instant) quotes for any computer task.

lancekey · 2026-01-29T14:24:28 1769696668

I'm starting to add inference providers to computeprices.com, but if you even just look at GPU/hr rentals, there are some reasonable options out there.

I personally have been enjoying shadeform to build the GPU setup I like.