Hacker Newsnew | past | comments | ask | show | jobs | submit | bodegajed's commentslogin

One reason, maximizing investor value. CEO and executives usually get bonuses after layoffs.


1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly.


Ingesting multiple code files will take forever in prompt processing without a GPU though, tg will be the least of your worries. Especially when you don't append but change it in random places so caching doesn't work.


A FIM or completion model like this won't have a large prompt and caching doesn't work anyways (per their notes). It'll get maybe a few thousand tokens in a prompt, maximum. For a 1.5B model, you should expect usable CPU-only inference on a modern CPU, like at least hundreds of tokens per second of prefill and tens of tokens per second of generation, which is decently usable in terms of responsiveness.


A thousand tokens (which would be on the low side) at 10-100 t/s in ingestion speed is 10-100 seconds. I don't seriously expect anyone to wait a solid minute after pressing tab for autocomplete, regular autocomplete gets unusably annoying if it takes more than a split second tbh.


Unfortunately, the main optimization (3x speedup) is using n-gram spec dec which doesn't run on CPUs. But I believe it works on Metal at least.


Brevity is the soul of wit, you did well sir.


Well said. This dream is probably for someone who have experienced the hardship, felt frustrated and gave up. Then see others who effortless did it, even felt fun for them. The manifestation of the dream feels like revenge to them.


This framing neatly explains the hubris of the influencer-wannabes on social media who have time to post endlessly about how AI is changing software dev forever while also having never shipped anything themselves.

They want to be seen as competent without the pound of flesh that mastery entails. But AI doesn’t level one’s internal playing field.


When executives fail, unfortunately, they don't blame each other. They do postmortems, then hire consultants to layoff senior engineers.


> When executives fail, unfortunately, they don't blame each other. They do postmortems, then hire consultants to layoff senior engineers.

Forced executive churn has been higher than for individual engineers at a lot of my past jobs. Especially for disciplines like marketing/advertising/sales.


Yes most c-level executives (who often have to report to a board) have tendencies to predict the future after using claude code. It didn't happen in 2025 yet they still insist. While their senior engineers are still working at the production code.


Investors are getting impatient


code has no use-value. it is like being a baker in an island. the value comes from its user base.


User base comes from the value you provide. Value comes from the product features. Features come from code. If code is easy, anyone with 10K bucks in their pocket can provide those features and product. The only thing missing is, is the product battle-tested? That fortunately remains out of reach for AI.


I would say unfortunately out of reach since so far it seems AI will mostly fill out world with bad code which is not battle tested.


Yes with few shots. you need to provide at least 2 examples of similar instructions and their corresponding solutions. But when you have to build few shots every time you prompt it feels like you're doing the work already.

Edit: grammar


It's an outlier because unlike other service industries the software has global presence. An app written by software engineers laid-off 3 years ago will still run by itself with exceptions like servers running out of disk space. Unlike other services industry tech workers get laid off for:

1. When the product has failed market-fit and company is not raising anymore capital.

2. When the backlog has been cleared.

Tech workers get more pay by job hopping.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: