Hacker Newsnew | past | comments | ask | show | jobs | submit | fancy_pantser's commentslogin

I use smaller fonts as well and when I first got an older OLED display with a pixel layout not supported by Windows ClearType, I used BetterClearTypeTuner and later MacType to adjust it. It was leagues better after tweaking a few settings and I'm very happy with text now, even on my AW3425DW, which has an older layout they moved on from in recent generations.



It is noticeable how since covid SV takes this place dramatically less seriously, largely because it used to be the case that getting to the top here got you a lot of attention in SV, but that hasn’t been the case in years.


I've never worked anywhere even close to SV, so I don't know how true any of this is.

But the idea that things happen in SV because someone basically got fancy reddit upvotes is sort of concerning.


Wow, my initial guess would have been it's overwhelmingly full of Americans.


my relatives' is always on a sticker under the AP


When OpenAI announced the Triton language, I was worried I'd be confused one day while reading something because of Nvidia's open-source Triton inference server. I made it quite a long time, but it finally happened today! I was so intrigued for the first few pages and then deeply confused.


Mods usually apply [Dupe] to later submissions if a recent (last year or so) one had a fair amount of discussion.


So if mine got no discussion they just allow a new one to be posted?


Sometimes they'll merge the two. What shows up on the FP is hit or miss. One might even say it's stochastic.


I wonder if someone's looked into the optimal time of day and day of the week to post for maximum traction.

If I had to guess it would be monday morning pacific time when people would rather be doing anything than working.


Surely there's already stats on this or even a whole paper :P Could pull all dupe posts over time and see which ones are more popular etc.


Student: Look, there’s hundred dollar bill on the ground! Economist: No there isn’t. If there were, someone would have picked it up already.

To wit, it's dangerous to assume the value of this idea based on the lack of public implementations.


If the hundred dollar bill was in an accessible place and the fact of its existence had been transmitted to interested parties worldwide, then yeah, the economist would probably be right.


That day the student was the 100th person to pick it up, realize it's fake, and drop it


In my opinion, a refined analogy would be:

Student: Look, a well known financial expert placed what could potentially be a hundred dollar bill on the ground, other well-known financial experts just leave it there!


> The team then created VeriGen, the first specialized AI model trained solely to generate Verilog code.

Perhaps it's the first open one. I was an eng manager at a hyperscaler helping one of our clients, a large semiconductor design company, build models to use internally. It was trained on their extensive Verilog repos, tooling, and strict style guides. I see this being repeated across industries, at least since 2023 there are quite a few deep-pocketed S&P 500 orgs creating models from scratch or extensively finetuning to give unique advantages they require. They're rarely announced specifically, but you can often infer from the initial investment or partnership announcements that they're working on it.


> Prices are entirely hidden

Recent legal changes have made pricing more transparent. In 2020, the federal government issued the "transparency in coverage" final rule under the Federal No Surprises Act. This limited the expenses for emergency care when out-of-network and a few other things, but even more exciting is that hospitals and insurers are now required to publish a comprehensive machine-readable file with ALL items and services. They have to provide all negotiated rates and cash prices for the services and include a display of "shoppable" services in a consumer-friendly format. The machine-readable files are impractical to process yourself for comparison shopping (picture: different formats, horribly de-normalized DB dumps), but many sites and APIs have emerged to scrape them and expose interfaces to do so.


> Attention is quadratic

Exactly. Standard Multi-Head Attention uses a matrix that grows to 4B parameters for a 64K sequence as a starting place. FlashAttention v2 helps slightly, but as you grow to 128K context length, you still need over 1TB/s memory bandwidth to stay compute-bound in practice even with this optimization.

So there has been a lot of research in this area and model architectures released this year are showing some promising improvements. Sliding windows lose context fidelity and if you go fully linear, you sacrifice math, logic, and long multi-turn (agentic) capabilities, so everyone is searching for a good alternative compromise.

MiniMax-M1 had lightning attention to scale up to 1M context lengths. It's "I/O aware" via tiling and calculates attention two ways block-wise (intra-block traditional attention and inter-block linear attention), thereby avoiding the speed-inhibiting cumulative summation.

DeepSeek V3.2 uses DeepSeek Sparse Attention (DSA), which is sub-linear by only computing "interesting" pairs. For example, in 128K context lengths this requires only 10-20% of attention pairs to be materialized.

Both Qwen3-Next and Kimi Linear adopt a Gated DeltaNet, which is borrowed from Mamba2. In Qwen3-Next it alternates three Gated DeltaNet (linear attention) layers for every one gated [full] attention. The speedup is from a delta rule, which basically amounts to caching in a hand-wavy way.

There's no universally-adopted solution yet, as these are all pretty heavy-duty compromises, but the search is going strong right now for linear or better attention mechanisms that still perform well.


Software like i2 Analyst's Notebook.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: