I use smaller fonts as well and when I first got an older OLED display with a pixel layout not supported by Windows ClearType, I used BetterClearTypeTuner and later MacType to adjust it. It was leagues better after tweaking a few settings and I'm very happy with text now, even on my AW3425DW, which has an older layout they moved on from in recent generations.
It is noticeable how since covid SV takes this place dramatically less seriously, largely because it used to be the case that getting to the top here got you a lot of attention in SV, but that hasn’t been the case in years.
When OpenAI announced the Triton language, I was worried I'd be confused one day while reading something because of Nvidia's open-source Triton inference server. I made it quite a long time, but it finally happened today! I was so intrigued for the first few pages and then deeply confused.
If the hundred dollar bill was in an accessible place and the fact of its existence had been transmitted to interested parties worldwide, then yeah, the economist would probably be right.
Student: Look, a well known financial expert placed what could potentially be a hundred dollar bill on the ground, other well-known financial experts just leave it there!
> The team then created VeriGen, the first specialized AI model trained solely to generate Verilog code.
Perhaps it's the first open one. I was an eng manager at a hyperscaler helping one of our clients, a large semiconductor design company, build models to use internally. It was trained on their extensive Verilog repos, tooling, and strict style guides. I see this being repeated across industries, at least since 2023 there are quite a few deep-pocketed S&P 500 orgs creating models from scratch or extensively finetuning to give unique advantages they require. They're rarely announced specifically, but you can often infer from the initial investment or partnership announcements that they're working on it.
Recent legal changes have made pricing more transparent. In 2020, the federal government issued the "transparency in coverage" final rule under the Federal No Surprises Act. This limited the expenses for emergency care when out-of-network and a few other things, but even more exciting is that hospitals and insurers are now required to publish a comprehensive machine-readable file with ALL items and services. They have to provide all negotiated rates and cash prices for the services and include a display of "shoppable" services in a consumer-friendly format. The machine-readable files are impractical to process yourself for comparison shopping (picture: different formats, horribly de-normalized DB dumps), but many sites and APIs have emerged to scrape them and expose interfaces to do so.
Exactly. Standard Multi-Head Attention uses a matrix that grows to 4B parameters for a 64K sequence as a starting place. FlashAttention v2 helps slightly, but as you grow to 128K context length, you still need over 1TB/s memory bandwidth to stay compute-bound in practice even with this optimization.
So there has been a lot of research in this area and model architectures released this year are showing some promising improvements. Sliding windows lose context fidelity and if you go fully linear, you sacrifice math, logic, and long multi-turn (agentic) capabilities, so everyone is searching for a good alternative compromise.
MiniMax-M1 had lightning attention to scale up to 1M context lengths. It's "I/O aware" via tiling and calculates attention two ways block-wise (intra-block traditional attention and inter-block linear attention), thereby avoiding the speed-inhibiting cumulative summation.
DeepSeek V3.2 uses DeepSeek Sparse Attention (DSA), which is sub-linear by only computing "interesting" pairs. For example, in 128K context lengths this requires only 10-20% of attention pairs to be materialized.
Both Qwen3-Next and Kimi Linear adopt a Gated DeltaNet, which is borrowed from Mamba2. In Qwen3-Next it alternates three Gated DeltaNet (linear attention) layers for every one gated [full] attention. The speedup is from a delta rule, which basically amounts to caching in a hand-wavy way.
There's no universally-adopted solution yet, as these are all pretty heavy-duty compromises, but the search is going strong right now for linear or better attention mechanisms that still perform well.
reply