Hacker Newsnew | past | comments | ask | show | jobs | submit | mikrl's commentslogin

Bet your shirt AND the farm from the comfort of your phone

I started dressing nice at work, reasoning that looking sharp would buy me a few seconds or minutes of grace to allow my social deficiencies to catch up - just in case an executive decided to ask me a question.

Of course, that never happened for months, years until the one day I went in wearing cargo pants and a gothy synth band shirt and was greeted by a delegation of executives from out of town engaging everyone in small talk…


I worked for a downtown firm for a while which loosened up dress code a little bit so I didn’t always wear my jacket in—though cargo pants and rock T would definitely have led to an HR meeting. One day I had to borrow a jacket from someone when I had to go to a nearby studio for a TV interview:-)

And I don’t wanna talk to a scientist

Y’all MFs unable to address the replication crisis, and getting me pissed


Great article. Personally I have been learning more about the mathematics of beyond-CLT scenarios (fat tails, infinite variance etc)

The great philosophical question is why CLT applies so universally. The article explains it well as a consequence of the averaging process.

Alternatively, I’ve read that natural processes tend to exhibit Gaussian behaviour because there is a tendency towards equilibrium: forces, homeostasis, central potentials and so on and this equilibrium drives the measurable into the central region.

For processes such as prices in financial markets, with complicated feedback loops and reflexivity (in the Soros sense) the probability mass tends to ends up in the non central region, where the CLT does not apply.


The key principle is that you get CLT when a bunch of random factors add. Which happens in lots of places.

In finance, the effects of random factors tend to multiply. So you get a log-normal curve.

As Taleb points out, though, the underlying assumptions behind log-normal break in large market movements. Because in large movements, things that were uncorrelated, become correlated. Resulting in fat tails, where extreme combinations of events (aka "black swans") become far more likely than naively expected.


Some correlations are fine though, there are versions of CLT that applies even when there are benign correlations.

https://en.wikipedia.org/wiki/Central_limit_theorem#Dependen...

I know you know that and were just simplifying. Just wanted this fact to be better known for practitioners. Your comment on multiplicative processes is spot on.

I say more here

https://news.ycombinator.com/item?id=47437152

It's bit of a shame that these other limiting distributions are not as tractable as the Gaussian.


Absolutely. The effect of straightforward correlations is a change in the variance, which can be measured in finance.

The effect of the nonlinear changing correlations is that future global behavior can't be predicted from local observations without a very sophisticated model.


As to ye philosophy of “why” the CLT gives you normals, my hunch is that it’s because there’s some connection between:

a) the CLT requires samples drawn from a distribution with finite mean and variance

and b) the Gaussian is the maximum entropy distribution for a particular mean and variance

I’d be curious about what happens if you starting making assumptions about higher order moments in the distro


The standard framing defines the Gaussian as this special object with a nice PDF, then presents the CLT as a surprising property it happens to have. But convolution of densities is the fundamental operation. If you keep convolving any finite-variance distribution with itself, the shape converges, and we called the limit "normal." The Gaussian is a fixed point of iterated convolution under √n rescaling. It earned its name by being the thing you inevitably get, not by having elegant closed-form properties.

The most interesting assumptions to relax are the independence assumptions. They're way more permissive than the textbook version suggests. You need dependence to decay fast enough, and mixing conditions (α-mixing, strong mixing) give you exactly that: correlations that die off let the CLT go through essentially unchanged. Where it genuinely breaks is long-range dependence -fractionally integrated processes, Hurst parameter above 0.5, where autocorrelations decay hyperbolically instead of exponentially. There the √n normalization is wrong, you get different scaling exponents, and sometimes non-Gaussian limits.

There are also interesting higher order terms. The √n is specifically the rate that zeroes out the higher-order cumulants. Skewness (third cumulant) decays at 1/√n, excess kurtosis at 1/n, and so on up. Edgeworth expansions formalize this as an asymptotic series in powers of 1/√n with cumulant-dependent coefficients. So the Gaussian is the leading term of that expansion, and Edgeworth tells you the rate and structure of convergence to it.


It is the not knowing, the unknown unknowns and known unknowns which result in the max entropy distribution's appearance. When we know more, it is not Gaussian. That is known.

Exactly this. From this perspective, the CLT then can be restated as: "it's interesting that when you add up a sufficiently large number of independent random variables, then even if you have a lot of specific detailed knowledge about each of those variables, in the end all you know about their sum is its mean and variation. But at least you do reliably know that much."

Came here basically looking to see this explanation. Normal dist is [approximately] common when summing lots of things we don't understand, otherwise, it isn't really.

IIRC there's a video by 3b1b that talks about that, and it is important that gaussians are closed under convolution.

That makes it an equilibrium point in function space, but the other half is why it's an a global attractor.

There must be a contractive nature in "passing to the limit". And then Brower's fixed point theorem.

(I know it is very easy to do "maths" this way).


IIRC the third moment defines a maxent distribution under certain conditions and with a fourth moment it becomes undefined? It's been awhile though.

If I'm remembering it correctly it's interesting to think about the ramifications of that for the moments.


You (and others) may enjoy going down the rabbit hole of universality. Terence Tao has a nice survey article on this which might be a good place to start: https://direct.mit.edu/daed/article/141/3/23/27037/E-pluribu...

>natural processes tend to exhibit Gaussian behaviour

to me it results of 2 factors - 1. Gaussian is the max entropy for a distribution with a given variance and 2. variance is the model of energy-limited behavior whereis physical processes are always under some energy limits. Basically it is the 2nd law.


that’s correct but a better explanation is this https://youtu.be/AwEaHCjgeXk?si=tV72uauquCHvzkNE

>are not competitive in the consumer space

AFAIK they still dominate on clock rate, which I was surprised to see when doing some back of the envelope calculations regarding core counts.

I felt my 8 core i9 9900K was inadequate, so shopped around for something AMD, and IIRC the core multiplier of the chip I found was dominated by the clock rate multiplier so it’s possible that at full utilization my i9 is still towards the best I can get at the price.

Not sure if I’m the typical consumer in this case however.


Your 9900k at 5ghz does work slower than a Ryzen 9800X3D at 5ghz. A lot slower (1700 single core geekbench vs 3300, and just about any benchmark will tell the same story). Clock speed alone doesn't mean anything.

From the newegg listing:

>8 Cores and 16 processing threads, based on AMD "Zen 5" architecture

which is the same thread geometry as my 9900K.

My main concerns at the time were:

1. More cores for running large workloads on k8s since I had just upgraded to 128G RAM

2. More thread level parallelism for my C++ code

Naively I thought that, ceteris paribus and assuming good L1 cache utilization, having more physical cores with a higher clock rate would be the ticket for 2.

Does the 9800X3D have a wider pipeline or is it some other microarchitectural feature that makes it faster?


Comparing CPUs by clock speed doesn’t work. New CPUs are do more work per clock cycle.

A 9800X3D is twice as fast as your 9900K in benchmarks like GeekBench, despite having similar clock speed and the same core count.

If you could downclock the AMD part to 2.5GHz as an experiment it would still beat your 5GHz 9900K.


You don't even need to go into the pipeline details. The 9800X3D has 8x more L2 cache, 6x more L3 cache, 2x the memory bandwidth than the now 8 years old i9 9900K. 3D V-cache is pretty cool.

I purposely picked a CPU with the same thread geometry as your 9900K to avoid calls of "apples & oranges" or whatever. If you want more threads, the 9950X is right there in the same socket. Or Core Ultra 9 285k. Either of which will run circles around a 9900K in code compilation.

You can research microarchitecture differences if you want, it's a fascinating world, or you can just skip to looking at benchmarks/reviews. Little hard to compare against quite that large of a generation gap, but eg https://gamersnexus.net/cpus/rip-intel-amd-ryzen-7-9800x3d-c... or https://www.phoronix.com/review/amd-ryzen-7-9800x3d-linux/2


The 9800X3D has wider everything. Decoder, execution ports, vectors, cache, memory bandwidth...

I think my i9 was released right after the Spectre and Meltdown mitigations in 2019, but I seem to remember even more recent vulns in that family… so that could also be a factor.

A 9700X is twice the performance of a 9900K and M5 Max is almost 3X the performance. The megahertz myth is a myth.

I replied to the sibling comment: I was making simplifying assumptions for two specific use cases and naively treated physical cores and clock rate as my variables.

Yes, but core count and clock speed of a nearly 10 year old CPU are meaningless when comparing to current processors.

But why? That's like trying to determine which car is faster by looking at only at the rpm.

I was doing similar by capturing XHR requests while clicking through manually, then asking codex to reverse engineer the API from the export.

Never tried that level of autonomy though. How long is your iteration cycle?

If I had to guess, mine was maybe 10-20 minutes over a few prompts.


>The movie did have an unfortunate eugenic implication

You’re thinking of dysgenic, not eugenic.

Gattaca is a movie about eugenics.


Making funny memes of my friends mainly. ChatGPT won’t touch that, I haven’t tried with Claude yet, but grok keeps the group chat flush with laughing emojis.

That’s all I use it for really- things out of alignment with the other platforms- which IMO are better on every other metric (except having a sense of humour of course)


I love my friends enough that the memes I make for them are hand-crafted.


Hey I’m all grown up now, just don’t have the time to meticulously touch pixels in MS Paint like back in the day


Perhaps the lesson here is upgrade your use case for AI's! All that power and that's your stumbling block? LOL, no disrespect.

Sure, I have no problem with what you're doing, and as things evolve I'm sure there'll be no problem, but there's countless other apps designed to do exactly what you've said.


As a Canadian I strongly felt it was GG to the Democrats when they didn’t run a second, competitive, knives-out primary for VP Harris.

For the second time, the party apparatus coalesced around a candidate who was ultimately trounced by someone wrongly considered unelectable.

Even if it was just theatre in the end, having a dramatic primary where the VP won would have made her look stronger and given her a chance to claw back some of the swing voters.


Or could have made her look worse because of the mud slinging between the candidates in the primary debates. You know that any criticism of a candidate by her competitors would have been trumpeted and distorted by Trump.


It feels like my era of education 2012-2020 (couple of degrees over that time) really deemphasized perf tuning, even heard it was practically useless in current day a few times.

I had a computer organization course that came close but mostly just described microarchitecture and its historical development, not so much the practical ways to exploit it.

Actually taking the time to sit down and poke around with techniques was mind blowing. I grew up during the golden age of CPU and OS advancements 90s-00s and the rush from seeing ‘instructions per cycle’ > 1 captured a bit of that magic that CRUD app dev and wrangling k8s just doesn’t have.


It's not worth optimising if you don't have a problem. Focus your effort.

If this code runs once per week at midnight, needs to finish by 5am, and currently it takes 18 minutes, the fact it could take 40 seconds isn't actually important and so spending meaningful engineering effort to go from 18 minutes to 40 seconds is a waste.

On the other hand, if the code runs on every toaster when it's started and ideally would finish before the toast pops up, but currently takes 4 minutes, then even getting it down to 2.5 minutes will make more customers happy [also, why the fuck are we running software in the toaster? But that's beside the point] and might well be worth doing.

The classic UX examples given are much closer to the latter category. When I type fast the symbols ought to appear immediately for example, if you can't do that then you have a performance problem and optimisation is appropriate. But so much of what software engineers do all day isn't in that space and doesn't need to prioritise performance so optimisation shouldn't be a priority.

In particular Fast but Wrong is just Wrong. https://x.com/magdraws/status/1551612747569299458


>so much of what software engineers do all day isn't in that space

Seems to me that critical infra that supports a lot of modern computing is in that space though.

If you want to develop that depth of knowledge you need to go into HPC/scientific, trading or accelerator hardware. I didn’t get into this sometimes crazy industry to NOT learn stuff and push the limits of my computer.

I’m glad I know about those applications now, but I wonder how much of a disservice we did to the industry by just focusing on frameworks and abstraction especially now that you can just sling a lot of that out with a prompt…


In the era of kubernetes and edge servers and everything running on battery power, that distinction between need and want becomes much fuzzier because of course we can bin pack the more efficient one better or preserve another five minutes of standby time even if the wall clock behavior is moot.

And I’d also argue that if you wait to use a skill only until the need is dire then you will be both 1) shit at doing it and fail to achieve your goal well and 2) won’t have spent enough time on the cost/benefit analysis to know when things have changed over from want to need. Like the blind people I allude to in my top level.


I went to a top ten school. I had one semester of circuit design, one one of EE, and a couple of computer architecture that went over the MIPs and writing assembly.

I think there was some sort of transition of curriculum going on with the introductory classes though because the difficulty from one homework assignment to the next that first year of CS 1XX classes was pretty choppy. A friend and I made a game of one-upsmanship of adding our own constraints to the easier assignments to make them more interesting. Like taking larger inputs than the requirements and counting execution time.

When I left school my first job the application was glacially slow, and I learned half of what I know about optimization in a short stint there through trial and error. It was a couple jobs in before I ever got pushback and had to learn the human factors element. But it (the optimization balanced against readability, robustness, extensibility) was a way I have always made pedestrian work more interesting. There are whole classes of code smells that also contain performance penalties, and at the peak of my restlessness I needed those to keep my sanity without irritating coworkers. I’m just cleaning up this messy code, nothing to see here.

Reading release notes for other tools bragging on their improvements. Dev tools and frameworks are more forthcoming about how and what than consumer apps, but there are standouts from time to time. I read a ton of SIGPLAN proceedings during that era. Fortune favors the prepared mind and you look a lot smarter when you’re confronting a problem or opportunity with a primed pump rather than coming in cold (being friendly with other disciplines in your company also helps there).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: