In the general case? Because the exponential distribution is directly the result of a poisson process, and poisson processes are also naturally ubiquitous. A poisson process results in poisson distributed counts of events over fixed periods of time, and exponentially distributed times between events. They are used to model everything from rainfall to particle physics, and those concepts do fit very neatly with things that are obviously more complicated like human dynamics.
In the case of memory-bound code benchmarking? Because memory access in NUMA architectures with tiered caches are approximately exponential: L1 vs L2 vs L3 vs main memory vs swap are all exponential increases in latencies. It's obviously more of a step function than a continuous function but the relationship is definitely exponential.
The pages for the exponential distribution and poisson distribution are also informative and interesting.
When I go through the process of finding an appropriate distribution, I usually first look through wikipedia to see if I can find conceptual parallels, just like this. That doesn't mean I settle on it..I'll usually test the distribution fit with functions like those found in fitdistr plus, as well as test to see if residuals are roughly normally distributed.
One hard and fast rule that I use though is that I never use an unbounded distribution to model a bounded process. For compute times, you can't have negative latencies, and you can't have zero latencies, but you can have infinite latencies. Therefore, I would only consider using distributions in the space of (0,infinity). The log normal distribution fits this and is probably adequate here, but the gamma distribution might be a smidge better.
In the case of memory-bound code benchmarking? Because memory access in NUMA architectures with tiered caches are approximately exponential: L1 vs L2 vs L3 vs main memory vs swap are all exponential increases in latencies. It's obviously more of a step function than a continuous function but the relationship is definitely exponential.
The pages for the exponential distribution and poisson distribution are also informative and interesting.
https://en.wikipedia.org/wiki/Exponential_distribution#Occur...
https://en.wikipedia.org/wiki/Poisson_distribution#Occurrenc...
When I go through the process of finding an appropriate distribution, I usually first look through wikipedia to see if I can find conceptual parallels, just like this. That doesn't mean I settle on it..I'll usually test the distribution fit with functions like those found in fitdistr plus, as well as test to see if residuals are roughly normally distributed.
One hard and fast rule that I use though is that I never use an unbounded distribution to model a bounded process. For compute times, you can't have negative latencies, and you can't have zero latencies, but you can have infinite latencies. Therefore, I would only consider using distributions in the space of (0,infinity). The log normal distribution fits this and is probably adequate here, but the gamma distribution might be a smidge better.
https://cran.r-project.org/web/packages/fitdistrplus/vignett...