I always thought the Amiga APIs with the tag lists were cool. You easily could extend the API/ABI w/o breaking anything at the binary level (assuming you made the calls accept tag lists as parameters to begin with, of course).
I did study physics, and our statistical physics lecture only derived thermodynamic laws.
We also had a somewhat shoddy derivation of Newton's Laws from the Schrödinger equation, but wasn't really satisfactory either, because it doesn't really answer the question when I can treat things classically.
What I'd really like (and haven't seen so far, but also haven't searched too hard) is the derivation of an error function that tells me how wrong I am to treat things classically, depending on some parameters (like number of particles, total mass, interaction strength, temperature, whatever is relevant).
(Another thing that drove me nuts in our QM classes where that "observations" where introduced as: a classical system couples to a quantum system. Which presupposes the existence of classical systems, without properly defining or delineating them. And here QM was supposed to be the more fundamental theory).
>What I'd really like (and haven't seen so far, but also haven't searched too hard) is the derivation of an error function that tells me how wrong I am to treat things classically, depending on some parameters (like number of particles, total mass, interaction strength, temperature, whatever is relevant).
There are plenty of ways to do this and things like Wigner functions literally calculate quantum corrections to classical systems.
But generally if you can't even measure a system before it's quantum state decoheres then it's quantum status is pretty irrelevant.
I.e. the time it takes for a 1 micrometer wide piece of dust to decohere is ~10^-31 s and it takes a photon ~10^12s to cross it's diameter. So it decoheres 10 billion billion times faster that a photon could even cross it.
The error is usually taken as ratio of wavelength to your desired precision, but in general depends on your use case, sometimes you have full precision all the way down, sometimes you have insufficient precision on astronomic scale. Quantum physics doesn't have an absolute scale cutoff.
OP is referring to "Credit based flow control", which is a way to ensure a sender does not overwhelm a receiver with more data than it can handle.
Usually, this is line-rate, but if the other side is slow for whatever reason (say the consumer is not draining data), you wouldn't want the sender to continue sending data.
If you also have N hosts sending data to 1 host, you would need some way of distributing the bandwidth among the N hosts. That's another scenario where the credit system comes. Think of it as an admission control for packets so as to guarantee that no packets are lost. Congestion control is a looser form of admission control that tolerates lossy networks, by retransmitting packets should they be lost.
They'd respond to your kind words, but there are two faulty cables in their token ring network, and as such, no redundant paths for the beacon frame to get through.
It's also a bit odd that they do not implement congestion control. Congestion control is fundamental unless you only have point-to-point data transfers, which is rarely the case. All-reduce operation during training requires N to 1 data transfer. In these scenarios the sender needs to control its data transfer rates so as to not overwhelm not just the receiver, but also the network... if this is not done, it will cause congestion collapse (https://en.wikipedia.org/wiki/Network_congestion#:~:text=ser...).
Current public SOTA seems to be “no congestion control”
> We proceeded without DCQCN for our 400G deployments. At this time, we have had over a year of experience with just PFC for flow control, without any other transport-level congestion control. We have observed stable performance and lack of persistent congestion for training collectives.
I probably shouldn't be commenting because I don't have any experience at this level, but given it's a closed system where they control supply and demand it seems they could manage away most congestion issues with scheduling/orchestration. They still have a primitive flow control in the protocol and it seems like you could create something akin to a virtual sliding window just by instrumenting the retransmits.
But now I am curious with the distribution of observed window sizes is in the wild.
Edit: I'd bet the simpler protocol is more vulnerable to various spoofing attacks though.
In principle, with perfect knowledge of flows at any given instant, you can assign credits/rate-of-transmission for each flow to prevent congestion. But, in practice this is somewhat nuanced to build, and there are various tradeoffs to consider: what happens if the flows are so short that coordinating with a centralised scheduler incurs a latency overhead that is comparable to the flow duration? There's been research to demonstrate that one can strike a sweet spot, but I don't think it's practical nor has it been really deployed in the wild. And of course, this scheduler has to be made reliable as it's a single point of failure.
Such ideas are, however, worth revisiting when the workload is unique enough (in this case, it is), and the performance gains are so big enough...
Maybe the protocol could have arbitration built in? If one was clever you could actually have the front of the packet set a priority header, and build the collision detection/avoidance right into the header.
Multiple parties communicate at the same time? Lower number priority electrically could pull the voltage low, dominating the transmission.
That way, priority messages always get through with no overhead or central communication required.
Yep, such ideas have been around. But congestion is a fundamental problem. Admission control is the only way to ensure there is no congestion collapse.
The technical issue is that you would need global arbitration to ensure that the _goodput_ (useful bytes delivered) is optimal. With training across 32k GPUs and more these days, global arbitration to ensure the correct packets are prioritised is going to be very difficult. If you are sending more traffic than the receiver's link capacity, packets _will_ get dropped, and it's suboptimal to transmit those dropped packets into the network as they waste link capacity elsewhere (upstream) within the network.
The "good" ones I know of tend to be smaller deals where the PE firm is pursuing a buy-and-hold approach in a sector where it has some knowledge and advantage. For example, Summa Equity bought Norsk Gjenvinning in Norway (a recycling company), refocused it on circular economy practices, and by most metrics it seems to be doing well for everyone. The Keiretsu companies an Japan are another example of the process working fairly well.
On the flip side, the high-profile "pump and dump" PE deals almost always seem to be bad for everyone but the PE firm and shareholders.
Usually my rule of thumb at work is that if a key SaaS vendor gets bought by private equity, we start planning a migration project. We just about always wind up needing one, as the private equity firm tries to milk customers for all they're worth while degrading service.
I worked at B&N for 4 years (2 before and 2 after the sale), it actually has been pretty good so far for the business. The stores have been given much more creative control over what they stock so it takes advantage of what's popular at that location (my location was known for our manga and young adult variety).
There have been some bad moves taken at the corporate level (cutting the book procurement team, massive reductions in corporate headcount, etc). This makes the stores have to be a lot more self-reliant and increased workloads. Pay kinda increased, but is still way below the average for retail.
Ultimately, it's good for the business, but it's not as great of a place to work at anymore. When I started, the average non-managerial employee tenure was 6 years, now it's only 2 years there.
Quite possibly, though the last time I was in B&N, earlier this year, I found it a fairly disappointing experience (though given how much they shifted shelf space to current hot genres, it's likely they are succeeding with a much different audience than in the past)
On Semiconductor was spun out of Motorola at the height of the dot com bubble. Their primary business at the time was commodity parts, a capital intensive and cyclical industry.
At their lowest, TPG became a majority shareholder and strongly influenced operations. The initial plan was a quick fix, but TPG saw opportunity and helped OnSemi make the jump up the component food chain.
A major part of this was the low cost structure that TPG drove. This then allowed for cheaper credit options which freed up money for M&A.
Hilton was in trouble after 2008 crisis. Blackstone purchased them as a lifeline. Under Blackstone, Hilton reorganized itself and then Blackstone sold them off.
> to have a tool that is external to the VM (runs on the hypervisor host) that essentially has "read only" access to the kernel running on the VM to provide visibility into what's running on the machine without an agent running within the VM itself
They also disallow updates to MV2 extensions in January, which effectively kills them off: "Chrome Web Store stops accepting updates to existing Manifest V2 extensions"
True, but you can issue updates to existing extensions with MV2. As someone who has two MV2 extensions, this transition is a much bigger deal for existing extensions than new ones.