What you suggested has been done before - you might find a review of the literature fun if this sort of thing interests you, even if academic papers are pretty dry reading normally.
It wasn't labor costs that made the Roomba twice a Roborock, but manufacturing costs. Roomba teardowns basically show that iRobot was just really bad at cost-optimizing their vacuums for mass production.
There's no way to get through the harder courses in the program on 1 hour a day. And you're not getting value from the degree if you aren't pushing yourself to take those hard courses, unless you just need the diploma.
This is a standard which few kernels will ever meet. I'd say requiring a numerical proof is the same as requiring no proof at all - because it won't ever happen unless you're validating silicon or something equally expensive.
I guess it depends on your definition of proof but I’d say the reasoning and justifications sections of a TOMS article qualifies and that’s a standard nearly every popular library meets.
C++: "look at what others must do to mimic a fraction of my power"
This is cute, but also I'm baffled as to why you would want to use macros to emulate c++. Nothing is stopping you from writing c-like c++ if that's what you like style wise.
It's interesting to me to see how easily you can reach a much safer C without adding _everything_ from C++ as a toy project. I really enjoyed the read!
Though yes, you should probably just write C-like C++ at that point, and the result sum types used made me chuckle in that regard because they were added with C++17. This person REALLY wants modern CPP features..
> I'm baffled as to why you would want to use macros to emulate c++.
I like the power of destructors (auto cleanup) and templates (generic containers). But I also want a language that I can parse. Like, at all.
C is pretty easy to parse. Quite a few annoying corner cases, some context sensitive stuff, but still pretty workable. C++ on the other hand? It’s mostly pick a frontend or the highway.
No name mangling by default, far simpler toolchain, no dependence on libstdc++, compiles faster, usable with TCC/chibicc (i.e. much more amenable to custom tooling, be it at the level of a lexer, parser, or full compiler).
C’s simplicity can be frustrating, but it’s an extremely hackable language thanks to that simplicity. Once you opt in to C++, even nominally, you lose that.
I highly doubt (and some quick checks seem to verify that) any of the tiny CC implementations will support the cleanup extension that most of this post's magic hinges upon.
Yup. And I like the implication that Rust is 'cross platform', when it's 'tier 1' support consists of 2 architectures (x86 & arm64). I guess we're converging on a world where those 2 + riscv are all that matter to most people, but it's not yet a world where they are all that matter to all people.
You are misunderstanding what Tiers are. Embedded architectures cannot be Tier 1 which requires running Rust compiler compiled on the architecture itself and testing stuff on with it. Only full desktop systems will be Tier 1, because those are the systems that you can run Rustc and all the nice desktop environments.
However most of the embedded world uses ARM chips and they are Tier 2 like thumbv6m and thumbv7em (there are still odd ones like 8051 or AVR or m68k, many of them lack a good C++ compiler already). They are guaranteed to be built and at the release time the tests still run for them.
I understand your tiers just fine. You are misunderstanding what "cross platform" means. Or rather, you're trying to redefine it to mean "what Rust supports, in the way we want to support it, on the few architectures we care about, because in our view nothing else of value exists".
In my experience most chips released in the past 10+ years ship with C++ compilers.
Quite frankly I'm not sure why you wouldn't given that most are using GCC on common architectures. The chip vendor doesn't have to do any work unless they are working on an obscure architecture.
I just revised Matlab to do some work involving a simulation and Kalman filter, and after years of using python I found the experience so annoying that I really welcome this library.
> The benefit of keyboard-driven programs like Vim is that you're trading an initial learning curve for a vastly more efficient experience once the learning is done+.
I have never been rate-limited by my keyboard input speed. I have lost many minutes of time daily looking up cheatsheets for terminal tools that I use occasionally.
Ironically, when I see what impact AI has had on my programming, the biggest has been in saving me time crafting command line invocations instead of browsing <tool> --help and man <tool>.
The speed change you see is not due to raw input speed, but do to eliminating a context switch in the brain. I thinking I want to see X and already seeing it on the screen.
Funny, I feel the same way about Triton. Performant Triton looks like CUDA (but with tiles!) except it's ten times harder to debug since it doesn't have the tooling NVIDIA provides.
If I had to run on AMD I'd rather deal with their hipify tooling.
Performant Triton programs are usually simpler and shorter than their CUDA equivalents. This alone makes it easier to write, and I would argue that it helps with debugging too because the model provides a lot more guarantees on how your code executes. That said, some of the tooling is notably poor (such as cuda-gdb support).
Agree on shorter, disagree on simpler. The hard part of understanding GPU code is knowing the reasons why algorithms are the way they are. For example, why we do a split-k decomposition when doing a matrix multiplication, or why are we loading this particular data into shared memory at this particular time, with some overlapping subset into registers.
Getting rid of the for loop over an array index doesn't make it easier to understand the hard parts. Losing the developer perf and debug tooling is absolutely not worth the tradeoff.
For me I'd rather deal with Jax or Numba, and if that still wasn't enough, I would jump straight to CUDA.
It's possible I'm an old fogey with bias, though. It's true that I've spent a lot more time with CUDA than with the new DSLs on the block.
I don’t think it is possible to write high performance code without understanding how the hardware works. I just think staring at code that coalesces your loads or swizzles your layouts for the hundredth time is a waste of screen space, though. Just let the compiler do it and when it gets it wrong then you can bust out the explicit code you were going to write in CUDA, anyway.
This is cool. Not personally going to switch because I want to stay with the official implementation, but I appreciate the effort involved in porting libraries.
reply