Modern CPU design adds an interesting twist to the bounds checking question that (imo) renders it moot. At least on Intel CPUs, the array bounds check gets compiled down to a compare-and-jump instruction pair, which then ends up getting run as a single microinstruction on the CPU. Unless you're in an extremely tight loop, the cost of that instruction just isn't significant. The only time it costs much at all is in the event of a branch misprediction, in which case the very next thing that's going to happen is an exception so it's still insignificant given the context. And if you are in an extremely tight loop, it's probably structured in a way that makes it easy for compiler optimize away the bounds check.
I am willing to be convinced, but it will take a well executed BLAS (, in particular a sparse BLAS) benchmark with and without range checks. If the range check enabled version is within 5~8% of the current standard of say a ATLAS/MKL/Eigen (note these aren't quite the fastest) consider me sold.
In my experience range checks did affect the speed, but range check alone may not be to blame. It might well be that code pieces became a tad bit larger than what a compiler would automatically inline, and those snow-balled.
In case you have such a benchmark handy would appreciate a link 'cause fast is good, but fast and correct is way better.
If you're really in a position where you need to worry about performance down to the level of counting individual instructions then you probably shouldn't be relying on someone else to do your benchmarks for you, anyway.
It's funny the author uses an example of an array indexed by 199x values, as the 90s were in some way "the lost decade" of C/++ development, and an utter lack of concern for this sort of checking.
The realities of internet exposure put an end to this recklessness for casual^H^H^H^H^H "enterprise" software development.
Interesting how the Go language now includes array bounds checking. While 2 of the main designers are ex Bell labs, 1 of them is from U of Zurich. (I'm assuming he would have had some Pascal/Modula exposure there)
The assert from assert.h is a macro that completely elides the check if built with NDEBUG. Assuming you're not using some other assert(), and are building your production build with NDEBUG (which is typically part of what's meant by "a production build", though you can certainly disagree with the practice), production code cannot capture it as no messages will be generated.
You are correct. If I build with -DNDEBUG=1, then the assert is completely ignored: no error message, and no abort. That is not what I want. Therefore I will not build with -DNDEBUG=1.
I will certainly not introduce "logging to a file" as a concept inside buf.c. There is no need for the pure data manipulation code in buf.c to know anything about files or stdio, or a specific name of a log file.
If the assert from assert.h does not do what I want, then I'll make a version that does. However, that is a moot point, since right now it does what I want.
No it's a good point actually, and it pays to know your context. Frankly, looking at the 118 lines in assert.h, it does considerably more than I really need.
Instead of this:
assert(buf->pos < buf->str->len);
I could just do this instead:
if (buf->pos >= buf->str->len) die("bad pos");
If the die message is unique, I don't really need to include __FILE__, __LINE__, and the expression itself in it.
I could even do this, though it's probably overkill for something that should never happen anyway:
if (buf->pos >= buf->str->len)
die("bad pos %d %d", buf->pos, buf->str->len);
Absolutely, though I tend to throw __FILE__ and __LINE__ in everything, so I can step through the sources of messages by just pulling messages into my vim quickfix buffer. Again, totally depends on context, though.
For in-house code in a manufacturing company, we usually left all the debug symbol and assertions in our C code. For writing a product that shipped to others at a dev tools company, we usually turned off the debugging, turned on NDEBUG and full optimization. (although the product had checking for things built into its logic, and a lot of testing -- C being used as an assembler surrogate -- as we were writing developer tools, rather than "enterprise" whack-it-together-on-no-time-budget pile-age)
"Production build" varies by environment, TMTOWTDI.
Range checking is great; the CPU cycles consumed are well worth the bugs they catch.
Thing is, they're a list thing, and there's more than just "is this index valid?" that can go wrong. One of the things I appreciate about C++ is that iterator invalidation is very precisely specified. Sadly, that's where it stops: it's just specified, but the onus is still on you to catch the errors. It'd be great to have that same thing: immediate errors when you use an invalidated iterator. (in vectors, it might just skip an element (some deletes) or see the same element twice (some inserts); Python warns about this during some cases with dicts:
RuntimeError: dictionary changed size during iteration
which is nice, but I think you can still slip by that.) I don't believe Java or Python specify what happens to iterators when the collection changes, which to me, is a bit sad.
Thing is, to implement this, the collections would probably need to know about all outstanding iterators, so as to figure out where they are and whether they should be invalidated at all. Most operators then become O(number of active iterators + whatever the normal cost of this op is); I'd argue this might still be close to O(1 + ...) since the number of active iterators on a collection is probably usually 0 or 1. But there's a memory cost, a CPU cost, and if you have something like:
Java can do it without too much trouble, you just have a version number on the collection that gets incremented with every mutation, and the iterator just checks to see if that has changed. C++ is much more complicated, allowing patterns like map.erase(i++) being legal but similar usages not.
Leave in all range checks. If the compiler can determine that the range check will never be hit, or can partially optimize it (hoisting it out of a loop, etc) great. Otherwise? It gets checked at runtime, which is typically not that expensive an operation anyways.
I wonder if the "bound" instruction on x86 has the same issue that the VAX(?) opcode did - that it's faster to write out the two checks than to call the instruction...