Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Const Generics in Rust (nora.codes)
98 points by todsacerdoti on Nov 6, 2021 | hide | past | favorite | 39 comments


It's a little too early to be hyped about const generics in Rust. You still can't do basic compile time arithmetic on them, such as (BITS+7)/8 to get how many bytes you need.

I recently had to use "FixedBitSet" in Rust. This is simply a set of N bits. It's the clean way in Rust to do bit-banging. Unfortunately, the underlying implementation is a dynamically allocated Vec. So, when you do bitset.set(4,true), far more code is generated than just one OR instruction.

So, it's not quite here yet.


I’ve been using the `bitflags` crate to do my bit-banging, and it seems to compile down to a single OR instruction quite nicely :)


I wonder if it'd make sense to have the const generic parameter be the number of bytes (or words) that the bit set needs, instead of the number of bits. It's ugly, but it could allow the optimizer to eliminate many runtime bounds checks as the size of the vector would become a compile-time constant.


> far more code is generated than just one OR instruction.

How many?

Because looking at ARM and C with an unpacked bitfield structure, you’ll have 3 or 4 easily. A load, some shift, or, store. Anytime I see a bitfield in C I assume 4 instructions. So is this close?


C++ has const generics, and they seem to be a lot more powerful, as you can use expressions in template parameters, as long as they can be evaluated at compile time of course:

https://godbolt.org/z/7Tbzhx5Kc

Rust is a very nice language, but I feel they forgot to copy some useful features from other languages.


They didn't "forget" to do it, it's just not finished yet.


The reason: Rust has a different view of how to design type systems compared to C++. They want their type system to be logically more grounded (as the tradition in ML languages like Haskell) rather than the “screw it” template approach. So they’re trying to make const generics logically sound (basically a very limited version of dependent types) and integrate well with the rest of the type system, which is why it’s taking so long.


A language is not just the sum of its features, it's a composite whole. Features interact, and if a language is to be good, those interactions need to be designed rather than incidental. Adding a new feature can very easily deliver net negative value!


Rust’s generics feels a lot more rigid than C++ templates. In C++ template programming works more like a duck-typed dynamic language (with very esoteric “runtime-like” compile time error messages), while Rust’s type system feels a bit more rigid probably due to its influences from ML-like functional languages. Though there are tradeoffs with each approach: C++ has gained an incredible notoriety for all sorts of wacky template tricks (SFINAE, template metaprogramming, etc…) Thankfully they developed features like constexpr and concepts to remedy some issues with it (the one part of recent committee decisions that I like, though I generally don’t like where the language is going overall)


Yeah. Take for example the thinking on const generics for user-defined types.

Right now in the Const Generics MVP you can only use integer types. So you can write a generic argument with an u32 N in it, or a char C (a Unicode scalar, so basically a "character" in Unicode) or something like that. There are a bunch of practical uses for this, but obviously there's no reason in principle it must be so limited.

In C++ there's always a temptation to just rip the knob off, sure it might be a terrible idea to make a new type generic over the 32-bit floating point numbers, but that's your problem as the developer right? Just don't do that if you're uncomfortable cleaning up the mess. Rust does not like to provide so many foot guns.

So the last time I looked the proposal was that user-defined types could be eligible for const generic treatment if they derive Eq.

At the first glance that sounds correct, if the type is Eq then it can make sense to be generic over it since apparently we can tell when A = B and thus whether Foo<A> is actually the same as Foo<B> or different.

But then an extra thought brings you up cold. I can claim my type is Eq with one line if it's PartialEq. And I can claim my types are PartialEq just by providing a bogus eq() function that always says false. So isn't this a useless requirement?

And then you study the language more carefully. They're going to require that you get Eq by deriving it. Anybody can implement PartialEq as I described above, but that's not derived. To get a derived PartialEq (and thus Eq) the derive macro has to allow that, and that's only going to work for the kind of composite types a non-crazy person would use for const generics.

Your trivial enumerated type with a list of fifteen types of chocolate bar can easily derive Eq if you want to write const generics so that somebody can make a variable of type AdventCalendar<Chocolate::Mars> in this future Rust version, but my custom type that contains two pointers to C Strings can't successfully derive Eq and so isn't eligible to try to be used for const generics even if it really genuinely implements Eq.


Actually, C++ does limit the non-type template parameters[0] to integral types, enums and function pointers. But some people have found that too restrictive and would like to use other constants as well, like string literals.

[0]: https://en.cppreference.com/w/cpp/language/template_paramete...


That document, which I've read, lists "a floating-point type" as allowed since C++ 20.


Rust uses proc macros for compile-time programming, not generics. These are quite distinct features, with distinct purposes.


Try D if you want to see real power in this regard. Want to take a struct value as a template parameter? It just works.


I would be cautions of claiming performance improvements without the benchmarks to back it up. The compiler is often very good at figuring out if a length is constant and optimizing appropriately. I suspect it'd be able to do it in the simple examples given. But maybe I'm wrong!

Admittedly it can be nice not to be at the mercy of the optimizer. And getting compiler errors instead of runtime panics is a really great benefit. But I've just become slightly wary of big performance claims these days. Perhaps I'm just getting too cynical.


Absolutely, I should have included a godbolt link or something, and it will vary on a case by case basis - but still, and especially across function boundaries, using const generics makes things more predictable.


As a bit of a type amateur, I wondered about this: why do we define primitive types and leave it at that? Why aren’t things like min and max first class constraints one can place on a type?

I imagine this basically gives you that. But do other languages do this? I’m guessing maybe functional programming languages?


This was common in Wirth languages eg Pascal, where they are called “sub-range types” https://wiki.lazarus.freepascal.org/subrange_types

It’s something I actually miss in modern mainstream languages. But as it’s something few copied and few lament, we can extrapolate that I’m a minority :)

C99 has something similar but different in that integer fields in a struct can have a bit width specifier. This is is primarily used for bit fields in protocol frames and things, and conventional wisdom is that compilers generate poor code for them even today, and doing your own bit masks instead is still popular.


I wanted that in Ada, which has subranges. I wanted "ranges" instead - no mention of the underlying hardware at all. Intermediate values would be sized based on the rule that you can't overflow an intermediate value unless it would lead to overflow of the final sized value. Sometimes this requires longer intermediate values.

What happened instead was standardized hardware. The 60 bit CPUs, the 48 bit CPUs, the 36 bit CPUs, the 24 bit CPUs, the 18 bit CPUs, and the 12 bit CPUs all died off. With everybody using powers of 2 of bytes, range calculation became less important for portability.


Nim's type system was inspired by Wirthian languagrs and it does have subranges.


Dependent types is what you're searching for. Mostly used by (functional) proof 'languages' like Coq, Agda, Idris. For a language to have dependent types, the values that the types depend on must not (like with Rust) only be compile time constants, but values at runtime.


Huh, surely the types in WUFFS are offering what the parent wanted (WUFFS won't let you add two 32-bit signed integers x and y together unless you've explicitly constrained them so that this operation won't overflow or underflow) but purely enforced at compile time since WUFFS wants to go real fast?

WUFFS is in a related place to something like Coq. The idea in WUFFS is, if I write my decoder for the FOO image format in WUFFS, no matter how lazy and incompetent I am, your image displaying program can confidently use my FOO decoder, and if I did a bad enough job it might be slow or produce the wrong picture but it can't crash or run a SSH server or delete the password database or whatever.


Ada does. It automatically selects the backing type depending on what range you specify.


Since I encountered Ada in the late 80s, I've often wondered how this would work out over a long lived project, even when I was young and full of confidence and the expectation that long lived was going to be a thing of the path.

C has been a headache in many ways, but one thing I've noticed is that there are many examples of very, very long-lived C projects, and not very many examples of very, very long-lived projects in most other languages (IBM, obviously, will disagree here, and COBOL). Some of that is related to C being a good fit for generating economic value at the beginning of the era that would now qualify as very long lived (20+ years, 30+ years), I'm sure, but it is interesting to consider that parts of some languages will not age well.


> there are many examples of very, very long-lived C projects

Of course there are, because that's the 'native' language for all Unix' and Unix-lookalikes and Windows because of the Kernels in C. So, for any OS relevant for 'real' computers of the last 30 years. Had the Lisp-Machines won the battle, nobody would talk about C any more (or just to scare children like with COBOL). And don't forget about Fortran, there is some _really_ ancient Fortran code in use at the same places that still use their COBOL.


I've worked for 13 years on an Ada codebase initiated in the 90s. Ported from ppc to x86 then x86_64 to arm with mostly re-compilation and testing. Mind you, endianness was handled in a very ad-hoc (ugly but portable manner).


Because it's hard and ultimately imperfect. Even a simple constraint as a max value (which inherently all integer values should have) requires the compiler to have good understanding of the code for it to recognize explicit bounds checks on the value. If there's no explicit bounds checks it would have to prove it through inference which is often (usually?) impossible even if the compiler was smart enough to do so.


> [...] max value (which inherently all integer values should have)

No, they shouldn't. Would really prevent many subtle bugs if the default integers of every language would be arbitrary precision ints.


Python is an example language with “big integers”.

Things quickly get interesting interoperating with other languages eg via json.


Can this be used somehow for the bounded range? some like,

    struct MyIndex<const MIN: usize, const MAX: usize> {
        index: usize,  MIN/MAX??
    }
That would get rid of lots of runtime bound checks.


Not as things stand now. For this functionality you would want something closer to `I32<0..500>` and `struct I32<const RANGE: Range>(i32)`, so then operations between two `I32<RANGE>` can check what each corresponding const param is. I don't know if this will be possible in the short term, but I know that quite a few people want something along these lines, so I think it might happen at some point.


Yes, that would be ideal. Was just thinking whether the const generics can be hacked to do the same thing.


You can't use field access today, so having a const param `struct Range<T> { start: T, end: T }` isn't yet useful, but you can still get something working, although the error message is terrible:

https://play.rust-lang.org/?version=nightly&mode=debug&editi...

Edit: I'm honestly surprised at how usable this is already in the current state of nightly:

https://play.rust-lang.org/?version=nightly&mode=debug&editi...


Excellent hack! This demonstrates it actually works. Just need to make it ergonomic for everyday work. That's awesome.


Will this make Rust's type system dependently typed like Agda or Idris?


Const Generics are a form of dependent typing, but a heavily restricted one. This will loosen some restrictions, increasing what can be done with dependent types, but still a far cry from languages with "full" dependent types.


No. To have dependent types the values the types depend on must be runtime constants, not compile time constants.


Strictly speaking, dependently-typed languages don't even have a clear-cut phase separation between "compile time" and "run time" (other than via "code extraction" to a non-dependently typed language as an add-on feature). It's clear however that the Rust const-generics 'MVP' is quite far from the flexibility of full dependent types.


No, const generics only work with compile-time constants.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: