It's more useful in C++ with things like generic and voldemort types, but I still would not discount it in C as the lack of namespacing impacts type names, and it could limit convenience typedefs (those which exist only to avoid the `struct` and `enum` prefixes). It might also help transitioning to fixed integer types, which are rather verbose.
I wonder how much this complicates parsing C++. Because of this, you can't discard/free struct and class definitions as soon as you leave the scope, like you can in C, because the definition can still escape the scope by being returned from a function with the "auto" keyword.
For many compilers, as soon as the parser sees a left curly brace, it pushes a symbol table onto a stack, and when it sees the corresponding right curly brace, it pops the symbol table off the stack, and "forgets" any declarations that were made inside that scope. That is so things like this work as expected.
{
int x = 0;
{
int x = 1;
printf("%d\n", x); // prints 1
}
printf("%d\n", x); // prints 0
}
But, in C++, using the auto keyword, declarations can escape their scope with auto. I'll change the C++ code that OP wrote. The C++ compiler has to correctly resolve cases like this, which means it can't just forget all the declarations within the scope of the function after the definition is done.
#include <iostream>
auto createVoldemortType(int value) {
struct Voldemort {
int value;
};
return Voldemort{value};
}
struct Voldemort {
std::string value;
};
int main() {
auto voldemort = createVoldemortType(7);
std::cout << voldemort.value << std::endl; // output: 7
}
During semantic analysis a parser usually attaches symbol info (of some kind) to the already existing abstract syntax tree, or creates a new tree entirely. Whenever it needs to know about a type, it just walks the tree to the node with the type definition. That way there’s really never any data that‘s forgotten.
At least that’s how I think the parsers work I‘m familiar with.
I've read a book about a BLISS compiler [1] that does this, but still uses a stack like I described [2]. It implements a hash table that used linked list nodes for collision. A new declaration adds a new name to the table, and it attaches the node to uses of the name in expressions of the syntax tree.
When a scope is exited, the declarations from that scope are removed from the symbol table, but because they're still attached to the syntax tree, they can't just be freed. They're added to a linked list of "purged" nodes, so that the information they contain can be used later during code generation, and then freed.
One-pass compilers don't have this problem; they really can just free the memory for reuse, because after they exit a scope, they've already generated the assembly or machine code from the high-level language.
However, I don't know what LLVM or GCC, or any other remotely modern compiler, does. I haven't read the code much.
[2]: Actually, it intertwines the stack and the symbol table in a complicated way, so there's only one hash table, and multiple stacks within it. It's explained by a diagram they include on page 13. You can find a PDF of it here: https://kilthub.cmu.edu/articles/journal_contribution/The_de...
Also because they might be a) an undocumented implementation detail (the result of std::bind for example); b) utterly unutterable like the type of a lambda expression.
>Hah. Voldemort types. The ones that can't be named
Ha ha, that reminds me of that phrase of yore, "the quality without a name (qwan)" (google it), which was heavily bandied about years ago, during the heyday of C++ and the software patterns movement (which continued a lot in the days of Java, of course). James Coplien (IIRC) and others of that time come to mind.
Though I read a fair amount about that stuff, a lot of of it went over my head, but later, I did understand some of the patterns, after reading the design patterns book, and trying out some of them.
The template method pattern is my favourite pattern, because I understand it more well than many of the others :), and also because it is the basis of software frameworks (inversion of control, aka the Hollywood principle - "don't call me, I'll call you"). Other patterns that I like and understand are the command pattern, the interpreter pattern, the chain of responsibility pattern, and flyweight pattern, to name a few. Builder and Factory, not so much. Singleton is straightforward, or is it really? impls matter :)
And I have written a few toy frameworks, which is fun to do and use.