Weird that this is about building a *C* compiler[0] in *OCaml*. I expected the i...

kragen · on Aug 15, 2024

ocaml makes writing a compiler enormously more accessible, and learning to read ocaml, while it can be somewhat intimidating at first, is much easier than learning to write a compiler

(imagine a medieval accountant trying to learn to do long division in roman numerals. he'll be much better off learning the western arabic numerals fibonacci is so excited about)

shortrounddev2 · on Aug 15, 2024

I really really really want to get more into Ocaml but as far as I know there is no good support for a debugger to use in an IDE like VSCode or even vim. Everyone I talk to says they just do print debugging. It's easier to 'reason about your code' in FP, but I do NOT want to go back to a time where I coded without breakpoints. I use F# because of its first party tooling support

kragen · on Aug 15, 2024

it does have a debugger with breakpoints, which even supports time-travel debugging (except on windows obviously), but i've never used it. it even has first-party ide integration: https://ocaml.org/manual/5.2/debugger.html#s:inf-debugger

i use debuggers a lot when i'm programming in assembly, and from time to time when i'm programming in c or c++, but in ocaml i've never needed one. it's not that i've never had bugs in ocaml, but they tend to be of a different flavor, a flavor for which breakpoints are of little value

it sounds like your f# experience is different; what kinds of bugs have you recently found the debugger valuable for in f#?

debug logging is usually a superior option to breakpoint debugging for problems to which both are applicable, because debug logging shows you the whole history of your program's execution rather than just a single point in time. breakpoint debugging requires a lot of manual labor, painstakingly operating the machine to navigate the execution to the state that has the problem. it's like grinding in a video game. i'd rather program the computer to do that labor for me, at which point i no longer need the debugger

(except in c and assembly)

shortrounddev2 · on Aug 15, 2024

> it does have a debugger with breakpoints, which even supports time-travel debugging (except on windows obviously), but i've never used it. it even has first-party ide integration: https://ocaml.org/manual/5.2/debugger.html#s:inf-debugger

1. I am developing on windows so that's an issue for me and

2. I don't use emacs, I use VScode and I've not been able to get the experimental debugger working for the VScode plugin.

It's not that the debugger is purely for fixing bugs; I use it as an active part of development. I want to be able to freeze the program and inspect values as they are running. I may know what the types or state of the program are just by viewing the code, but I want to be able to inspect the actual data for myself as the program is running.

> debug logging shows you the whole history of your program's execution rather than just a single point in time

breakpoints also provide stack traces, so they provide a kind of history as well. I'd rather inspect and interact with a program than dig through thousands of lines of logs

> breakpoint debugging requires a lot of manual labor, painstakingly operating the machine to navigate the execution to the state that has the problem

I see things the opposite as you: print debugging is tedious and requires restarting the program from the beginning whenever you make a change to the source, unless you are constructing your program piecemeal with the repl, which I consider to be extremely tedious as well. To me, a debugger with breakpoints is a far more efficient way to code than print debugging.

I think there is a cultural difference in software engineering between people who use debuggers and people who don't. John Carmack once pointed out that people who come from the game dev and Windows/PC world use debuggers while people from the linux and web dev world tend not to. It seems to be a matter of preference/taste, and I think FP programmers seem to have a distaste for debuggers and graphical debugging/development environments

nrr · on Aug 15, 2024

"... I think FP programmers seem to have a distaste for debuggers ..." I'm unsure that I'd make this assertion so broadly. For Haskell, I'm usually left using Debug.Trace because I've found the traditional symbolic-step debugger is less than ergonomic in the face of lazy evaluation. In Common Lisp, the debugger is my best friend.

"... and graphical debugging/development environments" The loud ones might be saying they use Arch btw (on ThinkPads no less) and lean heavily on using neovim inside the current popular Rust rewrite of tmux, but I personally don't care for it.

VS Code is neat. I mostly still use Emacs because that's where the tooling investment has historically been, but my ideal is definitely much closer to Smalltalk-meets-Oberon.

kragen · on Aug 16, 2024

mine too! except, not so much of a closed-world system? a lot of what i like about emacs is that it has some of that same live-malleability that smalltalk and oberon have

if you've tried godot i'm interested to hear what you think about it

nrr · on Aug 16, 2024

Unfortunately, I have no idea what godot is. I would like to know more though if you're so inclined.

(Surely not the game engine? That's the only thing my disambiguation machinery can come up with.)

kragen · on Aug 16, 2024

yup, the game engine!

kragen · on Aug 15, 2024

> John Carmack once pointed out that people who come from the game dev and Windows/PC world use debuggers while people from the linux and web dev world tend not to. It seems to be a matter of preference/taste, and I think FP programmers seem to have a distaste for debuggers and graphical debugging/development environments

i think that's true! but i don't think it's purely a matter of preference; it's also a matter of what the ecosystems support well and what you're trying to achieve. lisp is a huge exception to the rule about fp programmers; lisp systems, even including emacs, have extremely strong debugging support. but non-lisp fp languages tend to heavily emphasize static typing, thinking things through ahead of time, and correctness by construction, which reduce the need for debugging. but those are more valuable for writing compilers than for writing games or uis in general, where it's hard to define what counts as 'correct' ahead of time but easy to recognize it when you test it interactively

webdev of course has the problem that you can't stop your http response handler while the browser is waiting for a response. and, often, it's sadly not very concerned with ux. and it's often concerned with operating systems in a way where you have to debug problems after they occur, in part because of the scale of the problems

automated testing is another ecosystem thing that reduces the need for debuggers; to a significant extent it competes with static typing

one of the things i really appreciate about godot is being able to continuously adjust parameters and observe variables in my games as they're running. godot is motherfucking awesome, man. i definitely don't have a distaste for graphical debugging and development environments!

> breakpoints also provide stack traces, so they provide a kind of history as well. I'd rather inspect and interact with a program than dig through thousands of lines of logs (...) print debugging is tedious and requires restarting the program from the beginning whenever you make a change to the source

oh, see, i have programs like grep and emacs that dig through thousands of lines of logs for me. often when i'm debugging from logs i don't run the program at all; i just look at the logs and the source code. sometimes the program is running someplace i can't interact with it—memorably, on some occasions, on a satellite out of range of the groundstation. and usually exceptions on python or java give me a pretty decent stack trace, though other log messages unfortunately don't

there's another ecosystem support issue here—although i've sometimes developed on systems that supported fix-and-continue (cmucl, gforth, squeak, basic-80, godot), python and gcc support it very poorly or not at all. so for me i have to restart the program from the beginning whenever i make a change to the source in any case, whether i'm doing printf debugging or not. godot, again, is a very nice exception to this rule, and incidentally lets me add print debugging to the game while it's running

one of the great things about a breakpoint debugger, from my point of view, is that it makes it possible to add logging after the fact to a running program without editing the source or restarting it

i really appreciate you sharing your experience!

kazinator · on Aug 16, 2024

If you're literally "from the Linux world", you're probably younger and less experienced, and likely from a hobby background rather than CS or engineering.

Developers in Unix shops before the Linux era used debuggers.

Obviously, the GNU project developed a debugger for an audience. GNU wouldn't be a complete replacement for proprietary systems like Unix with only compilers, but no debugger.

norir · on Aug 15, 2024

There are many disadvantages to writing a compiler in c (and I have done it). For me, the biggest is simply that it is very verbose for the type of things that you do commonly in a compiler. I have written a self-hosting compiler that transpiles to c and the generated code is about 5x as long as the original code. I could go through and clean it up and probably get it down to 2-3x, but there are certain concepts that cannot easily be expressed compactly in c (except possibly with preprocessor abuse that I don't have patience for). A simple example is that if you want to represent a union type you have to manually define 2 structs and an enum. Matching with switch is also verbose as you have to both match and manually assign the value in the match case. Again, you can use macros to do this though at that point you arguably aren't even using c but rather a hybrid language. A language like ocaml, makes it much more straightforward to define and match on unions as well as gives you other nice high level things like gc so you can focus on the compiler itself and not low level details.

anta40 · on Aug 15, 2024

Written in OCaml? Ahh interesting. Suddenly I remember similar book:

Modern Compiler Implementation in ML: https://www.cs.princeton.edu/~appel/modern/ml/

As an undergrad student, I think the C version is kinda easier to understand, though.

userbinator · on Aug 15, 2024

The bonus of writing a C compiler in C is that you get to being able to experiment with self-compilation.

Croftengea · on Aug 15, 2024

OCaml? Thanks for saving me a click!

hdbxbxndj · on Aug 15, 2024

OCaml is one of the most used languages for compiler design

A good engineer should be able to use the right tool for the job

materielle · on Aug 15, 2024

For hobbyist compiler implementations, right? Compilers for the most popular languages are either written in C/C++, or self-hosted.

You can write compilers in almost any language. I fail to see how C, C++, or even Java or Python aren’t the right tool for the job here. I like pattern matching too, but given that hundreds of successful production compilers have been written without pattern matching, it’s surely just a personal preference.

porcoda · on Aug 15, 2024

I’ve worked on multiple compilers in industry that are written in Ocaml. A number of industrial static analyzers are written in Ocaml too (eg, Infer from Facebook/Meta). Yes, LLVM and GCC are the big ones written in the C/C++ family but they don’t represent everything.

materielle · on Aug 16, 2024

And the Go, Java, Ruby, JavaScript, C#, Typescript, PHP, Kotlin, R compilers, and so on.

But even for hobby projects, it’s just a matter of personal preference. OCaml is great for implementing compilers. So are Go, C++, and Java.

jsnnsjxj · on Aug 15, 2024

I mean, we _are_ talking about a book which invites you to build your own toy C compiler ^^

Nevertheless, OCaml is very strong in compiler design. For example Rust and Hack were written in OCaml initially.

Nevertheless you are not wrong that compilers needing the very last bit of performance like the JVM and LLVM tend to be written in C++

But the barrier is quite a lot more tending to high performance/very high performance and not toy/production

Java and Python are suitable for implementing a toy Compiler and the auther invites you to use any language you like. Just the reference implementation is using OCaml

I would however argue that using C++ is quite advanced since it does not have pattern matching and using C is just masochm. You will be fighting against the language to do even trivial things instead of fighting the actual problem at hand

materielle · on Aug 16, 2024

I totally agree that OCaml is a great language to write a compiler. I’ve used Rust and Haskell, and loved them both.

I was more so pushing back on the the implication that if it’s not OCaml, it’s not the right tool for the job.

Like, I honestly can’t think of a mainstream language in which it would be hard to implement a C compiler in.

wang_li · on Aug 15, 2024

A Retargetable C Compiler is another book that implements a C compiler in C.

https://www.amazon.com/Retargetable-Compiler-Design-Implemen...