Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Mold 2.0 (github.com/rui314)
430 points by fomine3 on July 26, 2023 | hide | past | favorite | 108 comments


Mold is absolutely excellent work for modern systems.

I've recently been trying to speed up our project builds, and found that linking is absolutely a huge bottleneck. I've got 24 cores * 2 threads to build on, and maybe 30% of that power goes unused because of the linker.

I've made a previous attempt to build with mold but it didn't quite work at the time. I'll be giving it another try.


Mold is amazing. I was playing around with O3DE some months ago, and tried switching to Mold to see if it could improve my build-run cycle times, and it absolutely did. I don't remember the exact numbers, but it was something crazy like multiple seconds with gold and lld, down to under a second with mold.


Multiple seconds...I've worked on binaries that take 20 minutes to link.

Having a ridiculously long cycle time massacres productivity and job satisfaction.


TWENTY minutes? O_o what kinda software/binary was that if you don't mind me asking?


I'm not OP but I've worked on projects with similar link times. Large games that are one monolithic executable and have enormous debug information filed (sym or pdb depending on platform). Heres a good blog post [0].

[0] https://devblogs.microsoft.com/cppblog/improved-linker-funda...


Not the OP either, but I also had terrible link times in SQLPage before I switched to mold. SQLPage is a web server that bundles a lot of html component templates directly in the server binary. This is very practical for users, but was painful for developers, before mold.


I've seen linking take that long or longer when I worked in defense. :)


Something big, decades old, financial. Not pleasant.


We've also done some PoCs at work, and the total build time for our UI layer (complex C++/Qt stuff) dropped from 44-45 mins to 29 by going from lld to mold on a smaller test build machine.


`mold -run ninja` works really well for me.


Very happy to read about the license change! I hope they are able to earn money from the project, but the likelihood of being able to integrate it into any work projects is much higher with MIT licensing. If we do use it, I'll try to get our company to sponsor the project.


The license on a linker shouldn't matter. It isn't injecting copyrighted code and there's already precedent for excepting trivial boilerplate in the GPL ecosystem so nothing in the generated binary should be affected by a copyleft license on the tooling. AGPL would only restrict deploying a privately modified linker via a network service which isn't a realistic scenario for a basic dev tool.


While that is true, it requires a very specific understanding of both intellectual property law and the inner workings of development tooling that is pretty rare in practice. As such, companies are generally conservative in such decisions. Most organizations have a single list of licenses allowed for use rather than getting into specific use cases for specific licenses. These types of environments will generally limit uptake of copyleft tools.

All that said, I'm not convinced the licensing is the issue here (although I wish them the best). We are in a world that has grown accustom to free development tools and building a commercially viable business in tooling is incredibly difficult. I'm always amazed by how many developers I know who make a living by creating software yet are unwilling to pay for software themselves.


All of those companies have no issues using `ld` right? Binutils is GPL and has always been. Seems like a direct analogy.



AGPL and GPL are quite different beasts


> Most organizations have a single list of licenses allowed for use rather than getting into specific use cases for specific licenses.

Bullshit. Most organizations are absolutely fine with different ad-hoc licenses for various closed source software. Applying different standards to copyleft licenses is just due to FUD.


Ignoring the problem of convincing legal team that linker license doesn't affect it's output, there is the problem of including the linker in prepackaged toolchains - like the ones part of Visual Studio, Xcode, Android studio or embedded toolchains provided by hardware vendors. In some of those cases you might still get away with AGPL, but it will require putting in some effort to comply with license since in those cases the linker itself is redistributed. It's no secret that Apple is avoiding GPL as much as they can, with many of commandline tools they provide either being BSD versions or very old GPL2 versions.

> AGPL would only restrict deploying a privately modified linker via a network service which isn't a realistic scenario for a basic dev tool.

Interacting with linker over network service may sound weird, but it's not that uncommon. For example Unity offers Cloud build service for their engine which means indirectly interacting with Android and iOS toolchains. All major cloud providers are making solutions where the development tools and libraries are tightly integrated with their cloud service in attempts to make it harder migrating your project away from them. Regular CI/CD service providers are including the most popular development tools in their default environment, both to simplify the development process and also so that they can better cache them thus saving network costs and speeding up builds compared to each customer downloading the toolchain manually. There was also a period where multiple companies where pushing a remote dev environment as a solution to minimize the hassle of having each developer setup things locally thus improving onboarding speed, ensuring everyone is working in the same environment and also simplifying work for company wide IT management.

In many of those cases there might be 2 or even 3 companies repackaging and redistributing between the original software (linker) author and final user (programmer).

Even in the land of open source things aren't that simple. Not sure if it's still a thing but there was a period when FreeBSD was trying to remove GPL from it's base packages.


Third party CI builds aren't distributed to anyone. Even if the compiled binary was subject to GPL rules, the binary will almost never exit the CI server where it was built so copyleft protection won't be invoked.

You can use modified GPL code to your heart's content in a corporate setting. As all your coworkers are part of one legal entity, private use within the organization is not distribution per the terms of GPL. You have to distribute a binary to someone who can make a claim under the terms of the license before copyleft is activated. Furthermore, you only ever have to disclose source to someone with possession of a derived binary.


I was talking about the case where linker is the software and user is the developer performing linking. And MOLD was previously AGPL licensed, meaning that having access to binary isn't required, it's sufficient to interact with the software over network for copyleft to activate.

Let's imagine Google wanted to include Mold in Android sdk, and CI service company (like Travis or Github with their Actions) want to include Android SDK in their VM images. I would consider using of such CI service for building android app interacting with the linker software over network. Meaning everyone in the middle (Google and CI service) has to deal with license requirements.


The main purpose of using AGPL as the original license is to force companies that have a blanket-ban on AGPL to buy a commercial license or just buy the entire project, as told by the author.

https://gist.github.com/lleyton/9c0b75d065f37333ea9851b6cad1...

It did not workout, so they're switching to MIT.


Good luck convincing corporate lawyers that the license terms don't matter since we're just using it for linking. I don't know many (any?) that would risk it.


I'm curious about the license change? This is an executable is it not? Invoking it as a separate process does not require you make the software calling it GPL so switching to MIT should have no affect in the common case.

If the authors really wanted a more permissive license, then instead of relicensing from AGPL to MIT they should have relicensed from AGPL to AGPL with linking exception. An example of a project that is GPL with linking exception is libgit2 [1]. This licensing is more permissive but still permits the author to sell commercial licenses to those making closed-source code changes.

[1] https://github.com/libgit2/libgit2#license


> This licensing is more permissive but still permits the author to sell commercial licenses to those making closed-source code changes.

I think the point is that the authors don't want want to continue selling licenses, as it wasn't worth the hassle. I guess `sold`, the macOS version is an exception.


> it wasn't worth the hassle

If selling licenses is a hassle, then that indicates a problem with the open source ecosystem as GitHub and other code hosting websites should offer monetization tools for selling closed-source licenses directly from their web interface. I'm talking legal forms, templates, payment processors, and product tracking. Selling licenses should be easy, not a hassle.

[1] https://github.com/rui314/mold/releases/tag/v2.0.0


The MPL or the BSL would be a better license. The best would be the MPL with something like Facebook's "700 mil mau" and above requiring a support contract. The Mangas corps benefit the most from mold given the inherent size of their C++ codebases.


This linker noticeably improves rust development happiness on an exploratory, chunky repo of mine that is trying to be a big ole web monolith (uses SeaORM and axum/tokio). You don't want to know the size of my `target` directory, but incremental builds are snappier!


I need to play around with mold:

  15:55 $ du -Hs --si target/
   11G    target/


just in case I confused: i'm not sure mold helps with the size of that directory :D. I was commenting on how big my rust repo is really, just saying that mold seems to help big builds like mine


Mold is great, here are some usecases I love:

1. Faster rust builds!

[target.x86_64-unknown-linux-gnu]

linker = "clang"

rustflags = [

"-C", "link-arg=-fuse-ld=mold",

"-C", "target-cpu=native"

]

2. Faster `makepkg` with ArchLinux, by adding "-fuse-ld=mold" to CFLAGS.


Is there a way to override this linker setting only for your local install? Ie i don't want to change production code or binaries, but it would be nice to have faster builds


Yes, it should just work for all projects if you put in in ~/.cargo/config.toml, or as a per-project setting.

Git life pro tip (that you didn't know you needed until now): You can use .git/info/exclude (present in every Git repository) as a local, private version of .gitignore. It has the same syntax as .gitignore, but isn't tracked by Git. So you could add .cargo/config.toml to .git/info/exclude, changing the linker locally without Git considering it as an untracked file. I generally use this feature to ignore files that are specific to my development setup which I don't want to list in .gitignore.


Great tip! Didn't know about this.

A similar one: ~/.config/git/ignore is a global private gitignore.

Great for files across repos which you always want to ignore, such as temporary files from an editor


Some projects add project/.cargo/config.toml which is unfortunate, but cool to learn about a global config for this purpose! Ty


What was the before/after on your Rust link times?


A quick and dirty comparison on my project:

    -- GNU Ld 2.31.1
 >> cargo build --release
       Compiling mfl v0.1.0 (/mnt/d/Programming/Projects/Rust/mfl/crates/mfl)
        Finished release [optimized + debuginfo] target(s) in 10.14s
 
    -- LLD-14
    >> cargo build --release
       Compiling mfl v0.1.0 (/mnt/d/Programming/Projects/Rust/mfl/crates/mfl)
        Finished release [optimized + debuginfo] target(s) in 6.20s
 
    -- Mold 2.0
    >> cargo build --release
       Compiling mfl v0.1.0 (/mnt/d/Programming/Projects/Rust/mfl/crates/mfl)
        Finished release [optimized + debuginfo] target(s) in 6.02s
I did the following for each test:

    cargo clean
    cargo build --release
    touch crates/mfl/src/main.rs
    cargo build --release
With the timings coming from the second build. The project links in LLVM, and the tests were done under WSL1 on Windows 10 with the files on the Windows file system, which is how the project is developed in general.


About a year ago, I was seeing incremental builds go from ~8s to ~2.7s with Mold. Huge quality of life improvement.

(fairly large Rust project, 24-core machine, Linux)


Personal anecdote: linking a debug build of buck2 (big Rust codebase) went from 30s to 3s for me. Pretty wonderful.


How does this latest release compare to lld? Can it run on alpine/musl?


Very wise decision. I know in my own products/code I sometimes have to swallow the tough pill and admit that some things just aren't a good fit for monetization. The market speaks, unfortunately, and it is important to try not to fight it, but instead find something else the market rewards.


Does someone know if this change of licence/business strategy is expected to expand to sold[0] (mold for macOS)?

[0] https://github.com/bluewhalesystems/sold


Apple has a new parallel linker which may or may not be of interest : https://twitter.com/davidecci/status/1665835119331135488


At the very top of mold's README (my guess is no, it won't become OSS):

> This repository contains a free version of the mold linker. If you are looking for a commercial version that supports macOS please visit the repository of the sold linker.


That has been there for a while so I think it's still an open question of if this change of strategy applies to that repo too


Looks like someone filed an issue about this https://github.com/bluewhalesystems/sold/issues/35


I don’t get why Apple doesn’t sponsor the project and make Mold the default linker in Xcode.

Give him a stack of cash for his work and make all well with the Universe.

It’d be so easy for them.


They're developing their own parallel linker - https://twitter.com/davidecci/status/1665835119331135488


License changed from AGPL to MIT.

But how (and why not) such a fine cratsman should be rewarded? Companies (maybe of all sizes) should allocate some funds to their top dependencies/tools or something and this should be listed on their main websites. More of a cultural norm I guess?


I had forgotten that it was a fast linker:

> Mold 2.0.0 is a new major release of our high-speed linker. With this release, we've transitioned our license from AGPL to MIT, aiming to expand the user base of our linker. This was not an easy decision, as those who have been following our progress know that we've been attempting to monetize our product through an AGPL/commercial license dual-licensing scheme. Unfortunately, this approach didn't meet our expectations. The license change represents our acceptance of this reality. We don't want to persist with a strategy that didn't work well.


Anyone remember having to pay for cc on Solaris? [0] It was horrible and a terrible way to treat developers who are writing software for your OS!

We have been conditioned for a very long time to not need to pay for low level developer tools and to pay for support instead. I'm surprised they even tried to license it like that.

[0] https://unix.stackexchange.com/questions/12731/usr-ucb-cc-la...


There’s a distinction between the maker of the OS/hardware and an independent vendor trying to monetize the tools.

The OS/hardware maker’s primary interest is in selling more OS or hardware, so it makes sense for them to give away the tools (or their work enhancing the already-free tools) that enable more and better applications.

An independent tool vendor is in a different position, and there have always been vendors with better tools, maybe for a specialized market, or maybe just going above and beyond what comes for free, which is what Mold tries to do.

In other words, it’s clear to everyone now that the essential tools should be free, but surely not that all tools should be free!


> In other words, it’s clear to everyone now that the essential tools should be free, but surely not that all tools should be free!

Obviously not, since their business model changed.


They were charging $100/yr [1]. Literal peanuts. Actually, that is unfair to the peanuts, it is literal pocket lint.

The average cost of a developer is more than $100/hr. mold needs to make you 0.05% (not 5%, 0.05%) more productive to directly pay for itself. If mold saves you 13 seconds a day in linking time it directly pays for itself. This ignores the knock on effects of reduced build time shortening the feedback cycle which completely dwarf the direct benefits.

It is ridiculous that such dirt cheap pricing for tools is viewed as a problem. As we saw in a post yesterday [2], this is why they say there is no money is tools.

[1] https://bluewhale.systems/

[2] https://news.ycombinator.com/item?id=36869747


Only $100/yr? Likely that is the problem then. Who's going to bother filing a P.O. with their mega company for only that amount?

It also wasn't paywalled, meaning tons of people probably just added it as a dependency in the CI and just ignored the license restrictions because it wasn't directly part of the shipped binary. There's no good way to enforce that.


> and just ignored the license restrictions because it wasn't directly part of the shipped binary. There's no good way to enforce that.

There's not anything to enforce. The AGPL allows you to run the application for any purpose. (If you modify the source, you have to make the modified source available to all users, including over the network users.)


Oh, you're right on that one. Kind of entertaining that the AGPL is all about linking, and this is the linker.


The GPL is all about linking. The AGPL is about extending what it means to distribute the software to SAAS-like uses.


At work in the 90s I was allowed to buy a brand new sparc workstation with a shrinkwrap compiler (and shrinkwrap sybase). One of the first things I did was compile gcc so I could install the whole gnu world.


They did the same at my first job out of college (Sun based company). It was my first introduction to free software. That was circa 1990... it's amazing that gcc is still going strong.


Exactly! I begged someone random for a gcc tarball.


Now I am imagining you panhandling on the sidewalk, stopping random strangers and shaking slightly: “hey man, can you spare a gcc tarball?”


This is unfortunately still the norm in the FPGA world, which I think is a poor decision in the long term for Intel/Xilinx. I wanted to seriously get into FPGA dev a decade or so ago, I learned to write Verilog and everything but the tooling made me give up, it was too painful to deal with all the quirks and limitations of the free tools.


It is possible to download and use the Xilinx tools for free. My company sells bare metal access to FPGA's and our customers do it all the time.

I agree with the tooling being a nightmare though. It is a 80gb+ install that fails half the time.


You forgot that it crashes in random times and that the text editor refuses to open files because sigasi can't be initialized


I didn't forget, I just didn't get past the install. ;-)


I admit I'm mostly familiar with the Intel (formerly Altera) side. There are no restrictions at all on the Xilinx side? That's pretty neat.


You sign up on their website and click ok on the long eula nobody reads and then click download. So yea, there are probably restrictions, but they don't have a paywall in front of them. This isn't legal advice at all.


You had to buy C compiler on many operating systems. At least once upon a time, it came with another giant set of books.


I am so glad GCC came and took over, even if I didn't have to live through this era.

The standard choice being free and good (best, in many cases) really can't be overstated.


GCC was largely ignored until the day Sun introduced the concept of UNIX user and developer editions.

Only then it started to gain momentum from people not willing to pay for Solaris Developer SDK.

In any case, I bet people using free beer tools enjoy being paid for their work, so just maybe they should also think in giving some money to the authors of those tools.


Codewarrior for the win! I used to hate all the times I would mess up a pointer and my whole machine would freeze up.


Over many years I've come across several discussions about fast linkers and how compile times sped up so much by replacing one linker with a faster one, but I've never found out what about them makes them normally so slow. Can anyone shed some light please?


This isn't the whole story, but linking is CPU-intensive and older linkers are mostly single-threaded. A lot of the performance gains come from doing work in parallel, which makes for a big improvement on beefy modern multicore CPUs.

Rui's given some good talks about Mold if you want more info: https://www.youtube.com/watch?v=hAt3kCalE0Y


Even single-threaded, modern linkers (lld, mold, even gold) are faster than BFD ld, which is notoriously slow, and sadly still the default you get on Linux.


Yep. A linker in the best case would run as fast as cat. Paste the binaries together, done. Disk I/O was a problem back when we used spinning rust but less so now.

What takes time is rewriting stuff as you go. Running the relocation tables to insert addresses into the code is cheap. Deadstripping sections is fairly cheap, deadstripping individual basic blocks within functions takes a lot more analysis and thus time.

Deduplicating constant strings is a good idea but involves streaming them all into a hashtable of some sort. Maybe you want them to share common suffixes, more work.

Deduplicating, deadstripping, rewriting debug information takes time. Debug builds can feature many gigabytes of dwarf to rewrite.

Oddly enough that the linker is scriptable, as in you can give it a program that it interprets, doesn't seem to be a significant cost. Probably because the script in question is quite short and somewhat limited in functionality.

Historically lld was very fast because it didn't bother doing any of the debug munging or other deduplication. Lld ran fast but the output binary was big.

I'm several years out of the linker performance game now so don't know the current status. In particular I don't know where mold or lld are in terms of quality of output vs their own performance.


Best source I've read is Ian Taylor's blog about his work on the gold linker:

https://www.airs.com/blog/archives/38

"Once again, the goal is speed, in this case being faster than my second linker. That linker has been significantly slowed down over the years by adding support for ELF and for shared libraries. This support was patched in rather than being designed in. Future plans for the new linker include support for incremental linking–which is another way of increasing speed."

Just think of the apps that were written in early Unix days - simple single-purpose apps, probably just one source code file , just one obj and libc to link together, no shared libs et al.

The linker code just grew organically as new "must-have" features were added. Correctness of features was more important than speed esp. when spinning rust was a limiting factor.


A PDF for the lld linker has some more info on speedups over ld and gold:

See https://llvm.org/devmtg/2017-10/slides/Ueyama-lld.pdf


I love Mold. Got my Rust debug builds from 15 seconds to 2 seconds on one project.


I'm amazed at how quickly the author responds to requests: https://github.com/rui314/mold/issues/1057

From the report to the fix in less than two days.

I'm not sure how competitive it will be with lld, especially if we consider ThinLTO (which takes multiple minutes on 64-core machine) - it can make the advantages of mold insignificant.


> I'm not sure how competitive it will be with lld, especially if we consider ThinLTO (which takes multiple minutes on 64-core machine) - it can make the advantages of mold insignificant.

Mold is focused on (incremental) development builds where LTO is probably not what you want anyway. For actual release builds you shouldn't really care that much about the build time.


mold will always win over lld, even if only for the much cooler name.


Anyone got any numbers/info on the impact to LTO optimizations? Some brief googling shows mold does support LTO, but is it as good/better than what you'll find in the LLVM/GCC linkers?


LTO for LLVM/clang and gcc is implemented by getting the compiler to emit internal compiler IR code rather than machine code to the object files. The linker's job is to call into the compilers at link time with the serialized IR code from the object files to produce machine code; the linker does not do the link-time optimization itself. Therefore LTO support in a linker is a pretty binary feature (does it support X compiler on Y platform?) without much of a "good/better" gradation. And when it comes to that, mold implements LTO support for both gcc and LLVM on Unix-like systems.


In the end it's the compiler which is being called by the linker to perform LTO, 99% of the time will be spent there


As a C++ dev, this kind of project makes me dream.

I wish I could use things like this in my day-to-day.

Sadly, stuck on very old OSs and toolchains. And you can forget about ever using Clang! :(


Does this mean macho/Darwin support might get back in?


I wonder howa change to MIT license should improve situation? Means with AGPL adoption was low because…?


I'm subject to a moratorium on AGPL software at work. Some legal departments forbid or highly restrict use of AGPL licenses. They are concerned the licenses' viral nature causing problems for their software products.

Not that I agree with them, but, also, IANAL.


Let's see that positively: with such a moratorium, we're less at risk of having some AGPL software used without respecting its licene terms :-)


I'm guessing that's what the commercial license option was for.


I'm sure it is - but that leaves the engineer with the calculation to do of "time this commercial tool will save me" versus "time/goodwill I will burn trying to get the purchase approved".


This is also true of SaaS dev tools.

There are plenty of ways to reduce friction. In this case, mold could have offered a 30 day commercial license to the company to demo the linker and see if the ROI was worth it.


moral of the story is tools don't sell regardless of license.


If mold can't sell I don't know which tool will, because they are really ahead in linking.


Sometimes they do. But at best they support a small development team, and better if they help a niche to such an extreme that there isn’t a critical mass of people able to band together and make a free version. Something like hexrays is an example but IIRC open source tools are replacing them now.


It still does not support .init_array / .fini_array sections. Too bad I'd like to use it.


Wait, what? You mean it doesn't fill the DT_INIT_ARRAY/DT_INIT_ARRAYSZ values in PT_DYNAMIC?


If it speeds up Rust compilation time, wow!


It doesn't. It speeds up rust linking time tho.


Are there available benchmarks around how much the performance changes?


It depends a lot on the project and your CPU. Incremental build (build, change 1 thing, rebuild) times are affected more than clean build times, because linking is a larger % of an incremental build.

When I benchmarked it about a year ago (with a fairly large Rust project on a 24-core machine), incremental builds went from ~8s to ~2.7s.


Is what?


Still, the lack of Windows MSVC support makes us unable to employ Mold. MinGW build maybe, but I don't think Mold is even able to produce PE now.


According to the developer, he has also written lld-link.exe[1] (i.e. lld that accepts link.exe flags for compatibility), which is already significantly faster than link.exe.

[1]: https://github.com/rui314/mold/issues/190#:~:text=I%27ve%20a...


make replaced by cmake, ouch. how to I get my -march=native -flto perf improvements in there easily? Would need at least 20 lines...


You configure with -DCMAKE_CXX_FLAGS='-march=native -ftlo' like any other cmake build. Or you `-G'Unix Makefiles' and export CFLAGS/CXXFLAGS before you build it if you really want to use make.


You should still be OK although I agree that CMake is really annoying to approach if you don't know the projects idioms particularly well e.g. there is a recurring bug in one of our builds where OMP initialization causes a deadlock, which can thus be disabled by not using openmp at cmake-time: finding how to do this when I went to disable it permanently took a good 20 minutes of guesswork because it's CMake magic versus make bullshit


Try this on for size as a concrete example. (The %{notation} is due to the RPM .spec file syntax -- adjust as required.)

    argv=( ${CMAKE:-cmake3} )
    argv+=( -S %{cmake_source_dir} )
    argv+=( -B %{cmake_build_dir} )
    argv+=( -G Ninja )
    argv+=( -D CMAKE_CXX_FLAGS='-march=native -flto' )
    argv+=( -D CMAKE_INSTALL_PREFIX=%{install_home} )
    argv+=( -D CMAKE_EXE_LINKER_FLAGS="-Wl,-rpath=%{openssl_root}/lib" )
    argv+=( -D OPENSSL_ROOT_DIR=%{openssl_root} )
    "${argv[@]}"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: