Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Zlib-ng: a performance-oriented fork of zlib (github.com/dead2)
100 points by profquail on June 5, 2015 | hide | past | favorite | 45 comments


One thing to be aware of, these replacement zlib libraries are not always "drop in" safe as some software such as nginx pre-allocate the zlib state buffers by hard coding in the original zlib structure sizes (the original zlib makes guarantees about memory usage that technically makes this behavior somewhat OK). Depending on how well the software is designed, the forked zlib versions may result in buffer overflows or other crashing. (nginx is fine - it spams your log file with errors and re-allocates).


I'll be watching this with some interest. A couple years ago I tried using some of the x86_64 assembly versions of inflate_fast() and longest_match() but found them to be subtly buggy -- they'd both sometimes do reads outside of the allocated memory and could crash if that happened to land at the end of a mmap'ed segment. I emailed all of the relevant authors, but nobody seemed to be maintaining that stuff any more. Honestly, just having a zlib with a bug tracker is a huge improvement.

I am sad to see that a lot of the zlib-ng seem to be around code style updates, which will make it hard for things to travel back and forth between the codebases. I guess one branch or another will eventually have to "win", which is sad.

Also, is zlib-ng going to commit to keeping the same zlib license? Some of the zlib improvements floating around on the net have snuck in more restrictive licensing. It would be fantastic if zlib-ng could hold the line on license creep.


According to the linked page, "Just remember that any code you submit must be your own and it must be zlib licensed."


The zlib code has to make numerous workarounds for old compilers that do not understand ANSI-C or to accommodate systems with limitations such as operating in a 16-bit environment.

This is true for many older open source libraries. I wonder:

- Are old platforms still tested as time passes? Is it really worth it maintaining extra cruft for old systems even if it is unknown whether it still works? Are there projects running CI for OS/2 Warp, MS-DOS, or A/UX?

- How many users of old or obscure platforms are there still around? Are they even updating such libraries?

- What platforms do we need to get rid of to get rid of autoconf/automake/configure and use e.g. CMake instead?


A lot even limit their codebases to C89, which I don't understand. Are there really that many platforms/compilers that don't effectively support C99 or greater? Is sixteen years not long enough for a language feature set to stabilize?

Maybe there are tradeoffs or platforms I'm just not aware of, but I at least tend to use C99 as the LCD.


That's mostly because of Microsoft's horribly outdated C-compiler


Why would anyone use VS to compile cross-platform C on Windows when there's MinGW (and clang-cl)?


Debug info, ubiquity, simplicity of combining with an otherwise VS based project.

At least VS now supports most of C99. Most. Now about that C11...


Having code that's easy to integrate with existing projects is definitely a really good thing, and I honestly don't like to neglect Windows, but with all due respect, the platform kind of does it to itself.

C11 would be really nice to see as an LCD, especially for threading (or even just a native POSIX threads implementation).


I was more thinking of language features. threads.h would be nice, but, as you may know, despite it being a pretty simple API, neither glibc nor OS X libc has implemented it - there's only musl and then compatibility shims. Sigh. You could blame a lack of demand, but I'd call that chicken and egg: the whole advantage of it over pthreads in theory is portability due to standardization, but as long as the portability remains theory, so does the advantage.


Wasn't this one of main reasons that thenNeoVim project was started?


> What platforms do we need to get rid of to get rid of autoconf/automake/configure and use e.g. CMake instead?

Why would you want that? Autotools is the devil we know. CMake has its own problems and from a statistical sample of one project implementing both, I came to the conclusion that CMake is more frustrating than Autotools when you need to do something unusual.


Exactly. autotools is exceptionally well designed for dealing with more exotic use cases like cross compiling. When designing a cross-compiling linux distribution(http://exherbo.org/docs/multiarch.txt) we discovered that autotools is by far the easiest to deal with when it comes to arbitrary host & build targets, compilers, etc. I would be even bolder and say that:

Those who do not understand autotools are doomed to reinvent it, poorly.


Exactly. autotools is exceptionally well designed for dealing with more exotic use cases like cross compiling.

...and it cannot deal with non-exotic use cases like non-UNIX environments (think Windows + Visual Studio) well at all. In contrast, with CMake you can generate an nmake file or Visual Studio project with one command.


On the other hand, with autotools you can cross-compile for Windows using MinGW.


Just like someone under unix could always just use mono's xbuild to build a project using visual studio's .vsproj files. Course, people tend to get grumpy when you tell them that they have to jump through hoops to use your project, just like your suggestion...


I use CMake to cross-compile firmware for my home automation hardware. It doesn't seem too bad, just a single file shared across all projects that defines the compiler binaries and settings. I like it much better than autotools.


Do you have an example online somewhere? This might be useful to have as a reference in the future.



Note that autotools handles all that (referencing also your link in another child comment) for you with no extra work. Writing a autoconf script for a native architecture means your project should compile for a cross arch.

It is for that reason I much prefer autotools, especially at the distribution level where you really don't want to be patching things. CMake can work, but I have had to do more patching for CMake build systems than I have had to for autotools; on that basis I consider autotools "better", at least for our use case.


I came to the conclusion that CMake is more frustrating than Autotools when you need to do something unusual.

Unusual in terms of...? If I have learnt anything from Go, Java (per Maven), etc. it's that the far majority of project don't need to do something unusual.

Conversely, if you give people the room to do unusual things, they will.


CMake. Ugh. It solves some problems and creates many, many more in the process.

There's a certain simplicity from the perspective of the user that configure && make && make install works most of the time.

CMake requires a lot of effort to get running by comparison.


It seems like this would be more compelling if accompanied by some benchmarks showing how much better the performance is than the original zlib's.


A very rough and ready, impromptu benchmark.

zlib latest took 26 seconds to compress 648 MB of text, zlib-ng latest took 16 seconds.

    alex@martha:~/src/zlib$ export TESTFILE=../enwiki-20141106-pages-articles-multistream-index.txt
    alex@martha:~/src/zlib$ du -h $TESTFILE
    648M	/home/alex/src/enwiki-20141106-pages-articles-multistream-index.txt
    alex@martha:~/src/zlib$ LD_LIBRARY_PATH=./ time ./minigzip <$TESTFILE >/dev/null
    23.75user 0.17system 0:26.05elapsed 91%CPU (0avgtext+0avgdata 1864maxresident)k
    56472inputs+0outputs (0major+144minor)pagefaults 0swaps
    alex@martha:~/src/zlib$ LD_LIBRARY_PATH=./ time ./minigzip <$TESTFILE >/dev/null
    24.29user 0.17system 0:26.80elapsed 91%CPU (0avgtext+0avgdata 1772maxresident)k
    0inputs+0outputs (0major+141minor)pagefaults 0swaps
    alex@martha:~/src/zlib$ cd ../zlib-ng
    alex@martha:~/src/zlib-ng$ LD_LIBRARY_PATH=./ time ./minigzip <$TESTFILE >/dev/null
    14.66user 0.11system 0:16.16elapsed 91%CPU (0avgtext+0avgdata 1604maxresident)k
    256inputs+0outputs (0major+140minor)pagefaults 0swaps
    alex@martha:~/src/zlib-ng$ LD_LIBRARY_PATH=./ time ./minigzip <$TESTFILE >/dev/null
    14.64user 0.12system 0:16.09elapsed 91%CPU (0avgtext+0avgdata 1840maxresident)k
    0inputs+0outputs (0major+144minor)pagefaults 0swaps
The machine is a MBP 11,2 with an i7-4750HQ CPU @ 2.00GHz, running Ubuntu 15.04 x64.


What about decompression? Are we already I/O bound? Compression is irrelevant in many cases, where you can precompress static content.


zlib latest takes 3 seconds, zlib-ng takes 13 seconds

    alex@martha:~/src/zlib$ du -sh $TESTFILE
    197M	/home/alex/src/enwiki-20141106-pages-articles-multistream-index.txt.gz
    alex@martha:~/src/zlib$ LD_LIBRARY_PATH=./ time ./minigzip -d <$TESTFILE >/dev/null
    3.36user 0.03system 0:03.87elapsed 87%CPU (0avgtext+0avgdata 1632maxresident)k
    8inputs+0outputs (0major+90minor)pagefaults 0swaps
    alex@martha:~/src/zlib$ LD_LIBRARY_PATH=./ time ./minigzip -d <$TESTFILE >/dev/null
    3.34user 0.04system 0:03.76elapsed 90%CPU (0avgtext+0avgdata 1704maxresident)k
    0inputs+0outputs (0major+91minor)pagefaults 0swaps
    alex@martha:~/src/zlib$ cd ../zlib-ng/
    alex@martha:~/src/zlib-ng$ LD_LIBRARY_PATH=./ time ./minigzip -d <$TESTFILE >/dev/null
    12.20user 0.04system 0:13.63elapsed 89%CPU (0avgtext+0avgdata 1576maxresident)k
    264inputs+0outputs (1major+86minor)pagefaults 0swaps
    alex@martha:~/src/zlib-ng$ LD_LIBRARY_PATH=./ time ./minigzip -d <$TESTFILE >/dev/null
    12.15user 0.08system 0:13.58elapsed 90%CPU (0avgtext+0avgdata 1428maxresident)k
    0inputs+0outputs (0major+84minor)pagefaults 0swaps


So the optimizations in zlib-ng enable it to decompress ... more than 4x slower than vanilla zlib? That seems worthy of some further investigation.


I'm curious, was your decompression test file compressed with zlib? zlib-ng? Does it make a difference?


Unfortunately the minigzip implementation used is using a very minimal codebase to implement the functionality, among these is a problem where it will read and decompress 1 byte per call to inflate(). A patch and pull request for this has been available since yesterday at least, and I merged this today.

In stock zlib this codepath is only used if you defined Z_SOLO, but in zlib-ng it is the default unless you run configure with --zlib-compat (or otherwise set the correct defines).

Please retest with the updated commit, or even better with --zlib-compat if you want to get close to an apples-to-apples comparison.

EDIT: PS: This goes to show that minigzip can not really be used as a reliable benchmark, since actual applications linking to the library could be using it differently (for better or worse).


Edit:

Results with zlib-ng and a newer version of the Cloudflare changes:

https://www.snellman.net/blog/archive/2015-06-05-updated-zli...

Original:

Here are some benchmarks of the Intel and Cloudflare forks that this project is based on:

https://www.snellman.net/blog/archive/2014-08-04-comparison-...

The speedups are not insignificant. (I'll see about updating the post to include results for zlib-ng).


As I mentioned in another comment, the minigzip used is very suboptimal due to reading and decompressing only 1 byte per inflate() call. Please retest with --zlib-compat or the fix that was merged today.


Sure thing, updated.

I'm not in the habit of including compression level 6 in the results, since the application I care about defaults to 5 :) And I wanted a good spread of different compression levels in the results.


Btw, I always wondered.. Why didn't you test compression level 6, it is the default level, and used by for example stock apache/nginx settings.

Also #zlib-ng on freenode if you or anyone else want to discuss.


For decompression miniz is already faster than zlib and with a more permissive license https://code.google.com/p/miniz/

And my post where I put times and code: http://fizz.buzz/posts/anything-you-zcat-i-zcat-faster-under...


Though it looks like there's been no updates since late 2013. I wonder if it makes sense for someone to step in and import it to Github



If you want more performance out of zlib, step one is recompile.

I've seen flabbergasting speedup out of basic CPU-bound utilities simply by recompiling the source code for a fifteen-year-old binary using a modern compiler.

Which also brings up the point that if you want to benchmark zlib-ng, compile zlib with the same compiler.



This makes me wonder what a good strategy would be for deprecating old architectures. Remove support for them in new versions and backport bugfixes to the last version supporting the old architectures indefinitely?


It's a more difficult issue with something like OpenSSL as new features are added. Zlib is done though, there is zero interest in sdding deflate64 or a different algorithm, they could just put a version on ice, keep it around for distribution and make a more clean 2.0


How to pronounce the "-ng"? "zetlibng" like "bang" without the "a" is kind of hard to produce for my mouth.


You spell it out. It's an acronym for “next generation”.


Ah that's what it stands for. Alright.


"zee lib en jee" was my first instinct


zed-lib-enn-gee I presume.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: