Implementing software updates where you don't want to ship entire binaries again (and only the diff) would be one. In some video games the assets are also packed into massive binaries, so you don't want to ship gigabytes of data because you replaced one icon. Sadly many games do this anyway nowadays.
There are other solutions to this problem that the game industry uses. Binary diff patching is slow, incremental, involves large diffs and has the possibility of corruption. It was used back in the mid 90's (RTPatch was the big name), but really isn't used anymore because of the drawbacks.
Games frequently use an override directory or file. The patch contains only the files that have changed and is loaded after the main index and replaces the entries in the index with the updated ones. This is the most common way of doing a patch if it's not just overwriting the original files.
Some games load their file as a virtual filesystem and then the patch just replaces the entries in the virtual store with new ones. Guild Wars 2 works this way. This is only common in MMOs though.
Games also use this because it's a straightforward way to almost guarantee a physical ordering of the files in the VFS, which is/was a common optimization strategy in the days of CDs and hard drives (profile what order the game needs files, then put them in the archive in exactly that order = tada, loading 4000 files behaves like a sequential read).
Another reason is that certain operating systems originating in the state of Washington have performance problems when you access small files or directories containing many files.
Circa 1995: we (company) used RTPatch as the users at the time were floppy based via Post, but enjoyed BBSes (and Prodigy, etc.) as it was a social community due to the nature of the software/industry. We could upload small RTPatch based updates and bugfixes to our tiny company BBS, users could dial in and download a rtpatch a lot faster than a floppy in the mail (besides avoiding the usual corrupt floppies that plagued the tech).
Fun fact. This is not a new concept. Doom used overrides with their WAD file format. Mod authors could release their mod files replacing or adding content without stealing the game level files.
There may be prior art to that, but as a young coder that was the first time I’d seen it
Yes, overrides. I've heard talks on this at conferences with a couple big publishers. A lot of effort is put into it but obviously if we were distributing OS security updates like Apple it would be a whole different ballgame.
To my knowledge, most developers have gone back to binary patching for obfuscation purposes. Bethesda does this now (and ID, by extension), as well as many other developers I've seen.
For at least the Nintendo Switch (not sure other modern conosles), the digital distribution infrastructure is built in terms of overlay packfiles. Games, updates, and DLC on disk are all single-file archives / filesystem images. The OS, when launching a game, mounts the game + its updates + its DLCs together as a transparent overlay filesystem. The game just sees a unified representation of its newest version, with whatever DLC has been installed, sitting under (IIIRC) /title.
I wouldn't be surprised if the other consoles also do things this way. It's a very sensible way to manage updates — especially when a game is running off of physical media but the updates are held in local storage. It also means there's no point where the update gets "merged in" to the base image, which means updates can be an atomic thing — either you have the whole update file downloaded + sig-checked (and thus it gets added to the overlay-list at boot) or you don't.
And, if all the consoles are doing it, I wouldn't be surprised if studios that do a lot of work on console don't just use that update strategy even on PC, for uniformity of QA, rather than for "obfuscation."
Games are directories/packfiles containing many individual files, mostly binary art assets, plus one executable that takes up a negligible proportion of the total size. When binary art assets in the directory/packfile are updated between versions, they don't really "change" in the sense that a source-code file might be changed a git commit; instead, they get replaced. (I.e. every file change is essentially a 100% change.)
The "binary diff patching" you're talking about the game industry using, was just the result of xor-ing the old and new packfiles, and then RLE-encoding the result (so areas that were "the same" were then represented by an RLE symbol saying "run of zeros, length N"). For the particular choices being made, this is indeed much less bandwidth-efficient than just sending a new packfile containing the new assets, and then overlay-mounting the new packfile over the old packfile.
bsdiff isn't for directories full of files that get 100% rewritten on update. (There's already a pretty good solution to that — tar's differential archives, esp. as automated by a program like http://tardiff.sourceforge.net/tardiff-help.html .)
Instead, bsdiff is for updates to executable binaries themselves (think Chrome updates), or to disk images containing mostly executable binaries + library code (think OS sealed-base-image updates — like CoreOS; or, as mentioned above, macOS as of Catalina + APFS.)
In these cases, almost all the files that change, change partially rather than fully. Often with very small changes. The patches can be much smaller, if they're done on the level of e.g. individual compiled function that have changed within a library, rather than on the level of the entire library. (Also, more modern algorithms than xor+RLE can be used — and bsdiff does — but even xor + RLE would be a win here, given the shape of the data.)
There's also Google's Courgette (https://www.chromium.org/developers/design-documents/softwar...), which goes further in optimizing for this specific problem domain (diffing executable binaries), by having the diff tool understand the structure/format of executables well-enough to be able to create efficient patches for when functions are inserted, deleted, moved around, or updated such that their emitted code changes size — in other words, at times when the object code gets rearranged and jumps/pointers must be updated.
The goal of tools like bsdiff or Courgette isn't to reduce an update from 1GB to 200MB for ~10k customers. The goal is to reduce an update from 10MB to 50KB for 100 million customers. At those scales, you really don't want to be sending even a 10MB file if you can at-all help it. The server time required to crunch of the patch is more than paid off by your peering-bandwidth savings.
XOR+RLE is almost useless for binaries, because almost any change will cause instructions to be added or deleted, offsetting the entire binary after the first change, making the xor fail to converge. On top of that, these changes cause changes in addresses in the first part of the binary, so you end up with a zillion similar-looking xor deltas in the first part of the file that won't compress well with RLE.
In fact, if you use smarter compression than RLE, I wouldn't be surprised if the update was larger than the original binaries after the xor, as an offset xor will likely increase chaos (entropy) in the file, making it compresss worse than the original.
bsdiff was specifically designed to intelligently handle these situations, which is why it works.
Just tested it on Chromium from my package server (90.0.4430.72 vs 90.0.4430.212):
Pedantic point: it's not "almost useless for binaries." It's almost useless for compiled, PIC binaries in modern executable formats like PE or ELF that allow for lots of load-time address-space rearrangement.
XOR-and-RLE works well for binaries from non-HLL languages (assembler, mostly) where — due mostly to early assemblers' lack of support for forward-referencing subroutine labels from the data section — subroutines tend to ossify into having defined address-space positions.
You can observe this by the fact that IPS-patchfile representations (which, while a different algorithm, is basically equivalent to XOR-and-RLE in its results) of the deltas between different versions/releases of old game ROMs written in assembly, are actually rather small relative to the sizes of the ROM images themsleves. The v1.1 ROMs are almost always byte-for-byte identical in ROM-image layout to the v1.0 versions, except for where (presumably) explicit changes were made in the assembler source code. Translated releases are the same (sometimes, but not always, because they were actually done by the localization team bit-twiddling the original ROM, because they didn't have access to the original team's assembly code.)
(This is also why archives that contain all the various versions/releases of a given game ROM, are highly compressible using generic compressors like LZMA.)
bsdiff is pretty similar to RTPatch, which is what the game industry used in the past. I'm unaware of what you're describing ever being used in practice, especially among large game houses.
That said, patches aren't really downloaded as standalone patches anymore because of Steam distribution. The way Steam handles it is documented, and if you're interested, it's available here: https://partner.steamgames.com/doc/sdk/uploading#Building_Ef...
But as an overview, Steam splits files into 1MB chunks and only downloads the 1MB chunks that have changed. The 1MB chunks are compressed in transit. Steam also dedups the 1MB chunks. I would assume that this works fine to manage the tradeoffs between size and efficiency.
One of the reasons games do this is the data is compressed, so a "patch" might be indistinguishable from a real update.
Also, as a dev, you have no idea what version your users are updating _from_. You either need to generate some number of patches for every version you could be updating from, and figure out if you should just download the whole thing again in any of those cases anyway.
The simplest one is generate patches for recent versions, where recent can be years in the past. It is a linear operation but you only run it on release so it probably isn't a huge cost. You can also use some heuristics such as if if diff is >20% of the file just stop and force users still on that version to do a full update.
A second option is using zsync[1]. zsync is basically a precomputed rolling checksum. The client can download this manifest and they download just the parts of the file they need. This way you don't care about the source, if there is any similarity they can save resources.
And of course these can be combined. Generate exact deltas for recent versions and a zsync manifest for fallback.
Side note: One nice thing about zsync is that the actual download happens from the original file using range requests. This is nice for caching as a proxy only needs to cache the new data once. Is there a diff tool that generates a similar manifest for exact diffs? So instead of storing the new data in the delta file it just references ranges of the new file.
We usually don’t compress the data on disk; decompression would make loading and file access slower.
Instead, we just pack the uncompressed files together (frequently using normal zip in a no-compression mode) so that we can avoid needing to ask the OS to open and close files for us or examining the contents of a directory, both of which can be kind of startlingly slow (by video game standards) on some common OSes. Instead, we will generally cache the directory data from the zip file and just use that rather than go to disk.
(of course, the whole download/patch would all be compressed for network transfer, but files would then be decompressed during the installation process)
On the switch the reads are so slow the fastest loading requires atbleast mild compression. At least it did when I was testing packaging for my latest switch release. Despite the weak cpu.
Ps4 also did the compressed packages by default thing if I remember right. The upside there being ample cpu for decompression such that no compression was never fastest.
I could see that keeping one big file would still be advantageous too in that environment. As a fopen on a set of small sized files plus read plus close over and over does add up. Just in cpu time and memory slack. Whereas treating it as one giant packed backing store would have advantages in speed. But at a cost of dev time. Even if you are compressing it as well could be an advantage. But I would expect there to be some spot where it crosses over from being an advantage to a disadvantage. I suspect it would be on small files/objects. But that would just be for speed. You may have to optimize for size in some cases. In the end it is usually better to build some test code and find out. Sometimes the results are what you expect. Sometimes you are spending time on things that do not matter anymore but were a big deal 20 years ago.
Oh, definitely! I haven’t worked on anything targetting a console past the PS3 generation and it completely slipped my mind that the latest gen consoles are architected specifically for streaming compressed data.
On the Windows/Mac/Linux title I’m working on now, I definitely measure a sizeable improvement to performance when loading from an uncompressed zip rather than from a compressed one. But even that could be down to the particular set of libraries I’m using to handle it.
> We usually don’t compress the data on disk; decompression would make loading and file access slower.
Did you actually benchmark this? It probably makes sense in your head, but on any vaguely modern hardware it's very unlikely to actually be true because of how exponential the memory hierarchy is.
Console hardware tends to have fast processors & cache but extremely slow RAM. Benchmarking a console's memory vs cache access tends to be one of the first things a team of principal game devs do and that information becomes bible for their titles.
IIRC in a bunch of scenarios compression makes loading and file access faster, as you're I/O limited and it's quicker to read less data and decompress. You do need to choose simple/quick/not-that-much-compressing compression algorithms for that.
I have worked on a number of embedded products which ran from compressed root file system from eMMC. The overhead was a wash because RAM is so much faster than eMMC. What you spent in decompression time was covered by reduced eMMC access time.
If you know a user is on version 3 and need to update to version 5, then why not just send out all the patches between 3 and 5? Why do you need to generate a new patch for each pair of versions.
It feels a bit egregious when I have to download a 100MB update just because a few characters were buffed or nerfed. More involved changes end up being over 1GB.
Because it's not just version 3 to version 5, it's version 3 to version 84.
Not all versions are made equal either - one might be a character buff, another might reorder assets in the "big huge binary blob file" for performance improvements. At a certain point, rather than downloading 30MB per update for 25 versions, and applying each incrementally (remember that you have to do them in order too), just download the full 1GB once and overwrite the whole whing.
Microsoft made sure in windows 10 that it's almost unusable without SSD. SO you big binary blob file have random r/w access.
Most backup software is able to do good binary deltas of arbitrary data for decades. Even dumb checkpointing resolves problem of downloading 25 versions - you download latest checkpoint and deltas from there.
Don't excuse poor design and programming, when you know a file structure, creating a differential update should be short task. With a tiny bit of algorithmic knowledge you could even optimize the process to only download needed assets inside of you big binary blob - if the asset was changed 7 times during your last 25 version you only need to download the last one.
I'd personally like to see a company put a little thought into innovating how they store data on disk so patches can be quickly applied like with git while also not requiring a full source recompilation.
It can go worse - some cheap and badly designed Android phones which download updates from every month when you first buy it until the current month, so maybe 10+ updates, but they aren't deltas (diffs) but full images. Ridiculous on so many levels.
It’s because they only tested updates from one version to the next, and not every version to every newer version.
It is a complete image, but phones today have nontrivial state that may be a problem - e.g. your baseband processor might have its own rom with its own update protocol, which changed between image 2 and image 7, so image 10 after image 1 will be unable to update the baseband.
If it's a cheap phone, I'd rather them do something brute force but reliable than try and be clever when they know they don't have the budget to QA it.
I honestly consider that a pretty reasonable trade-off.
> One of the reasons games do this is the data is compressed, so a "patch" might be indistinguishable from a real update.
Does this happen with more advanced compression algorithms? I've rsynced zip files of different versions of internal software and the diff was always much, much smaller than the entire package.
Zip files have all the metadata in a footer rather than a header. As a result, compressed files can be added and overwritten by appending to the file without disturbing already compressed data. Additionally, the "deflate" compression likely does not span across files, so files that did not change from version to version would have a similar compressed byte sequence, regardless of the order they were added to the archive.
I'd argue that zip is a relatively simple compressed archive format. Its simplicity is its charm and the reason it's so popular. More space-efficient algorithms would be less likely to be "patchable" as there would be less redundancy / structure in the compressed representation to exploit (the best compression seems like it would have similar properties of random data.)
> Additionally, the "deflate" compression likely does not span across files
Clarification: .zip (unlike .tar.gz for example, or "solid" .7z) compresses each file separately, that's nothing to do with the compression algorithm used. In addition, DEFLATE, the LZ77-based compression which is by far most commonly used in .zip (and also by gzip) has a window size of 32kB (uncompressed). So yes, even if you used DEFLATE on a solid stream (e.g. zipped a .tar archive) it couldn't remove any cross-file redundancy once it's gone past the first 32kB of each file.
On the other hand HIgh voltage sid collection (hvsc) distributes a zipped zip;
Each file is 1k-20k, of which there are 40,000 or so. But they are catalogued in 3-4 deep directories, so if you just zip them, the metadata takes 30% or so of the zip.
But the metadata does compress very well, so they zip it again.
zstd strikes a nice balance here. It can inject markers in the bytestream to make it "rsync friendly", but one could just as well say "binary diff friendly".
zstd itself also has the (pretty new) ability to use a shared file as shorthand during compression. What that means in practice is that diffs can be REALLY tiny if you have the previous archive download.
In general yes. After the first difference the compressed streams will be basically random compared to each other. However there are numerous things that may avoid this.
For zip files each individual file is compressed independently. So unchanged files and prefixes don't need to be resent, even if once a file changes the entire tail end of it needs to be resent.
Some times compression algorithms "reset" periodically. For example the `gzip --rsyncable` patch. This basically resets the compression stream so that a change will only affect part of the compressed file. This does have a cost in terms of compressed size because the compressor can't deduplicate across resets. However if the resets are infrequent you can maintain fairly good delta transfer with little space overhead.
Additionally some delta transfer tools detect common compression and decompress the file "in transfer", performing the delta checks on the original file.
Kind of an aside from your question, but binary patches not being much smaller than the full thing might happen more with modern games?
Heresay, but from what I've heard modern games may ship multiple copies of some assets with different levels or features so they can be loaded as a sequential read off the disk. While a block-oriented compression algorithm might sync up more reliably, if you're packing 200MB of assets for a level and they're all compressed to take advantage of the fact they'll be read sequentially could mean a change 25MB in would still ship ~175MB of changes.
So far in in most of the world even platter disks (which have really poor performance with modern Windows) are faster than network. Which means you can download description of the difference and reorder the file locally much faster than downloading it. Yes it needs the file to made in a way that is update friendly - most current compression algorithms can be configured like that. Yes, compression will be slightly lower, but you will save on both download size AND disk space, because right now most patches require you to download patch and then apply it, requiring twice the space. If you have 5% larger asset file but can patch it with a few memcpy on a mapped file it is a win in every way imaginable.
It is just really poor programming, nothing more. And it's everywhere. If find source >/dev/null takes 6 seconds there is no reason for gradle to take 2 minutes on a rebuild. If the dev is used to that, why would they even think about patch optimisation?
Windows game devs just traditionally didn't treat things with such granularity as you might find in a *nix environment where every little thing is a file. Content is then managed as larger blobs and you have a database of offsets or mappings.
It is very unlikely that it was a “continuous” compression anyway. Continuous archives disallow random file access property, and games require exactly that for assets. You can’t decompress few gb on average to fetch a sprite of a cat.
The reason games (and software in general) do full downloads instead of binary patches is purely overdefensive and/or stupid. Store software could just check checksums after a patch and re-download only if they fail.
Speaking from experience, AAA games have quite a bit of architecture behind them that can date back a decade or two. So you end up with some tradeoffs. The code may be well-tuned, resource efficient, and mostly crash proof, but some elements can be a bit dated relative to the size and scale of modern assets.
Ahh, thank you for the explanation. That's an awesome tool!
I actually could really see using it, now that I understand what it does.
I've worked on some firmware projects where we did OTA updates and were guilty of shipping the entire binary. Luckily, even the entire binary was rather small, but still it would have been very cool to be able to create a diff and ship only the diff!