Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are solutions to this!

The simplest one is generate patches for recent versions, where recent can be years in the past. It is a linear operation but you only run it on release so it probably isn't a huge cost. You can also use some heuristics such as if if diff is >20% of the file just stop and force users still on that version to do a full update.

A second option is using zsync[1]. zsync is basically a precomputed rolling checksum. The client can download this manifest and they download just the parts of the file they need. This way you don't care about the source, if there is any similarity they can save resources.

And of course these can be combined. Generate exact deltas for recent versions and a zsync manifest for fallback.

[1] http://zsync.moria.org.uk/

Side note: One nice thing about zsync is that the actual download happens from the original file using range requests. This is nice for caching as a proxy only needs to cache the new data once. Is there a diff tool that generates a similar manifest for exact diffs? So instead of storing the new data in the delta file it just references ranges of the new file.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: