Guess the Size of an Lrzip Archive of all Linux Kernel Sources Since 1.0

I've reported before about the efficiency of Lrzip. Now Con did a useful and fun thing: He created an lrzip archive of all the sources of all 2.6 linux kernels. And guess how much space it took? Update: Guess how big an archive of *all* Linux sources since 1.0 is.
A few hints:
- All sources in a tar are 10.3 GB.
- The 2.6.39.1 kernel is 73 MB as a tar.bz2 archive.
- Now it would be less, because the first kernel was smaller (just 32 MB).
- So taking the average (73+32=105/2=52.5) times 39 would be just 2GB (compressed!).

- 19,617,064,960 bytes linux-1.0-2.6.39.tar
- 11,067,473,920 bytes linux-2.6.0-2.6.39.tar

- ( 2,000,000,000 bytes linux-2.6.0-2.6.39.tar.bz) estimate
-   1,535,618,848 bytes linux-2.6.0-2.6.39.tar.xz 13.8%

With heavy, slow IO it took less than an hour to compress, and lrzip was more than twice as fast as xz.

Now that I gave you all these numbers let's see how much it really is... (make a guess an click the link)

171,879,382 bytes linux-2.6.0-2.6.39.tar.lrz 1.6%.
Yes, that's 10.3 Gigabytes down to about 163.9 Megabytes,
in less than an hour.

Update all linux sources ever:

221,368,298 bytes linux-1.0-2.6.39.tar.lrz 1.1%.
That's 18.2 G down to 211 Megabytes in under 40 minutes on SSD.

And then he says: "Lrzip can compress it even further with zpaq as an option, but it makes decompression much slower so I'd personally find the archive less useful."

If you like this post, share it and subscribe to the RSS feed so you don't miss the next one. In any case, check the related posts section below. (Because maybe I'm just having a really bad day and normally I write much more interesting articles about theses subjects! Or maybe you'll only understand what I meant here once you've read all my other posts on the topic. ;) )

No comments:

Post a Comment

I appreciate comments. And I do read them.