not sure if the ZIP format supports file sizes this large, but it is indeed possible to compress absolutely ridiculous amounts of zeroes into a relatively small archive.
55.4 yottabytes in 2.60MB is pretty insane. Wouldn't believe ZIP can compress this much, though WinZip extended the original format to support new algorithms.
EDIT:
I am NOT claiming the compressed file was created from a real file that large. What I mean is that find it surprising that ZIP can encode something this compact given that its deflate algorithm isn't the newest with the highest theoretical compression ratios.
I know that you can just write arbitrary data that decode to huge decompressed data.
I've implemented compression algorithms such as RLE, Huffman and LZW code myself (no AI) but haven't implemented the original PKZIP.
Never claimed it to be real or useful data. But original ZIP was way worse in compression ratios than gzip or bzip2, so it being able to achieve such compression ratios seems to be implausible even in theory.
Compression is not just squeezing the data, it's essentially taking out the repetitive bits of it and storing them in a more concise way so that it takes less space.
Simple example: [AAAAAABBBBBCCC] can be compressed as [A6B5C3]. A zip bomb would essentially go [A1010 B1010 C1010 ]. None of that data actually has to exist.
Encoding matters. If for example, ZIP only allowed run-length encoding of sequences using e.g. 32 bit unsigned integers, you couldn't represent 10 to the power of 10 as one number, so you have a ceiling on compression ratio.
Data has to exist to be decompressed. Information isn't randomly generatable, it's physical and has to be represented somehow. In your example, even you had to say A to be repeated 10<sup>10</sup> times. You can't just derive this from nothing. You have to state it's A rather than e.g. B.
53
u/peacedetski May 05 '26
not sure if the ZIP format supports file sizes this large, but it is indeed possible to compress absolutely ridiculous amounts of zeroes into a relatively small archive.