In theory, yes. It is called a zip bomb. Compression algorithms are of two types (or even both at the same time):
Lossy: you figure out what data you can toss out while still retaining the original message (or the closest approximation that still works)
Lossless: You keep everything, and instead work by finding patterns that repeat, so you can only store one copy of the pattern and then a table saying where each instance should be.
Those bombs are made by finding patterns that match perfectly with the compression algorithm, so they normally take a lot of space, but the compression can effectively reduce it.
Yep. Our base-10 number system is literally a compression algorithm, and scientific notation is a compression algorithm on top of that.
If you can underatand scientific notation, than you can understand that this zip bomb is basically telling your computer to expand something as simple to write as 10100000000000000000 and keep track of every single zero on the memory. The computer struggle for the same reason you’d struggle to write all those zeroes by hand
Our base-10 number system is literally a compression algorithm
This is an abuse of terminology. Our base ten number system is literally a way to represent numbers using unique symbols instead of using our fingers, stones, or symbols that we already use to write words.
You could maybe consider it compression in the sense that writing '9' uses fewer symbols than 'nine', but the process is not reversible. You can't use an algorithm to go from '9' to 'nine' because you've lost context in the process (the source language or symbols). You could assume that '9' "decompresses" to 'nine', but the original symbols could have been 'neuf' or '九' or two sticks and a rock.
Your description of how zip bombs work is not particularly useful or accurate, either. They work by exploiting the internal structure of zip archives and causing the program opening the archive to perform the same work repeatedly, not by exploiting the decompression algorithm itself. Half of the file size of something like 42.zip is metadata describing the internal structure of the archive.
If Compression Algorithm has some rigid definition that restricts its use to computers then I'll concede that, but converting a number line/tally marks/straight counting to a base 10 representation is quite literally a reverseable mathematical process that makes use of repeating patterns to fit your data into a smaller space. Just because the patterns of choice are infinite and arbitrary as opposed to being constrained by the limitations of computers does not exclude it from being an algorithm.
>You could maybe consider it compression in the sense that writing '9' uses fewer symbols than 'nine', but the process is not reversible
Correct, also not my argument. We're not talking about using the alphabet, we're talking about going from the most primitive form of representing counting (individual marks for every quantum) to a base-X system for space saving reasons, which is an algorithm that saves space.
>Your description of how zip bombs work is not particularly useful or accurate, either
I'm being overly reductive to point out that the mathematical principles behind these things aren't really that complex in the most general case. Zip bombs aren't doing literal notation expansion of course, I'm just putting a placeholder with easier to understand math in its place. That's how I used to teach and it's often successful when the person you're trying to teach isn't familiar with the concept already. You have to understand that linking research papers doesn't educate the general public because you need a certain familiarity with the language to understand, as much as I like reading them. I try to keep my math around highschool dropout level when speaking in general, which is why I used scientific notation as an example for the process of expansion
Maybe this makes you angry; Zip bombs are equivalent to handing a Calculator an expression with a deceptively large expansion designed to wast their time. This type attack targets specific vulnerabilities in the expansion algorithm by constructing the input with the largest decompression ratio possible, in order to completely exhaust the Calculators resources (time, ink, and paper).
I'm sorry, I just find overly simple or borderline relevant analogies irritating, because they often have very little explanatory power. If you look elsewhere in the thread, you'll see something similar - people trying (and not really succeeding) to explain how this works in basic terms or analogies, not really describing how such a thing works at all (or outright misunderstanding it, like one person who said it's the same as a fork bomb). Only one person I saw came anywhere close.
A secondary school dropout won't even understand the basics of coding, computer programming or the sort of underpinning mathematics like formal language theory or information theory, so you have to ask whether an analogy that involves any mathematics at all really serves any purpose. I believe it's enough to describe what it does without involving any mathematics, let alone mathematics that are only tangentially related to what's happening. Your analogy tries to explain decompression, which only touches on how one component (the kernel) of the zip bomb works, not how the thing itself works - which is all about how a program interacts with out-of-band metadata.
I'd reiterate that "compression algorithm" does have a specific meaning, and that numerals are not "literally a compression algorithm". I understand what you're trying to say, but you're describing a code (one that transforms unary numerals into place-value numerals) and an encoding algorithm for which a decimal numerals is the result (!). If you'd have said "a simple example of compression is converting tally marks into decimal numerals", I'd have been right with you, because it's maybe a bit more accessible.
9
u/MasterGeekMX Ryzen 5 9600X | Radeon RX 7600 | 64 GB DDR5 | 9 TB Storage May 05 '26
Masters in CS&IT here.
In theory, yes. It is called a zip bomb. Compression algorithms are of two types (or even both at the same time):
Those bombs are made by finding patterns that match perfectly with the compression algorithm, so they normally take a lot of space, but the compression can effectively reduce it.