Discussion about this post

User's avatar
Robert Rider's avatar

Really enjoyed this, especially the framing of compression as a tradeoff between CPU and I/O rather than just “making things smaller.” That perspective feels a lot closer to how systems actually behave under load.

The layering you described also stood out to me. The idea of doing some form of semantic or structural encoding before entropy compression feels like where a lot of untapped potential is. Most systems seem to stop at entropy, even though reshaping the data beforehand could reduce the problem space significantly.

I’ve been experimenting with something similar on a smaller scale, trying to push compression slightly closer to structure and meaning instead of just pattern detection. Not replacing something like gzip or zstd, but giving them a cleaner input to work with.

One thing I’m curious about is how you think about the boundary between semantic encoding and complexity. At what point do the gains start getting outweighed by the cost of generalizing across different types of data?

It feels like that edge is where most of the interesting work is right now.

2 more comments...

No posts

Ready for more?