'16-'17 General Discussion for Lesson 2.2

So this is the thing about this kind of text compression…there is always a tipping point where the amount of stuff in the dictionary overwhelms the benefits of the compression. (extreme example: if you added each individual character of the alphabet to the dictionary, you’ve now at least doubled the size of the “compressed” version since you now have to account for two characters were there was just one.

So what you want is to substitute a single characters for as many large groups of characters as possible.

What’s nuts is that there is no way to know what’s best, and what you choose to substitute first makes a difference. But you always reach the tipping point. For example, I’ve attached a screenshot of a way to do “she sells sea shells” that gets ~40% compression. And I followed basically the same heuristic you outlined!