Zipf’s law says that, given any corpus of text, the second most used word will be used half as many times as the first most used. The third most-used word will be used one-third as many times. The fourth most-used word will be used one fourth as many times, and so on.

For example, this graph shows the number of times the most commonly used words in Melville’s Moby Dick appear (click to expand).

zipfgraph
This pattern is present in every book or collection of text where it’s been looked for. The same pattern has been observed in the physical world as well. For example, the largest city in any country has twice as many people as the 2nd largest. The third-largest has a third as many…and so on. In economics, website traffic, and the number of ‘likes’ on Facebook posts, this pattern shows up over and over.

In spite of plenty of research over the years, no one has come up with a definitive answer to why Zipf’s law is so prominent in so many places. In this newsletter, we can’t even start to tell you about the leading theories. But, if you’re curious, check out this great video from Michael Stevens.

What is clear is that Zipf’s law can be applied in how you think about your website and getting more traffic to it. In our online course, Achieving Top Search Engine Positions, we propose an exercise where we have students come up with 50 words they think are relevant to their website that someone might search for.

We then have them find out how frequently people search for each of these words, and how many other sites use those words. Using these numbers, students calculate what’s called a Keyword Effectiveness Index (KEI) in order to get an idea of which of the words are the most likely to get them more search engine traffic.

When we calculate KEI values for a website’s keywords, we often find that one keyword is far better than the others. The next best is usually about half as good,  ..and so on. If you’ve chosen 50 keywords to optimize your site for, just the top few are going to result in most of the benefit to you.

Economist Vilfredo Pareto developed what is called the Pareto principal, which states that “for many events, roughly 80% of the effects come from 20% of the causes.” Moreover, 18% of the words in a language account for 80% of the occurrences of words.

Since commonly-used words appear so much more often than not-so-commonly-used words, search engine traffic is really about a very small percentage of words. No one searches for very common words such as “of”, “the” and “to”. If you look at the text of this newsletter, you may find one or two words that may enable it to be found on a search engine. These are the words that make it understandable to Google. The rest of the words are what make it understandable to people.

The long tail strategy of SEO says that having many keywords will pay off because you’ll get an occasional click once in a while from a lot of different keywords, which will add up to huge gains. The Zipf’s law strategy of SEO says to find out what your best keywords are and to focus on optimizing your site for them in order to realize the biggest gains with the least effort.

Zipf’s Law and SEO