Regarding this, what does Zipf's law mean?
Zipf's law. (definition) Definition: The probability of occurrence of words or other items starts high and tapers off. Thus, a few occur very often while many others occur rarely. Formal Definition: Pn ∼ 1/na, where Pn is the frequency of occurrence of the nth ranked item and a is close to 1.
Secondly, how is Zipf's Law calculated? More exactly, suppose a word occurs f times and that in the list of word frequencies it has a certain rank, r. Then if Zipf's Law holds we have (for all words) f = a/rb where a and b are constants and b is close to 1. Taking the logarithm of each side of the equation we obtain log(f) = log(a) - b*log(r).
Thereof, why does Zipf's law work?
Zipf's law, which states that the probability of an observation is inversely proportional to its rank, has been observed in many domains. This explanation rests on the observation that real world data is often generated from underlying causes, known as latent variables.
What is the zipf mystery?
Zipf's law arose out of an analysis of language by linguist George Kingsley Zipf, who theorised that given a large body of language (that is, a long book — or every word uttered by Plus employees during the day), the frequency of each word is close to inversely proportional to its rank in the frequency table.