How long would it take a monkey to write ‘Hamlet’?
The infinite monkey theorem allows us to explore probability and the limits of chance
The so-called infinite monkey theorem states that a monkey hitting keys at random on a typewriter would eventually type any literary work: Hamlet, Don Quixote, or even an original bestseller. Although it is hardly practical — since it is at the very least complicated to have an immortal monkey willing to type forever — this idea allows for the exploration of very interesting concepts such as randomness, behavior at infinity, and computation based on the generation of pseudo-random numbers.
This is a direct consequence of the second Borel-Cantelli lemma. This lemma states that if each attempt to achieve a specific outcome is independent from the others and has a probability of success greater than zero, then with enough attempts, that outcome will occur infinitely many times. In the case of the infinite monkey theorem, if a monkey presses keys at random indefinitely, the probability that it types a given text in a single attempt is very low but not zero. Since the attempts repeat indefinitely and are independent of each other, according to the lemma, the monkey will eventually write the desired text infinitely many times.
For the theorem to hold, it relies on several assumptions. The first is that the monkey must type randomly. Colloquially, we understand a random phenomenon as one whose outcome cannot be determined with certainty before it occurs, even if the initial conditions are known. Examples of randomness include rolling a die or the lottery. In the monkey’s case, it is assumed that at each keystroke, all letters of the alphabet have an equal probability of being pressed, regardless of the text already typed.
This condition allows us to calculate the probability that the monkey types any given sequence. For example, the probability of typing “hello” by randomly pressing four keys on a keyboard (considering only the letters and the space) is (1/27)^4, approximately 0.0000019. This very small value, for such a short sequence, already shows how complicated the matter is.
This is where the second assumption of the theorem comes in to play: there is an infinite amount of time available, and therefore an infinite number of attempts. After an n number of attempts, assumed to be isolated for simplicity, the probability that the sequence “hello” does not appear is (1-0.0000019)^n. Although (1 - 0.0000019) is very close to 1, when multiplied by itself n times, if n is large enough, the value approaches zero. Therefore, the monkey will type “hello” with as high a probability as we want.
The same applies to any other sequence — even the one that includes all the words, in order, of Hamlet — and this is what the infinite monkey theorem’s claim is based on. But, can we roughly estimate how long it might take, with high probability, to obtain Shakespeare’s classic? A recent article calculated that, with near certainty, the entire current population of monkeys would not manage to write a text longer than a few words before the thermal death of the universe.
Another curious experiment related to this theorem allows users to input any sequence and simulates the random generation of text until it finds the given sequence. To produce the text, this site uses so-called pseudorandom number generators. Since these are based on rules, the calculations these programs perform are completely deterministic: if all initial conditions are known, the generated number can be predicted. In other words, pseudorandom numbers are not truly random. However, if the initial conditions of the generator are unknown, the generated values are indistinguishable from truly random numbers. Various techniques exist for this purpose, such as generators based on modular arithmetic or those based on cryptographic methods, among others.
Finally, in the spirit of large language models, could these be used as substitutes for the monkeys in our experiment? Could ChatGPT or DeepSeek spontaneously write Don Quixote if asked to write for an infinite amount of time? The above reasoning doesn’t hold, since these models generate text based on the probability of words appearing in a given context; they are not the product of a random process. And since Don Quixote is among the texts they have been trained on, it might seem that the probability of them reproducing the entire work would be higher than in the previous case.
However, several factors make this extremely unlikely. First, these models are not trained to faithfully replicate texts in Golden Age Spanish, but rather in modern languages, making it difficult for them to accurately follow Cervantes’s style. Furthermore, these programs are designed not to copy verbatim large portions of the texts they learned from, further reducing the chances of reproducing complete works. This, combined with other limitations of the program, means that although the model could get closer than monkeys to certain parts of the text, the probability of reproducing it in its entirety is tiny.
Pablo García Arce is a predoctoral researcher at the Spanish National Research Council (CSIC) at the Institute of Mathematical Sciences (ICMAT).
Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition