I wrote an article for Nature News about a recent unexpected discovery about prime numbers. A prime number that ends in a 1 is less likely to be followed by another one ending in 1 than random chance would dictate. The finding is more general, extending to all primes, not just those that end in 1, and covering last digits in any base other than 2. Their paper is on the arXiv if you want to read it for yourself.

The result is surprising because prime numbers, although they are definitely not random, seem to behave like random numbers. As Erica Klarreich writes in her (excellent, as usual) Quanta article about the phenomenon, treating the primes as essentially random “does an excellent job of predicting certain features of the real primes, such as how many to expect between two consecutive perfect squares.” If we assume primes are random, we expect a 1 to be followed by a 1 25 percent of the time; for primes under a billion, the real figure is 18 percent of the time.

Robert Lemke Oliver and Kannan Soundararajan, the Stanford mathematicians who noticed this pattern, explain the phenomenon using the Hardy-Littlewood k-tuple conjecture. In talking about their work with James Maynard, another number theorist, I learned something peculiar and upsetting about this conjecture, or rather about it and its sibling conjecture.

The k-tuple conjecture is one of two Hardy-Littlewood conjectures about the distribution of prime numbers. It improves on the heuristic of primes being random by taking into account the fact that prime numbers have no small prime factors. As I wrote in my Nature News article,

The idea behind it is that there are some configurations of primes that can’t occur, and that this makes other clusters more likely. For example, consecutive numbers cannot both be prime — one of them is always an even number. So if the number n is prime, it is slightly more likely that n + 2 will be prime than random chance would suggest. The k-tuple conjecture quantifies this observation in a general statement that applies to all kinds of prime clusters.

The k-tuple conjecture, then, gives fairly precise estimates for how many twin primes, sexy primes, and prime triplets there should be, and so far, these estimates match observed data very well.

The other Hardy-Littlewood conjecture is the seemingly innocuous statement that there are more primes in the first n numbers than in a string of n numbers starting anywhere else on the number line. For example, there should be more primes under 100 than primes between 900 and 1,000 and more under 1,000 than between 130,000 and 131,000.

To me, this conjecture is even more plausible than the k-tuple conjecture, in part because it is more straightforward. The primes are just so bottom-heavy! The prime number theorem, an excellent theorem if ever there was one, says primes thin out further along the number line and exactly how much. The number of primes less than n is proportional to the number of digits of the prime. A number with 4 digits is half as likely to be prime as a number with 2.

There are 25 primes less than 100 and 168 less than 1,000. It seems difficult for me to believe that there are places along the number line where the primes bunch up enough to make up for those very dense areas, and that's why the k-tuple conjecture seems so reasonable.

If you asked me about the conjectures, I’d say they both sound very plausible, and I’m in good company. Number theory giants G. H. Hardy and John Littlewood made these two conjectures in the same 1923 paper, but they are not both true. In 1974, Ian Richards published a paper showing that the second contradicts the first and posits that the first is more likely. Today, number theorists generally assume the first, the k-tuple conjecture, and many proofs are contingent on the conjecture.

Based on the consensus of number theorists and the many, many computations that seem to support the k-tuple conjecture, I have reluctantly come to accept the fact that somewhere up there, in the vast expanse of primes, a cluster sits there that outweighs the first chunk of prime numbers.