An AI's Walk to the Edge of Erdős
Paul Erdős asked in 1946 a question whose elementary statement disguises the depth of the problem: among n points in the plane, how many pairs can lie at exactly distance one from each other? The known upper and lower bounds have stood, in essentially their current form, for forty years. A recently produced AI chain of thought — 125 pages of mathematical reasoning — argues that a particular conjectural upper bound on this count is wrong. This note explains, without assuming mathematics beyond AP calculus, what the problem is, what the proposed disproof tries to do, and where the genuine novelty in the argument lies. The story is, in roughly equal parts, about arithmetic geometry and about how an automated reasoning system can stalk a hard problem in the way a working mathematician would.
§1The Question
Place n points anywhere in the plane. Count the pairs that happen to be at exactly distance one apart. What is the largest this count can be, as a function of n?
This is the entire question. You can pose it to a high-school student. The answer has been one of the most stubborn open problems in combinatorial geometry for eighty years.
Call the maximum ν(n) — "nu of n," just a name for the quantity we're trying to pin down. A few small examples build intuition. Three points at the corners of a unit equilateral triangle give three unit distances. Four points at the corners of a unit rhombus (a tilted square stretched into a diamond, with all four sides of length one) give five. By the time you reach a few dozen points, you can already arrange them in ways that produce surprisingly many unit pairs — and the question of how surprisingly many is the whole game.
§2Two Bounds and a Gap
In number theory and combinatorics, when an exact answer is out of reach, you settle for bounds — a "no fewer than" floor and a "no more than" ceiling. The hope is that you can squeeze them together until they meet. Two such estimates frame the Erdős problem, and they have stubbornly refused to meet for forty years.
The lower bound (Erdős, 1946). A lattice is just a regular grid of points — like graph paper. Take a √n × √n square of integer-spaced grid points (so the total count is n), and then rescale the whole picture so that some chosen distance becomes exactly one. Erdős's clever observation: if you choose that distance well, an enormous number of point-pairs will land at unit distance.
The reason involves a small piece of classical number theory. How many ways can a given integer be written as a sum of two squares? For most integers, very few. But for special integers — ones that factor in particular ways — there can be very many. For instance, 25 = 0² + 5² = 3² + 4² = 4² + 3² = 5² + 0², counted with sign and order, gives multiple representations. The divisor function of an integer counts how many divisors it has, and integers with many divisors are precisely the ones with many sum-of-two-squares representations. The maximum value of the divisor function grows like
— a peculiar expression we'll come back to. (Here "exp" means "e to the power of," "log" is the natural logarithm, and "log log n" means "the log of the log of n" — so for n equal to ten billion, log log n is about three.) This unusual growth rate is the signature of highly composite numbers — integers like 720, 5040, 55440, which have unusually many divisors for their size. Erdős's lattice, choosing the right integer to scale by, achieves
for some fixed positive c. That weird "1/log log n" in the exponent is not a flourish; it is a fingerprint of number theory hiding inside what looked like a plane-geometry problem. It says: the geometry of the plane is somehow communicating with the arithmetic of highly composite numbers.
The upper bound (Spencer–Szemerédi–Trotter, 1984). The ceiling comes from a completely different argument. If you draw all the unit-distance pairs as line segments, the segments have to cross each other a lot in any dense arrangement. The "crossing lemma" makes this quantitative: it gives a lower bound on how many crossings a dense graph drawn in the plane must have. Pushing this hard yields
for some constant C. This bound has not been improved in forty years.
The gap is enormous. The lower bound grows like n times something almost-but-not-quite-constant (since 1/log log n shrinks toward zero, slowly). The upper bound grows like n times n-to-the-one-third. Decades of work have not been able to bring them together. Erdős himself conjectured that the lower bound is essentially the truth — that
for some fixed C. This is the statement that the 125-page chain of thought sets out to disprove.
The argument is a one-way attack: produce, for every constant C, an arbitrarily large family of point configurations that beat the conjectured bound. A single such family suffices to refute the conjecture. Proving the conjecture is harder by an entirely different measure — you have to control every possible arrangement of points. The asymmetry between proving and disproving is the seam the argument tries to exploit.
§3Recasting Geometry as Arithmetic
The first move — standard, and not the novelty — is to identify the plane with the complex numbers. Here's the dictionary: a complex number is just a pair of real numbers (a, b) written as a + bi, where i is the formal symbol for √(−1). The point (a, b) in the plane is the complex number a + bi. Addition of complex numbers is the same as adding vectors. The modulus of a complex number a + bi, written |a + bi|, is just its distance from the origin, namely √(a² + b²). So two points are at distance one if and only if their difference is a complex number of modulus one — that is, sits on the unit circle.
Now we add a layer of structure. An additive group is just a set of objects you can add and subtract, where the result stays inside the set. The integers ℤ are an additive group; so are the real numbers, and so are the complex numbers. A more interesting one for our purposes: the Gaussian integers, written ℤ[i], are the complex numbers a + bi where a and b are both integers. They form a square lattice in the plane. You can add and subtract them and stay inside the set.
If our point configuration sits inside some additive group, then a unit distance between two points corresponds to an element of the group that happens to lie on the unit circle. So the question "how many unit distances can n points have?" becomes "how many elements of modulus one can a group of n points contain — measured by how many translations within the group happen to be by unit-modulus elements?"
Erdős's lattice construction is exactly this idea applied to the Gaussian integers. The unit-circle elements of ℤ[i] (after rescaling) correspond to integers that can be written as sums of two squares in many ways — which is the divisor function calculation from §2, viewed from the algebraic side. That control is precisely what gives Erdős's exponent.
To beat Erdős, you need a different additive group — one whose intersection with the unit circle is somehow richer than the Gaussian integers allow. The whole 125-page document is, at its core, a search for such a group.
§4Why the Easy Constructions Fail
A large part of the chain of thought is a parade of plausible-looking attempts that don't work. This is not padding. Each failure is informative: it tells you what the missing ingredient cannot be.
Subset sums of unit vectors. Pick k unit vectors in the plane. Form every possible sum of subsets of those vectors. You get 2k points (one for each subset). Two such points lie at unit distance precisely when their subsets differ by including or excluding a single vector. So you get roughly k · 2k−1 unit distances, which works out to about n log n total. That sounds like a lot, but compared to what Erdős already gives us, it's far too few. Independent directions are too "expensive" — each one only adds a logarithmic amount of structure.
Roots of unity. These are the m-th roots of 1 — complex numbers z satisfying zm = 1, equally spaced around the unit circle like the corners of a regular m-gon. There are m of them, all at modulus one. So they look like a promising supply. But the additive structure they generate (sums and differences of integer multiples) is smaller than you'd think — it has dimension roughly m/(log log m) when measured properly. The savings are real but only logarithmic — not enough.
Rational points on the unit circle. These are complex numbers a/c + (b/c)i where a, b, c are integers and a² + b² = c². They look interesting (Pythagorean triples!), but they all turn out to be variants of the same family generated by (3+4i)/5. Clearing denominators puts you right back in the Gaussian integers.
Algebraic numbers of modulus one. An algebraic number is a number that's a root of some polynomial with integer coefficients — like √2 (a root of x² − 2), or i (a root of x² + 1). One can study algebraic numbers that happen to sit on the unit circle. But their complexity (measured by how big the coefficients of the relevant polynomial must be) grows exponentially as you take powers and combinations. You get only logarithmically many usable directions in a polynomial-sized region — again too few.
The recurring lesson: you need many unit-modulus elements all living in a low-complexity additive structure, and arithmetic is unusually careful about rationing exactly that combination.
§5The Pivot: A Loophole in the Algebra
The genuine conceptual move of the argument lives here, and it requires a few definitions.
A number field is what you get when you take the rational numbers ℚ and adjoin some algebraic number. Adjoining i gives ℚ(i), the Gaussian rationals. Adjoining √2 gives ℚ(√2). Adjoining both gives ℚ(√2, i), a more complicated number field. These are like extended number systems, sitting between the rationals and the complex numbers. Each has its own arithmetic, its own notion of "integer" (analogous to the integers inside the rationals), and its own structural quirks.
Each number field comes with a notion of embeddings into the complex numbers — different ways the field can be realized concretely as complex numbers. ℚ(i) has two embeddings: one sending i to +i, one sending it to −i. These are related by complex conjugation. A totally real number field is one all of whose embeddings land inside the real numbers (no i's anywhere); ℚ(√2) is the simplest example. A totally imaginary field is one where every embedding produces genuinely complex (non-real) numbers.
A CM field (the name comes from "complex multiplication" in classical algebraic geometry) is a totally imaginary number field that is built as a "doubling" of a totally real one — specifically, a quadratic extension of a totally real field, meaning you've adjoined the square root of some negative number. ℚ(i) is the smallest example: the totally real field is just ℚ itself, and you've doubled it by adjoining √(−1) = i. There are infinitely many CM fields, of arbitrarily large dimension, and they have a remarkable structural feature:
In a CM field K, complex conjugation — which we'll call c — is an honest algebraic automorphism, meaning it's a symmetry of the field's arithmetic that respects addition and multiplication. (Compare: in the ordinary complex numbers, complex conjugation is also algebraically nice, but in a generic field it might not be definable purely algebraically.) For any nonzero element α of K, the ratio
α / c(α)
has modulus exactly one at every complex embedding of K. The reason: at any embedding, the numerator and denominator are complex conjugates of each other, and a complex number divided by its conjugate always has modulus one. (Quick check: (3+4i)/(3−4i), if you multiply it out, has |numerator| = |denominator| = 5, so the ratio sits on the unit circle.) Their absolute values cancel structurally, not by accident.
This is the loophole. There's a famous result called Kronecker's theorem that says: if an algebraic integer (a number that's a root of a polynomial with integer coefficients and leading coefficient 1, like √2 or (1+√5)/2 but not 1/2) has all of its complex conjugates of modulus one, then it must be a root of unity — that is, a corner of some regular polygon inscribed in the unit circle. This is a severe constraint: it says you can't manufacture exotic modulus-one things by integer-algebra alone.
But α/c(α) is generally not an integer. The numerator and denominator are integers, but the ratio is a fraction — what number theorists call an S-unit, meaning it has controlled denominators only at certain specified primes. It evades Kronecker by a hair's breadth, and that hair is the entire construction.
Concretely in the Gaussian integers: the element (2+i)/(2−i) has both its complex conjugates on the unit circle, and is not a root of unity. Its denominator has norm five (where the norm of a Gaussian integer a + bi is just a² + b², i.e., its modulus squared — a notion of "size" that respects multiplication). That single tiny example, scaled up to towers of arbitrarily large CM fields, is the engine of the disproof.
§6The Construction in Three Movements
With CM fields as the source of unit-modulus elements, the argument has three parts to execute. None is individually unprecedented; their combination is what is striking.
Movement One: Sign vectors
Pick a CM field K and a collection of rational primes (ordinary prime numbers 2, 3, 5, 7, 11, …) that split in K. Splitting is a phenomenon from algebraic number theory: when you go from the ordinary integers to a number field's integers, the prime numbers can behave differently. Some stay "inert" (still prime), some "ramify" (become a perfect power), and some "split" into a product of two or more new primes. In a CM field, the splitting primes break into pairs of complex-conjugate prime ideals — pairs swapped by the conjugation c we met above.
With t rational primes and the field having dimension d (its "degree"), you get t · d such conjugate pairs. For each pair, you can choose which of the two primes in the pair to include in a product. That gives 2t·d distinct ideal products — call them "sign ideals," because each is specified by a string of t·d binary choices. This is where the exponential supply of candidate directions comes from.
Movement Two: Pigeonhole into the class group
Not all of those 2t·d sign ideals are immediately useful. We want to apply the Hilbert 90 trick (α/c(α)), and that trick requires a principal ideal — one that's generated by a single element α. In ordinary integers, every ideal is principal: the ideal of "multiples of 6" is just 6 times everything. But in general number fields, not every ideal can be expressed as the multiples of a single element. The obstruction is captured by an invariant called the ideal class group of the field. Its size — the class number — measures how badly principality fails. The Gaussian integers have class number 1 (everything is principal). More exotic number fields can have class numbers in the thousands.
For CM fields built carefully, the class number is bounded — at most Hd for some H depending on the field's discriminant (a kind of complexity measure for a number field). By the pigeonhole principle, since there are 2t·d sign ideals and only Hd ideal classes, at least
of the sign ideals fall into the same ideal class. Within that class, ratios of pairs are principal, the Hilbert 90 trick fires, and we get unit-modulus elements in bulk.
t · log 2 vs. log H
The left side is the entropy from sign-vector freedom — how much "information" you have in choosing which prime from each conjugate pair. The right side is the cost paid to the class group. If sign-vector entropy beats the class-group cost, you win. If not, the construction collapses back into Erdős-scale, and you've gained nothing.
Movement Three: Class field towers
To win the entropy fight you need very specific number fields: large dimension d, controlled discriminant (so H stays small), and many completely split primes. These don't grow on trees — but a celebrated 1964 theorem of Golod and Shafarevich constructs them. A class field tower is an infinite sequence of larger and larger number fields, each unramified over the previous (no prime acquires multiplicity going up), with chosen primes forced to split completely at every level. The condition for the tower to be infinite is a relatively simple inequality between two invariants of the base field, and it can be met by starting with a base field whose class group has large rank — meaning, intuitively, a base field that is already "complicated" in a measurable sense.
With dimension growing while H stays bounded, and with as many split primes as desired, the entropy inequality t · log 2 > log H becomes achievable. The supply of unit-modulus elements per dimension becomes truly exponential.
The remaining work is geometric harvesting: embed the field in higher-dimensional complex space (each embedding gives one coordinate), take a window — a bounded region — average over translations of the lattice through that window, project everything back down to the plane via just one of the embeddings, and count what survives. Each step is standard machinery from the "cut-and-project" technique used in the theory of quasicrystals. The point set you get is, after due bookkeeping,
for some fixed δ > 0 — possibly microscopic, but fixed, while the family ranges to arbitrarily large sizes. Since δ is a fixed positive number and 1/log log n shrinks to zero as n grows, eventually δ exceeds C/log log n for any prescribed C. The Erdős upper-bound conjecture fails.
§7Where the Novelty Lives
None of the individual ingredients is new. CM fields, Hilbert 90, class field towers, the Golod–Shafarevich inequality, sign-ideal pigeonhole, cut-and-project — every one of these is well-established mathematics, some of it nearly a century old. What is novel is the combination: the recognition that
- the right algebraic source of unit-modulus elements is not the field's units (Dirichlet's theorem rations those too tightly) but Hilbert 90 ratios — fractions that evade Kronecker;
- the right counting framework is an entropy comparison between sign-vector freedom and class-group cost;
- the right existence theorem is a Golod–Shafarevich tower with prescribed prime splitting — and one specific group-theoretic trick (a "Frattini-subgroup" argument) keeps the dimension growing while imposing the splitting conditions;
- the right class-number bound is an elementary divisor-function estimate, side-stepping a substantial analytic-number-theory thicket that the natural approach would have run into.
Each of these recognitions individually is the kind of move a working number theorist might make on a good day. The striking thing about the document is that an automated system produced all of them together, with the wrong attempts visibly tried first, and the right combination assembled out of a long chain of failed candidates. The 125-page transcript is structured exactly as a working mathematician's notebook would be — including the moments where it backtracks, switches conventions (it tries a construction over the prime 2, finds technical noise, then redoes it over the prime 3 to clean things up), and stress-tests its own steps for hidden errors.
§8A Note on the Status
This summary should not be read as endorsement of the argument's correctness. The load-bearing parts — the Golod–Shafarevich tower with prescribed splitting, the uniformity of class-number bounds along the tower, the precise version of the entropy inequality — are exactly the steps that a referee would scrutinize hardest. The document itself returns to them repeatedly, which is both reassuring (the author noticed they were delicate) and a flag (they are not routine).
But as a piece of mathematical reasoning, the trajectory from "many naïve attempts fail" to "CM ratios plus class field towers plus sign entropy" is the kind of synthesis that, if it survives expert review, would resolve an eighty-year-old problem. And whatever the final verdict on this particular argument, the fact that an AI system can hold together a 125-page chain of mathematical reasoning of this character is, by itself, a piece of news.