Openai/696202aa-e614-800c-a993-2d8abc734eb7
This conversation was exported from ChatGPT[1]
gpt-5-2-thinking[edit | edit source]
User: Please, rewrite this historical paragraph about a mathematical finding process more slowly, in form of a true short sto…[edit | edit source]
Please, rewrite this historical paragraph (about a mathematical finding process) more slowly, in form of a true short story, best in simple English.. Ask me, if you are unsure about the meaning of something. -** the story of this proof
While the mathematical and literature contents of the writeup is already complete, this proof has a significance of being the first recognized Erdos problem (a problem proposed by Paul Erdos and collected on the Erdos problems website https://www.erdosproblems.com) that was solved autonomously by an AI system, with no prior literature found as of the date of this writing. As this is a milestone, it is of interest to ask about the nature of the proof (e.g. how com plex/novel it is compared to other mathematical results). Unfortunately (or fortunately) the re sulting proof is in formalized form in Lean. While this boosts confidence in correctness, it may not be readily digestible. This writeup attempts to address this issue. The author has received substantial assistance from ChatGPT in writing this manuscript. Indeed, a large fraction of the words were penned by ChatGPT. However, the author does not take this to mean we compromise on correctness; in contrast, everything has been manually checked up to roughly the level that the author himself would be confident in were he to write it himself and send it off as a journal submission! Of course, mistakes are still possible and any correction is welcome. It is of interest to record the story of this proof as follows. The “raw” story may be read in the Erdos problems website thread: https://www.erdosproblems.com/728. On Jan 4, 2026, Kevin Barreto announced that he has a proof from the AI system: Aristotle by Harmonic. This is a Lean system, and the input to Aristotle is based on an informal argument from GPT-5.2 Pro. However there is one nuance here: at that time, the problem as stated on the website is vague, with the intended interpretation from Erdos unclear. Barreto’s version resolves one version of the problem, but it wasn’t clear whether this is the intended version. Afterwards, forum participants commented on this vagueness; a consensus then emerged that this should be regarded as a partial result, which does not yet fully resolve the problem. But- on Jan 5, 2026, Barreto again asked GPT-5.2 Pro whether its argument can be upgraded to tackle the version that forum participants had now identified as the problem. GPT-5.2 Pro responded in affirmative; Aristotle was then run based on this latest response and produced a formal proof on Jan 6, 2026. There was a misunderstanding around this time that this second output had been based on a human’s mathematical observation, which would render the result not autonomous by the AIs involved. Nevertheless, Barreto has clarified that this was a misunderstanding and a human obser vation was not provided in order to reach this result. With autonomy confirmed, the problem was now understood to be fully resolved by AI. In particular, Terence Tao has vouched for this autonomous status which helped in cementing the consensus. Afterwards, a forum participant who goes by KoishiChan attempted a literature review to locate any prior literature resolving this problem. KoishiChan has been highly successful in locating prior literature in the past for problems claimed to be solved by AI; however this time around an existing 11 literature resolving this problem was not found. It is of course possible that such literature exists, as it is impossible to disprove the possibility, but we have not found such literature as of the date of this writing. Thus, the current status is that Erdos Problem 728 was fully resolved autonomously by AI with no prior human literature resolving the problem found. On Jan 6, 2026, Boris Alexeev ran Aristotle again on the Aristotle proof in order to simplify it, producing a new proof. This is the Lean proof in [1] which this present writeup is based on. The author has then worked with ChatGPT in order to extract a human-readable proof from [1] to make it accessible to a wider audience. Later, the presentation was iteratively improved and relevant literature included in order to make it a fuller presentation with maximum value to the mathematical community; this is the present writeup. For reference, the conversation with ChatGPT that the author collaborates with in producing this writeup can be accessed here1. Note that while there are many drafts there, none of them is the current writeup, in which the author further applies significant rewrites in various places by himself-
Assistant:[edit | edit source]
Assistant: A short true story: how the proof came to be[edit | edit source]
A short true story: how the proof came to be[edit | edit source]
When the author began writing this manuscript, the mathematics was already done. The ideas were there, the statements were clear, and the argument existed as a real proof. But there was a special reason to tell the story behind it.
This proof mattered for another reason: it seemed to be the first time an Erdős problem—a problem suggested by Paul Erdős and listed on the Erdos Problems website—had been solved autonomously by an AI system, with no earlier human solution found in the literature (at least up to the date this was written). That felt like a milestone. And milestones make people curious. They ask: What kind of proof is this? Is it deep? Is it new? Is it complicated?
There was one problem, though.
The proof was written in Lean, a formal proof assistant. Lean proofs are very strict and very reliable—but they are not easy to read. They are like a machine-checked blueprint: correct, but not friendly. So the author decided to do something else: to take that formal proof and turn it into something a human mathematician could read without pain.
To do that, the author worked with ChatGPT. In fact, ChatGPT wrote many of the sentences in the manuscript. Still, the author did not treat the text as “automatic truth.” He checked everything carefully—roughly as carefully as he would check his own writing before sending it to a journal. He also admitted something honest: even with careful checking, mistakes can still happen, and readers are invited to point them out.
Then the author looked back and tried to record the story of how the proof appeared in the first place.
The “raw” story, with all the forum posts and real-time reactions, can be read in a public thread on the Erdos Problems website (the thread for Problem 728).
On January 4, 2026, a person named Kevin Barreto posted an announcement. He said he had a proof produced by an AI system called Aristotle, made by Harmonic. Aristotle works in Lean, so what it produced was a formal Lean proof. Barreto also explained something important about the input: Aristotle did not invent the whole thing from nothing. It was guided by an informal argument produced by GPT-5.2 Pro, and then Aristotle was asked to formalize that argument in Lean.
At first, this looked like a clean success. But soon another issue appeared, and it was not a technical issue—it was a meaning issue.
The statement of the problem on the website was vague. People were not sure what Erdős had really intended. Barreto’s proof solved one version of the problem, but it was unclear whether that was the version everyone actually cared about.
So the forum began to discuss the wording. Participants pointed out the ambiguity again and again. Slowly, a rough agreement formed: the proof should be seen as a partial result. It was real progress, but it might not fully settle the intended question.
Then Barreto tried again.
On January 5, 2026, after the forum had identified what they now believed was the intended version of the problem, Barreto went back to GPT-5.2 Pro and asked a direct question: could the argument be strengthened to handle this clearer, stronger interpretation?
GPT-5.2 Pro answered: yes.
So Barreto used that new response as the basis for another run of Aristotle. And on January 6, 2026, Aristotle produced a formal Lean proof for the stronger version—the version the forum now treated as the real target.
Around this time, a misunderstanding spread. Some people believed that the second successful proof depended on a key mathematical observation made by a human. If that had been true, the achievement would no longer count as fully autonomous AI problem solving. It would be AI plus a crucial human hint.
But Barreto later clarified that this was not what happened. According to his clarification, no human mathematical observation was supplied in order to reach that result. The AI systems did the work from the problem and the prompts alone. With that, the community’s view shifted again: the problem now seemed to be fully solved by AI, autonomously.
One important detail helped settle this consensus. Terence Tao publicly supported the claim that the result was autonomous, and that endorsement made the community more confident about how to describe what had happened.
After that, someone in the forum—using the name KoishiChan—tried to do something very difficult: a literature search. The goal was to find whether any earlier paper, note, or obscure source had already solved this problem long ago. KoishiChan had done this successfully in the past, finding prior literature for other problems that people first thought were newly solved by AI.
But this time, KoishiChan did not find an earlier solution.
Of course, not finding something does not prove it does not exist. Mathematics is old, and the literature is huge. Still, as of the time of writing, no prior human literature resolving the problem had been located. So the status became: Erdős Problem 728 appears to have been fully resolved autonomously by AI, with no earlier human solution found.
The story did not end there.
On January 6, 2026, Boris Alexeev ran Aristotle again, this time focusing on simplification. The goal was to take the existing Aristotle proof and produce a cleaner one. That run produced a new, simpler Lean proof—the one cited as [1] and used as the foundation for the present writeup.
Now the author had a formal proof, but he still had the original problem: Lean is not easy to read.
So he returned to ChatGPT and worked step by step to extract a human-readable proof from the Lean file. The manuscript was revised again and again. The explanations became clearer. Relevant literature was added. The presentation was improved to make it as useful as possible to mathematicians who wanted to understand the argument, not just trust that Lean checked it.
That final product—the readable proof and the fuller discussion—is what this writeup aims to be.
The author also notes that the collaboration conversation with ChatGPT is publicly accessible. It contains many drafts, many attempts, and many intermediate versions. But it is not identical to the final text, because the author later made significant rewrites on his own.
And that is the story: a proof born inside a formal system, tested by a community, strengthened when the meaning of the problem became clearer, checked for prior literature, simplified, and then translated—carefully—into ordinary mathematical language.
Quick questions (only where I’m genuinely unsure)[edit | edit source]
- When you say “autonomously,” do you want the story to mean: no human supplied any key mathematical idea, but humans did choose prompts / run tools / decide which version of the problem to ask about?
- The line “the conversation … can be accessed here1” looks like a missing footnote or link. Do you want me to keep it as “here (see footnote 1)” or remove it from the story version?
References[edit | edit source]
- ↑ "AI Solves Erdos Problem". ChatGPT. Retrieved 2026-01-11.