A New Era in Mathematics

Math Will Never Be the Same Again… | Yang-Hui He

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In this engaging discussion with Yang-Hui He, we delve into the transformative potential of AI in the realm of mathematics. Yang-Hui shares his insights on how AI isn’t just a tool for quicker calculations but a revolutionary force that's reshaping how we conceptualize and solve complex mathematical problems. By looking at mathematics through the lens of AI, Yang-Hui suggests new approaches to enduring challenges like the Birch and Swinerton-Dyer conjecture, while emphasizing how AI can bridge the abstract world of algebraic geometry and practical image processing techniques.

Highlights

AI is changing math by offering new insights, not just faster calculations 🧠.
Yang-Hui discovered AI's potential in math during quiet nights with his son 👶.
Calabi-Yau manifolds are reimagined as image problems, showing AI's prowess 📸.
The Birch Test challenges AI to create new mathematical theories 🌟.
AI highlights connections in math akin to leaps made by Gauss 🔍.

Key Takeaways

AI is not just speeding up math but reshaping how we understand it 🧠.
Yang-Hui He stumbled upon AI-assisted math during sleepless nights with his newborn 👶.
Calabi-Yau manifolds modeled as images showcase AI's potential in math 📸.
The Birch Test could be the next step for AI in generating novel mathematical ideas 🌟.
AI's role mirrors intuitive insights of great mathematicians like Gauss 🔍.

Overview

In this episode of Theories of Everything, we explore the intersection of artificial intelligence and mathematics with Yang-Hui He, an expert advocating for AI's transformative role beyond just speeding up calculations. Yang-Hui shares his journey into AI-assisted mathematics, which began during late nights with his newborn. This unexpected inspiration led him to explore how AI could revolutionize the way we approach and solve mathematical problems.

Yang-Hui illustrates how viewing complex algebraic structures, such as Calabi-Yau manifolds, as image recognition tasks have allowed AI to uncover new patterns and insights. This novel approach has even ventured into profound mathematical challenges, like the Birch and Swinerton-Dyer conjecture, demonstrating AI’s capability to facilitate significant breakthroughs in mathematics.

Furthermore, the conversation delves into the notion of the Birch Test, a hypothetical scenario envisioning AI's capacity to independently generate meaningful new mathematical concepts and conjectures. This evolution in mathematics sees AI not just as a computational tool but as a partner capable of fostering groundbreaking advancements akin to historical mathematical insights.

Chapters

00:00 - 02:30: Introduction and Initial Discussion This chapter introduces Yang-Hui He, the guest on the podcast, as the host expresses admiration for his work and lectures. Yang-Hui reciprocates by acknowledging his admiration for the host, especially highlighting the interviews with prominent figures like Roger Penrose and Edik Franco. The atmosphere is one of mutual respect and appreciation as the conversation begins.
02:30 - 10:00: AI and Mathematics The chapter titled 'AI and Mathematics' delves into the intricate relationship between artificial intelligence, machine learning, and mathematics. It explores three levels of understanding mathematics in this context: bottom-up, top-down, and meta perspectives. Before diving into these concepts, the chapter hints at a discussion about the specific mathematical and physics disciplines that initially captivated the speaker's interest, as well as an exploration of their collaboration with Roger. The speaker highlights their expertise in mathematical physics, particularly focusing on the intersection with algebraic geometry.
10:00 - 20:00: Interplay of Physics and Mathematics The chapter discusses the editor's background in string theory and their collaboration with C.N. Young, a notable figure in physics known for the Young-Mills Theory. Young, who is remarkable not only for his scientific contributions but also for being one of the world's oldest living Nobel laureates, having received the Nobel Prize in 1957. The discussion highlights the intertwined nature of physics and mathematics, particularly through influential figures like Young.
20:00 - 22:30: Future of AI in Mathematical Discovery This chapter delves into the collaborative relationship between the speaker and notable physicist Roger Penrose, initiated through a joint editorial project on a book related to topology in physics. The discussion highlights the importance of collaboration and personal relationships in advancing mathematical and scientific discovery, particularly in the context of future developments in AI. Insights into the editing process for academic works and the potential of AI to transform complex disciplines like topology and physics are explored.

Math Will Never Be the Same Again… | Yang-Hui He Transcription

00:00 - 00:30 Yang-Hui He, hey, welcome to the podcast. I'm so excited to speak with you. You have an energetic humility and your expertise and your passion comes across whenever I watch any of your lectures, so it's an honor. It's a great pleasure and great honor to be here. In fact, I'm a great admirer of yours. You've interviewed several of my very distinguished colleagues like, you know, Roger Penrose and Edik Franco. I actually watched some of them. It's actually really nice. Wonderful, wonderful. Well, that's humbling to hear. So firstly,
00:30 - 01:00 people should know that we're going to talk about or you're going to give a presentation on AI and machine learning mathematics and the relationship between them, as well as the three different levels of what math is in terms of production and understanding, bottom-up, top-down, and then the meta. But prior to that, what specific math and physics disciplines initially sparked your interest? And how did the collaboration with Roger come about? So my, you know, my bread and butter was mathematical physics, especially, you know, sort of the interface between algebraic geometry
01:00 - 01:30 and string theory. So that's my background, what I did my PhD on. And so at some point, I was editing a book with C.N. Young, who is an absolute legend. You know, he's 102, he's still alive, and he's the world's oldest living Nobel laureate. You know, Penrose is a mere 93 or something. So C.N. Young of the Young-Mills Theory, so he's an absolute legend. He got the Nobel Prize in 1957. So at some
01:30 - 02:00 point, I got involved in editing a book with C.N. called Topology in Physics. And, you know, with a name like that, you can just invite anybody you want, and they'll probably say yes. And that was my initial friendship with Roger Penrose started through working together on that editorial. I mean, I have Roger as a colleague in Oxford, and I've known him on and off for a number of years. But that's when we really started getting, working together. So when Roger snickers at
02:00 - 02:30 string theorists, what do you say? What do you do? How does that make you feel? Oh, that's totally fine. I mean, I'm not a diehard string theorist. And, you know, I'm just generally interested in the interface between mathematics and physics. And, you know, Roger's totally chill with that. So you just happened to study the mathematics that would be interesting to string theorists, though you're not one. Exactly, and vice versa. I just completely chanced on this. It was kind
02:30 - 03:00 of interesting, you know, I was recently given a talk, a public lecture in Dublin, about the interactions between physics and mathematics. And I still find that just, you know, string theory is still a very much a field that gives the best cross disciplinary, you know, kind of feedback. I've been doing that for decades. It's a fun thing. You know, I talked to my friends in pure mathematics, especially in algebraic geometry, 100% of them are convinced that string theory
03:00 - 03:30 is correct. Because for them, it's inconceivable for a physics theory to give so much interest in mathematics. Interesting. And that's kind of a, I think that's a story that hasn't been told so much, you know, in media. You know, if you talk to a physicist, they're like, you know, string theory doesn't predict anything, this and the other thing. But there's a big chapter of string theory, you know, to me, more than 50% of the story, backstory of string theory, is just constantly
03:30 - 04:00 giving new ideas in mathematics. And, you know, historically, when a physical theory does that, it's very unlikely for it to be completely wrong. Yeah, you watch the podcast with Edward Frenkel, and he takes the opposite view, although he initially took the former view that, okay, string theory must be on the correct track because of the positive externalities. It's like the opposite of fossil fuels. It doesn't give you what you want for your field, like physics, but it gives you what you want for other fields, as a serendipitous outgrowth. But then he's no longer
04:00 - 04:30 convinced after being at a string conference. So you still feel like the pure mathematicians that you interact with see string theory as on the correct track as a physical theory, not just as a mathematical theory. Yeah, so he, yeah, absolutely. He does make a good point. And so, I think, you know, Frankel and algebra geometers like Richard Thomas and various people, they appreciate what string theory is constantly doing in terms of mathematics.
04:30 - 05:00 And the challenge is whether it is a theory of physics based on the fact that it's giving so much mathematics. I guess, you know, you've got to be a mystic. Some of them are mystics. Some of us are mystics. And I actually, I don't, I don't personally have an opinion on that. I just, you know, some days when I'm like, well, you know, it's, this is such a cool mathematical structure, and there's so much internal consistency. It's got to be, there's got to be something there. So it's
05:00 - 05:30 just, but of course, you know, it being a science, you need the experimental evidence, you know, you need to go through the scientific process. And that I have absolutely no idea. It could take years and decades. Wouldn't you also have to weight the field, like w-e-i-g-h-t, weight the, whatever field, like the subdiscipline of string theory, with how much IQ power has been poured into it, how much raw talent has been poured into it, versus others. So you would imagine that if it was the big daddy field, which it happens to be, that it should produce more and more insights.
05:30 - 06:00 And it's unclear to me, at least, if that this much time and effort went into asymptotic safety, or loop quantum gravity, or what have you, or causal set theory, if that would produce mathematical insights of the same level of quality, we don't have a comparison. I mean, I don't know. I want to know what your thoughts are on that. I think the reason for that is just that, you know, we follow our own nose as a community. And the contending theories like, you know, loop
06:00 - 06:30 quantum gravity and stuff, you know, there are people who do it. There are communities of people who do it. And, you know, there's a reason why the top mathematicians are doing string-related stuff, is because, you know, you follow the right nose. You feel like it is actually giving the right mathematics. Things like, you know, mirror symmetry, you know, or vertex algebras, that's kind of giving the right ideas constantly. And it's been doing this since the very beginning. So, and people do, you know, the other alternative, you know, the alternative theories of everything.
06:30 - 07:00 But so far, it hasn't produced new math. You can certainly prove us wrong. But I think, you know, I follow, you know, there's a reason why Witten is the one who gets the Fields Medal. Because it's just somehow is at the right interface of the right ideas in geometry, number theory, representation theory, algebra, that this idea tends to produce the right, you know, the right mathematics. Whether it is a theory of physics, that's still, you know, that's the next mystical
07:00 - 07:30 level. But, you know, it's kind of, it's an exciting time, actually. Witten didn't get the Fields Medal for string theory, though. It was his work on the Jones polynomial, and Chern-Simons theory, and Morse theory with supersymmetry, and topological quantum field theory, but not specifically string theory. That's right. That's right. But he certainly is a champion for string theory. And for him, I mean, you know, that idea, he was able to do, you know, the Morse
07:30 - 08:00 theory stuff, he was able to get because of his work on supersymmetry. He was able to realize this was a supersymmetric index theorem that generated this idea. And that's really, supersymmetry really is a cornerstone for string theory, even though there's no experimental evidence for it.
08:00 - 08:30 So I think that's one of the reasons that's guiding him towards this idea. I think that's one of the reasons that's guiding him towards this direction. So what's cool is that just prior, the podcast that I filmed just prior to yours was Peter Woit, as you know, is a critic of string theory. And Joseph Conlon, who is a defender of string theory, and he has a book even called Why String Theory. That's right. I think it was the first time that publicly, someone like Peter Woit, along with a defender of string theory, were just on a podcast of this length, speaking about in a technical manner, what are both of their likes and dislikes of string theory, and then the string
08:30 - 09:00 community. There's three issues, string theory as a physical theory, string theory as a tool for mathematical insight, and then three, string theory as a sociological phenomenon of overhype, and does it see itself as the only game in town? Is there arrogance? Should there be arrogance? It was an interesting conversation. Yeah. Well, Joe is a good friend of mine, Joe Conlon. Yeah, right, right. In Oxford. And yeah, no, I value his comments greatly. I've always been kind of,
09:00 - 09:30 you know, for me, I've always been kind of like slightly orthogonal to the main string theory community. I'm just happy because it's constantly giving me good problems to work on. Yes. Including what I'm about to talk about in AI. Wonderful. And I'll mention a little bit about it because I got into this precisely because I had a huge database of Calabi-Yau manifolds, and I wouldn't have done that without the string community. It was, again, one of those accidents that, you know, no other, you know, the other theoretical physicist didn't happen to have this, didn't happen
09:30 - 10:00 to be thinking about this problem. There's this proliferation of Calabi-Yau manifolds, and I'll mention that a bit in my lecture later, and why this is such an interesting problem, why Calabi-Yau-ness is interesting inherently, regardless whether you're a string theorist. And that kind of launched me in this direction of AI-assisted mathematical discovery. So this is kind of really nice. And I think, I mean, for me, the most exciting thing about this whole community
10:00 - 10:30 is that, you know, science, and especially, you know, theoretical science, well, not especially, science, including theoretical science, has become so compartmentalized, right? You know, everyone is doing their tiny little bit of thing. And string theory has been breaking that mold for the last decades. It's constantly, oh, let's take a piece of algebraic geometry, let's take a bit of number theory here, elliptic curves, let's take a bit of quantum information,
10:30 - 11:00 entanglement, whatever, entropy, black holes. And it's the only field that I know that different expertise are talking to each other. I mean, this doesn't happen in any other field that I know of in sort of mathematics, theoretical physics. And that just gets me excited. And that's what I really like thinking about. Well, let's hear more about what you like thinking about and what you're enthusiastic about these days. Let's get to the presentation. Sure. Well, thank you very much for
11:00 - 11:30 having me here. And I'm going to talk about work I've been thinking about, stuff I've been thinking about for the last seven years, which is how AI can help us do mathematical discovery, you know, in theoretical physics and pure mathematics. I recently wrote this review for Nature, which is trying to summarize a lot of these ideas that I've been thinking about. And there's an earlier review that I wrote in 2021 about, you know, how machine learning can help us with
11:30 - 12:00 understanding mathematics. So let me just take it away and think about, oh, by the way, please feel free to interrupt me. I know this is one of those lectures. I always like to make my lectures interactive. So please, if you have any questions, just interrupt me anytime. And I'll just pretend there's a big audience out there and I'll just make it. So firstly, you're likely going to get to this, but what's the definition of mathematics? OK, great. So roughly, I mean, of course, you
12:00 - 12:30 know, how does one, so the first question is, how does one actually do mathematics? Right. And so one can think about, of course, in these reviews, I try to divide it into sort of three directions. Of course, these three directions are interlaced and it's very hard to pull them apart. But roughly, you can think about, you know, bottom-up mathematics, which is, you know, mathematics is a formological system, you know, definition and, you know, lemma proof and, you know, theorem proof.
12:30 - 13:00 And that's certainly how mathematics is presented in, you know, papers. And there's another one, which I like to call top-down mathematics, is where, you know, where the practitioner looks from above. That's why I say top-down, from like a bird's eye view. You see different ideas and subfields of mathematics. And you try to do this as a sort of an intuitive creative art. You know, you've got some experience and then you're trying to see, oh, well, maybe I can take a little bit of
13:00 - 13:30 piece from here and a piece from there and I'm trying to create a new idea or maybe a method of proof or attack or derivation. Yes. So these are these two. So that's, you know, complementary directions of research. And the third one, meta, that's just because it was short of any other creative words, because there's, you know, words like meta-science and meta-philosophy or meta-physics. I'm just thinking about mathematics as purely as a language, you know, whether the
13:30 - 14:00 person understands what's going on underneath. So meta is of secondary importance. So it's kind of like Chai-Chi-Pi-Ti, if you wish, you know, can you do mathematics purely by symbol processing? So that's what I mean by meta. So I'm going to talk a little bit about, in this talk, about each of the three directions and focusing mostly on the second direction of top-down, which is what I've been thinking about for the last seven years or so. Hmm. Okay. I don't know if you know of this
14:00 - 14:30 experiment called the Chinese room experiment. Yeah. Okay. So in that, the person in the center who doesn't actually understand Chinese, but is just symbol pushing or pattern matching, I don't know if it's exactly pattern, rule following, that would be the better way of saying it. Yeah. They would be an example of bottom-up or meta in this? So I would say that's meta. As you know, on Theories of Everything, we delve into some of the most reality spiraling concepts from theoretical physics and consciousness to AI and emerging technologies. To stay informed in an
14:30 - 15:00 ever-evolving landscape, I see The Economist as a wellspring of insightful analysis and in-depth reporting on the various topics we explore here and beyond. The Economist's commitment to rigorous journalism means you get a clear picture of the world's most significant developments, whether it's in scientific innovation or the shifting tectonic plates of global politics. The Economist provides comprehensive coverage that goes beyond the headlines. What sets The Economist apart is
15:00 - 15:30 their ability to make complex issues accessible and engaging, much like we strive to do in this podcast. If you're passionate about expanding your knowledge and gaining a deeper understanding of the forces that shape our world, then I highly recommend subscribing to The Economist. It's an investment into intellectual growth, one that you won't regret. As a listener of TOE, you get a special 20% off discount. Now you can enjoy The Economist and all it has to offer for less.
15:30 - 16:00 Head over to their website, www.economist.com to get started. Thanks for tuning in, and now back to our explorations of the mysteries of the universe. So I would say that's meta, in the sense that the person doesn't even have to be a mathematician. You're just simply taking symbols, large language modeling for math, if you wish. Got it. Of course, you know, there's a bit of
16:00 - 16:30 component of others, you know, that you can see there's a little bit of component of bottom-up, because you are taking mathematics as, you know, a sequence of symbols. But I would mainly call that meta, if that's okay. I mean, these definitions are just, you know, things that I'm using. Yes, yes. But in any case, I would talk mostly about this bit, which is what I've been thinking mostly about. One thing, just to set the scene, you know, 20th century, of course, you know, computers have
16:30 - 17:00 been playing an increasingly important role in mathematical discovery. And of course, you know, it speeds up computation, all that stuff goes without saying. But something that's perhaps not so emphasized and appreciated is the fact that there are actually fundamental and major results in mathematics that could no longer have been done without the help of the computer. And so this,
17:00 - 17:30 you know, there's famous examples. Even back in 1976, this is the famous Upper Harkin-Cock proof of the four-color theorem. You know, that every map, it only takes four, every map in a plane only takes four colors to completely color it with no neighbors. And this is a problem that was posed, I think, probably by Euler, right? And this was finally settled by reducing this whole topology problem to thousands of cases, and then they ran it through a computer and checked it case by case. So, and then other major things like, you know, the Kepler conjecture, which is, you know,
17:30 - 18:00 that stacking balls, identical balls. The best way to stack it is what you see in the supermarket, you know, in this hexagonal thing. And this was a conjecture by Kepler, but to prove that this is actually the best way to do it was settled in 1998, again, by a huge computer check. And the full acceptance by the math community, it was only as late as 2017, when proof copilots actually went through Hauss's construction and then made this into a proof. Yes.
18:00 - 18:30 But wasn't there a recent breakthrough in the generalized Kepler conjecture? Absolutely. So this is what Marina Vyotsovskaya got the Fields Medal for. So the Kepler conjecture is in three dimensions, you know, our world. Vyotsovskaya showed in dimensions 8, 16, and 24 what the best possible packing are. And she gave a beautiful proof of that fact. And to my knowledge, I don't
18:30 - 19:00 think she actually used the computer. There's some optimization method. Actually, what I'm referring to is that there are some researchers who generalize this for any n, not just 8, not just 24, who used methods in graph theory of selecting edges to maximize packing density to solve a sphere packing problem probabilistically for any n. Though I don't believe they used machine learning. Well, thanks for telling me. I've got to check that. That's interesting. This
19:00 - 19:30 was actually a really interesting one. I mean, that's something that's closer to me, which is the classification of finite simple groups. So simple groups are building blocks of all finite groups. And the proof is, you know, took 200 years. And the final definitive volume was by Gornstein 2008. And what's really interesting, the law in the finite group theory community is that nobody's actually read the entire proof. It's just not possible. It takes longer for people to actually read the entire proof than a lifetime. So this is kind of interesting that, you know, we have
19:30 - 20:00 reached the cusp in mathematical research where the computers are not just becoming computational tools, but it's increasingly becoming an integral part of who we are. So this is just set the scene. So we're very much in this, you know, we're now in the early stages of the 21st century. And this is increasingly the case where we have this, where computers can help us or AI can help us
20:00 - 20:30 in these three different directions. Great. So let me just begin with this bottom up and sort of to summarize this. This is probably the oldest attempt in where computers can help us. So this is where I'm going to define bottom up, which is, I guess it goes back to the modern version of this is this classic paper, the classic book of. Russell Whitehead on the Principia Mathematica,
20:30 - 21:00 which is 1910s, where they try to axiomize, axiomatize mathematics, you know, from the very beginning, you know, it took like 300 pages for them to prove the one plus one is good to two famously. Nobody has read this. So this is this is one of those impenetrable books. But I mean, this, but this tradition goes back to, you know, Leibniz or to Euclid, even, you know, that the idea that mathematics should be axiomatized. Of course, this, this program took only about 20
21:00 - 21:30 years before he was completely killed in some sense, because of Gödel and Church and Turin's incompleteness theorems that, you know, this very idea of trying to axiomatize mathematics by constructing, you know, layer by layer is proven to be, you know, logically impossible within every order of logic. But I like to quote my very distinguished colleague, Professor Minyong Kim. He says the practice of mathematician hardly ever worries about Gödel. Because, you know,
21:30 - 22:00 if you have to worry about whether your axioms are valid to your day to day, you know, if an algebraic geometer has to worry about this, then you're sunk, right? You get depressed about everything you do. Right? So the two parts kind of cancel out. But the reason I mentioned this is that because of the fact that these two parts cancel each other out, these two negatives cancel each other out, this idea of using computers to check proofs or to compute proofs really goes back
22:00 - 22:30 to the 1950s. Right? So despite the, you know, what Gödel and Church and Turin have proved is foundational. Even back in 1956, Noah, Simon and Shaw devised this logical theory machine. I have no idea how they did it, because this is really very, very, very primitive computers. And they were actually able to prove some certain theorems of Principia by building this bottom up, you know, take these axioms and use the computer to prove. And this is becoming, you know, an entire field of
22:30 - 23:00 itself with this very distinguished history. And just to mention that this 1956 is actually a very interesting year, because it's the same year, 56, 57, that the first neural networks emerged from the basement of Penn and MIT. And that's really interesting, right? So people in the 50s were really thinking about the beginnings of AI, you know, because neural networks is what we now call,
23:00 - 23:30 you know, goes under the rubric of AI. And at the same time, they were really thinking about computers to prove theorems and mathematics. So it's 56 was a kind of a magical year. And, you know, this neural network really was a neural network in the sense that, you know, they put cadmium sulfide cells in a basement. It's a wall size of photoreceptors. And they were using, you know, flashlights to try to stimulate neurons, literally to try to simulate computation.
23:30 - 24:00 That's quite an impressive thing. And then this thing really developed, right? And now, you know, half a century later, we have very advanced and very, very sophisticated, computer-aided proof, automated theorem provers. Things like the Coq system, the Lean system. And they were able to create, so Coq was what was used in this, the full verification of the proof of the Four
24:00 - 24:30 Color theorem was through the Coq system. And, you know, then there's the Phi-Thomson theorem, which got Thomson the Fields Medal. Again, they got the proof through this system. And Lean is very good. I do a little bit of Lean, but also Lean, the true champion of Lean is Kevin Buzzard at Imperial, 30 minutes down the road from here, from this spot. And he's been very much a champion for what he calls the Xena project, and using Lean to formulate, to formalize all of
24:30 - 25:00 mathematics. That's the dream. And what Lean has done now is that it has, Kevin tells me, all of the undergraduate level mathematics at Imperial, which is a non-trivial set of mathematics, but still a very, very tiny bit of actual mathematics. And they can check it, and everything that we've been taught so far at undergraduate level is good and self-consistent, so nobody needs to cry about
25:00 - 25:30 that one. Wonderful. And so that's all good. And then more recent breakthroughs is the beautiful work of, you know, so three Fields Medalists here. So two Fields Medalists, Gowers, Manners, I think it's his name, and Tao, where they proved this conjecture, which I don't know the details of, but they were actually using Lean to prove, to help prove this. And I think Terry Tao in this public lecture, which he gave recently in 2024 in Oxford, he calls this whole idea of AI co-pilot,
25:30 - 26:00 which I very much like this word. I was with Tao in August in Barcelona, we were at this conference, and he's very much into this, and of course, you know, Tao, Terry Tao for us is, you know, is a godlike figure. And the fact that he's championing this idea of AI co-pilots for mathematics is very, very encouraging for all of us. Yes. And for people who are unfamiliar
26:00 - 26:30 with Terry Tao, but are familiar with Ed Witten, Terry Tao is considered the Ed Witten of math, and Ed Witten is considered the Terry Tao of physics. Yeah, I've never heard that expression. That's kind of interesting. At Barcelona, when Terry was being introduced by the organizer, Eva Miranda, she said, Terry Tao is, this is a very beautiful sentence, Terry Tao has been described as the Gauss of mathematics. Yes, or the Mozart. But I think a more appropriate thing to describe
26:30 - 27:00 him is to describe him as the Leonardo da Vinci of mathematics, because he has such a broad impact on all fields of mathematics, and that's a very rare thing. Yeah, I remember he said something like, topology is my weakest field, and by weakest field to him, it means I can only write one or two graduate textbooks off of the top of my head on the subject of topology. Exactly, exactly. I guess
27:00 - 27:30 his intuitions are more analytic. He's very much in that world of analytic number three, functional analysis. He's not very pictorial, surprisingly. Like Roger Penrose has to do, everything has to be in terms of pictures. But Terry is a symbol, symbolic matcher. We can just look at equations, extremely long, complicated equations, and just see which pieces should go together. That's kind
27:30 - 28:00 of very interesting. Speaking of Eva Miranda, you and I, we have several lines of connection. Eva's coming on the podcast in a week or two to talk about geometric quantization. Awesome. Eva is super fun, right? She's filled with energy. Yes. She's a good friend of mine. I think in this academic world of math and physics, I think we're at most one degree of separation from anyone else. It's a very small community, relatively small community. Back to this thing about, of course,
28:00 - 28:30 one could get overoptimistic. I was told by my friends in DeepMind that Shaggedy, who I think he's on this AI math team, he was instructing that computers beat humans in chess in the 90s, that humans go at 2018, so you should beat humans in proving theorems in 2030. I have no idea how he extrapolated these two points. They're only three data points. But DeepMind has a product to sell,
28:30 - 29:00 so it's very good for them to be overoptimistic. But I wouldn't be surprised that this number… Well, I'm not sure to beat humans, but it might give ideas that humans have not thought about before. So that's possible. Just moving on. So that's the bottom up. And as I said, this is very much a blossoming, or not blossoming, it's very much a long, distinguished field of automated theorem computing, of theorem provers and verifications of formalization mathematics,
29:00 - 29:30 which Tal calls the AI copilot. Just to mention a bit with your question a bit earlier about metamathematics. So this is just kind of… I like your analogy. This is like the Chinese room. Can you do mathematics without actually understanding anything? You know, personally, I'm a little biased, because having interacted with so many undergraduate students before I moved to the London Institute so I don't have to teach anymore, teach undergraduates, I've noticed,
29:30 - 30:00 you know, maybe one can say the vast majority of undergraduates are just pattern matching, whether there's any understanding. I think this is one of the reasons why, you know, why CHAT-GPT does things so well. It's not just because… It's not because, oh, you know, LLMs are great, large language models are great. It's more that most things that humans do are so without comprehension anyway. So that's why it's kind of this pattern matching idea. And this is
30:00 - 30:30 also true for mathematics. What's funny is that my brother's a professor of math in the University of Toronto for machine learning, but for finance. And I recall 10 years ago, he would lament to me students that came to him who wanted to be PhD students, and he would say, okay, but Curt, some of them, they don't have an understanding. They have a pattern matching understanding. He didn't want that at the time, but now he's into machine learning, which is effectively that times 10 to the power of 10. Right, right. No, no, I completely agree. I mean, this is not… This is
30:30 - 31:00 not to criticize undergraduate studies. You know, I think in undergraduate students, it's just that, you know, it's part of being human. We kind of pattern match, and then we do it the best we can. And then, of course, if you're Terry Tao, you know, you actually understand what you're doing, but you know… Of course. But the vast majority of us doing most of the stuff is just pattern matching. So that's why… And this is true even for mathematics. So here, I just want to mention
31:00 - 31:30 something, which is a fun project that I did with my friends, Vishnu Jijala and Brent Nelson, back in 2018, before LLM. So before all this LLM for science thing. And this is a very fun thing, because what we did, we took the archive, and we took all the titles of the archive. You know, this is the, you know, the preprint server for contemporary research in theoretical sciences. And, you know, we were doing LLM classifiers, Word2Vec, very old fashioned. This is a neural
31:30 - 32:00 network, Word2Vec. And, you know, you can classify this and do their thing. But what's really interesting, this is my favorite bit, we took, to benchmark the archive, we took Vixra. So Vixra is a very interesting repository, because it's archive spelled backwards. And he has all kinds of crazy stuff. I'm not saying everything on Vixra is crazy, but certainly has everything that archive rejects, because he thinks it's crazy. Things like, you know, three page proof
32:00 - 32:30 of the Riemann hypothesis, or Albert Einstein is wrong, it's got filled with that. It's interesting to study the linguistics, even at a title level, you could see that, you know, what they call the distinctions of quantum gravity versus the other things, they have the right words in Vixra. But the word order is already quite random, that, you know, in other words, the classification matrix, the confusion matrix for Vixra is certainly not as distinct as archive, which, you know, so kind of interesting, you know, you get all the right buzzwords. It's like, you know, kind
32:30 - 33:00 of thing, Vixra, I think, is a good benchmark, that linguistically is not as sophisticated as, you know, real research articles. But this idea, so this is something much more serious, is this very beautiful work of Chitoyan et al in Nature, where they actually took all of material science, and they did a large language model for that, and they were able to actually generate new reactions in material science. So this, I think, this paper in 2019, this paper by Chitoyan, is really the
33:00 - 33:30 beginnings of LLM for scientific discovery. This is quite early, I mean, this is 2019, right? Yeah, and it's remarkable how we can even say that that's quite early. The field is exploding so quickly. Absolutely. That five years ago is considered quite some time ago. Yeah, absolutely. Even five years ago, you know, I was still very much in a lot of... I've evolved in thinking a lot about this thing. I would also like to get to your personal use cases for LLMs,
33:30 - 34:00 ChagGVT, Claude, and what you see as the pros and cons between the different sorts, like Gemini was just released at 2.0, and then there's O1, and there's a variety. So at some point, I would like to get to you personally, how you use LLMs, both as a researcher and then your personal use cases. Okay, now I can mention a little bit. One of the very, very first things when ChagGVT3 came out in, what, 2018, something, 2019, something like that? ChagGVT, oh, three. You mean GPT-3? GPT-3,
34:00 - 34:30 like the really early baby versions. Yeah, that was during, just before the pandemic. Just before pandemic. So that was just like, so I got into this AI for math through this Calabi-Yau metaphor, which I'm going to mention a bit later. And then GPT came out when I was just thinking about this large language model. So this is a great conversation. So I was typing problems
34:30 - 35:00 in calculus, freshman calculus, and it was solved fairly well. I mean, it's really quite impressive what he can do. So it's fairly sophisticated because things like, I was typing questions like, take vector fields, blah, blah, blah on a sphere, find me the grad or the curve. I mean, it's like
35:00 - 35:30 first, second year stuff, and you have to do a lot of computation. And he was actually doing this kind of thing correctly, partially because there's just so many example sheets of this type out there on the internet. And so he's kind of learned all of that. So I was getting very excited and I was trying to sell this to everybody at lunch. I was having lunch with my usual team of colleagues in Oxford over this. And of course, lo and behold, who was at lunch was the great Andrew Wiles. So
35:30 - 36:00 I felt like I was being a peddler for GPT, LLM for mathematics, to perhaps the greatest living legend in mathematics. And Andrew's super nice, and he's a lovely guy. And he just instantly asked me, he says, how about you try something much simpler? Two problems he tried. The first one was, tell me the rank of a certain elliptic curve. And he just typed it down, a certain elliptic curve,
36:00 - 36:30 or rational points, of a very simple elliptic curve, which is his baby. And I typed it, and it got completely wrong. He very quickly started saying things like, five over seven is an integer. Partially because this is a very hard thing to do. You can't really guess integer points. Unlike in calculus, where there's a routine of what you need to do. And then very quickly, we converge on an even simpler problem. How about find the 17th digit in the decimal expansion of
36:30 - 37:00 22 divided by 29, like whatever. And that it's completely random, because you can't train, you actually have to do long division. This is primary school level stuff. And yet GPT just simply cannot do, and it's inconceivable that it could do it, because no language model could possibly do this. But GPT now, O1, O2, O1 is already clever enough. When he asks a question like this, linguistically,
37:00 - 37:30 it knows to go to O from alpha. And then it's okay, then it's actually doing the math. So something so basic like this, you just can't train a language model to do. You get one in 10 right, and it's just a randomly distributed thing. Whereas sophisticated things, they are seemingly sophisticated things, like solving differential equations, or doing very complicated integrals. It can do, because there's somewhat of a routine, and there are enough samples out there. So anyway,
37:30 - 38:00 so that's my user case, two user cases. That's also not terribly different than the way that you and I, or the average person, or people in general, think. So for instance, we're speaking right now in terms of conversation. And then if we ask each other a math question, we move to a math part of our brain. We recognize this is a math question. So there's some modularity in terms of how we think. It's not like we're going to solve long division using Shakespeare. Even if we're in
38:00 - 38:30 a Shakespeare class, and someone just jumps in and then asks that question, we're like, okay, that's a different, that's of a different sort of mechanism. Yeah, that's a good analogy. Yeah, yeah. When you first encountered ChatGPT, or something more sophisticated that could answer even larger mathematical problems, did you get a sense of awe or terror initially? So I'll give you an example. There was this meeting with some other math friends of mine, and I was showing them ChatGPT when it first came out. And then one of the friends said, explain,
38:30 - 39:00 can you get it to explain some inequality, or prove some inequality? And then it did, and then explained it step by step. Then everyone just had their hand over their mouth like, are you serious? Can you do this? And then they're like, then one said, one friend said, this is like speaking to God. And then another friend said, had the thought like, what am I even doing? What's the point of even working, if this can just do my job for me? So did you ever get that sense? Like, yes, we're excited about the future, and it as an assistant, but did you ever feel any sense
39:00 - 39:30 of dread? I'm by nature a very optimistic person. So I think it was just awe and excitement. I don't think I've ever felt that I was threatened, or the community is being threatened. I could totally be wrong. But so far, I just say, this is such an awesome thing, because it'll save me so much time looking up references and stuff like this. I was happy. I was just like, wow, this is kind
39:30 - 40:00 of cool. I mean, I guess if I were an educator, I might get a bit of a dread, because there's like, you know, undergraduate degrees, you know, if you do an undergraduate degree, it's just basically one chart GPT being fed to another. You know, a lot of my colleagues started setting questions in exams with chart GPT, with fully latexed out equation. I mean, this is becoming the standard thing to do. I guess even if you're an educator, you would probably worry. But I was thinking about
40:00 - 40:30 just long term discovery of, you know, what new knowledge can we generate? So in that sense, this is going to be a certainly an incredible help, because it's got all the knowledge in the background. Wonderful. All right, let's move forward. Yeah, sure. So 2022 was a great year. I'm surprised this wasn't like over every single newspaper. I don't know why. At least I was told there was some obscuring outlet. I can't remember. Some expert friends in the community told me that
40:30 - 41:00 the chart GPT has passed the Turing test. This is a big deal. But I don't know why it hasn't been. I was hoping to see this on BBC and every major newsletter, but it didn't catch on. But anyhow, I believe that in 2022, chart GPT has passed the Turing test. And then, you know, where in the last two years, this is obviously where we can, you know, this is a huge development now for large language models for mathematics. And, you know, every major company, OpenAI, MetaAI, EpochAI,
41:00 - 41:30 you know, everything. And they've been doing tremendous work in trying to get LLM for math. Basically, you know, take the archive, which is a great repository for mathematics and theoretical physics, pure mathematical and theoretical physics, and then just learn that and try to generate to see it to how much. And this is very much a work in progress. And of course, you know,
41:30 - 42:00 AlphaGeo, AlphaGeo2, AlphaProof, this is all the DeepMind's success. It's kind of interesting within a year, you know, you've gone from 53% on Olympia level to 84%, which is, you know, this is scary, right? This is scary in the sense that impressively awesome that, you know, they could do so quickly. So basically in 2022, an AI is approximately equal to the 12-year-old Terence Tao, in the sense that it could do a silver medal. But of course, this is a very specialized,
42:00 - 42:30 you know, the AlphaGeo2 was really just homing in on Euclidean geometry problems, which to be fair, extremely difficult, right? If you don't know how to add the right line or the right angle, you have no idea how to attack this problem, but it's kind of learned how to do this. So it's kind of nice. So, you know, this is all within, you know, a couple of years. And there's this very
42:30 - 43:00 nice benchmark called Frontier Math that Epoch AI has put out. I think there was a white paper and they got Gowers and Tao, you know, the usual suspects, just to benchmark. Okay, fine. So it can do 84% on Math Olympiad, which is sort of high school level. What about truly advanced research problems, right? To my knowledge, as of the beginning of this month, it was only doing 2%. So that's okay, fine. So it's not doing that great. But the beginning of this week, you learn
43:00 - 43:30 that OpenAI 03 is doing 25%. So we've gone 20% up. We've got a fifth up within four weeks of what they can do. So this is, wow, that's kind of very interesting. Such a rapid improvement. It's so, this is great. I love this, right? Because it's exciting. It's very rare to be. I remember back in the day when I was a PhD student, doing AdS/CFT related algebra geometry, because Maldacena had
43:30 - 44:00 just come out with a paper in 97, 98, and that's just when I began my PhD. I remember that kind of excitement, the buzz in the string community. People were saying, there was a paper every couple of days on the next, that kind of excitement. And I haven't felt that kind of excitement for a very long time, just because of this. Wow. And then this is like that, right? Every week, there's this
44:00 - 44:30 new benchmark, a new breakthrough. So that's why I find this field of AI system mathematics to be really, really exciting. Can you explain, perhaps it's just too small on my screen because I have to look over here, but can you explain the graph to the left with Terence Tao? Oh, gosh. I'm not sure I can, because I'm sure I can read this graph in detail. I think it's the, it's the year. What is it trying to convey? So it's the ranking of Terence. Oh, no, this is just Terence
44:30 - 45:00 Tao's individual performances over different years, over different problems. So he's retaking the test every year? No, no, he's taken it three times. Ages 10, 11, and 12. And when he was 10, he got the bronze medal, and then he got the silver medal, then he got the gold medal within three years. Okay. And age of 12 or something. But I can't, I think... What are those bars, though? I
45:00 - 45:30 think the bars, it's a good question. I think, maybe it's to the different questions, you're given 60 questions, and what it would take to get the gold medal, I think, or what it would take to get the silver medal. I think. How many percents do you have to correct? Okay, so it wasn't a foolish question of mine. It's actually... No, no, no, no, it's a good question. I have no recollection, or maybe I never even looked at it. Somebody told me about this graph at some point. I forgot what it is. Okay, because it looks to me like Terence Tao is retaking the same test, and
45:30 - 46:00 then this is just showing his score across time, and he's only getting better. But that can't be it. Why would he retake a test? He's a professor. No, I think it goes to 66. It must be like... This is an open source graph. Oh, I thought you were going to say, this is an open problem in the field. What does this graph mean? No, no, no. It's an open source. This graph is just... You can take it from the Math Olympiad database. Got it. Which I shamelessly... See, again, perfect,
46:00 - 46:30 right? I've just done something that I have absolutely no understanding. I've presented to you like a language model, and I just copy and pasted, because it's got a nice, cute picture of Terence Tao when he was a little... So, finally, I'll go back to the stuff that I really be thinking about, which is this sort of top-down mathematics, right? So, and then this is kind of interesting. So, the way we do research, you know, practitioners, is completely opposite to
46:30 - 47:00 the way we write papers. I think that's important to point that out. We muck about all the time. We do all kinds... When you look at my board, right, it's just filled with all kinds of stuff. And most of it is probably just wrong. And then once we got a perfectly good story, we write it backwards. And I think writing math papers backwards, and math generally defined, math and theoretical physics papers backwards, well, theoretical physics is a bit better. At least sometimes you write the
47:00 - 47:30 process. But in pure math papers, everything is written in the style of Bubaki, this very dry definition proof, which is completely not how it's actually done at all. This is why, you know, Arnold, the great Vladimir Arnold, says, you know, Bubaki is criminal. He actually used this word, the criminal Bubaki-ization of mathematics, because it leaves out all human intuition experience. It just becomes this dry machine-like presentation, which is exactly how things should
47:30 - 48:00 not be done. But Bubaki is extremely important, because that's exactly the language that's most amenable to computers. So, you know, it's one way or another. But we, you know, human practitioners certainly don't do this kind of stuff, right? We muck about, you know, we have to... And sometimes even rigorous sacrifice, right? If we have to wait for proper analysis in the 19th century to come about, before Newton invented calculus, we won't even know how to
48:00 - 48:30 compute the area of an ellipse. Because we have to wait and formalize all of that. It'll just go all backwards. So kind of the historical progression of mathematics is exactly opposite to the way that it's presented. I mean, it's fine, but the way it's presented is better. It's much more amenable to a proof copilot system like Lean than what we actually do. Even science in general is like that,
48:30 - 49:00 where we say it's the scientific method, where you first come up with a hypothesis and then you test it against the world, gather data and so on. But the way that scientists, not just in math and physics, but biologists and chemists and so on, work, are based on hunches and creative intuitions and conversations with colleagues and several dead ends, and then afterward you formalize it into a paper in terms of step-by-step, but it was highly nonlinear. You don't even have a recollection most
49:00 - 49:30 of the time of how it came about. That's right. And I think one of the reasons I got so excited about all this AI for math is this direction. Because this hazy idea of intuition or experience, this is something that a neural network is actually very, very good at. Wonderful. So I'm going to give concrete examples later on about how it guides humans. But just to give
49:30 - 50:00 some classical examples, I've said this joke so many times. So what's the best neural network of the 18th century? Well, it's clearly the brain of Gauss. I mean, that's a perfectly functioning, perhaps the greatest neural network of all time. I want to use this as an example. Because what did Gauss do? Gauss plotted the number of prime numbers less than a given positive real
50:00 - 50:30 number, just to give a sort of continuity. And he plotted this, and it's kind of a really, really, you know, jaggedy curve. And it's a step function because it jumps whenever you hit a prime. But Gauss was just able to look at this when he was 16 and say, well, this is clearly x over log x. How did he even do this experience? I mean, he had to compute this by hand, and he did, and he got
50:30 - 51:00 some of them wrong even. You know, primes, he had tables. By his time, the tables of primes were up in the tens and hundreds of thousands. He has it go up in the hundred thousand range. And you can just look at this as x over log of x. But this is very important because he was able to raise a conjecture before the method by which this conjecture is proved, namely complex analysis, was even conceived of by Cauchy and Riemann. And that's a very important fact. So he just
51:00 - 51:30 kind of felt that this was x over log x. And you had to wait for 50 years before Hadamard and de la Vallee Poussin to prove this fact because this technique, which we now take for granted, this technique called complex numbers, complex analysis, wasn't invented by Cauchy. It wasn't invented yet. You had to wait for that to happen. So it happens like this in mathematics all the time. Even major things. Of course, you know, now it's called the prime number theorem,
51:30 - 52:00 which is a cornerstone of all of mathematics. This is the first major result since Euclid on the distribution of primes. How did Gauss say this was x over log x? I don't know. Because he had a really great neural network. And this happens over and over again. Like, you know, the Birch-Swinnerton-Dyer conjecture, which I'm going to talk about later, which is one of the millennium problems. And it's still open, and it's certainly one of the most important problems in mathematics of all time. And this is Birch-Swinnerton-Dyer in a basement, you know,
52:00 - 52:30 in Cambridge in the 1960s. They just plotted ranks and conductors of related curves. I'm going to define those in more detail later. And they would say, oh, that's kind of interesting. You know, the rank should be related to the conductor in some strange way. And that's now the BSD conjecture, the Birch-Swinnerton-Dyer conjecture. And what they were doing was computer-aided conjectures. So here was the eyeballs of Gauss in the 19th century. But the 20th century really had seriously
52:30 - 53:00 computer-aided conjectures. And of course, the proof of this is still open in general. There's been lots of nice progress in this. And, you know, where we're going to go is very much, what technique do we need to wait to prove something like this? Now, is there a reason that you chose Gauss and not Euler? Like, is it just because Gauss had this example of data
53:00 - 53:30 points and guessing a form of a function? I'm sure Euler, who certainly is great, had conjectures. Maybe... That's an interesting quote. I'll mention Euler later. But I think there's not an example as striking as this one. In fact, what's interesting, as a byproduct of Gauss inventing this, because it was kind of mucking around with statistics, right? This is before statistics existed as
53:30 - 54:00 a field as well, right? This is like early 1800s. And Gauss, I think, and you can check me on this, Gauss got the idea of statistics and the Gaussian distribution because he was thinking about this problem. So it's kind of interesting. So he was laying foundations to both analytic number theory and modern statistics in one go. He was doing regression. So I think he essentially invented
54:00 - 54:30 regression and the curve fitting, which is like 101 of modern society. He was trying to fit a curve. What was the curve that really fit this? In the process, he got x over log x. And in addition, he got this idea of regression. An impressive guy. What can we say? He's a god to us all. The
54:30 - 55:00 upshot of this is like, I love this. Again, this is something I found on the internet. And just to emphasize that this idea of... Speaking of God. Yes, speaking of God, this idea of mucking about with data in pure mathematics is a very ancient thing, right? Once you formulate something like this in conjecture, you will write your paper. Imagine writing a paper, you will say,
55:00 - 55:30 conjecture, definition, prime, definition, pi of x, then conjecture, pi of x, evidence. Rather than all of the failed stuff about inventing regression and mucking about, all that stuff just gets not written at all. That intuitive creative process is not written down anywhere. So it's great. I'm glad I'm chatting to you about it, right? Because it's nice to have an audience with this, right? So if you look at like... So pattern recognition, what do we do in terms of pure mathematical data? If I
55:30 - 56:00 gave you a sequence like this, you can immediately tell me what the next number is, to some confidence. Zeros is just, you know, this is just multiple of three or not. This one, I've tried this with many audiences, and after a few minutes of struggle, you can get the answer. And then this turns out to be the prime characteristic function. So what I've done here is to mark all the odd integers. And evens, obviously, you're going to get zero. So it's kind of pointless. You just add
56:00 - 56:30 just a sequence of odd integers. And then it's a one if it's a prime, it's zero if it's not. So 3, 5, 7, 8, and so on and so forth. No, sorry, 3, 5, 7, 9, 11. And you mark all the odd ones, which are one. And you can probably, after a while, you can muck about and you can see where this is going. The next sequence is much harder. So I'm going to give away so we won't have to spend a couple
56:30 - 57:00 of hours staring at it. So this one is what's called the shifted Möbius function. What this is, just you take an integer, and you take the parity of the number of prime factors it has up to multiplicity, starting from 2. I think I didn't start from 1 here. And then if it's 1, if it's 0, if it's an odd number of prime factors, it's 1 if it's an even number of prime factors for all the
57:00 - 57:30 sequence of integers. And I hope now I've gotten this right. So if I think I start with 2, 2 has, so that's all. No, let's see, 2, 3. Yeah, so I did start I'm going to mark 1 for 1, just to kick off the sequence. And then 2 is a prime number, it has only one prime factor. It's an odd number. 3 is an odd number of prime factors. 4 is 2, because it's 2 squared. So it has an even number of prime
57:30 - 58:00 factors, and so on and so forth. So 5 is prime, it has one odd number. 6 is 2 times 3, so it has 2, an even number of prime factors, and so on and so forth. It looks kind of harmless. What's really interesting, so this is even number. So if you stare at this for a while, it's very, very hard to recognize a pattern. And what's really interesting is that to know the parity of the next number,
58:00 - 58:30 if you have an algorithm that can tell me the parity of this in an efficient way, you will have an equivalent formulation of the Riemann hypothesis. So that's actually an extremely hard sequence to predict. So if you can tell me with some confidence more than 50% what the next number is, without looking up some table, then you can probably end up cracking every bank in the world. Because this is equivalent to the Riemann hypothesis. So I've just given three, so trivial,
58:30 - 59:00 kind of okay-ish, really, really, really hard. Yes. So now you can think about a question, if I were to feed sequences like this into some neural network, how would a neural network do? So one way to do it, so this goes a bend, so we go way back to the very beginning, to the question of what is mathematics? And Hardy, in his beautiful apology, says, what mathematicians do is essentially we
59:00 - 59:30 are pattern recognizers. That's probably the best definition of what mathematics is, is that it's a study of patterns, finding regularity in patterns. And in fact, if there's one thing that AI can do better than us, it's pattern detection. Because we evolved in being able to detect patterns in
59:30 - 60:00 three dimensions and no more. So in this sense, if you have the right representation of data, you're sure that AI can do better than that. I mean, it'll generate a lot of stuff, but filtering out what is better is a very interesting problem in and of itself. So let's try to do one. I mean, there are various ways to do this representation. One way you can do it is to do a problem which is maybe best fit for an AI system, which is binary classification of binary vectors. So
60:00 - 60:30 what you do is, you know, sequence prediction is kind of difficult. So one thing you can do is just take this infinite sequence and just take, say, a window of a hundred, a thousand, a fixed window size, and then label it with the one immediately outside the window, and then shift, label, shift, label. So then you can generate a lot of training data this way. So for this sequence, I think I've just taken here, you know, whatever the sequence is, and I just,
60:30 - 61:00 with a fixed window size, and with this label. So now you have a perfectly supervised, perfectly defined binary supervised machine learning problem. Then you pass it to your standard AI, you know, algorithm that they're, you know, just, you know, out of the box ones, nothing, you don't even have to tune your particular architecture. Just take your favorite one, and then do cross validation, you know, the standard stuff, take sample, do the training,
61:00 - 61:30 you know, and then try to validate this on unseen data. So if you do this to the mod3 problem, to this one, you immediately find that, you know, any neural network, or whatever Bayes classifies, would do it 100% accuracy, as it should, because it would be really dumb if it didn't, because this is just a linear transformation. So even if you have a single neuron that's just doing linear transform, that's good enough to do it. The prime-q problem, I did some experiment,
61:30 - 62:00 some, oh gosh, like seven years ago, and it got 80% accuracy. And I was like, wow, that's kind of, this was a wow moment, I was like, why is it doing 80? I don't have a good answer to this. Why is it doing 80% accuracy to this? How is it learning? Maybe it's doing some sieve method, which is kind of interesting, somehow. The second number is just to chi-squared, just to double test what's called MCC, which is Matthew's correlation coefficient. These are just buzzwords in stats.
62:00 - 62:30 I never learned stats, but now I'm relearning. I took Coursera in 2017, so I can relearn all these buzzwords. Great. It's great, it's really useful. And then this shifted Liouville lambda function, it's, sorry, I think I made a, yeah, I mistakenly called this Möbius mu function. It's not, I mean, it's related, but it's not. It's the shifted Liouville lambda function. Got it. Sorry,
62:30 - 63:00 one of my neurons died when I said Möbius mu, but it's Liouville lambda. You were subject to the one-pixel attack. But so this one, I couldn't break 50%, right? 0.5 just means it's coin toss. It's not doing any better guessing than whatever. And this chi-squared is 0.00, that means I'm up to statistic error. So which means I couldn't find an AI system which could break, which could do better than random guess. I'm not saying there isn't one, it would be great if there
63:00 - 63:30 were one. And then, yeah, so it's kind of, you know, it's life. And I couldn't, if I do break it, you know, I might actually stand a good chance breaking every bank in the world. All right. But I don't, I haven't made it worse. Let's remain close friends. Yeah, that's right, that's right. So I was very proud of this because this experiment, I'm going to mention a bit later,
63:30 - 64:00 this Liouville lambda was just a thing I was just trying, like way back when. But apparently, Peter Sarnak, whom I really admire, he's one of the world's greatest number theorists currently, current number theorists. And I got to know him through this memoration thing that I'm going to talk about later. And I reminded him that I almost became his undergraduate research student. I ended up doing, I was an undergrad at Princeton, where I had two paths I could follow for,
64:00 - 64:30 you know, it kind of defines your undergraduate thesis, right? So one was in mathematical physics, that's with Alexander Polyakov. And the other one was with, you know, two problems. And the other one was actually offered by Peter Sarnak on arithmetic problems. And I somehow just, because I wanted to understand the nature of space and time, I went through the Alexander Polyakov
64:30 - 65:00 path to do mathematical physics, which led to do string theory. After 20, 30 years, I came full back to be in Peter Sarnak world again. I met him at this conference, I reminded him of this, and he was very happy. But what's really interesting is that he was asking DeepMind the same question a few years ago about the Liouville lambda, whether DeepMind could do better than 50%. So I was glad
65:00 - 65:30 that I thought along the similar lines as a great expert in number theory. And somebody who could have potentially have been my supervisor, and then I would have gone into a number theory instead of string theory, which is whatever, it's just how life happens. So perhaps you're going to get to this later on in the talk, but I noticed here you have the word classifier. And the recent buzz since 2020 or so has been with architecture, the transformer architecture in specific. So is there anything regarding mathematics, not just LLMs, that has to do with transformer
65:30 - 66:00 architecture that's going to come up in your talk? Not specifically. I'm actually, it's interesting, one of my colleagues here at the London Institute, he's Mikhail Burtsev. He's an AI, he's our institute's AI fellow, and he's an expert on transformer architecture. So I've been talking to him and we're trying to get, to devise a nice transformer architecture to address problems
66:00 - 66:30 in finite group theory. It's in the works. But nothing so far, even with the memorization stuff, it's very basic neural networks that we didn't use anything more sophisticated than that. So to be determined whether it will outperform the standard ones will be kind of interesting. Got it. Yeah, so actually now we go way back to the beginning of our conversation, is how I got into this stuff. And that, I don't know, completely coincidentally was through string theory. So at this point,
66:30 - 67:00 maybe I'll just give a bit of a background of how all this stuff came about, at least personally. Why was I even thinking about this? Because I knew nothing about AI seven, eight years ago. Zero, like literally zero. I knew nothing more than to read it from the news. And this is actually a very interesting story, which shows again, the kind of ideas that the string theory community
67:00 - 67:30 is capable of generating, just because you got all these experts looking at kind of interesting problems. So let's go way back. And again, you know, I've quoted Gauss, right? I gotta cook, I have to say something about Euler. So this is a problem. Again, you can see I'm very influenced by three, the number three. You know, I'm a total numerologist, right? Trinity, name the three, three is something, right? And then there is called the trichotomy classification theorem by
67:30 - 68:00 Euler. This dates to 1736. So if you look at, so I'm going to say the buzzword, which is connected compact orientable surfaces. So these are, you know, I mean, the words explain themselves, you know, they have no boundaries and they're, you know, topologically, you know, whatever the topological surfaces. So Euler was able to realize that a single integer characterizes all
68:00 - 68:30 such surfaces. So this is the standard thing that people see in topology, right? So the surface of a ball is the surface of a ball, and you can deform it, you know, the surface of a football is the same as an American football, it can deform without cutting or tearing. And then the surface of a donut is the same as, you know, your cup, right? Because, you know,
68:30 - 69:00 it's everything that everyone, the standard thing, you know, this is, it has one handle. And so the surface of a donut is exactly the topologically, what they call topologically homomorphic to the cup. And then you got the, you know, the pretzel. So I think that's a pretzel. Or maybe, I think this is like the German pretzel, and it gets more and more complicated. But Euler is, because, you know, Euler invented the field of topology. So he realized this idea of topological equivalence,
69:00 - 69:30 in the sense that there's a single topological invariant, which we now call the Euler number, which characterizes these things. Another way to, an equivalent way to say is, the genus of these surfaces is, you know, no handles, one handle, two handles, three handles, and so on and so forth. It turns out that the Euler number, which we now call the Euler number, is 2 minus twice the genus. So 2, 2, 2 minus 2g. Okay, that's great. So this is, that's the classic Euler's theorem. And then,
69:30 - 70:00 you know, comes in Gauss, right? Once you've got these three names, the Euler, Gauss, and Riemann, you know, this is, it's got to be some serious theorem, right? So Euler did this in topology. And then Gauss did this incredible work, which he calls him, he himself calls him the Theorema Grigio, the great theorem, which he considers this is his personal favorite. And this is Gauss,
70:00 - 70:30 right? And Gauss said, you can relate this number to, which is, this number is purely topological. You can relate this number to metric geometry. So he came up with this concept, which we now call Gaussian curvature, which is some complicated stuff. You can characterize this curvature, which you can define on this. Well, this is even before the word manifold existed on the surface. And then
70:30 - 71:00 you can integrate using calculus, and the integral of this Gaussian curvature divided by 4 pi is exactly equal to this topological number. And that's incredible, right? The fact that you can do an integral, it comes out to be an integer. And that integer is exactly a topology. So this idea, this Gauss related geometry to topology in this one suite. And then what's even the next level,
71:00 - 71:30 comes Riemann. Riemann says, well, what you can do is to complexify. So these are no longer, you know, real connected compact orientable surfaces. But you can think about these as complex objects. So what do we mean by that is, well, if you think about, you know, the real Cartesian plane, that's
71:30 - 72:00 a two dimensional object. But you can equally think of that as a one complex dimensional object, namely the complex plane. Or the complex line. Yeah, the complex line, exactly. So with R2, Riemann would call C. And then Riemann realized that you can put similar structure on all of these things as well. So all of a sudden, these things are no longer two dimensional real orientable surfaces. But one complex dimensional, what's now called curves. I mean, it's a terrible name. So a
72:00 - 72:30 complex curve is actually a two real dimensional surface. And it turns out that all complex curves are orientable. So you already rule out things like, you know, Klein bottles and stuff like that, or Möbius strips. So the complex structure requires orientability. And that's partly because of Cauchy-Riemann relations, you know, it puts a direction, you can't get away. But the idea is,
72:30 - 73:00 the interesting thing is, all of these now should be thought of as one complex dimensional curves. They're called curves because they're one complex dimension, but they're not curves, right? They're surfaces in the real sense. Yes. So now here comes, so if you apply this to the Gauss thing, you get this amazing trichotomy theorem. And the theorem says, if you do this to the curvature, you can see this, I mean, the number here is two, right? You get the Euler number two,
73:00 - 73:30 which is a positive curvature thing, right? And that's consistent with the fact that the sphere is a positively curved object. Locally, everywhere, it has positive curvature. If you do it to a torus, or the surface of a donut, which is just called, you know, the algebraic donut, you integrate that, you get zero curvature. And this is not a surprise, because, you know, you have a sheet of paper, you fold it once, you get a cylinder, and you fold it again, you glue it again, you get this torus, this donut. And this sheet of paper is inherently
73:30 - 74:00 flat. Yes. So if you just take a piece of paper, you take this piece of paper, and you roll it up, you get a cylinder. And then you do it again, you get the surface of a donut, like a rubber tire. And that is inherently zero curvature. And then you can do this, and this is a consequence
74:00 - 74:30 of what's known as Riemann uniformization theorem. If you do anything that has more than one handle, you get zero curvature. So now you have the trichotomy, right? You have positive curvature, zero curvature, and negative curvature. The one in the middle is really, obviously, it's interesting, it's the boundary case. In complex algebraic geometry, these things are called funnel varieties. Earlier you said, if you have anything that's more than one handle, you have zero curvature, you meant negative curvature. Sorry, sorry, I meant negative curvature. Negative,
74:30 - 75:00 yeah. So these fidget spinors on the right, they all have negative curvature. Everything here has negative curvature. Yeah. So now in the world of complex algebraic geometry, these positive curvature things are called funnel varieties, after this Italian guy, Fano. These negative curvature objects which proliferate are called varieties of general type. And this boundary case are called zero curvature objects. And it just so happens, we now call things in the middle,
75:00 - 75:30 clavier. These zero curvature objects. Yes. So far, this has got nothing to do with physics. I mean, it's just the fact of topology. But this is such a beautiful diagram that took from 1736 until Riemann. Riemann died in the 1860s, I think, or something like that. So it took 100, 120 years to really formulate just this table to relate metric geometry to topology, to algebraic
75:30 - 76:00 geometry. It's kind of a beautiful thing, right? So to generalize this table is the central piece of what's now called the minimal model program in algebraic geometry, for which there have been all these fields metalists, you know, Birkar a couple of years ago, and then it started with Mori who got the fields metal, and then this whole Mukai and this whole distinguished idea. So basically, this minimal model program should just generalize this to higher dimension. This is dimension,
76:00 - 76:30 complex dimension one, right? How do you do it? It's very hard. And once you have it, I won't bore you with the details. This is very nice. You know, there's topology, algebraic geometry, differential geometry, index theorem, they all get unified in this very beautiful way. And you want to, obviously, you want to generalize this to arbitrary dimension, arbitrary complex dimension. It'd be nice. It's still an open problem. How do you do it in general? It's a very nice problem. But at least for a class of complex manifold known as Kähler manifolds,
76:30 - 77:00 I won't bore you with the details, but Kähler manifolds on which where the metric has very nice behavior, there's a potential for which you can have a double derivative that gets on the metric. And then it was conjectured by Calabi in the 50s. Again, you know, 54, 56, 57, it was a great year, right? All these different ideas, I mean, in three completely different worlds, now come together because mathematical physicists have kind of tied it up, you know, the world of neural networks,
77:00 - 77:30 the world of Calabi conjecture, the world of string theory to one. I like, you know, when things get bridged up in this way, you know, but again, the theorem itself is extremely technical. But the idea is for this Kähler manifolds, there is an analog of this diagram, basically. I love this slide. I saved this slide for my own private notes. I keep a collection of dictionaries in physics and math. Yeah. I think this is beautiful. Yeah, me too. I mean, but it took me years to do
77:30 - 78:00 this table because, you know, it's not written down anywhere. And it touches different things. I think it's not written down anywhere precisely because math textbooks are written in the Bubaki style. But now it just becomes clear what people have been thinking about for the past 100 years, you know, after Grotendieck. It's just trying to relate these ideas. You know,
78:00 - 78:30 this is intersection theory of characteristic classes. Ah, so this is topology. And, you know, this is, I mean, this is over 200 years of work of, you know, the central part of analytics. And mathematicians like Chern, Ritchie, Euler, Betty. Yeah, everything. Everybody was ever involved in this diagram is an absolute legend. In fact, there is one more column to this diagram. I think for sure, I think when I did, I mean, this was a slide from some time ago. But when I was
78:30 - 79:00 talking to a string audience, there is one more, one more, which is relations to L-function. And that's what number theory comes in. So there is one more column. And to understand this world to this one more column of its behavior to L-functions, that's the Langlands program. So it's actually really magical that this table actually extends more as far as, I mean, that's just as far as we know now, right? The L-functions and its relations to modularity. And this is, of course,
79:00 - 79:30 obviously, to me, like mathematics is about extending this table as much as possible to let it go into different fields of mathematics. So, but at least for sure, we know there is one because of the Langlands correspondence, there is one more column and that column should be on number theory and modularity. And soon there'll be another table on the Yang invariant, the He invariant. No, I don't think, I don't think I have enough talent to create something that, but it could well be there
79:30 - 80:00 should be something, something new to do. To me, that's really the most fun part about mathematics. It's not, not so, I mean, they're like, you know, I, who was it? I think maybe it's Arnold as well, because there are two types of mathematicians. They're the hedgehogs and they're the birds, right? Hedgehogs really like, you know, like, like specialize, specialize. I mean, you absolutely need it. I think, you know, who is a great hedgehog? I think Zhang, the guy who,
80:00 - 80:30 you know, made this first major breakthrough in the pride gap. I mean, he's been saying his entire life, just trying to think about, can I bound, can I bound the, you know, the, how many, you know, in the, in the, what is the, what's the limb sob of the, of the distance between, between prime pairs. And the technique he uses is, it's beautifully argued, analytic number theory
80:30 - 81:00 technique, sieve methods, you know, kind of, you know, the, the, the Ben Green world of this, of, of, of sieves and James Maynard. And then there are the, the, the, the, the birds who are like, you know, I'm just going to just fly around. I may bump into trees and whatnot, but I'm just trying to see whether they can do. And, and people like Robert Langlands and, you know, they're very much in that world. Can I see from a distance? I mean, I may get very coarse grain view. And which are you? I'm 100% in the bird category. I mean, what I, I, I like to go,
81:00 - 81:30 you know, once I, once I see something and I, I, I, of course, sooner or later, you need to dig like a, like a hedgehog, but the most thrill that I get is when I say, oh, wow, this is gets, gets connected. So the results are proven when you dig, but the connections are seen when you get the overview. Yeah. Yeah, absolutely. So, I mean, of course, again, this is a division that's kind of artificial in all of us. We, we, we do a bit of both. Yes. The guy who really does it well is,
81:30 - 82:00 I forget to mention, of course, it's, it's, it's like, it's, it's become like a grand, well, he, he passed away, John Mackay, who was a Canadian, probably the, the, the great, the greatest, greatest Canadian mathematician since Coxeter. John Mackay really saw unbelievable connections in fields that nobody will ever see. And he passed away, he became sort of, in the last 10 years of
82:00 - 82:30 life, he became sort of like a, like a grandfather to, to me. He, you know, he saw my kids grow up, you know, over Zoom. I know, so the, the London Math Society asked me to write an obituary. I was very touched by this, and so I wrote his obituary for it, and I was just trying to say, well, this guy is the ultimate pattern, you know, linker. So, so John Mackay, absolute legend. Great. Moving on. I mean, this is a, this is very much, this is very much a huge digression for what
82:30 - 83:00 I'm actually going to tell you about, which is, you know, the Birch test for AI. And that's... Great. Do you have a limit on what these videos are? No, just so you know, some of them are one hour, some of them are four hours. And people listen for all of it. Yeah, this is great fun. Great. Yeah, same. I'm loving this. Yeah, me too. Because normally, you know, I have one hour, you know, what, 55 minute cutoff. I could give a talk, right, and then five minutes questions.
83:00 - 83:30 And they're like, oh my God, I haven't said most of the stuff I wanted to say. Yeah, yeah, exactly. Because the point of this channel is to give whoever I'm speaking to enough time to get through all of their points, rather than they're rushing and not covering something in depth. I want them to be technical and rigorous. So please continue. Sure. Sounds good to me. So in that magical year of 1957 of neural networks, the magical year of the automated theorem prover world, and the world
83:30 - 84:00 of algebraic geometry, in three complete different worlds, they didn't even know each other's names, let alone the results. Kalabji conjectured that at least for Kähler manifolds, this diagram is very much well-defined, this table. And Yau proved it 20 years later. So Xintong Yau, who is, again, very much like a mentor to me. And he gets the Fields Medal immediately. So
84:00 - 84:30 you can see why this is so important. He gets the Fields Medal because this idea of falling through Kalabji is trying to generalize this sequence of ideas of Euler, Riemann, and Euler, Gauss, and Riemann. So it's certainly very important. So there it is. We can park this idea. So Yau showed that there are these Kähler manifolds that have this property, that have the right metrical properties. So by metric, I mean distance, something you can integrate over. Because here,
84:30 - 85:00 you never think that this integral is messy, right? Even if we do this on a sphere, this R has all these cosines and sines. And they've all got to cancel at the end of the day to get 4 pi. Yes. Like, what the hell? And then divided by 2 pi, you get 2. And that's the Euler number, which is kind of amazing stuff. And now you can do this in general. Just as a caveat, Yau showed that this metric exists. He never actually gave you a metric. So the only currently known metric
85:00 - 85:30 on this thing is, for the zero curvature case, it's just the torus. Anything above that, we don't know. We just know that exists. And if you did this integral, you're going to get like 2, 5, or whatever the number is. Which is kind of amazing. This is like a completely non-constructive proof. What's interesting is that these automated theorem provers, they seem computational. And it's my understanding that computationalists, so people who use intuition as logic, they don't like
85:30 - 86:00 constructive proofs. Sorry, they like constructive proofs. They don't like non-constructive proofs. In other words, existence proofs without showing the specific construction. So it's interesting to me that all of undergraduate math, which has some non-constructive proofs, are included in Lean. So I don't know the relationship between Lean and non-constructive proofs, but that's an aside. Yeah, that's an aside. I probably won't have too much to say about it. Cool. So,
86:00 - 86:30 I don't know why I went on this digression on string theory. But I just want to say, this is a side comment. So this is something since 1736, which is kind of nice. Oh, by the way, that's actually kind of interesting. I'm going to have to check this again. Just down the street from the Institute is the famous department store Fortman Mason's, which I think is established in
86:30 - 87:00 17-something. It's a great department store. It's not where I usually do my shopping, but it's just a beautiful department store where Mozart and Haydn might have called and did their Christmas shopping. But anyhow, just random thought. So string theory was just one slide, right? I mean, in some sense, I'm not a string theorist. In a sense, I don't go quantize strings.
87:00 - 87:30 The kind of stuff that I'm more interested in is like, I didn't grow up writing conformal field theories and do all that stuff. For me, it's an input, so I can play with a little more problems in geometry. So string theory is this theory of space-time that unifies quantum gravity, blah, blah, blah. And then it works in 10 dimensions, and we've got to get down to four dimensions. So we're missing six dimensions. So that's what I want to say. And this amazing paper in 1985 by
87:30 - 88:00 Candelas, Horowitz, Strominger, and Witten, they were thinking about what are the properties of the six extra dimensions. So what is interesting is that by imposing supersymmetry, and this is why supersymmetry is so interesting to me, by imposing supersymmetry and other anomaly cancellation, not too stringent conditions, they hit on the condition that this six extra dimensions has to
88:00 - 88:30 be Ricci-flat. Ricci-flat, you can understand, because it's vacuum-style solutions. You want the vacuum string solution. And then a condition which you've never seen before, which just happens to be this Cayley condition. They didn't know about this. No physicist until 1985 would know what a Cayley manifold was. And it's complex, and it's complex dimension three. Remember, again, I said complex dimension three means real dimension six, right? That's 10 minus four is six,
88:30 - 89:00 and six needs to be complexified into three. And again, this is just an amazing fact that in 1985, Strominger, who was a physicist, was visiting Yau at the Institute of Advanced Study in Princeton. And so he went to Yau and said, can you tell me what this strange condition, this technical condition I got? And Yau says, well, you know, I just got the Fields Medal for this. I think I may
89:00 - 89:30 know a few things. It's just amazing. Again, it was a complete confluence of ideas that's totally random. And the rest is history. So in fact, these four guys named this Ricci-flat Cayley manifold Clabi-Yau. So it wasn't the mathematicians who did it. This word Clabi-Yau came from physicists. So from string theorists, which now, you know, of course, Clabi-Yau is now one of the central pieces. And so Philip Candelas was my mentor at Oxford when I was a junior fellow there. And he
89:30 - 90:00 tells me this story. He's a very lively guy. He tells me about how this whole story came about, and it's very interesting. So he and these four guys came up with the word Clabi-Yau. So all of a sudden, we now have a name for this boundary case in complex algebraic geometry. This bounding
90:00 - 90:30 case is now known as a Clabi-Yau. So remember, we had names before, right? This was the final variety. This was varieties of general type. And this bounding case is now called Clabi-Yau. So what we're seeing with the torus here is a Clabi-Yau 1. Exactly. In fact, the torus is the only Clabi-Yau 1. So it's the only one that's Ricci-flat. I mean, by this classification,
90:30 - 91:00 it's the only one that's topologically possible. So that's kind of interesting, right? And then this is just a comment. I like this title because I think your series is called TOE. This is a TOE on TOE. Love it. I just want to emphasize, this is a nice confluence of ideas with mathematical physics. But string theory really, what it really is, is this brainchild of interpreting problems between, interpreting and interpolating between problems in mathematics and physics. So for
91:00 - 91:30 example, we now, you know, GR should be phrased in differential geometry. The standard model gauge theory should be phrased in terms of algebraic geometry and representation theory of finite groups. And, you know, condensed matter physics of topological insulators should be phrased in terms of algebraic topology. This idea, you know, I think the greatest achievement of the 20th century physics is, to me, and I think something you would appreciate since you like tables, is that here's
91:30 - 92:00 a dictionary of a list of things, and then here's what they are in mathematics. And then, you know, you can talk to mathematicians in this language, and you can talk to physicists in that language, but they're actually the really same, same, same thing. You know, what's a fermion? You know, it's a spin representation of the Lorentz group. You know, I like that because it gives a precise definition of what we are seeing around. Then you have something you can purely play with in this platonic world. And string theory is really just the brainchild of this translation, this tradition
92:00 - 92:30 of what's on the left and what's on the right, and let's see what we can do. And sometimes you make progress on the left, you give insight and stuff on the right, and sometimes you make progress on the right and you give insight on the left. Why is it that you call the standard model algebraic geometry? Because bundles and connections are part of differential geometry, no? Oh yeah, that's true. Well, I think that's, yeah, I mean, they're interlinked. And I think algebraic, maybe it's because of Atiyah and Hitchin. Of course, you know, they are fluid in both. Yeah, they go either
92:30 - 93:00 way. But algebraic in the sense that you can often work with bundles and connections without actually doing the integral in differential geometry. So I think that's the part I want to emphasize. You know, you can understand bundles purely as algebraic objects without ever doing an
93:00 - 93:30 integral. You know, like here, for example. Like this integral is obviously something you would do in differential geometry. But this integral, the fact that it comes to be an integer, was explained through the theory of churn classes. You know, this integral is a pairing between homology and cohomology, which is a purely algebraic thing. You know, we all try to avoid doing integrals,
93:30 - 94:00 because integrals are horrible. Because it's hard to do. And in this language, it really just becomes polynomial manipulation. And it becomes much simpler. Okay. So, you know, in that sense, I want to put it. Of course, you know, it's a bit of both. So I like doing this diagram, right? And, you know, if you look at the time lag between the mathematical idea and the physical realization of
94:00 - 94:30 that idea, there really is a confluence. Yeah. It's getting closer. I mean, these things going up and down. I mean, I'm just saying in the past, if you take the last 200 years, last 100 years or so of the groundbreaking ideas in physics, there is this. Interesting. Right. It gets shorter and shorter. So, I mean, obviously, Einstein took ideas of Riemann. And, you know, there was a six-year gap. Dirac was able to come up with the equation of electron, essentially
94:30 - 95:00 because of Clifford algebras. Historically, was he motivated by Clifford algebras? Or was it later realized, hey, Dirac, what you're doing is an example of a Clifford algebra? So I believe the story goes, in order to write down the first-time derivative version of the Klein-Gordon equation, which is a second-order, you know, that's the bosonic one, he had to do some… Essentially, he factorized the matrix in a way that seemed very strange to him. And Dirac said,
95:00 - 95:30 this really reminded me of something that I've seen before. And this is one of those moments, right? Today, we can ChatGPT this. But what Dirac did was, he was at St. John's in Cambridge at the time. He said, I have seen this in a textbook before somewhere, you know, this gamma mu gamma nu thing. And then he said, I need to go to the library to check this. So he
95:30 - 96:00 really knew about this. And unfortunately, the St. John's library was closed that evening. So he waited until the morning, until the library was open, to go to Clifford's book. Or a book about Clifford. I can't remember whether it was Clifford's book, or maybe it was one of these books. And then he opened up, and he really knew that this gamma mu gamma nu anti-commutation relation really was through… So he knew about Clifford. Cool. It's kind of
96:00 - 96:30 interesting. Yeah. Just like Einstein knew about Riemann's work on curvature. But whether you say, you know, Dirac was really inspired by Clifford, well, he certainly did a funky factorization. And then he knew how to justify it immediately, by looking at the right source. And then similarly, you know, Yang-Miao's theory depended on this Zybert's book on apology. And
96:30 - 97:00 then, you know, by the time you get to Witten and Borchardt's, really, there's this… This diagram, for me, is what gets me excited about string theory. Because string theory is a brainchild of this curve, this orange curve. And now it's getting mixed up. I mean, of course, you know, people hear about this great quote that Witten says, you know, string theory is a piece of 21st century mathematics that happens to fall into the 20th century. And I think he means this. You know,
97:00 - 97:30 that he was using supersymmetry to prove, you know, theorems in Morse theory, and vice versa. Richard Borchardt was using vertex algebras, which is sort of foundational thing conformal field theory, to prove some properties about the monster group. We're at this stage. And of course, you know, this was turn of the century. And now we're here, and we have to… Where are we now? Are we crisscrossed, or are we parallel? It's hard to say. And in a meta manner, you can even interpret
97:30 - 98:00 this as the pair of pants in string theory with the world sheet. Yeah, cute. Very cute. Why not? But going back to what you were saying, how I got to… Oh, yeah. So just, yeah, this confluence idea, of course, you know, everyone quotes these two books, papers. You know, when Wigner was
98:00 - 98:30 thinking about in 59, why mathematics is so effective in physics. And there's this maybe slightly less known paper, but certainly equally important paper by the great Leitler, Thea, and then Dijkgraaf and Hitchen, which is the other way around. Why is physics so effective in giving ideas in mathematics? So this is a beautiful pair of essays. This is like very much in the world of
98:30 - 99:00 a summary of the kind of physics ideas from string theory. It's making such beautiful advances in geometry. So this is a very beautiful pair of one given in the other that needs to be, you know, sort of praised more. And that's why you were mentioning earlier how I got to know, you know, Roger. So while he's through these editorials, we try to connect, you know, with my colleague,
99:00 - 99:30 Molinke, who is a former director of the Chern Institute. You know, everybody's connected, right? So it just so happens that, you know, I grew up in the West, but after my trip with my parents, after so many decades, my parents actually retired and went back to Tianjin, where Nankai University is, where Chern founded what's now called the Chern Institute for
99:30 - 100:00 Mathematical Sciences. And that's an institute devoted to the dialogue between mathematics and physics. In fact, one third of Chern's ashes is buried outside of the Math Institute. There's a great, beautiful marble tomb. And one third, not because of any mathematical reason, it's just that he considered three parts of his home. So his hometown in Zhejiang, China, and Berkeley,
100:00 - 100:30 where he did most of his professional career, and then Nankai University, where he retired to for the last 20 years of his life. So a third each. Yes, the number three comes up again. And in fact, I was going to joke, so in Chern-Simons theory in three dimensions, there's this topological theory, the Chern-Simons theory, there's a crucial factor of one third. I always joke, you know, that's why Chern chose one third for his ashes, but that's not right. Complete coincidence. But what is
100:30 - 101:00 actually interesting is that tomb, that beautiful black marble tomb, you know, for somebody as great as Chern, it mentions nothing about, you know, his chief done this, done the other thing. It's just one page of his notebook. You think about the poor guy who had a chisel, or that he had no idea what he's chiseling, right? The guy was chiseling this thing, and it's the proof of this. And of course,
101:00 - 101:30 you can look this on the internet, just say the grave of S.S. Chern at Nankai University. Well, the whole conversation we've had is just about pattern matching without the intuitive understanding behind it. So this chiseler may have had that. Yes, that's what I do every day. I love it. So that chisel is essentially his proof of why this is equal to this. You know, why this
101:30 - 102:00 intersection product is the same as this integral. So essentially, it's where the Gauss-Bonnet theorem is a corollary of this trick in algebraic geometry, which is his great achievement. But anyhow, back to this coincidence, and it just so happens that my parents, after drifting all these years abroad, they retired back to Tianjin, where the Chern Institute is. So that's why I became an honorary professor at Nankai, because my motivation was purely just so that I could
102:00 - 102:30 spend time to hang out with my parents. But it just so happens that it happens to be there, and I can just pay my homage to Chern, just to see his grave. I mean, it's a great, you know, it's a mind-blowing experience just to see the Chern's grave and to see the derivation of this in his handwriting chiseled in stone. But anyway, so that's how I got involved with C.N. Yang, because he was very deeply involved with Chern. He and Chern are good friends. I can
102:30 - 103:00 imagine that C.N. Yang is 102 today. Yeah, it's remarkable. And that he was still doing, he wrote the preface to this when he was 99. These guys are unstoppable. And, you know, Roger Penrose, he sent his essay to this one when he was, what, 92? Yeah, these guys are... Anyhow, it's kind of,
103:00 - 103:30 you like tables, right? I love tables. So the tables, here's just a speculation of where string theory is going. Here's a list of, you know, the annual conferences, like the series where string theory has been happening. So 1986 was the first string revolution, where since then, every year, there's been a major string conference. I'm going to the first one I'm going to for years
103:30 - 104:00 in two weeks' time. It happens in Abu Dhabi, I get some sun. And then, you know, there's a series of annual ones, the StringFino, and then StringMath came in as late as in 2011. That's kind of interesting. So that's like, you know, 30 years after the first string conference. And the various other ones. What's really interesting one is in 2017, there's the first string data. This is when AI entered string theory. And so it's kind of, so what I read the first paper in 2017
104:00 - 104:30 about AI-assisted stuff, and there were three other groups independently, mining different AI aspects and how to apply to string theory. So the reason I want to mention this was just how, why was, you know, with the string community even thinking about these problems in AI? Oh, and also, just to be clear, briefly speaking, I'm not a fan of tables, per se. I'm a fan of dictionaries because they're like Rosetta Stones. So I'm a fan of Rosetta Stones and translating between
104:30 - 105:00 different languages. So you mentioned the siloing earlier. And mathematicians call, even physicists call them dictionaries, but technically they're thesauruses. Like a dictionary, you just have a term and then you define it. The translation. Like Rosetta Stones. Yes. No, absolutely. I guess that's why you like Langlands so much. Yeah. Yeah, for sure. Yeah, no, absolutely. In some way, this whole channel is a project of a Rosetta Stone between the different fields of math and physics and philosophy. Right. Yeah. That's fantastic. Love it. Big fan. Thank you. Okay. So do you want
105:00 - 105:30 to just, I noticed it jumped back to number 13. So it seems like, I thought we were at 39 out of 40. No, no, no. Because I've learned this nonlinear structure. Because you see, like, I've learned this, this is really dangerous. I've learned like the click button in PDF presentations. Like you click it, it jumps to another one. And you can have interludes. So, you know, it's clearly an interlude. And you say you jump back to your main. So my actual main presentation is only
105:30 - 106:00 like, you know, 30 pages. But it's got all these digressions, which is actually very typical of my personality. So I gave you this big interlude about string theory and Calabi-Yau manifolds, right? So now we've already got to the point that Calabi-Yau one-fold, the one-dimensional complex Calabi-Yau. There's only one example. That's just one of these, right? And then it turns
106:00 - 106:30 out that in complex dimension two, there are two of these. There is the four-dimensional torus, which is, and then there's this crazy thing called the K3, which is Ritchie, Flatt, and Caylor. So you got one in complex dimension one, two in complex dimension three. You would think in three dimensions, there's three of these things that are topologically distinct. And unfortunately, this is one of the sequences in mathematics that goes as one, two, we have
106:30 - 107:00 absolutely no idea. And we know at least one billion. At least. So it's kind of, it goes one, two, a billion. And so Calabi-Yau, so starting from complex dimension three just goes crazy. It's still a conjecture of Yau that in every dimension, this number is finite. So remember this positive curvature thing, this final thing to the very top? It is a theorem that in every dimension,
107:00 - 107:30 final varieties is finite in possibility, in topology, that only a finite number of these that are distinct topologically. It's also known that the negative curvatures is infinite in every dimension. And when it goes higher, it's like even uncountably infinite. Oh, interesting. But it's this boundary case. Yau conjectures in an ideal world, they're also finite. But we don't know. This is the open conjecture. Now the billion, are any of them constructed? Or is it
107:30 - 108:00 just the existence? Yeah, that's it. Now that's exactly what we're getting. So it's gotten one, two, and three. Three is like, you know, how are you going to list these things, right? And then algebraic geometers never really bother listing all in one mouth. This is just not something they do. So it took on, the physicists took on the challenge. So Philip Candelas and Franz,
108:00 - 108:30 and then Harald Skaka and Maximilian Kreutzer started just listing these. And that's why we have these billions. There is actually databases of these. And they're presented in just like matrices like this. I won't bore you with the details of these matrices. You know, these are algebraic varieties. You can define these as, you know, like intersections of polynomials. That's one way to present them. And in Kreutzer and Skaka's database, they put in vertices, optoric varieties, height, and bend. But the upshot is that, you know, there's a database of many, many
108:30 - 109:00 gigabytes that really got done by the, certainly by the turn of the century, by year 2000. These guys were running on Pentium machines. I mean, this is an absolute feat. Especially Kreutzer and Skaka. They were able to get 500 million of these things stored on a hard drive using a Pentium machine of these Carb-EL manifolds. And they were able to compute topological invariants
109:00 - 109:30 of these. So that's, so I happened to have this database. I could access them. And that was kind of fun. And I've been playing on and off with them for a number of years. So, and, you know, a typical calculation is like, you know, you have something like a configuration of tensors of, you know, here is even in integers. And you have some standard method in algebraic geometry to compute topological invariants. And this topological invariants, again, in this dictionary
109:30 - 110:00 means something. So for example, H1, H21, in some context, is the number of generations of fermions in the low energy world. So that's a complete problem in this computing a topological invariant in algebraic geometry. And there are methods to do it. And in these databases, you know, people took 10, 20 years to compile this database. And you got these things in. And they're not easy. It's very complicated to compute these things. So in 2017, I was playing around with this. And the
110:00 - 110:30 reason is very, why I was playing around with this was very simple, is because my son was born. And I had infinite sleepless nights, that I couldn't do anything, right? I had like, you know, there's the kid. And then, you know, there's the kid, and you know, and he wakes you up at two, you know, put him to, and, you know, and I was bottle feeding him. And while I had a daughter at the time, so that my wife's taking care of the daughter, they're passed out. And then I got this
110:30 - 111:00 kid, I passed him, I put them into bed, and I'm wide awake at this point, it's like 2am. So like, I can't fall asleep anymore. And I can't do real, you know, serious computation anymore, because I'm just too tired. So let's just play around with data, the least I can let the computer help me to do something. And then that's when I learned, well, you know, what's this thing that everybody's talking about? Well, you know, it's machine learning. So that's why I got through this. It's
111:00 - 111:30 a very simple, very simple biological reason why I was trying to learn machine learning. So then I think I was hallucinating at some point, right? I was like, well, if you look at pictures, like, you know, matrices a lot, like, you know, we're talking about, you know, 500 million of these things, right? Yes. Certainly, I wasn't going through all of them. And they're being labeled by topological invariants. How different is it if I just sort of pixelated one of these and labeled them by this? And all of a sudden, this began to look like a problem in hand-digit recognition,
111:30 - 112:00 right? This is like, how different is this or image recognition? So and I just literally started feeding in, I took 500, I mean, 500 million is too much, right? So I took like 5,000 of these, 10,000 of these, and I trained them to look and recognize this, to recognize this number. And I was like, this is going to be like, it's just going to give crap. Obviously, it's going to
112:00 - 112:30 give 0% accuracy. And to my surprise, it was giving extremely good accuracies. So somehow, the neural network that I was training, this is, I was even using standard MNIST, you know, the hand recognition, MNIST things, recognizing this. And it was recognizing it to great accuracy. And now, I mean, people have improved this, like loads of people like, you know, Finatello, there's a group there that did some serious work on just trying this problem. But this idea suddenly didn't seem
112:30 - 113:00 so crazy anymore. The idea seemed completely crazy to me because I was hallucinating at 2am. But what's the upshot of this? The upshot is, somehow the neural network was doing algebraic geometry, like this kind of algebraic geometry, really sequence-chasing, very complicated Bourbaki-style stuff, without knowing anything about algebraic geometry. It somehow was just doing pattern recognition, and somehow it's beating us. Because, you know, if you do this computation seriously,
113:00 - 113:30 it's double exponential complexity. But it's just now, by pattern recognition, it's bypassing all of that. So then I became a fanatic, right? Then I said, well, all of algebraic geometry is image processing. And so far, I have not been shocked by the algebraic geometries, because it's actually true. If you really think about it, the point of algebraic geometry, the reason I like algebraic more than differential is because there's a very nice way to represent manifolds in
113:30 - 114:00 this way. Manifolds in algebraic geometry. So in differential geometry, manifolds are defined in terms of Euclidean patches. Then you do transition functions, which are differentiable, C infinity, blah, blah, blah. But in algebraic geometry, they're just vanishing low-side polynomials. And then once you have systems of polynomials, you have a very good representation. So for example, here, I'm just recording the list of polynomials, the degrees of polynomials that are embedded
114:00 - 114:30 in some space. And that really is algebraic geometry. So basically, any algebraic variety, so that's a fancy way of saying this polynomial representation of a manifold, which is called an algebraic variety, this thing is representable in terms of a matrix or a tensor, sometimes even an integer tensor. And then the computation of invariance, a topological invariance,
114:30 - 115:00 is the recognition problem of such tensor. But once you have a tensor, you can always pixelate it and picturize it. At the end of the day, it's doing this because it's just image processing algebraic geometry. Now, do you mean to say every problem in algebraic geometry is an image process? Almost. Is an image processing problem, or just problems involving invariance or image processing, or even broader than that? Well, I think it is really more broad. I think at some level, I think
115:00 - 115:30 in my view, I try to say bottom-up mathematics is language processing, and top-down mathematics is image processing. Interesting. Of course, this is, I mean, take with a caveat, but of course, at some level, there is truth in what I say. Of course, it's an extreme thing to say. But in terms of what mathematical discovery is, is that you're trying to take a pattern in mathematics.
115:30 - 116:00 So in algebraic geometry, just a perfect example, you can pixelate everything, and you can just try to see certain images have certain properties. And so you're image processing mathematics, whereas bottom-up, you're building up mathematics as a language. So it's language processing. And of course, all of this will be useless if you can't actually get human-readable mathematics out of it. So this is the first surprise, the fact that it's even doing it at all to a certain degree
116:00 - 116:30 of accuracy. Now we're talking about accuracy, it's been improved to like 99.99 percent accuracy in these databases. But that's the first level, that's the first surprise. The second surprise is that you can actually extract human-understandable mathematics from it. And I think that's the next level surprise. So in the memoration conjectures, this beautiful work in DeepMind that Jody Williamson's involved in, in this human-guided intuition, you can actually get human mathematics
116:30 - 117:00 out of it, and that's really quite something. So maybe that's a good point to break for part two, which is an advertisement of, you know, here is like, we've gone through many, many things about what mathematics is, and to, you know, how it got this through doing, you know, this interaction between algebraic geometry and string theory. And then a second part would be how you can actually extrapolate and extract mathematics, actual conjectures, things to prove
117:00 - 117:30 from doing this kind of experimentation, which are summarized in these books. I keep on advertising my books because I get 50 pounds per year of, what do they call it, royalties, you know, so I don't have to sell my liver for my kids. But it's actually kind of fun. It's a complete, I mean, academic publishing is a good joke, right? You get like, I don't know, like 100 pounds a year,
117:30 - 118:00 because you don't actually make money out of it. But maybe that's a good place to break. And then for part two, how we try to formulate what the Birch Test is for AI, which is sort of, you know, the Turing Test Plus. Because the Birch Test is how to get actual meaningful human mathematics out of this kind of playing around with mathematical data. I see two of your sentence that will be these maxims for the future will be that machine learning is the 22nd century's math
118:00 - 118:30 that fell into the 21st. So this machine learning assisted mathematics, or that the bottom up is language processing, and then the bottom, the top down is image processing. Yeah. I like those two. Yeah. Anyone who's watching, if you have questions for Yang-Hui for part two, please leave them in the comments. Do you want to give just a brief overview? Oh, yeah, sure. So just so I'm going to talk about what the Birch Test is, and which papers so far have gone, how close they've gone
118:30 - 119:00 to the Birch Test. And then I'm going to talk about some of the more experiments, number three, and the one that I really enjoyed doing with my collaborators, Lee, Oliver, and Pashnakov, which is to actually make something meaningful that's related to the Birch-Stone-Winton-Dye conjecture. Just by just letting machine go crazy and finding a new pattern in elliptic curves, which is fundamentally a new pattern in the prime numbers, which is completely amazing. You
119:00 - 119:30 mentioned quanta earlier. So this quanta feature that featured this one, consider this as one of the breakthroughs of 2024. Great. And that word murmuration, which was used repeatedly throughout, it was never defined, but it will be in the part two. Absolutely. I'm looking forward to it. Me too. Me too. Okay. Thank you so much. Thank you. This has been wonderful. I could continue speaking to you for four hours. Both of us have to get going, but that's so much fun. Pleasure.
119:30 - 120:00 Don't go anywhere just yet. Now I have a recap of today's episode brought to you by The Economist. Just as The Economist brings clarity to complex concepts, we're doing the same with our new AI-powered episode recap. Here's a concise summary of the key insights from today's podcast. Alright, let's dive in. We're talking about Curt Jamungal and his deep dives into all things mind-bending. You know this guy puts in the hours, like weeks prepping to grill guests like Roger Penrose on
120:00 - 120:30 some wild topics. Yeah, it's amazing using his own background to dig in. Really challenging guests with his knowledge of mathematical physics pushes them beyond the usual. Definitely. And today we're focusing on his chat with mathematician Yang-Hui He. They're getting into AI, math, where those two worlds collide. And it's fascinating because it really makes you think differently about how math works, how we do math, and where AI might fit into the picture. You might think a mathematician's
120:30 - 121:00 life is all formulas and proofs, but Yang-Hui, he actually started exploring AI-assisted math while dealing with sleepless nights with his newborn son. It's such a cool example of finding inspiration when you least expect it. Tired but inspired, he started messing around with machine learning in those quiet early morning hours. So let's break down this whole AI and math thing. Yang-Hui, he talks about three levels of math. Bottom-up, top-down, and meta. Bottom-up is like building with Legos. Very structured, rigorous proofs. That's the foundation. But here's where
121:00 - 121:30 things get really interesting. It has limitations. Right. And those limitations are highlighted by Gödel's incompleteness theorems. Basically, Gödel showed us that even in perfectly logical systems, there will always be true statements that can't be proven within that system. It's mind-blowing. So if even our most rigorous math has these inherent limitations, it makes you think. Could AI discover truths that we as humans bound by our formal systems might miss? Could it explore uncharted territory? That's a really deep thought. And it's really at the core of what makes this conversation
121:30 - 122:00 revolutionary. It's not about AI just helping us with math faster. It's about AI possibly changing how we think about math altogether. So how is this all playing out? We've had computers in math for ages, from early theorem provers to AI assistants like Lean. But where are we now with AI actually doing math? Well, AI is already making some big strides. It's tackling Olympiad-level problems and doing it well. Which makes you ask, can AI really unlock the secrets of math? And that leads us to
122:00 - 122:30 the big philosophical questions. Is AI really understanding these mathematical ideas? Or is it just incredibly good at spotting patterns? It's like that famous Chinese room thought experiment. You could follow rules to manipulate Chinese symbols without truly understanding the language. Yang-Hui, he shared a story about Andrew Wiles, the guy who proved Fermat's last theorem, trying to challenge GPT-3 with some basic math problems. It highlights how early AI models, while excelling in tasks with clear rules and plenty of examples, struggled with things that
122:30 - 123:00 needed real deep understanding. It seems like AI's strength right now is in pattern recognition. And that ties into what Yang-Hui calls top-down mathematics. It's where intuition and seeing connections between different parts of math are king. Like Gauss. He figured out the prime number theorem way before we had the tools to prove it. It shows how a knack for patterns can lead to big breakthroughs even before we have the rigorous structure. It's like AI is taking that intuitive leap, seeing connections that might have taken us humans years, even decades, to figure out.
123:00 - 123:30 And it's all because AI can deal with such massive amounts of data. Which brings us back to Yang-Hui. He's sleepless nights. He started thinking about Calabi-Yau manifolds, super-complex mathematical things key to string theory, as image-processing problems. Wait, Calabi-Yau manifolds? Those sound like something straight out of science fiction. They're pretty wild. Think six dimensions all curled up, nearly impossible to picture. They're vital to string theory, which tries to bring all
123:30 - 124:00 the forces of nature together. Now, mathematicians typically use these really abstract algebraic geometry techniques for this. But Yang-Hui? He had a different thought. So instead of equations and formulas, he starts thinking about pixels. Yeah. Like taking a Calabi-Yau manifold, breaking it down into a pixel grid like you do with an image. He's taking abstract geometry and turning it into something a neural network built for image recognition can handle. That is a radical change in how we think about this. It's like he's making something incredibly abstract, tangible,
124:00 - 124:30 translating it for AI. Did it even work? The results blew people away. He fed these pixelated manifolds into a neural network, and it predicted their topological properties really accurately. He basically showed AI could do algebraic geometry in a whole new way. So it's not just speeding up calculations. It's uncovering hidden patterns and connections that might have stayed hidden, like opening a new way of seeing math. And that leads us to the big question. If AI can crack open
124:30 - 125:00 complex math like this, what other secrets could it unlock? We're back. Last time we were talking about AI not just helping us with math, but actually coming up with new mathematical insights, which is where the Birch test comes in. It's like, can AI go from being a supercalculator to actually being a math partner? Exactly. And now we'll look at how researchers like Yang-Hui He are trying to answer that. Remember, the Turing test was about a machine being able to hold a conversation like a human. The Birch test is a whole other level. It's not about imitation. It's
125:00 - 125:30 about creating completely new mathematical ideas. Think about Brian Birch back in the 60s. He came up with this bold conjecture about elliptic curves, just from looking at patterns and numbers. So this test wants AI to do similar leaps, to go through tons of data, find patterns, and come up with conjectures that push math forward. Exactly. Can AI, like Birch, show us new mathematical landscapes? That's asking a lot. So how are we doing? Are there any signs AI
125:30 - 126:00 might be on the right track? There have been some promising developments. Like in 2021, Davies and his team used AI to explore knot theory. Knots, like tying your shoelaces. What's that got to do with advanced math? It's more complex than you think. Knot theory is about how you can embed a loop in three-dimensional space, and it actually connects to things like topology and even quantum physics. Okay, that's interesting. So how does AI come in? Well, every knot has certain mathematical properties called invariance. It's kind of like its fingerprint. Davies' team used machine
126:00 - 126:30 learning to analyze a massive amount of these invariants. So was the AI just crunching numbers, or was it doing something more? What's amazing is the AI didn't just process the data. It actually found hidden relationships between these invariants, which led to new conjectures that mathematicians hadn't even considered before. Like the AI was pointing the way to new mathematical truths. That's wild. Sounds like AI is becoming a powerful tool to spot patterns our human minds might miss. Absolutely. Another cool example is Lample and Charton's work
126:30 - 127:00 in 2019. They trained AI on a massive data set of math formulas. And what did they find? Well, this AI could accurately predict the next formula in a sequence, even for really complex ones. It was like the AI was learning the grammar of math and could guess what might come next. So we might not have AI writing full-blown proofs yet, but it's getting really good at understanding the structure of math and suggesting new directions. And that brings us back to Yang-Hu He. His work with those Calabi-Yau manifolds, analyzing them as pixelated forms, that was a huge breakthrough. Showed that
127:00 - 127:30 AI could take on algebraic geometry problems in a totally new way. Like bridging abstract math in the world of data and algorithms. Exactly. And that bridge leads to some really mind-bending possibilities. Yang-Hu He and his colleagues started exploring something they call murmuration. Murmuration. Like birds. It's a great analogy. Think of a flock of birds moving together like one. Each bird reacts to the ones around it, and you get these complex, beautiful patterns. Right,
127:30 - 128:00 I get it. But how does it relate to AI and math? Well, Yang-Hu He sees a parallel between how birds navigate together in a murmuration and how AI can guide mathematicians towards new insights by sifting through tons of math data. So the AI is like the flock, exploring math and showing us where things get interesting. Yeah, and they've actually used this murmuration idea to look into a famous problem in number theory, the Birch and Swinerton-Dyer conjecture. That name sounds a bit intimidating. What's it all about? Imagine a donut shape, but in the world of numbers. These
128:00 - 128:30 are called elliptic curves. Mathematicians are obsessed with finding rational points on these curves. Points where the coordinates can be written as fractions. Okay, I'm following so far. The Birch and Swinerton-Dyer conjecture basically says there's this deep connection between how many of these rational points there are and a specific math function, like linking the geometry of these curves to number theory. Things are definitely getting complex now. And it's a big deal in math. It's actually one of the Clay Mathematics Institute's Millennium Prize problems. Solve it,
128:30 - 129:00 you win a million bucks. Now that's some serious math street cred. So how did Yang-Hu He's team use AI for this? They trained an AI on this massive data set of elliptic curves and their functions. The AI didn't actually solve the whole conjecture, but it found this new pattern, this correlation that mathematicians hadn't noticed before. So the AI was like a digital explorer, mapping out this math territory and showing mathematicians what to look at more closely. Exactly. This discovery,
129:00 - 129:30 while not a complete proof, gives more support to the conjecture and opens up some exciting new areas for research. It shows how AI can help with even the hardest problems in mathematics. It feels like we're on the edge of something new in math. AI is not just a tool, it's a partner in figuring out the truth. What does all this mean for math in the future? That's a great question, and it's something we'll dig into in the final part of this deep dive. We'll look at the philosophical and ethical stuff around AI in math. We'll ask if AI is really understanding the math it's working
129:30 - 130:00 with, or if it's just manipulating symbols in a really fancy way. See you there. Welcome back to our deep dive. We've been exploring how AI is changing the game in math, from solving tough problems to finding hidden patterns in complex structures. But what does it all mean? What are the implications of all of this? We've touched on this question of understanding. Does AI really understand the math it's dealing with, or is it just a master of pattern matching? Yeah, we can get caught up in the cool stuff AI is doing, but we can't forget about those implications. If AI
130:00 - 130:30 is going to be a real collaborator in mathematics, this whole understanding question is huge. It goes way back to the Chinese room thought experiment. Imagine someone who doesn't speak Chinese has this rulebook for moving Chinese symbols around. They can follow the rules to make grammatically correct sentences, but do they actually get the meaning? So is AI like that, just manipulating symbols in math without grasping the deeper concepts? That's the big question, and there's no easy
130:30 - 131:00 answer. Some people say that because AI gets meaningful results, like we've talked about, it shows some kind of understanding, even if it's different from how we understand things. Others say AI doesn't have that intuitive grasp of math concepts that we humans have. It's a debate that's probably going to keep going as AI gets better and better at math. Makes you wonder how it's going to affect the foundations of mathematics itself. That's a key point. Traditionally, mathematical proof has been all about logic, building arguments step by step using established axioms and theorems. But AI brings something new, inductive reasoning, finding patterns
131:00 - 131:30 and extrapolating from those patterns. So could we see a change in how mathematicians approach proof? Could we move toward a way of doing math that's driven by data? It's possible. Some mathematicians are already using AI as a partner in the proving process. AI can help generate potential theorems or find good strategies for tackling conjectures. But others are more cautious, worried that relying too much on AI could make math less rigorous, more prone to errors. It's like with any new tool.
131:30 - 132:00 There's good and bad. Finding that balance is important. We need to be aware of the limitations and not rely on AI too much. Right. And as AI becomes more important in math, it's crucial to have open and honest conversations. We need to talk about what AI means, not just for math, but for everything we do. It's not just about the tech. It's about how we choose to use it. We need to make sure AI helps humanity and the benefits are shared. That's everyone's responsibility. A responsibility that goes way beyond just mathematicians and computer scientists. We need
132:00 - 132:30 philosophers, ethicists, social scientists, and most importantly, the public. We need all sorts of voices and perspectives to guide us as we go into this uncharted territory. This has been an amazing journey into the world of AI and math. From sleepless nights to those mind-bending manifolds, we've seen how AI is pushing the boundaries of what's possible. And as we wrap up, we encourage you to keep thinking about these things. What does it really mean for a machine to understand math?
132:30 - 133:00 How will AI change the way we prove things and make discoveries in math? How can we make sure we're using AI responsibly and ethically in our search for knowledge? These are tough questions, but they're worth asking. The future of mathematics is being shaped right now, and AI is a major player. Thanks for joining us on this deep dive. We'll catch you next time, ready to explore some other fascinating corner of the universe of knowledge. New update! Started a substack. Writings on there are currently about language and ill-defined concepts, as well as
133:00 - 133:30 some other mathematical details. Much more being written there. This is content that isn't anywhere else. It's not on Theories of Everything, it's not on Patreon. Also, full transcripts will be placed there at some point in the future. Several people ask me, Hey Curt, you've spoken to so many people in the fields of theoretical physics, philosophy, and consciousness. What are your thoughts? While I remain impartial in interviews, this substack is a way to peer into my present deliberations on these topics. Also, thank you to our partner, The Economist. Plus, it helps
133:30 - 134:00 out Curt directly, aka me. I also found out last year that external links count plenty toward the algorithm, which means that whenever you share, on Twitter, say on Facebook, or even on Reddit, etc.,
134:00 - 134:30 it shows YouTube, Hey, people are talking about this content outside of YouTube, which in turn greatly aids the distribution on YouTube. Thirdly, there's a remarkably active discord and subreddit for Theories of Everything, where people explicate TOEs, they disagree respectfully about theories, and build, as a community, our own TOE. Links to both are in the description. Fourthly, you should know this podcast is on iTunes, it's on Spotify, it's on all of the audio platforms.
134:30 - 135:00 All you have to do is type in Theories of Everything and you'll find it. Personally, I gain from re-watching lectures and podcasts. I also read in the comments that, Hey, TOE listeners also gain from replaying. So how about instead you re-listen on those platforms, like iTunes, Spotify, Google Podcasts, whichever podcast catcher you use. And finally, if you'd like to support more conversations like this, more content like this, then do consider visiting patreon.com
135:00 - 135:30 slash CURTJAIMUNGAL and donating with whatever you like. There's also PayPal, there's also crypto, there's also just joining on YouTube. Again, keep in mind, it's support from the sponsors and you that allow me to work on TOE full-time. You also get early access to ad-free episodes, whether it's audio or video. It's audio in the case of Patreon, video in the case of YouTube. For instance, this episode that you're listening to right now was released a few days earlier. Every dollar helps far more than you think. Either way, your viewership is generosity enough. Thank you so much.