A New Era in Mathematics

Math Will Never Be the Same Again… | Yang-Hui He

Estimated read time: 1:20

    AI is evolving every day. Don't fall behind.

    Join 50,000+ readers learning how to use AI in just 5 minutes daily.

    Completely free, unsubscribe at any time.


    In this engaging discussion with Yang-Hui He, we delve into the transformative potential of AI in the realm of mathematics. Yang-Hui shares his insights on how AI isn’t just a tool for quicker calculations but a revolutionary force that's reshaping how we conceptualize and solve complex mathematical problems. By looking at mathematics through the lens of AI, Yang-Hui suggests new approaches to enduring challenges like the Birch and Swinerton-Dyer conjecture, while emphasizing how AI can bridge the abstract world of algebraic geometry and practical image processing techniques.


      • AI is changing math by offering new insights, not just faster calculations 🧠.
      • Yang-Hui discovered AI's potential in math during quiet nights with his son 👶.
      • Calabi-Yau manifolds are reimagined as image problems, showing AI's prowess 📸.
      • The Birch Test challenges AI to create new mathematical theories 🌟.
      • AI highlights connections in math akin to leaps made by Gauss 🔍.

      Key Takeaways

      • AI is not just speeding up math but reshaping how we understand it 🧠.
      • Yang-Hui He stumbled upon AI-assisted math during sleepless nights with his newborn 👶.
      • Calabi-Yau manifolds modeled as images showcase AI's potential in math 📸.
      • The Birch Test could be the next step for AI in generating novel mathematical ideas 🌟.
      • AI's role mirrors intuitive insights of great mathematicians like Gauss 🔍.


      In this episode of Theories of Everything, we explore the intersection of artificial intelligence and mathematics with Yang-Hui He, an expert advocating for AI's transformative role beyond just speeding up calculations. Yang-Hui shares his journey into AI-assisted mathematics, which began during late nights with his newborn. This unexpected inspiration led him to explore how AI could revolutionize the way we approach and solve mathematical problems.

        Yang-Hui illustrates how viewing complex algebraic structures, such as Calabi-Yau manifolds, as image recognition tasks have allowed AI to uncover new patterns and insights. This novel approach has even ventured into profound mathematical challenges, like the Birch and Swinerton-Dyer conjecture, demonstrating AI’s capability to facilitate significant breakthroughs in mathematics.

          Furthermore, the conversation delves into the notion of the Birch Test, a hypothetical scenario envisioning AI's capacity to independently generate meaningful new mathematical concepts and conjectures. This evolution in mathematics sees AI not just as a computational tool but as a partner capable of fostering groundbreaking advancements akin to historical mathematical insights.


            • 00:00 - 02:30: Introduction and Initial Discussion This chapter introduces Yang-Hui He, the guest on the podcast, as the host expresses admiration for his work and lectures. Yang-Hui reciprocates by acknowledging his admiration for the host, especially highlighting the interviews with prominent figures like Roger Penrose and Edik Franco. The atmosphere is one of mutual respect and appreciation as the conversation begins.
            • 02:30 - 10:00: AI and Mathematics The chapter titled 'AI and Mathematics' delves into the intricate relationship between artificial intelligence, machine learning, and mathematics. It explores three levels of understanding mathematics in this context: bottom-up, top-down, and meta perspectives. Before diving into these concepts, the chapter hints at a discussion about the specific mathematical and physics disciplines that initially captivated the speaker's interest, as well as an exploration of their collaboration with Roger. The speaker highlights their expertise in mathematical physics, particularly focusing on the intersection with algebraic geometry.
            • 10:00 - 20:00: Interplay of Physics and Mathematics The chapter discusses the editor's background in string theory and their collaboration with C.N. Young, a notable figure in physics known for the Young-Mills Theory. Young, who is remarkable not only for his scientific contributions but also for being one of the world's oldest living Nobel laureates, having received the Nobel Prize in 1957. The discussion highlights the intertwined nature of physics and mathematics, particularly through influential figures like Young.
            • 20:00 - 22:30: Future of AI in Mathematical Discovery This chapter delves into the collaborative relationship between the speaker and notable physicist Roger Penrose, initiated through a joint editorial project on a book related to topology in physics. The discussion highlights the importance of collaboration and personal relationships in advancing mathematical and scientific discovery, particularly in the context of future developments in AI. Insights into the editing process for academic works and the potential of AI to transform complex disciplines like topology and physics are explored.

            Math Will Never Be the Same Again… | Yang-Hui He Transcription

            • 00:00 - 00:30 Yang-Hui He, hey, welcome to the podcast. I'm so  excited to speak with you. You have an energetic   humility and your expertise and your passion comes  across whenever I watch any of your lectures,   so it's an honor. It's a great pleasure  and great honor to be here. In fact,   I'm a great admirer of yours. You've interviewed  several of my very distinguished colleagues like,   you know, Roger Penrose and Edik Franco.  I actually watched some of them. It's   actually really nice. Wonderful, wonderful.  Well, that's humbling to hear. So firstly,
            • 00:30 - 01:00 people should know that we're going to talk about  or you're going to give a presentation on AI and   machine learning mathematics and the relationship  between them, as well as the three different   levels of what math is in terms of production  and understanding, bottom-up, top-down, and then   the meta. But prior to that, what specific math  and physics disciplines initially sparked your   interest? And how did the collaboration with Roger  come about? So my, you know, my bread and butter   was mathematical physics, especially, you know,  sort of the interface between algebraic geometry
            • 01:00 - 01:30 and string theory. So that's my background, what I  did my PhD on. And so at some point, I was editing   a book with C.N. Young, who is an absolute legend.  You know, he's 102, he's still alive, and he's the   world's oldest living Nobel laureate. You know,  Penrose is a mere 93 or something. So C.N. Young   of the Young-Mills Theory, so he's an absolute  legend. He got the Nobel Prize in 1957. So at some
            • 01:30 - 02:00 point, I got involved in editing a book with C.N.  called Topology in Physics. And, you know, with a   name like that, you can just invite anybody you  want, and they'll probably say yes. And that was   my initial friendship with Roger Penrose started  through working together on that editorial. I   mean, I have Roger as a colleague in Oxford, and  I've known him on and off for a number of years.   But that's when we really started getting,  working together. So when Roger snickers at
            • 02:00 - 02:30 string theorists, what do you say? What do you do?  How does that make you feel? Oh, that's totally   fine. I mean, I'm not a diehard string theorist.  And, you know, I'm just generally interested in   the interface between mathematics and physics.  And, you know, Roger's totally chill with that.   So you just happened to study the mathematics  that would be interesting to string theorists,   though you're not one. Exactly, and vice versa.  I just completely chanced on this. It was kind
            • 02:30 - 03:00 of interesting, you know, I was recently given  a talk, a public lecture in Dublin, about the   interactions between physics and mathematics. And  I still find that just, you know, string theory   is still a very much a field that gives the best  cross disciplinary, you know, kind of feedback.   I've been doing that for decades. It's a fun  thing. You know, I talked to my friends in pure   mathematics, especially in algebraic geometry,  100% of them are convinced that string theory
            • 03:00 - 03:30 is correct. Because for them, it's inconceivable  for a physics theory to give so much interest in   mathematics. Interesting. And that's kind of a,  I think that's a story that hasn't been told so   much, you know, in media. You know, if you talk to  a physicist, they're like, you know, string theory   doesn't predict anything, this and the other  thing. But there's a big chapter of string theory,   you know, to me, more than 50% of the story,  backstory of string theory, is just constantly
            • 03:30 - 04:00 giving new ideas in mathematics. And, you know,  historically, when a physical theory does that,   it's very unlikely for it to be completely wrong.  Yeah, you watch the podcast with Edward Frenkel,   and he takes the opposite view, although  he initially took the former view that,   okay, string theory must be on the correct track  because of the positive externalities. It's like   the opposite of fossil fuels. It doesn't give you  what you want for your field, like physics, but   it gives you what you want for other fields, as a  serendipitous outgrowth. But then he's no longer
            • 04:00 - 04:30 convinced after being at a string conference.  So you still feel like the pure mathematicians   that you interact with see string theory as  on the correct track as a physical theory,   not just as a mathematical theory. Yeah, so he,  yeah, absolutely. He does make a good point.   And so, I think, you know, Frankel and algebra  geometers like Richard Thomas and various people,   they appreciate what string theory is  constantly doing in terms of mathematics.
            • 04:30 - 05:00 And the challenge is whether it is a theory  of physics based on the fact that it's giving   so much mathematics. I guess, you know, you've  got to be a mystic. Some of them are mystics.   Some of us are mystics. And I actually, I don't, I  don't personally have an opinion on that. I just,   you know, some days when I'm like, well, you know,  it's, this is such a cool mathematical structure,   and there's so much internal consistency. It's got  to be, there's got to be something there. So it's
            • 05:00 - 05:30 just, but of course, you know, it being a science,  you need the experimental evidence, you know,   you need to go through the scientific process.  And that I have absolutely no idea. It could   take years and decades. Wouldn't you also have to  weight the field, like w-e-i-g-h-t, weight the,   whatever field, like the subdiscipline of string  theory, with how much IQ power has been poured   into it, how much raw talent has been poured into  it, versus others. So you would imagine that if it   was the big daddy field, which it happens to be,  that it should produce more and more insights.
            • 05:30 - 06:00 And it's unclear to me, at least, if that this  much time and effort went into asymptotic safety,   or loop quantum gravity, or what have you,  or causal set theory, if that would produce   mathematical insights of the same level of  quality, we don't have a comparison. I mean, I   don't know. I want to know what your thoughts are  on that. I think the reason for that is just that,   you know, we follow our own nose as a community.  And the contending theories like, you know, loop
            • 06:00 - 06:30 quantum gravity and stuff, you know, there are  people who do it. There are communities of people   who do it. And, you know, there's a reason why the  top mathematicians are doing string-related stuff,   is because, you know, you follow the right  nose. You feel like it is actually giving the   right mathematics. Things like, you know, mirror  symmetry, you know, or vertex algebras, that's   kind of giving the right ideas constantly. And  it's been doing this since the very beginning. So,   and people do, you know, the other alternative,  you know, the alternative theories of everything.
            • 06:30 - 07:00 But so far, it hasn't produced new math. You can  certainly prove us wrong. But I think, you know,   I follow, you know, there's a reason why Witten  is the one who gets the Fields Medal. Because   it's just somehow is at the right interface  of the right ideas in geometry, number theory,   representation theory, algebra, that this idea  tends to produce the right, you know, the right   mathematics. Whether it is a theory of physics,  that's still, you know, that's the next mystical
            • 07:00 - 07:30 level. But, you know, it's kind of, it's an  exciting time, actually. Witten didn't get   the Fields Medal for string theory, though. It was  his work on the Jones polynomial, and Chern-Simons   theory, and Morse theory with supersymmetry,  and topological quantum field theory,   but not specifically string theory. That's right.  That's right. But he certainly is a champion   for string theory. And for him, I mean, you know,  that idea, he was able to do, you know, the Morse
            • 07:30 - 08:00 theory stuff, he was able to get because of his  work on supersymmetry. He was able to realize this   was a supersymmetric index theorem that generated  this idea. And that's really, supersymmetry   really is a cornerstone for string theory, even  though there's no experimental evidence for it.
            • 08:00 - 08:30 So I think that's one of the reasons that's  guiding him towards this idea. I think that's   one of the reasons that's guiding him towards  this direction. So what's cool is that just prior,   the podcast that I filmed just prior to yours was  Peter Woit, as you know, is a critic of string   theory. And Joseph Conlon, who is a defender of  string theory, and he has a book even called Why   String Theory. That's right. I think it was the  first time that publicly, someone like Peter Woit,   along with a defender of string theory, were just  on a podcast of this length, speaking about in   a technical manner, what are both of their likes  and dislikes of string theory, and then the string
            • 08:30 - 09:00 community. There's three issues, string theory  as a physical theory, string theory as a tool   for mathematical insight, and then three, string  theory as a sociological phenomenon of overhype,   and does it see itself as the only game in town?  Is there arrogance? Should there be arrogance?   It was an interesting conversation. Yeah. Well,  Joe is a good friend of mine, Joe Conlon. Yeah,   right, right. In Oxford. And yeah, no, I value  his comments greatly. I've always been kind of,
            • 09:00 - 09:30 you know, for me, I've always been kind of like  slightly orthogonal to the main string theory   community. I'm just happy because it's constantly  giving me good problems to work on. Yes. Including   what I'm about to talk about in AI. Wonderful. And  I'll mention a little bit about it because I got   into this precisely because I had a huge database  of Calabi-Yau manifolds, and I wouldn't have done   that without the string community. It was, again,  one of those accidents that, you know, no other,   you know, the other theoretical physicist  didn't happen to have this, didn't happen
            • 09:30 - 10:00 to be thinking about this problem. There's  this proliferation of Calabi-Yau manifolds,   and I'll mention that a bit in my lecture later,  and why this is such an interesting problem,   why Calabi-Yau-ness is interesting inherently,  regardless whether you're a string theorist. And   that kind of launched me in this direction of  AI-assisted mathematical discovery. So this is   kind of really nice. And I think, I mean, for me,  the most exciting thing about this whole community
            • 10:00 - 10:30 is that, you know, science, and especially, you  know, theoretical science, well, not especially,   science, including theoretical science, has  become so compartmentalized, right? You know,   everyone is doing their tiny little bit of  thing. And string theory has been breaking   that mold for the last decades. It's constantly,  oh, let's take a piece of algebraic geometry,   let's take a bit of number theory here, elliptic  curves, let's take a bit of quantum information,
            • 10:30 - 11:00 entanglement, whatever, entropy, black holes. And  it's the only field that I know that different   expertise are talking to each other. I mean, this  doesn't happen in any other field that I know   of in sort of mathematics, theoretical physics.  And that just gets me excited. And that's what I   really like thinking about. Well, let's hear more  about what you like thinking about and what you're   enthusiastic about these days. Let's get to the  presentation. Sure. Well, thank you very much for
            • 11:00 - 11:30 having me here. And I'm going to talk about work  I've been thinking about, stuff I've been thinking   about for the last seven years, which is how AI  can help us do mathematical discovery, you know,   in theoretical physics and pure mathematics.  I recently wrote this review for Nature,   which is trying to summarize a lot of these  ideas that I've been thinking about. And there's   an earlier review that I wrote in 2021 about,  you know, how machine learning can help us with
            • 11:30 - 12:00 understanding mathematics. So let me just take  it away and think about, oh, by the way, please   feel free to interrupt me. I know this is one of  those lectures. I always like to make my lectures   interactive. So please, if you have any questions,  just interrupt me anytime. And I'll just pretend   there's a big audience out there and I'll just  make it. So firstly, you're likely going to get to   this, but what's the definition of mathematics?  OK, great. So roughly, I mean, of course, you
            • 12:00 - 12:30 know, how does one, so the first question is, how  does one actually do mathematics? Right. And so   one can think about, of course, in these reviews,  I try to divide it into sort of three directions.   Of course, these three directions are interlaced  and it's very hard to pull them apart. But   roughly, you can think about, you know, bottom-up  mathematics, which is, you know, mathematics is a   formological system, you know, definition and, you  know, lemma proof and, you know, theorem proof.
            • 12:30 - 13:00 And that's certainly how mathematics is presented  in, you know, papers. And there's another one,   which I like to call top-down mathematics, is  where, you know, where the practitioner looks   from above. That's why I say top-down, from like  a bird's eye view. You see different ideas and   subfields of mathematics. And you try to do this  as a sort of an intuitive creative art. You know,   you've got some experience and then you're trying  to see, oh, well, maybe I can take a little bit of
            • 13:00 - 13:30 piece from here and a piece from there and I'm  trying to create a new idea or maybe a method   of proof or attack or derivation. Yes. So these  are these two. So that's, you know, complementary   directions of research. And the third one,  meta, that's just because it was short of any   other creative words, because there's, you know,  words like meta-science and meta-philosophy or   meta-physics. I'm just thinking about mathematics  as purely as a language, you know, whether the
            • 13:30 - 14:00 person understands what's going on underneath.  So meta is of secondary importance. So it's kind   of like Chai-Chi-Pi-Ti, if you wish, you know, can  you do mathematics purely by symbol processing? So   that's what I mean by meta. So I'm going to talk  a little bit about, in this talk, about each of   the three directions and focusing mostly on the  second direction of top-down, which is what I've   been thinking about for the last seven years or  so. Hmm. Okay. I don't know if you know of this
            • 14:00 - 14:30 experiment called the Chinese room experiment.  Yeah. Okay. So in that, the person in the center   who doesn't actually understand Chinese, but is  just symbol pushing or pattern matching, I don't   know if it's exactly pattern, rule following,  that would be the better way of saying it. Yeah.   They would be an example of bottom-up or meta in  this? So I would say that's meta. As you know,   on Theories of Everything, we delve into  some of the most reality spiraling concepts   from theoretical physics and consciousness to AI  and emerging technologies. To stay informed in an
            • 14:30 - 15:00 ever-evolving landscape, I see The Economist as  a wellspring of insightful analysis and in-depth   reporting on the various topics we explore here  and beyond. The Economist's commitment to rigorous   journalism means you get a clear picture of the  world's most significant developments, whether   it's in scientific innovation or the shifting  tectonic plates of global politics. The Economist   provides comprehensive coverage that goes beyond  the headlines. What sets The Economist apart is
            • 15:00 - 15:30 their ability to make complex issues accessible  and engaging, much like we strive to do in this   podcast. If you're passionate about expanding your  knowledge and gaining a deeper understanding of   the forces that shape our world, then I highly  recommend subscribing to The Economist. It's   an investment into intellectual growth, one that  you won't regret. As a listener of TOE, you get   a special 20% off discount. Now you can enjoy  The Economist and all it has to offer for less.
            • 15:30 - 16:00 Head over to their website, www.economist.com to  get started. Thanks for tuning in, and now back   to our explorations of the mysteries of the  universe. So I would say that's meta, in the   sense that the person doesn't even have to be a  mathematician. You're just simply taking symbols,   large language modeling for math, if you wish.  Got it. Of course, you know, there's a bit of
            • 16:00 - 16:30 component of others, you know, that you can see  there's a little bit of component of bottom-up,   because you are taking mathematics as, you know, a  sequence of symbols. But I would mainly call that   meta, if that's okay. I mean, these definitions  are just, you know, things that I'm using. Yes,   yes. But in any case, I would talk mostly about  this bit, which is what I've been thinking mostly   about. One thing, just to set the scene, you know,  20th century, of course, you know, computers have
            • 16:30 - 17:00 been playing an increasingly important role in  mathematical discovery. And of course, you know,   it speeds up computation, all that stuff goes  without saying. But something that's perhaps   not so emphasized and appreciated is the fact that  there are actually fundamental and major results   in mathematics that could no longer have been done  without the help of the computer. And so this,
            • 17:00 - 17:30 you know, there's famous examples. Even back in  1976, this is the famous Upper Harkin-Cock proof   of the four-color theorem. You know, that every  map, it only takes four, every map in a plane only   takes four colors to completely color it with no  neighbors. And this is a problem that was posed,   I think, probably by Euler, right? And this was  finally settled by reducing this whole topology   problem to thousands of cases, and then they ran  it through a computer and checked it case by case.   So, and then other major things like, you know,  the Kepler conjecture, which is, you know,
            • 17:30 - 18:00 that stacking balls, identical balls. The best way  to stack it is what you see in the supermarket,   you know, in this hexagonal thing.  And this was a conjecture by Kepler,   but to prove that this is actually the best way  to do it was settled in 1998, again, by a huge   computer check. And the full acceptance by the  math community, it was only as late as 2017,   when proof copilots actually went through Hauss's  construction and then made this into a proof. Yes.
            • 18:00 - 18:30 But wasn't there a recent breakthrough in the  generalized Kepler conjecture? Absolutely. So   this is what Marina Vyotsovskaya got the Fields  Medal for. So the Kepler conjecture is in three   dimensions, you know, our world. Vyotsovskaya  showed in dimensions 8, 16, and 24 what the best   possible packing are. And she gave a beautiful  proof of that fact. And to my knowledge, I don't
            • 18:30 - 19:00 think she actually used the computer. There's  some optimization method. Actually, what I'm   referring to is that there are some researchers  who generalize this for any n, not just 8,   not just 24, who used methods in graph theory of  selecting edges to maximize packing density to   solve a sphere packing problem probabilistically  for any n. Though I don't believe they used   machine learning. Well, thanks for telling me.  I've got to check that. That's interesting. This
            • 19:00 - 19:30 was actually a really interesting one. I mean,  that's something that's closer to me, which is the   classification of finite simple groups. So simple  groups are building blocks of all finite groups.   And the proof is, you know, took 200 years. And  the final definitive volume was by Gornstein 2008.   And what's really interesting, the law in the  finite group theory community is that nobody's   actually read the entire proof. It's just not  possible. It takes longer for people to actually   read the entire proof than a lifetime. So this  is kind of interesting that, you know, we have
            • 19:30 - 20:00 reached the cusp in mathematical research where  the computers are not just becoming computational   tools, but it's increasingly becoming an integral  part of who we are. So this is just set the scene.   So we're very much in this, you know, we're now  in the early stages of the 21st century. And this   is increasingly the case where we have this,  where computers can help us or AI can help us
            • 20:00 - 20:30 in these three different directions. Great. So  let me just begin with this bottom up and sort   of to summarize this. This is probably the oldest  attempt in where computers can help us. So this is   where I'm going to define bottom up, which is,  I guess it goes back to the modern version of   this is this classic paper, the classic book of.  Russell Whitehead on the Principia Mathematica,
            • 20:30 - 21:00 which is 1910s, where they try to axiomize,  axiomatize mathematics, you know, from the   very beginning, you know, it took like 300 pages  for them to prove the one plus one is good to two   famously. Nobody has read this. So this is this  is one of those impenetrable books. But I mean,   this, but this tradition goes back to, you know,  Leibniz or to Euclid, even, you know, that the   idea that mathematics should be axiomatized. Of  course, this, this program took only about 20
            • 21:00 - 21:30 years before he was completely killed in some  sense, because of Gödel and Church and Turin's   incompleteness theorems that, you know, this  very idea of trying to axiomatize mathematics by   constructing, you know, layer by layer is proven  to be, you know, logically impossible within   every order of logic. But I like to quote my very  distinguished colleague, Professor Minyong Kim.   He says the practice of mathematician hardly  ever worries about Gödel. Because, you know,
            • 21:30 - 22:00 if you have to worry about whether your  axioms are valid to your day to day, you know,   if an algebraic geometer has to worry about this,  then you're sunk, right? You get depressed about   everything you do. Right? So the two parts kind  of cancel out. But the reason I mentioned this   is that because of the fact that these two parts  cancel each other out, these two negatives cancel   each other out, this idea of using computers to  check proofs or to compute proofs really goes back
            • 22:00 - 22:30 to the 1950s. Right? So despite the, you know,  what Gödel and Church and Turin have proved is   foundational. Even back in 1956, Noah, Simon and  Shaw devised this logical theory machine. I have   no idea how they did it, because this is really  very, very, very primitive computers. And they   were actually able to prove some certain theorems  of Principia by building this bottom up, you know,   take these axioms and use the computer to prove.  And this is becoming, you know, an entire field of
            • 22:30 - 23:00 itself with this very distinguished history. And  just to mention that this 1956 is actually a very   interesting year, because it's the same year, 56,  57, that the first neural networks emerged from   the basement of Penn and MIT. And that's really  interesting, right? So people in the 50s were   really thinking about the beginnings of AI, you  know, because neural networks is what we now call,
            • 23:00 - 23:30 you know, goes under the rubric of AI. And at  the same time, they were really thinking about   computers to prove theorems and mathematics.  So it's 56 was a kind of a magical year. And,   you know, this neural network really was a  neural network in the sense that, you know,   they put cadmium sulfide cells in a basement.  It's a wall size of photoreceptors. And they were   using, you know, flashlights to try to stimulate  neurons, literally to try to simulate computation.
            • 23:30 - 24:00 That's quite an impressive thing. And then this  thing really developed, right? And now, you know,   half a century later, we have very advanced and  very, very sophisticated, computer-aided proof,   automated theorem provers. Things like the Coq  system, the Lean system. And they were able to   create, so Coq was what was used in this, the  full verification of the proof of the Four
            • 24:00 - 24:30 Color theorem was through the Coq system. And,  you know, then there's the Phi-Thomson theorem,   which got Thomson the Fields Medal. Again, they  got the proof through this system. And Lean is   very good. I do a little bit of Lean, but also  Lean, the true champion of Lean is Kevin Buzzard   at Imperial, 30 minutes down the road from  here, from this spot. And he's been very much   a champion for what he calls the Xena project,  and using Lean to formulate, to formalize all of
            • 24:30 - 25:00 mathematics. That's the dream. And what Lean has  done now is that it has, Kevin tells me, all of   the undergraduate level mathematics at Imperial,  which is a non-trivial set of mathematics, but   still a very, very tiny bit of actual mathematics.  And they can check it, and everything that we've   been taught so far at undergraduate level is good  and self-consistent, so nobody needs to cry about
            • 25:00 - 25:30 that one. Wonderful. And so that's all good. And  then more recent breakthroughs is the beautiful   work of, you know, so three Fields Medalists here.  So two Fields Medalists, Gowers, Manners, I think   it's his name, and Tao, where they proved this  conjecture, which I don't know the details of,   but they were actually using Lean to prove, to  help prove this. And I think Terry Tao in this   public lecture, which he gave recently in 2024 in  Oxford, he calls this whole idea of AI co-pilot,
            • 25:30 - 26:00 which I very much like this word. I was with  Tao in August in Barcelona, we were at this   conference, and he's very much into this, and  of course, you know, Tao, Terry Tao for us is,   you know, is a godlike figure. And the fact that  he's championing this idea of AI co-pilots for   mathematics is very, very encouraging for all  of us. Yes. And for people who are unfamiliar
            • 26:00 - 26:30 with Terry Tao, but are familiar with Ed Witten,  Terry Tao is considered the Ed Witten of math,   and Ed Witten is considered the Terry Tao of  physics. Yeah, I've never heard that expression.   That's kind of interesting. At Barcelona, when  Terry was being introduced by the organizer, Eva   Miranda, she said, Terry Tao is, this is a very  beautiful sentence, Terry Tao has been described   as the Gauss of mathematics. Yes, or the Mozart.  But I think a more appropriate thing to describe
            • 26:30 - 27:00 him is to describe him as the Leonardo da Vinci of  mathematics, because he has such a broad impact on   all fields of mathematics, and that's a very rare  thing. Yeah, I remember he said something like,   topology is my weakest field, and by weakest  field to him, it means I can only write one or two   graduate textbooks off of the top of my head on  the subject of topology. Exactly, exactly. I guess
            • 27:00 - 27:30 his intuitions are more analytic. He's very much  in that world of analytic number three, functional   analysis. He's not very pictorial, surprisingly.  Like Roger Penrose has to do, everything has to   be in terms of pictures. But Terry is a symbol,  symbolic matcher. We can just look at equations,   extremely long, complicated equations, and just  see which pieces should go together. That's kind
            • 27:30 - 28:00 of very interesting. Speaking of Eva Miranda,  you and I, we have several lines of connection.   Eva's coming on the podcast in a week or two to  talk about geometric quantization. Awesome. Eva   is super fun, right? She's filled with energy.  Yes. She's a good friend of mine. I think in this   academic world of math and physics, I think we're  at most one degree of separation from anyone else.   It's a very small community, relatively small  community. Back to this thing about, of course,
            • 28:00 - 28:30 one could get overoptimistic. I was told by my  friends in DeepMind that Shaggedy, who I think   he's on this AI math team, he was instructing  that computers beat humans in chess in the 90s,   that humans go at 2018, so you should beat humans  in proving theorems in 2030. I have no idea how he   extrapolated these two points. They're only three  data points. But DeepMind has a product to sell,
            • 28:30 - 29:00 so it's very good for them to be overoptimistic.  But I wouldn't be surprised that this number…   Well, I'm not sure to beat humans, but it  might give ideas that humans have not thought   about before. So that's possible. Just moving  on. So that's the bottom up. And as I said,   this is very much a blossoming, or not blossoming,  it's very much a long, distinguished field of   automated theorem computing, of theorem provers  and verifications of formalization mathematics,
            • 29:00 - 29:30 which Tal calls the AI copilot. Just to mention  a bit with your question a bit earlier about   metamathematics. So this is just kind of… I  like your analogy. This is like the Chinese   room. Can you do mathematics without actually  understanding anything? You know, personally,   I'm a little biased, because having interacted  with so many undergraduate students before I   moved to the London Institute so I don't have to  teach anymore, teach undergraduates, I've noticed,
            • 29:30 - 30:00 you know, maybe one can say the vast majority  of undergraduates are just pattern matching,   whether there's any understanding. I think  this is one of the reasons why, you know,   why CHAT-GPT does things so well. It's not  just because… It's not because, oh, you know,   LLMs are great, large language models are great.  It's more that most things that humans do are so   without comprehension anyway. So that's why it's  kind of this pattern matching idea. And this is
            • 30:00 - 30:30 also true for mathematics. What's funny is that my  brother's a professor of math in the University of   Toronto for machine learning, but for finance.  And I recall 10 years ago, he would lament to   me students that came to him who wanted to be  PhD students, and he would say, okay, but Curt,   some of them, they don't have an understanding.  They have a pattern matching understanding. He   didn't want that at the time, but now he's into  machine learning, which is effectively that times   10 to the power of 10. Right, right. No, no, I  completely agree. I mean, this is not… This is
            • 30:30 - 31:00 not to criticize undergraduate studies. You know,  I think in undergraduate students, it's just that,   you know, it's part of being human. We kind of  pattern match, and then we do it the best we   can. And then, of course, if you're Terry Tao, you  know, you actually understand what you're doing,   but you know… Of course. But the vast majority  of us doing most of the stuff is just pattern   matching. So that's why… And this is true even  for mathematics. So here, I just want to mention
            • 31:00 - 31:30 something, which is a fun project that I did  with my friends, Vishnu Jijala and Brent Nelson,   back in 2018, before LLM. So before all this LLM  for science thing. And this is a very fun thing,   because what we did, we took the archive, and  we took all the titles of the archive. You know,   this is the, you know, the preprint server for  contemporary research in theoretical sciences.   And, you know, we were doing LLM classifiers,  Word2Vec, very old fashioned. This is a neural
            • 31:30 - 32:00 network, Word2Vec. And, you know, you can  classify this and do their thing. But what's   really interesting, this is my favorite bit, we  took, to benchmark the archive, we took Vixra. So   Vixra is a very interesting repository, because  it's archive spelled backwards. And he has all   kinds of crazy stuff. I'm not saying everything  on Vixra is crazy, but certainly has everything   that archive rejects, because he thinks it's  crazy. Things like, you know, three page proof
            • 32:00 - 32:30 of the Riemann hypothesis, or Albert Einstein is  wrong, it's got filled with that. It's interesting   to study the linguistics, even at a title level,  you could see that, you know, what they call the   distinctions of quantum gravity versus the other  things, they have the right words in Vixra. But   the word order is already quite random, that, you  know, in other words, the classification matrix,   the confusion matrix for Vixra is certainly  not as distinct as archive, which, you know,   so kind of interesting, you know, you get all  the right buzzwords. It's like, you know, kind
            • 32:30 - 33:00 of thing, Vixra, I think, is a good benchmark,  that linguistically is not as sophisticated as,   you know, real research articles. But this idea,  so this is something much more serious, is this   very beautiful work of Chitoyan et al in Nature,  where they actually took all of material science,   and they did a large language model for that, and  they were able to actually generate new reactions   in material science. So this, I think, this paper  in 2019, this paper by Chitoyan, is really the
            • 33:00 - 33:30 beginnings of LLM for scientific discovery. This  is quite early, I mean, this is 2019, right? Yeah,   and it's remarkable how we can even say that  that's quite early. The field is exploding so   quickly. Absolutely. That five years ago  is considered quite some time ago. Yeah,   absolutely. Even five years ago, you know, I was  still very much in a lot of... I've evolved in   thinking a lot about this thing. I would also  like to get to your personal use cases for LLMs,
            • 33:30 - 34:00 ChagGVT, Claude, and what you see as the pros and  cons between the different sorts, like Gemini was   just released at 2.0, and then there's O1, and  there's a variety. So at some point, I would like   to get to you personally, how you use LLMs, both  as a researcher and then your personal use cases.   Okay, now I can mention a little bit. One of the  very, very first things when ChagGVT3 came out in,   what, 2018, something, 2019, something like  that? ChagGVT, oh, three. You mean GPT-3? GPT-3,
            • 34:00 - 34:30 like the really early baby versions. Yeah,  that was during, just before the pandemic.   Just before pandemic. So that was just like,  so I got into this AI for math through this   Calabi-Yau metaphor, which I'm going to mention a  bit later. And then GPT came out when I was just   thinking about this large language model. So this  is a great conversation. So I was typing problems
            • 34:30 - 35:00 in calculus, freshman calculus, and it was solved  fairly well. I mean, it's really quite impressive   what he can do. So it's fairly sophisticated  because things like, I was typing questions like,   take vector fields, blah, blah, blah on a sphere,  find me the grad or the curve. I mean, it's like
            • 35:00 - 35:30 first, second year stuff, and you have to do a  lot of computation. And he was actually doing this   kind of thing correctly, partially because there's  just so many example sheets of this type out there   on the internet. And so he's kind of learned all  of that. So I was getting very excited and I was   trying to sell this to everybody at lunch. I was  having lunch with my usual team of colleagues in   Oxford over this. And of course, lo and behold,  who was at lunch was the great Andrew Wiles. So
            • 35:30 - 36:00 I felt like I was being a peddler for GPT, LLM for  mathematics, to perhaps the greatest living legend   in mathematics. And Andrew's super nice, and he's  a lovely guy. And he just instantly asked me,   he says, how about you try something much  simpler? Two problems he tried. The first one was,   tell me the rank of a certain elliptic curve. And  he just typed it down, a certain elliptic curve,
            • 36:00 - 36:30 or rational points, of a very simple elliptic  curve, which is his baby. And I typed it, and   it got completely wrong. He very quickly started  saying things like, five over seven is an integer.   Partially because this is a very hard thing  to do. You can't really guess integer points.   Unlike in calculus, where there's a routine  of what you need to do. And then very quickly,   we converge on an even simpler problem. How about  find the 17th digit in the decimal expansion of
            • 36:30 - 37:00 22 divided by 29, like whatever. And that it's  completely random, because you can't train, you   actually have to do long division. This is primary  school level stuff. And yet GPT just simply cannot   do, and it's inconceivable that it could do it,  because no language model could possibly do this.   But GPT now, O1, O2, O1 is already clever enough.  When he asks a question like this, linguistically,
            • 37:00 - 37:30 it knows to go to O from alpha. And then it's  okay, then it's actually doing the math. So   something so basic like this, you just can't train  a language model to do. You get one in 10 right,   and it's just a randomly distributed thing.  Whereas sophisticated things, they are seemingly   sophisticated things, like solving differential  equations, or doing very complicated integrals.   It can do, because there's somewhat of a routine,  and there are enough samples out there. So anyway,
            • 37:30 - 38:00 so that's my user case, two user cases. That's  also not terribly different than the way that   you and I, or the average person, or people in  general, think. So for instance, we're speaking   right now in terms of conversation. And then if we  ask each other a math question, we move to a math   part of our brain. We recognize this is a math  question. So there's some modularity in terms of   how we think. It's not like we're going to solve  long division using Shakespeare. Even if we're in
            • 38:00 - 38:30 a Shakespeare class, and someone just jumps in  and then asks that question, we're like, okay,   that's a different, that's of a different sort  of mechanism. Yeah, that's a good analogy. Yeah,   yeah. When you first encountered ChatGPT,  or something more sophisticated that could   answer even larger mathematical problems, did  you get a sense of awe or terror initially?   So I'll give you an example. There was this  meeting with some other math friends of mine,   and I was showing them ChatGPT when it first came  out. And then one of the friends said, explain,
            • 38:30 - 39:00 can you get it to explain some inequality,  or prove some inequality? And then it did,   and then explained it step by step. Then everyone  just had their hand over their mouth like,   are you serious? Can you do this? And then they're  like, then one said, one friend said, this is like   speaking to God. And then another friend said,  had the thought like, what am I even doing? What's   the point of even working, if this can just do my  job for me? So did you ever get that sense? Like,   yes, we're excited about the future, and it as  an assistant, but did you ever feel any sense
            • 39:00 - 39:30 of dread? I'm by nature a very optimistic person.  So I think it was just awe and excitement. I don't   think I've ever felt that I was threatened, or the  community is being threatened. I could totally be   wrong. But so far, I just say, this is such an  awesome thing, because it'll save me so much   time looking up references and stuff like this.  I was happy. I was just like, wow, this is kind
            • 39:30 - 40:00 of cool. I mean, I guess if I were an educator, I  might get a bit of a dread, because there's like,   you know, undergraduate degrees, you know, if you  do an undergraduate degree, it's just basically   one chart GPT being fed to another. You know, a  lot of my colleagues started setting questions   in exams with chart GPT, with fully latexed out  equation. I mean, this is becoming the standard   thing to do. I guess even if you're an educator,  you would probably worry. But I was thinking about
            • 40:00 - 40:30 just long term discovery of, you know, what new  knowledge can we generate? So in that sense,   this is going to be a certainly an incredible  help, because it's got all the knowledge in   the background. Wonderful. All right, let's move  forward. Yeah, sure. So 2022 was a great year.   I'm surprised this wasn't like over every single  newspaper. I don't know why. At least I was told   there was some obscuring outlet. I can't remember.  Some expert friends in the community told me that
            • 40:30 - 41:00 the chart GPT has passed the Turing test. This is  a big deal. But I don't know why it hasn't been.   I was hoping to see this on BBC and every major  newsletter, but it didn't catch on. But anyhow,   I believe that in 2022, chart GPT has passed the  Turing test. And then, you know, where in the last   two years, this is obviously where we can, you  know, this is a huge development now for large   language models for mathematics. And, you know,  every major company, OpenAI, MetaAI, EpochAI,
            • 41:00 - 41:30 you know, everything. And they've been doing  tremendous work in trying to get LLM for math.   Basically, you know, take the archive, which is a  great repository for mathematics and theoretical   physics, pure mathematical and theoretical  physics, and then just learn that and try to   generate to see it to how much. And this is very  much a work in progress. And of course, you know,
            • 41:30 - 42:00 AlphaGeo, AlphaGeo2, AlphaProof, this is all  the DeepMind's success. It's kind of interesting   within a year, you know, you've gone from 53%  on Olympia level to 84%, which is, you know,   this is scary, right? This is scary in the sense  that impressively awesome that, you know, they   could do so quickly. So basically in 2022, an AI  is approximately equal to the 12-year-old Terence   Tao, in the sense that it could do a silver  medal. But of course, this is a very specialized,
            • 42:00 - 42:30 you know, the AlphaGeo2 was really just homing in  on Euclidean geometry problems, which to be fair,   extremely difficult, right? If you don't know  how to add the right line or the right angle,   you have no idea how to attack this problem, but  it's kind of learned how to do this. So it's kind   of nice. So, you know, this is all within, you  know, a couple of years. And there's this very
            • 42:30 - 43:00 nice benchmark called Frontier Math that Epoch  AI has put out. I think there was a white paper   and they got Gowers and Tao, you know, the usual  suspects, just to benchmark. Okay, fine. So it   can do 84% on Math Olympiad, which is sort of  high school level. What about truly advanced   research problems, right? To my knowledge, as of  the beginning of this month, it was only doing 2%.   So that's okay, fine. So it's not doing that  great. But the beginning of this week, you learn
            • 43:00 - 43:30 that OpenAI 03 is doing 25%. So we've gone 20%  up. We've got a fifth up within four weeks of what   they can do. So this is, wow, that's kind of very  interesting. Such a rapid improvement. It's so,   this is great. I love this, right? Because it's  exciting. It's very rare to be. I remember back   in the day when I was a PhD student, doing AdS/CFT  related algebra geometry, because Maldacena had
            • 43:30 - 44:00 just come out with a paper in 97, 98, and that's  just when I began my PhD. I remember that kind   of excitement, the buzz in the string community.  People were saying, there was a paper every couple   of days on the next, that kind of excitement. And  I haven't felt that kind of excitement for a very   long time, just because of this. Wow. And then  this is like that, right? Every week, there's this
            • 44:00 - 44:30 new benchmark, a new breakthrough. So that's  why I find this field of AI system mathematics   to be really, really exciting. Can you explain,  perhaps it's just too small on my screen because   I have to look over here, but can you explain  the graph to the left with Terence Tao? Oh,   gosh. I'm not sure I can, because I'm sure I can  read this graph in detail. I think it's the, it's   the year. What is it trying to convey? So it's the  ranking of Terence. Oh, no, this is just Terence
            • 44:30 - 45:00 Tao's individual performances over different  years, over different problems. So he's retaking   the test every year? No, no, he's taken it three  times. Ages 10, 11, and 12. And when he was 10,   he got the bronze medal, and then he got the  silver medal, then he got the gold medal within   three years. Okay. And age of 12 or something. But  I can't, I think... What are those bars, though? I
            • 45:00 - 45:30 think the bars, it's a good question. I think,  maybe it's to the different questions, you're   given 60 questions, and what it would take to get  the gold medal, I think, or what it would take to   get the silver medal. I think. How many percents  do you have to correct? Okay, so it wasn't a   foolish question of mine. It's actually... No,  no, no, no, it's a good question. I have no   recollection, or maybe I never even looked at it.  Somebody told me about this graph at some point. I   forgot what it is. Okay, because it looks to me  like Terence Tao is retaking the same test, and
            • 45:30 - 46:00 then this is just showing his score across time,  and he's only getting better. But that can't be   it. Why would he retake a test? He's a professor.  No, I think it goes to 66. It must be like... This   is an open source graph. Oh, I thought you were  going to say, this is an open problem in the   field. What does this graph mean? No, no, no.  It's an open source. This graph is just... You   can take it from the Math Olympiad database. Got  it. Which I shamelessly... See, again, perfect,
            • 46:00 - 46:30 right? I've just done something that I have  absolutely no understanding. I've presented to you   like a language model, and I just copy and pasted,  because it's got a nice, cute picture of Terence   Tao when he was a little... So, finally, I'll go  back to the stuff that I really be thinking about,   which is this sort of top-down mathematics,  right? So, and then this is kind of interesting.   So, the way we do research, you know,  practitioners, is completely opposite to
            • 46:30 - 47:00 the way we write papers. I think that's important  to point that out. We muck about all the time. We   do all kinds... When you look at my board, right,  it's just filled with all kinds of stuff. And most   of it is probably just wrong. And then once we got  a perfectly good story, we write it backwards. And   I think writing math papers backwards, and math  generally defined, math and theoretical physics   papers backwards, well, theoretical physics is  a bit better. At least sometimes you write the
            • 47:00 - 47:30 process. But in pure math papers, everything  is written in the style of Bubaki, this very   dry definition proof, which is completely not how  it's actually done at all. This is why, you know,   Arnold, the great Vladimir Arnold, says, you know,  Bubaki is criminal. He actually used this word,   the criminal Bubaki-ization of mathematics,  because it leaves out all human intuition   experience. It just becomes this dry machine-like  presentation, which is exactly how things should
            • 47:30 - 48:00 not be done. But Bubaki is extremely important,  because that's exactly the language that's   most amenable to computers. So, you know,  it's one way or another. But we, you know,   human practitioners certainly don't do this kind  of stuff, right? We muck about, you know, we have   to... And sometimes even rigorous sacrifice,  right? If we have to wait for proper analysis   in the 19th century to come about, before Newton  invented calculus, we won't even know how to
            • 48:00 - 48:30 compute the area of an ellipse. Because we have to  wait and formalize all of that. It'll just go all   backwards. So kind of the historical progression  of mathematics is exactly opposite to the way that   it's presented. I mean, it's fine, but the way  it's presented is better. It's much more amenable   to a proof copilot system like Lean than what we  actually do. Even science in general is like that,
            • 48:30 - 49:00 where we say it's the scientific method, where  you first come up with a hypothesis and then you   test it against the world, gather data and so on.  But the way that scientists, not just in math and   physics, but biologists and chemists and so on,  work, are based on hunches and creative intuitions   and conversations with colleagues and several dead  ends, and then afterward you formalize it into a   paper in terms of step-by-step, but it was highly  nonlinear. You don't even have a recollection most
            • 49:00 - 49:30 of the time of how it came about. That's right.  And I think one of the reasons I got so excited   about all this AI for math is this direction.  Because this hazy idea of intuition or experience,   this is something that a neural network is  actually very, very good at. Wonderful. So   I'm going to give concrete examples later on  about how it guides humans. But just to give
            • 49:30 - 50:00 some classical examples, I've said this joke so  many times. So what's the best neural network   of the 18th century? Well, it's clearly the brain  of Gauss. I mean, that's a perfectly functioning,   perhaps the greatest neural network of all  time. I want to use this as an example. Because   what did Gauss do? Gauss plotted the number of  prime numbers less than a given positive real
            • 50:00 - 50:30 number, just to give a sort of continuity. And he  plotted this, and it's kind of a really, really,   you know, jaggedy curve. And it's a step function  because it jumps whenever you hit a prime. But   Gauss was just able to look at this when he was  16 and say, well, this is clearly x over log x.   How did he even do this experience? I mean, he had  to compute this by hand, and he did, and he got
            • 50:30 - 51:00 some of them wrong even. You know, primes, he had  tables. By his time, the tables of primes were up   in the tens and hundreds of thousands. He has it  go up in the hundred thousand range. And you can   just look at this as x over log of x. But this  is very important because he was able to raise   a conjecture before the method by which this  conjecture is proved, namely complex analysis,   was even conceived of by Cauchy and Riemann.  And that's a very important fact. So he just
            • 51:00 - 51:30 kind of felt that this was x over log x. And you  had to wait for 50 years before Hadamard and de   la Vallee Poussin to prove this fact because  this technique, which we now take for granted,   this technique called complex numbers, complex  analysis, wasn't invented by Cauchy. It wasn't   invented yet. You had to wait for that to  happen. So it happens like this in mathematics   all the time. Even major things. Of course, you  know, now it's called the prime number theorem,
            • 51:30 - 52:00 which is a cornerstone of all of mathematics.  This is the first major result since Euclid on   the distribution of primes. How did Gauss say  this was x over log x? I don't know. Because   he had a really great neural network. And this  happens over and over again. Like, you know,   the Birch-Swinnerton-Dyer conjecture, which  I'm going to talk about later, which is one of   the millennium problems. And it's still open,  and it's certainly one of the most important   problems in mathematics of all time. And this is  Birch-Swinnerton-Dyer in a basement, you know,
            • 52:00 - 52:30 in Cambridge in the 1960s. They just plotted ranks  and conductors of related curves. I'm going to   define those in more detail later. And they would  say, oh, that's kind of interesting. You know, the   rank should be related to the conductor in some  strange way. And that's now the BSD conjecture,   the Birch-Swinnerton-Dyer conjecture. And what  they were doing was computer-aided conjectures.   So here was the eyeballs of Gauss in the 19th  century. But the 20th century really had seriously
            • 52:30 - 53:00 computer-aided conjectures. And of course, the  proof of this is still open in general. There's   been lots of nice progress in this. And, you  know, where we're going to go is very much,   what technique do we need to wait to prove  something like this? Now, is there a reason   that you chose Gauss and not Euler? Like, is  it just because Gauss had this example of data
            • 53:00 - 53:30 points and guessing a form of a function? I'm sure  Euler, who certainly is great, had conjectures.   Maybe... That's an interesting quote. I'll mention  Euler later. But I think there's not an example as   striking as this one. In fact, what's interesting,  as a byproduct of Gauss inventing this, because it   was kind of mucking around with statistics,  right? This is before statistics existed as
            • 53:30 - 54:00 a field as well, right? This is like early 1800s.  And Gauss, I think, and you can check me on this,   Gauss got the idea of statistics and the Gaussian  distribution because he was thinking about this   problem. So it's kind of interesting. So he was  laying foundations to both analytic number theory   and modern statistics in one go. He was doing  regression. So I think he essentially invented
            • 54:00 - 54:30 regression and the curve fitting, which is like  101 of modern society. He was trying to fit a   curve. What was the curve that really fit this? In  the process, he got x over log x. And in addition,   he got this idea of regression. An impressive  guy. What can we say? He's a god to us all. The
            • 54:30 - 55:00 upshot of this is like, I love this. Again, this  is something I found on the internet. And just   to emphasize that this idea of... Speaking  of God. Yes, speaking of God, this idea of   mucking about with data in pure mathematics is  a very ancient thing, right? Once you formulate   something like this in conjecture, you will write  your paper. Imagine writing a paper, you will say,
            • 55:00 - 55:30 conjecture, definition, prime, definition, pi of  x, then conjecture, pi of x, evidence. Rather than   all of the failed stuff about inventing regression  and mucking about, all that stuff just gets not   written at all. That intuitive creative process is  not written down anywhere. So it's great. I'm glad   I'm chatting to you about it, right? Because it's  nice to have an audience with this, right? So if   you look at like... So pattern recognition, what  do we do in terms of pure mathematical data? If I
            • 55:30 - 56:00 gave you a sequence like this, you can immediately  tell me what the next number is, to some   confidence. Zeros is just, you know, this is just  multiple of three or not. This one, I've tried   this with many audiences, and after a few minutes  of struggle, you can get the answer. And then this   turns out to be the prime characteristic function.  So what I've done here is to mark all the odd   integers. And evens, obviously, you're going to  get zero. So it's kind of pointless. You just add
            • 56:00 - 56:30 just a sequence of odd integers. And then it's a  one if it's a prime, it's zero if it's not. So 3,   5, 7, 8, and so on and so forth. No, sorry, 3, 5,  7, 9, 11. And you mark all the odd ones, which are   one. And you can probably, after a while, you can  muck about and you can see where this is going.   The next sequence is much harder. So I'm going  to give away so we won't have to spend a couple
            • 56:30 - 57:00 of hours staring at it. So this one is what's  called the shifted Möbius function. What this is,   just you take an integer, and you take the  parity of the number of prime factors it has up   to multiplicity, starting from 2. I think I didn't  start from 1 here. And then if it's 1, if it's 0,   if it's an odd number of prime factors, it's 1 if  it's an even number of prime factors for all the
            • 57:00 - 57:30 sequence of integers. And I hope now I've gotten  this right. So if I think I start with 2, 2 has,   so that's all. No, let's see, 2, 3. Yeah, so I did  start I'm going to mark 1 for 1, just to kick off   the sequence. And then 2 is a prime number, it has  only one prime factor. It's an odd number. 3 is   an odd number of prime factors. 4 is 2, because  it's 2 squared. So it has an even number of prime
            • 57:30 - 58:00 factors, and so on and so forth. So 5 is prime, it  has one odd number. 6 is 2 times 3, so it has 2,   an even number of prime factors, and so on and so  forth. It looks kind of harmless. What's really   interesting, so this is even number. So if you  stare at this for a while, it's very, very hard to   recognize a pattern. And what's really interesting  is that to know the parity of the next number,
            • 58:00 - 58:30 if you have an algorithm that can tell me the  parity of this in an efficient way, you will   have an equivalent formulation of the Riemann  hypothesis. So that's actually an extremely   hard sequence to predict. So if you can tell me  with some confidence more than 50% what the next   number is, without looking up some table, then  you can probably end up cracking every bank in the   world. Because this is equivalent to the Riemann  hypothesis. So I've just given three, so trivial,
            • 58:30 - 59:00 kind of okay-ish, really, really, really hard.  Yes. So now you can think about a question, if I   were to feed sequences like this into some neural  network, how would a neural network do? So one way   to do it, so this goes a bend, so we go way back  to the very beginning, to the question of what is   mathematics? And Hardy, in his beautiful apology,  says, what mathematicians do is essentially we
            • 59:00 - 59:30 are pattern recognizers. That's probably the best  definition of what mathematics is, is that it's a   study of patterns, finding regularity in patterns.  And in fact, if there's one thing that AI can do   better than us, it's pattern detection. Because  we evolved in being able to detect patterns in
            • 59:30 - 60:00 three dimensions and no more. So in this sense,  if you have the right representation of data,   you're sure that AI can do better than that. I  mean, it'll generate a lot of stuff, but filtering   out what is better is a very interesting problem  in and of itself. So let's try to do one. I mean,   there are various ways to do this representation.  One way you can do it is to do a problem which   is maybe best fit for an AI system, which is  binary classification of binary vectors. So
            • 60:00 - 60:30 what you do is, you know, sequence prediction  is kind of difficult. So one thing you can do   is just take this infinite sequence and just  take, say, a window of a hundred, a thousand,   a fixed window size, and then label it with  the one immediately outside the window,   and then shift, label, shift, label. So then you  can generate a lot of training data this way. So   for this sequence, I think I've just taken here,  you know, whatever the sequence is, and I just,
            • 60:30 - 61:00 with a fixed window size, and with this label.  So now you have a perfectly supervised, perfectly   defined binary supervised machine learning  problem. Then you pass it to your standard AI,   you know, algorithm that they're, you know,  just, you know, out of the box ones, nothing,   you don't even have to tune your particular  architecture. Just take your favorite one,   and then do cross validation, you know, the  standard stuff, take sample, do the training,
            • 61:00 - 61:30 you know, and then try to validate this on unseen  data. So if you do this to the mod3 problem,   to this one, you immediately find that, you know,  any neural network, or whatever Bayes classifies,   would do it 100% accuracy, as it should,  because it would be really dumb if it didn't,   because this is just a linear transformation.  So even if you have a single neuron that's just   doing linear transform, that's good enough to do  it. The prime-q problem, I did some experiment,
            • 61:30 - 62:00 some, oh gosh, like seven years ago, and it got  80% accuracy. And I was like, wow, that's kind of,   this was a wow moment, I was like, why is it  doing 80? I don't have a good answer to this.   Why is it doing 80% accuracy to this? How is it  learning? Maybe it's doing some sieve method,   which is kind of interesting, somehow. The second  number is just to chi-squared, just to double test   what's called MCC, which is Matthew's correlation  coefficient. These are just buzzwords in stats.
            • 62:00 - 62:30 I never learned stats, but now I'm relearning. I  took Coursera in 2017, so I can relearn all these   buzzwords. Great. It's great, it's really useful.  And then this shifted Liouville lambda function,   it's, sorry, I think I made a, yeah, I mistakenly  called this Möbius mu function. It's not, I mean,   it's related, but it's not. It's the shifted  Liouville lambda function. Got it. Sorry,
            • 62:30 - 63:00 one of my neurons died when I said Möbius mu,  but it's Liouville lambda. You were subject to   the one-pixel attack. But so this one, I  couldn't break 50%, right? 0.5 just means   it's coin toss. It's not doing any better guessing  than whatever. And this chi-squared is 0.00, that   means I'm up to statistic error. So which means  I couldn't find an AI system which could break,   which could do better than random guess. I'm not  saying there isn't one, it would be great if there
            • 63:00 - 63:30 were one. And then, yeah, so it's kind of, you  know, it's life. And I couldn't, if I do break it,   you know, I might actually stand a good chance  breaking every bank in the world. All right. But   I don't, I haven't made it worse. Let's  remain close friends. Yeah, that's right,   that's right. So I was very proud of this because  this experiment, I'm going to mention a bit later,
            • 63:30 - 64:00 this Liouville lambda was just a thing I was  just trying, like way back when. But apparently,   Peter Sarnak, whom I really admire, he's one of  the world's greatest number theorists currently,   current number theorists. And I got to know him  through this memoration thing that I'm going to   talk about later. And I reminded him that I almost  became his undergraduate research student. I ended   up doing, I was an undergrad at Princeton,  where I had two paths I could follow for,
            • 64:00 - 64:30 you know, it kind of defines your undergraduate  thesis, right? So one was in mathematical physics,   that's with Alexander Polyakov. And the other  one was with, you know, two problems. And the   other one was actually offered by Peter Sarnak  on arithmetic problems. And I somehow just,   because I wanted to understand the nature of space  and time, I went through the Alexander Polyakov
            • 64:30 - 65:00 path to do mathematical physics, which led to do  string theory. After 20, 30 years, I came full   back to be in Peter Sarnak world again. I met him  at this conference, I reminded him of this, and he   was very happy. But what's really interesting is  that he was asking DeepMind the same question a   few years ago about the Liouville lambda, whether  DeepMind could do better than 50%. So I was glad
            • 65:00 - 65:30 that I thought along the similar lines as a great  expert in number theory. And somebody who could   have potentially have been my supervisor, and then  I would have gone into a number theory instead of   string theory, which is whatever, it's just how  life happens. So perhaps you're going to get to   this later on in the talk, but I noticed here  you have the word classifier. And the recent   buzz since 2020 or so has been with architecture,  the transformer architecture in specific. So is   there anything regarding mathematics, not  just LLMs, that has to do with transformer
            • 65:30 - 66:00 architecture that's going to come up in your talk?  Not specifically. I'm actually, it's interesting,   one of my colleagues here at the London  Institute, he's Mikhail Burtsev. He's an AI,   he's our institute's AI fellow, and he's an expert  on transformer architecture. So I've been talking   to him and we're trying to get, to devise a nice  transformer architecture to address problems
            • 66:00 - 66:30 in finite group theory. It's in the works. But  nothing so far, even with the memorization stuff,   it's very basic neural networks that we didn't use  anything more sophisticated than that. So to be   determined whether it will outperform the standard  ones will be kind of interesting. Got it. Yeah,   so actually now we go way back to the beginning  of our conversation, is how I got into this stuff.   And that, I don't know, completely coincidentally  was through string theory. So at this point,
            • 66:30 - 67:00 maybe I'll just give a bit of a background of how  all this stuff came about, at least personally.   Why was I even thinking about this? Because I knew  nothing about AI seven, eight years ago. Zero,   like literally zero. I knew nothing more than  to read it from the news. And this is actually   a very interesting story, which shows again, the  kind of ideas that the string theory community
            • 67:00 - 67:30 is capable of generating, just because you got  all these experts looking at kind of interesting   problems. So let's go way back. And again, you  know, I've quoted Gauss, right? I gotta cook,   I have to say something about Euler. So this is a  problem. Again, you can see I'm very influenced by   three, the number three. You know, I'm a total  numerologist, right? Trinity, name the three,   three is something, right? And then there is  called the trichotomy classification theorem by
            • 67:30 - 68:00 Euler. This dates to 1736. So if you look at, so  I'm going to say the buzzword, which is connected   compact orientable surfaces. So these are, you  know, I mean, the words explain themselves,   you know, they have no boundaries and they're,  you know, topologically, you know, whatever   the topological surfaces. So Euler was able to  realize that a single integer characterizes all
            • 68:00 - 68:30 such surfaces. So this is the standard thing  that people see in topology, right? So the   surface of a ball is the surface of a ball, and  you can deform it, you know, the surface of a   football is the same as an American football,  it can deform without cutting or tearing. And   then the surface of a donut is the same as,  you know, your cup, right? Because, you know,
            • 68:30 - 69:00 it's everything that everyone, the standard thing,  you know, this is, it has one handle. And so the   surface of a donut is exactly the topologically,  what they call topologically homomorphic to the   cup. And then you got the, you know, the pretzel.  So I think that's a pretzel. Or maybe, I think   this is like the German pretzel, and it gets  more and more complicated. But Euler is, because,   you know, Euler invented the field of topology. So  he realized this idea of topological equivalence,
            • 69:00 - 69:30 in the sense that there's a single topological  invariant, which we now call the Euler number,   which characterizes these things. Another way to,  an equivalent way to say is, the genus of these   surfaces is, you know, no handles, one handle, two  handles, three handles, and so on and so forth.   It turns out that the Euler number, which we now  call the Euler number, is 2 minus twice the genus.   So 2, 2, 2 minus 2g. Okay, that's great. So this  is, that's the classic Euler's theorem. And then,
            • 69:30 - 70:00 you know, comes in Gauss, right? Once you've got  these three names, the Euler, Gauss, and Riemann,   you know, this is, it's got to be some serious  theorem, right? So Euler did this in topology.   And then Gauss did this incredible work, which  he calls him, he himself calls him the Theorema   Grigio, the great theorem, which he considers  this is his personal favorite. And this is Gauss,
            • 70:00 - 70:30 right? And Gauss said, you can relate this number  to, which is, this number is purely topological.   You can relate this number to metric geometry.  So he came up with this concept, which we now   call Gaussian curvature, which is some complicated  stuff. You can characterize this curvature, which   you can define on this. Well, this is even before  the word manifold existed on the surface. And then
            • 70:30 - 71:00 you can integrate using calculus, and the integral  of this Gaussian curvature divided by 4 pi is   exactly equal to this topological number. And  that's incredible, right? The fact that you can   do an integral, it comes out to be an integer. And  that integer is exactly a topology. So this idea,   this Gauss related geometry to topology in this  one suite. And then what's even the next level,
            • 71:00 - 71:30 comes Riemann. Riemann says, well, what you can  do is to complexify. So these are no longer, you   know, real connected compact orientable surfaces.  But you can think about these as complex objects.   So what do we mean by that is, well, if you think  about, you know, the real Cartesian plane, that's
            • 71:30 - 72:00 a two dimensional object. But you can equally  think of that as a one complex dimensional object,   namely the complex plane. Or the complex line.  Yeah, the complex line, exactly. So with R2,   Riemann would call C. And then Riemann realized  that you can put similar structure on all of these   things as well. So all of a sudden, these things  are no longer two dimensional real orientable   surfaces. But one complex dimensional, what's now  called curves. I mean, it's a terrible name. So a
            • 72:00 - 72:30 complex curve is actually a two real dimensional  surface. And it turns out that all complex curves   are orientable. So you already rule out things  like, you know, Klein bottles and stuff like that,   or Möbius strips. So the complex structure  requires orientability. And that's partly because   of Cauchy-Riemann relations, you know, it puts a  direction, you can't get away. But the idea is,
            • 72:30 - 73:00 the interesting thing is, all of these now should  be thought of as one complex dimensional curves.   They're called curves because they're one complex  dimension, but they're not curves, right? They're   surfaces in the real sense. Yes. So now here  comes, so if you apply this to the Gauss thing,   you get this amazing trichotomy theorem. And the  theorem says, if you do this to the curvature,   you can see this, I mean, the number here  is two, right? You get the Euler number two,
            • 73:00 - 73:30 which is a positive curvature thing, right?  And that's consistent with the fact that the   sphere is a positively curved object. Locally,  everywhere, it has positive curvature. If you do   it to a torus, or the surface of a donut, which  is just called, you know, the algebraic donut,   you integrate that, you get zero curvature.  And this is not a surprise, because, you know,   you have a sheet of paper, you fold it once,  you get a cylinder, and you fold it again,   you glue it again, you get this torus, this  donut. And this sheet of paper is inherently
            • 73:30 - 74:00 flat. Yes. So if you just take a piece of paper,  you take this piece of paper, and you roll it up,   you get a cylinder. And then you do it again,  you get the surface of a donut, like a rubber   tire. And that is inherently zero curvature. And  then you can do this, and this is a consequence
            • 74:00 - 74:30 of what's known as Riemann uniformization theorem.  If you do anything that has more than one handle,   you get zero curvature. So now you have the  trichotomy, right? You have positive curvature,   zero curvature, and negative curvature. The one in  the middle is really, obviously, it's interesting,   it's the boundary case. In complex algebraic  geometry, these things are called funnel   varieties. Earlier you said, if you have anything  that's more than one handle, you have zero   curvature, you meant negative curvature. Sorry,  sorry, I meant negative curvature. Negative,
            • 74:30 - 75:00 yeah. So these fidget spinors on the right, they  all have negative curvature. Everything here has   negative curvature. Yeah. So now in the world  of complex algebraic geometry, these positive   curvature things are called funnel varieties,  after this Italian guy, Fano. These negative   curvature objects which proliferate are called  varieties of general type. And this boundary case   are called zero curvature objects. And it just  so happens, we now call things in the middle,
            • 75:00 - 75:30 clavier. These zero curvature objects. Yes. So  far, this has got nothing to do with physics.   I mean, it's just the fact of topology. But this  is such a beautiful diagram that took from 1736   until Riemann. Riemann died in the 1860s, I  think, or something like that. So it took 100,   120 years to really formulate just this table to  relate metric geometry to topology, to algebraic
            • 75:30 - 76:00 geometry. It's kind of a beautiful thing, right?  So to generalize this table is the central piece   of what's now called the minimal model program in  algebraic geometry, for which there have been all   these fields metalists, you know, Birkar a couple  of years ago, and then it started with Mori who   got the fields metal, and then this whole Mukai  and this whole distinguished idea. So basically,   this minimal model program should just generalize  this to higher dimension. This is dimension,
            • 76:00 - 76:30 complex dimension one, right? How do you do it?  It's very hard. And once you have it, I won't   bore you with the details. This is very nice.  You know, there's topology, algebraic geometry,   differential geometry, index theorem, they all  get unified in this very beautiful way. And   you want to, obviously, you want to generalize  this to arbitrary dimension, arbitrary complex   dimension. It'd be nice. It's still an open  problem. How do you do it in general? It's   a very nice problem. But at least for a class  of complex manifold known as Kähler manifolds,
            • 76:30 - 77:00 I won't bore you with the details, but Kähler  manifolds on which where the metric has very nice   behavior, there's a potential for which you can  have a double derivative that gets on the metric.   And then it was conjectured by Calabi in the 50s.  Again, you know, 54, 56, 57, it was a great year,   right? All these different ideas, I mean, in three  completely different worlds, now come together   because mathematical physicists have kind of tied  it up, you know, the world of neural networks,
            • 77:00 - 77:30 the world of Calabi conjecture, the world of  string theory to one. I like, you know, when   things get bridged up in this way, you know, but  again, the theorem itself is extremely technical.   But the idea is for this Kähler manifolds, there  is an analog of this diagram, basically. I love   this slide. I saved this slide for my own private  notes. I keep a collection of dictionaries in   physics and math. Yeah. I think this is beautiful.  Yeah, me too. I mean, but it took me years to do
            • 77:30 - 78:00 this table because, you know, it's not written  down anywhere. And it touches different things.   I think it's not written down anywhere precisely  because math textbooks are written in the Bubaki   style. But now it just becomes clear what people  have been thinking about for the past 100 years,   you know, after Grotendieck. It's just  trying to relate these ideas. You know,
            • 78:00 - 78:30 this is intersection theory of characteristic  classes. Ah, so this is topology. And, you know,   this is, I mean, this is over 200 years of work  of, you know, the central part of analytics. And   mathematicians like Chern, Ritchie, Euler,  Betty. Yeah, everything. Everybody was ever   involved in this diagram is an absolute legend.  In fact, there is one more column to this diagram.   I think for sure, I think when I did, I mean, this  was a slide from some time ago. But when I was
            • 78:30 - 79:00 talking to a string audience, there is one more,  one more, which is relations to L-function. And   that's what number theory comes in. So there  is one more column. And to understand this   world to this one more column of its behavior to  L-functions, that's the Langlands program. So it's   actually really magical that this table actually  extends more as far as, I mean, that's just as   far as we know now, right? The L-functions and its  relations to modularity. And this is, of course,
            • 79:00 - 79:30 obviously, to me, like mathematics is about  extending this table as much as possible to let it   go into different fields of mathematics. So, but  at least for sure, we know there is one because   of the Langlands correspondence, there is one more  column and that column should be on number theory   and modularity. And soon there'll be another table  on the Yang invariant, the He invariant. No, I   don't think, I don't think I have enough talent to  create something that, but it could well be there
            • 79:30 - 80:00 should be something, something new to do. To me,  that's really the most fun part about mathematics.   It's not, not so, I mean, they're like, you know,  I, who was it? I think maybe it's Arnold as well,   because there are two types of mathematicians.  They're the hedgehogs and they're the birds,   right? Hedgehogs really like, you know,  like, like specialize, specialize. I mean,   you absolutely need it. I think, you know, who  is a great hedgehog? I think Zhang, the guy who,
            • 80:00 - 80:30 you know, made this first major breakthrough  in the pride gap. I mean, he's been saying his   entire life, just trying to think about, can I  bound, can I bound the, you know, the, how many,   you know, in the, in the, what is the, what's  the limb sob of the, of the distance between,   between prime pairs. And the technique he uses is,  it's beautifully argued, analytic number theory
            • 80:30 - 81:00 technique, sieve methods, you know, kind of, you  know, the, the, the Ben Green world of this, of,   of, of sieves and James Maynard. And then there  are the, the, the, the, the birds who are like,   you know, I'm just going to just fly  around. I may bump into trees and whatnot,   but I'm just trying to see whether they can  do. And, and people like Robert Langlands and,   you know, they're very much in that world. Can I  see from a distance? I mean, I may get very coarse   grain view. And which are you? I'm 100% in the  bird category. I mean, what I, I, I like to go,
            • 81:00 - 81:30 you know, once I, once I see something and I,  I, I, of course, sooner or later, you need to   dig like a, like a hedgehog, but the most thrill  that I get is when I say, oh, wow, this is gets,   gets connected. So the results are proven when you  dig, but the connections are seen when you get the   overview. Yeah. Yeah, absolutely. So, I mean, of  course, again, this is a division that's kind of   artificial in all of us. We, we, we do a bit of  both. Yes. The guy who really does it well is,
            • 81:30 - 82:00 I forget to mention, of course, it's, it's, it's  like, it's, it's become like a grand, well, he,   he passed away, John Mackay, who was a Canadian,  probably the, the, the great, the greatest,   greatest Canadian mathematician since Coxeter.  John Mackay really saw unbelievable connections   in fields that nobody will ever see. And he passed  away, he became sort of, in the last 10 years of
            • 82:00 - 82:30 life, he became sort of like a, like a grandfather  to, to me. He, you know, he saw my kids grow up,   you know, over Zoom. I know, so the, the London  Math Society asked me to write an obituary. I   was very touched by this, and so I wrote his  obituary for it, and I was just trying to say,   well, this guy is the ultimate pattern, you know,  linker. So, so John Mackay, absolute legend.   Great. Moving on. I mean, this is a, this is very  much, this is very much a huge digression for what
            • 82:30 - 83:00 I'm actually going to tell you about, which is,  you know, the Birch test for AI. And that's...   Great. Do you have a limit on what these videos  are? No, just so you know, some of them are one   hour, some of them are four hours. And people  listen for all of it. Yeah, this is great fun.   Great. Yeah, same. I'm loving this. Yeah, me too.  Because normally, you know, I have one hour, you   know, what, 55 minute cutoff. I could give a  talk, right, and then five minutes questions.
            • 83:00 - 83:30 And they're like, oh my God, I haven't said most  of the stuff I wanted to say. Yeah, yeah, exactly.   Because the point of this channel is to give  whoever I'm speaking to enough time to get through   all of their points, rather than they're rushing  and not covering something in depth. I want them   to be technical and rigorous. So please continue.  Sure. Sounds good to me. So in that magical year   of 1957 of neural networks, the magical year of  the automated theorem prover world, and the world
            • 83:30 - 84:00 of algebraic geometry, in three complete different  worlds, they didn't even know each other's names,   let alone the results. Kalabji conjectured that at  least for Kähler manifolds, this diagram is very   much well-defined, this table. And Yau proved  it 20 years later. So Xintong Yau, who is,   again, very much like a mentor to me. And  he gets the Fields Medal immediately. So
            • 84:00 - 84:30 you can see why this is so important. He gets the  Fields Medal because this idea of falling through   Kalabji is trying to generalize this sequence  of ideas of Euler, Riemann, and Euler, Gauss,   and Riemann. So it's certainly very important.  So there it is. We can park this idea. So Yau   showed that there are these Kähler manifolds that  have this property, that have the right metrical   properties. So by metric, I mean distance,  something you can integrate over. Because here,
            • 84:30 - 85:00 you never think that this integral is messy,  right? Even if we do this on a sphere, this   R has all these cosines and sines. And they've  all got to cancel at the end of the day to get   4 pi. Yes. Like, what the hell? And then divided  by 2 pi, you get 2. And that's the Euler number,   which is kind of amazing stuff. And now you can  do this in general. Just as a caveat, Yau showed   that this metric exists. He never actually gave  you a metric. So the only currently known metric
            • 85:00 - 85:30 on this thing is, for the zero curvature case,  it's just the torus. Anything above that, we don't   know. We just know that exists. And if you did  this integral, you're going to get like 2, 5, or   whatever the number is. Which is kind of amazing.  This is like a completely non-constructive proof.   What's interesting is that these automated  theorem provers, they seem computational. And   it's my understanding that computationalists, so  people who use intuition as logic, they don't like
            • 85:30 - 86:00 constructive proofs. Sorry, they like constructive  proofs. They don't like non-constructive proofs.   In other words, existence proofs without showing  the specific construction. So it's interesting   to me that all of undergraduate math, which has  some non-constructive proofs, are included in   Lean. So I don't know the relationship between  Lean and non-constructive proofs, but that's   an aside. Yeah, that's an aside. I probably  won't have too much to say about it. Cool. So,
            • 86:00 - 86:30 I don't know why I went on this digression  on string theory. But I just want to say,   this is a side comment. So this is something  since 1736, which is kind of nice. Oh, by the way,   that's actually kind of interesting. I'm going  to have to check this again. Just down the street   from the Institute is the famous department store  Fortman Mason's, which I think is established in
            • 86:30 - 87:00 17-something. It's a great department store.  It's not where I usually do my shopping,   but it's just a beautiful department store where  Mozart and Haydn might have called and did their   Christmas shopping. But anyhow, just random  thought. So string theory was just one slide,   right? I mean, in some sense, I'm not a string  theorist. In a sense, I don't go quantize strings.
            • 87:00 - 87:30 The kind of stuff that I'm more interested in is  like, I didn't grow up writing conformal field   theories and do all that stuff. For me, it's an  input, so I can play with a little more problems   in geometry. So string theory is this theory of  space-time that unifies quantum gravity, blah,   blah, blah. And then it works in 10 dimensions,  and we've got to get down to four dimensions. So   we're missing six dimensions. So that's what I  want to say. And this amazing paper in 1985 by
            • 87:30 - 88:00 Candelas, Horowitz, Strominger, and Witten, they  were thinking about what are the properties of the   six extra dimensions. So what is interesting is  that by imposing supersymmetry, and this is why   supersymmetry is so interesting to me, by imposing  supersymmetry and other anomaly cancellation,   not too stringent conditions, they hit on the  condition that this six extra dimensions has to
            • 88:00 - 88:30 be Ricci-flat. Ricci-flat, you can understand,  because it's vacuum-style solutions. You want   the vacuum string solution. And then a condition  which you've never seen before, which just happens   to be this Cayley condition. They didn't know  about this. No physicist until 1985 would know   what a Cayley manifold was. And it's complex,  and it's complex dimension three. Remember,   again, I said complex dimension three means real  dimension six, right? That's 10 minus four is six,
            • 88:30 - 89:00 and six needs to be complexified into three. And  again, this is just an amazing fact that in 1985,   Strominger, who was a physicist, was visiting Yau  at the Institute of Advanced Study in Princeton.   And so he went to Yau and said, can you tell  me what this strange condition, this technical   condition I got? And Yau says, well, you know, I  just got the Fields Medal for this. I think I may
            • 89:00 - 89:30 know a few things. It's just amazing. Again, it  was a complete confluence of ideas that's totally   random. And the rest is history. So in fact, these  four guys named this Ricci-flat Cayley manifold   Clabi-Yau. So it wasn't the mathematicians who  did it. This word Clabi-Yau came from physicists.   So from string theorists, which now, you know,  of course, Clabi-Yau is now one of the central   pieces. And so Philip Candelas was my mentor at  Oxford when I was a junior fellow there. And he
            • 89:30 - 90:00 tells me this story. He's a very lively guy. He  tells me about how this whole story came about,   and it's very interesting. So he and these four  guys came up with the word Clabi-Yau. So all of   a sudden, we now have a name for this boundary  case in complex algebraic geometry. This bounding
            • 90:00 - 90:30 case is now known as a Clabi-Yau. So remember,  we had names before, right? This was the final   variety. This was varieties of general type.  And this bounding case is now called Clabi-Yau.   So what we're seeing with the torus here is  a Clabi-Yau 1. Exactly. In fact, the torus   is the only Clabi-Yau 1. So it's the only one  that's Ricci-flat. I mean, by this classification,
            • 90:30 - 91:00 it's the only one that's topologically possible.  So that's kind of interesting, right? And then   this is just a comment. I like this title because  I think your series is called TOE. This is a TOE   on TOE. Love it. I just want to emphasize, this  is a nice confluence of ideas with mathematical   physics. But string theory really, what it really  is, is this brainchild of interpreting problems   between, interpreting and interpolating between  problems in mathematics and physics. So for
            • 91:00 - 91:30 example, we now, you know, GR should be phrased  in differential geometry. The standard model gauge   theory should be phrased in terms of algebraic  geometry and representation theory of finite   groups. And, you know, condensed matter physics of  topological insulators should be phrased in terms   of algebraic topology. This idea, you know, I  think the greatest achievement of the 20th century   physics is, to me, and I think something you would  appreciate since you like tables, is that here's
            • 91:30 - 92:00 a dictionary of a list of things, and then here's  what they are in mathematics. And then, you know,   you can talk to mathematicians in this language,  and you can talk to physicists in that language,   but they're actually the really same, same, same  thing. You know, what's a fermion? You know,   it's a spin representation of the Lorentz group.  You know, I like that because it gives a precise   definition of what we are seeing around. Then you  have something you can purely play with in this   platonic world. And string theory is really just  the brainchild of this translation, this tradition
            • 92:00 - 92:30 of what's on the left and what's on the right, and  let's see what we can do. And sometimes you make   progress on the left, you give insight and stuff  on the right, and sometimes you make progress on   the right and you give insight on the left. Why  is it that you call the standard model algebraic   geometry? Because bundles and connections are  part of differential geometry, no? Oh yeah,   that's true. Well, I think that's, yeah, I mean,  they're interlinked. And I think algebraic, maybe   it's because of Atiyah and Hitchin. Of course, you  know, they are fluid in both. Yeah, they go either
            • 92:30 - 93:00 way. But algebraic in the sense that you can often  work with bundles and connections without actually   doing the integral in differential geometry.  So I think that's the part I want to emphasize.   You know, you can understand bundles purely  as algebraic objects without ever doing an
            • 93:00 - 93:30 integral. You know, like here, for example. Like  this integral is obviously something you would do   in differential geometry. But this integral, the  fact that it comes to be an integer, was explained   through the theory of churn classes. You know,  this integral is a pairing between homology and   cohomology, which is a purely algebraic thing.  You know, we all try to avoid doing integrals,
            • 93:30 - 94:00 because integrals are horrible. Because it's  hard to do. And in this language, it really just   becomes polynomial manipulation. And it becomes  much simpler. Okay. So, you know, in that sense,   I want to put it. Of course, you know, it's a bit  of both. So I like doing this diagram, right? And,   you know, if you look at the time lag between the  mathematical idea and the physical realization of
            • 94:00 - 94:30 that idea, there really is a confluence. Yeah.  It's getting closer. I mean, these things going   up and down. I mean, I'm just saying in  the past, if you take the last 200 years,   last 100 years or so of the groundbreaking ideas  in physics, there is this. Interesting. Right. It   gets shorter and shorter. So, I mean, obviously,  Einstein took ideas of Riemann. And, you know,   there was a six-year gap. Dirac was able to come  up with the equation of electron, essentially
            • 94:30 - 95:00 because of Clifford algebras. Historically, was  he motivated by Clifford algebras? Or was it   later realized, hey, Dirac, what you're doing is  an example of a Clifford algebra? So I believe the   story goes, in order to write down the first-time  derivative version of the Klein-Gordon equation,   which is a second-order, you know, that's the  bosonic one, he had to do some… Essentially,   he factorized the matrix in a way that  seemed very strange to him. And Dirac said,
            • 95:00 - 95:30 this really reminded me of something that I've  seen before. And this is one of those moments,   right? Today, we can ChatGPT this. But  what Dirac did was, he was at St. John's   in Cambridge at the time. He said, I have seen  this in a textbook before somewhere, you know,   this gamma mu gamma nu thing. And then he said,  I need to go to the library to check this. So he
            • 95:30 - 96:00 really knew about this. And unfortunately, the  St. John's library was closed that evening. So   he waited until the morning, until the library  was open, to go to Clifford's book. Or a book   about Clifford. I can't remember whether  it was Clifford's book, or maybe it was   one of these books. And then he opened up,  and he really knew that this gamma mu gamma nu   anti-commutation relation really was through…  So he knew about Clifford. Cool. It's kind of
            • 96:00 - 96:30 interesting. Yeah. Just like Einstein knew about  Riemann's work on curvature. But whether you say,   you know, Dirac was really inspired by  Clifford, well, he certainly did a funky   factorization. And then he knew how to justify  it immediately, by looking at the right source.   And then similarly, you know, Yang-Miao's theory  depended on this Zybert's book on apology. And
            • 96:30 - 97:00 then, you know, by the time you get to Witten and  Borchardt's, really, there's this… This diagram,   for me, is what gets me excited about string  theory. Because string theory is a brainchild   of this curve, this orange curve. And now it's  getting mixed up. I mean, of course, you know,   people hear about this great quote that Witten  says, you know, string theory is a piece of 21st   century mathematics that happens to fall into the  20th century. And I think he means this. You know,
            • 97:00 - 97:30 that he was using supersymmetry to prove, you  know, theorems in Morse theory, and vice versa.   Richard Borchardt was using vertex algebras,  which is sort of foundational thing conformal   field theory, to prove some properties about the  monster group. We're at this stage. And of course,   you know, this was turn of the century. And now  we're here, and we have to… Where are we now? Are   we crisscrossed, or are we parallel? It's hard to  say. And in a meta manner, you can even interpret
            • 97:30 - 98:00 this as the pair of pants in string theory with  the world sheet. Yeah, cute. Very cute. Why not?   But going back to what you were saying, how I got  to… Oh, yeah. So just, yeah, this confluence idea,   of course, you know, everyone quotes these  two books, papers. You know, when Wigner was
            • 98:00 - 98:30 thinking about in 59, why mathematics is so  effective in physics. And there's this maybe   slightly less known paper, but certainly equally  important paper by the great Leitler, Thea,   and then Dijkgraaf and Hitchen, which is the other  way around. Why is physics so effective in giving   ideas in mathematics? So this is a beautiful pair  of essays. This is like very much in the world of
            • 98:30 - 99:00 a summary of the kind of physics ideas from string  theory. It's making such beautiful advances in   geometry. So this is a very beautiful pair of one  given in the other that needs to be, you know,   sort of praised more. And that's why you were  mentioning earlier how I got to know, you know,   Roger. So while he's through these editorials,  we try to connect, you know, with my colleague,
            • 99:00 - 99:30 Molinke, who is a former director of the Chern  Institute. You know, everybody's connected,   right? So it just so happens that, you know, I  grew up in the West, but after my trip with my   parents, after so many decades, my parents  actually retired and went back to Tianjin,   where Nankai University is, where Chern founded  what's now called the Chern Institute for
            • 99:30 - 100:00 Mathematical Sciences. And that's an institute  devoted to the dialogue between mathematics and   physics. In fact, one third of Chern's ashes is  buried outside of the Math Institute. There's a   great, beautiful marble tomb. And one third,  not because of any mathematical reason, it's   just that he considered three parts of his home.  So his hometown in Zhejiang, China, and Berkeley,
            • 100:00 - 100:30 where he did most of his professional career, and  then Nankai University, where he retired to for   the last 20 years of his life. So a third each.  Yes, the number three comes up again. And in fact,   I was going to joke, so in Chern-Simons theory in  three dimensions, there's this topological theory,   the Chern-Simons theory, there's a crucial factor  of one third. I always joke, you know, that's why   Chern chose one third for his ashes, but that's  not right. Complete coincidence. But what is
            • 100:30 - 101:00 actually interesting is that tomb, that beautiful  black marble tomb, you know, for somebody as great   as Chern, it mentions nothing about, you know, his  chief done this, done the other thing. It's just   one page of his notebook. You think about the poor  guy who had a chisel, or that he had no idea what   he's chiseling, right? The guy was chiseling this  thing, and it's the proof of this. And of course,
            • 101:00 - 101:30 you can look this on the internet, just say the  grave of S.S. Chern at Nankai University. Well,   the whole conversation we've had is just  about pattern matching without the intuitive   understanding behind it. So this chiseler may have  had that. Yes, that's what I do every day. I love   it. So that chisel is essentially his proof of  why this is equal to this. You know, why this
            • 101:30 - 102:00 intersection product is the same as this integral.  So essentially, it's where the Gauss-Bonnet   theorem is a corollary of this trick in algebraic  geometry, which is his great achievement. But   anyhow, back to this coincidence, and it just so  happens that my parents, after drifting all these   years abroad, they retired back to Tianjin,  where the Chern Institute is. So that's why I   became an honorary professor at Nankai, because  my motivation was purely just so that I could
            • 102:00 - 102:30 spend time to hang out with my parents. But it  just so happens that it happens to be there,   and I can just pay my homage to Chern, just to  see his grave. I mean, it's a great, you know,   it's a mind-blowing experience just to see the  Chern's grave and to see the derivation of this   in his handwriting chiseled in stone. But anyway,  so that's how I got involved with C.N. Yang,   because he was very deeply involved with  Chern. He and Chern are good friends. I can
            • 102:30 - 103:00 imagine that C.N. Yang is 102 today. Yeah, it's  remarkable. And that he was still doing, he wrote   the preface to this when he was 99. These guys  are unstoppable. And, you know, Roger Penrose,   he sent his essay to this one when he was, what,  92? Yeah, these guys are... Anyhow, it's kind of,
            • 103:00 - 103:30 you like tables, right? I love tables. So the  tables, here's just a speculation of where string   theory is going. Here's a list of, you know, the  annual conferences, like the series where string   theory has been happening. So 1986 was the first  string revolution, where since then, every year,   there's been a major string conference. I'm  going to the first one I'm going to for years
            • 103:30 - 104:00 in two weeks' time. It happens in Abu Dhabi,  I get some sun. And then, you know, there's a   series of annual ones, the StringFino, and then  StringMath came in as late as in 2011. That's   kind of interesting. So that's like, you know, 30  years after the first string conference. And the   various other ones. What's really interesting one  is in 2017, there's the first string data. This   is when AI entered string theory. And so it's  kind of, so what I read the first paper in 2017
            • 104:00 - 104:30 about AI-assisted stuff, and there were three  other groups independently, mining different   AI aspects and how to apply to string theory. So  the reason I want to mention this was just how,   why was, you know, with the string community even  thinking about these problems in AI? Oh, and also,   just to be clear, briefly speaking, I'm not a  fan of tables, per se. I'm a fan of dictionaries   because they're like Rosetta Stones. So I'm a  fan of Rosetta Stones and translating between
            • 104:30 - 105:00 different languages. So you mentioned the siloing  earlier. And mathematicians call, even physicists   call them dictionaries, but technically they're  thesauruses. Like a dictionary, you just have a   term and then you define it. The translation.  Like Rosetta Stones. Yes. No, absolutely. I   guess that's why you like Langlands so much. Yeah.  Yeah, for sure. Yeah, no, absolutely. In some way,   this whole channel is a project of a Rosetta Stone  between the different fields of math and physics   and philosophy. Right. Yeah. That's fantastic.  Love it. Big fan. Thank you. Okay. So do you want
            • 105:00 - 105:30 to just, I noticed it jumped back to number 13. So  it seems like, I thought we were at 39 out of 40.   No, no, no. Because I've learned this nonlinear  structure. Because you see, like, I've learned   this, this is really dangerous. I've learned  like the click button in PDF presentations.   Like you click it, it jumps to another one.  And you can have interludes. So, you know, it's   clearly an interlude. And you say you jump back to  your main. So my actual main presentation is only
            • 105:30 - 106:00 like, you know, 30 pages. But it's got all these  digressions, which is actually very typical of my   personality. So I gave you this big interlude  about string theory and Calabi-Yau manifolds,   right? So now we've already got to the point that  Calabi-Yau one-fold, the one-dimensional complex   Calabi-Yau. There's only one example. That's  just one of these, right? And then it turns
            • 106:00 - 106:30 out that in complex dimension two, there are two  of these. There is the four-dimensional torus,   which is, and then there's this crazy thing  called the K3, which is Ritchie, Flatt, and   Caylor. So you got one in complex dimension one,  two in complex dimension three. You would think   in three dimensions, there's three of these  things that are topologically distinct. And   unfortunately, this is one of the sequences  in mathematics that goes as one, two, we have
            • 106:30 - 107:00 absolutely no idea. And we know at least one  billion. At least. So it's kind of, it goes one,   two, a billion. And so Calabi-Yau, so starting  from complex dimension three just goes crazy. It's   still a conjecture of Yau that in every dimension,  this number is finite. So remember this positive   curvature thing, this final thing to the very  top? It is a theorem that in every dimension,
            • 107:00 - 107:30 final varieties is finite in possibility, in  topology, that only a finite number of these   that are distinct topologically. It's also known  that the negative curvatures is infinite in every   dimension. And when it goes higher, it's like  even uncountably infinite. Oh, interesting. But   it's this boundary case. Yau conjectures in  an ideal world, they're also finite. But we   don't know. This is the open conjecture. Now the  billion, are any of them constructed? Or is it
            • 107:30 - 108:00 just the existence? Yeah, that's it. Now that's  exactly what we're getting. So it's gotten one,   two, and three. Three is like, you know, how are  you going to list these things, right? And then   algebraic geometers never really bother listing  all in one mouth. This is just not something   they do. So it took on, the physicists took on  the challenge. So Philip Candelas and Franz,
            • 108:00 - 108:30 and then Harald Skaka and Maximilian Kreutzer  started just listing these. And that's why we have   these billions. There is actually databases of  these. And they're presented in just like matrices   like this. I won't bore you with the details of  these matrices. You know, these are algebraic   varieties. You can define these as, you know,  like intersections of polynomials. That's one   way to present them. And in Kreutzer and Skaka's  database, they put in vertices, optoric varieties,   height, and bend. But the upshot is that,  you know, there's a database of many, many
            • 108:30 - 109:00 gigabytes that really got done by the, certainly  by the turn of the century, by year 2000. These   guys were running on Pentium machines. I mean,  this is an absolute feat. Especially Kreutzer   and Skaka. They were able to get 500 million  of these things stored on a hard drive using   a Pentium machine of these Carb-EL manifolds. And  they were able to compute topological invariants
            • 109:00 - 109:30 of these. So that's, so I happened to have this  database. I could access them. And that was kind   of fun. And I've been playing on and off with  them for a number of years. So, and, you know,   a typical calculation is like, you know, you  have something like a configuration of tensors   of, you know, here is even in integers. And you  have some standard method in algebraic geometry   to compute topological invariants. And this  topological invariants, again, in this dictionary
            • 109:30 - 110:00 means something. So for example, H1, H21, in some  context, is the number of generations of fermions   in the low energy world. So that's a complete  problem in this computing a topological invariant   in algebraic geometry. And there are methods to  do it. And in these databases, you know, people   took 10, 20 years to compile this database. And  you got these things in. And they're not easy.   It's very complicated to compute these things. So  in 2017, I was playing around with this. And the
            • 110:00 - 110:30 reason is very, why I was playing around with  this was very simple, is because my son was   born. And I had infinite sleepless nights, that I  couldn't do anything, right? I had like, you know,   there's the kid. And then, you know, there's the  kid, and you know, and he wakes you up at two,   you know, put him to, and, you know, and I was  bottle feeding him. And while I had a daughter   at the time, so that my wife's taking care of the  daughter, they're passed out. And then I got this
            • 110:30 - 111:00 kid, I passed him, I put them into bed, and I'm  wide awake at this point, it's like 2am. So like,   I can't fall asleep anymore. And I can't do real,  you know, serious computation anymore, because I'm   just too tired. So let's just play around with  data, the least I can let the computer help me   to do something. And then that's when I learned,  well, you know, what's this thing that everybody's   talking about? Well, you know, it's machine  learning. So that's why I got through this. It's
            • 111:00 - 111:30 a very simple, very simple biological reason why  I was trying to learn machine learning. So then   I think I was hallucinating at some point, right?  I was like, well, if you look at pictures, like,   you know, matrices a lot, like, you know, we're  talking about, you know, 500 million of these   things, right? Yes. Certainly, I wasn't going  through all of them. And they're being labeled   by topological invariants. How different is it if  I just sort of pixelated one of these and labeled   them by this? And all of a sudden, this began to  look like a problem in hand-digit recognition,
            • 111:30 - 112:00 right? This is like, how different is this or  image recognition? So and I just literally started   feeding in, I took 500, I mean, 500 million is  too much, right? So I took like 5,000 of these,   10,000 of these, and I trained them to look and  recognize this, to recognize this number. And I   was like, this is going to be like, it's just  going to give crap. Obviously, it's going to
            • 112:00 - 112:30 give 0% accuracy. And to my surprise, it was  giving extremely good accuracies. So somehow,   the neural network that I was training, this is, I  was even using standard MNIST, you know, the hand   recognition, MNIST things, recognizing this. And  it was recognizing it to great accuracy. And now,   I mean, people have improved this, like loads of  people like, you know, Finatello, there's a group   there that did some serious work on just trying  this problem. But this idea suddenly didn't seem
            • 112:30 - 113:00 so crazy anymore. The idea seemed completely crazy  to me because I was hallucinating at 2am. But   what's the upshot of this? The upshot is, somehow  the neural network was doing algebraic geometry,   like this kind of algebraic geometry, really  sequence-chasing, very complicated Bourbaki-style   stuff, without knowing anything about algebraic  geometry. It somehow was just doing pattern   recognition, and somehow it's beating us. Because,  you know, if you do this computation seriously,
            • 113:00 - 113:30 it's double exponential complexity. But it's just  now, by pattern recognition, it's bypassing all   of that. So then I became a fanatic, right?  Then I said, well, all of algebraic geometry   is image processing. And so far, I have not been  shocked by the algebraic geometries, because it's   actually true. If you really think about it,  the point of algebraic geometry, the reason I   like algebraic more than differential is because  there's a very nice way to represent manifolds in
            • 113:30 - 114:00 this way. Manifolds in algebraic geometry. So in  differential geometry, manifolds are defined in   terms of Euclidean patches. Then you do transition  functions, which are differentiable, C infinity,   blah, blah, blah. But in algebraic geometry,  they're just vanishing low-side polynomials.   And then once you have systems of polynomials, you  have a very good representation. So for example,   here, I'm just recording the list of polynomials,  the degrees of polynomials that are embedded
            • 114:00 - 114:30 in some space. And that really is algebraic  geometry. So basically, any algebraic variety,   so that's a fancy way of saying this polynomial  representation of a manifold, which is called an   algebraic variety, this thing is representable in  terms of a matrix or a tensor, sometimes even an   integer tensor. And then the computation  of invariance, a topological invariance,
            • 114:30 - 115:00 is the recognition problem of such tensor. But  once you have a tensor, you can always pixelate   it and picturize it. At the end of the day, it's  doing this because it's just image processing   algebraic geometry. Now, do you mean to say every  problem in algebraic geometry is an image process?   Almost. Is an image processing problem, or just  problems involving invariance or image processing,   or even broader than that? Well, I think it is  really more broad. I think at some level, I think
            • 115:00 - 115:30 in my view, I try to say bottom-up mathematics is  language processing, and top-down mathematics is   image processing. Interesting. Of course, this  is, I mean, take with a caveat, but of course,   at some level, there is truth in what I say.  Of course, it's an extreme thing to say. But in   terms of what mathematical discovery is, is that  you're trying to take a pattern in mathematics.
            • 115:30 - 116:00 So in algebraic geometry, just a perfect example,  you can pixelate everything, and you can just try   to see certain images have certain properties.  And so you're image processing mathematics,   whereas bottom-up, you're building up mathematics  as a language. So it's language processing. And   of course, all of this will be useless if you  can't actually get human-readable mathematics   out of it. So this is the first surprise, the fact  that it's even doing it at all to a certain degree
            • 116:00 - 116:30 of accuracy. Now we're talking about accuracy,  it's been improved to like 99.99 percent accuracy   in these databases. But that's the first level,  that's the first surprise. The second surprise is   that you can actually extract human-understandable  mathematics from it. And I think that's the next   level surprise. So in the memoration conjectures,  this beautiful work in DeepMind that Jody   Williamson's involved in, in this human-guided  intuition, you can actually get human mathematics
            • 116:30 - 117:00 out of it, and that's really quite something. So  maybe that's a good point to break for part two,   which is an advertisement of, you know,  here is like, we've gone through many,   many things about what mathematics is, and to,  you know, how it got this through doing, you know,   this interaction between algebraic geometry  and string theory. And then a second part would   be how you can actually extrapolate and extract  mathematics, actual conjectures, things to prove
            • 117:00 - 117:30 from doing this kind of experimentation, which are  summarized in these books. I keep on advertising   my books because I get 50 pounds per year of,  what do they call it, royalties, you know, so I   don't have to sell my liver for my kids. But it's  actually kind of fun. It's a complete, I mean,   academic publishing is a good joke, right? You  get like, I don't know, like 100 pounds a year,
            • 117:30 - 118:00 because you don't actually make money out of it.  But maybe that's a good place to break. And then   for part two, how we try to formulate what the  Birch Test is for AI, which is sort of, you know,   the Turing Test Plus. Because the Birch Test is  how to get actual meaningful human mathematics   out of this kind of playing around with  mathematical data. I see two of your sentence   that will be these maxims for the future will be  that machine learning is the 22nd century's math
            • 118:00 - 118:30 that fell into the 21st. So this machine learning  assisted mathematics, or that the bottom up is   language processing, and then the bottom, the top  down is image processing. Yeah. I like those two.   Yeah. Anyone who's watching, if you have questions  for Yang-Hui for part two, please leave them in   the comments. Do you want to give just a brief  overview? Oh, yeah, sure. So just so I'm going   to talk about what the Birch Test is, and which  papers so far have gone, how close they've gone
            • 118:30 - 119:00 to the Birch Test. And then I'm going to talk  about some of the more experiments, number three,   and the one that I really enjoyed doing with  my collaborators, Lee, Oliver, and Pashnakov,   which is to actually make something meaningful  that's related to the Birch-Stone-Winton-Dye   conjecture. Just by just letting machine go crazy  and finding a new pattern in elliptic curves,   which is fundamentally a new pattern in the  prime numbers, which is completely amazing. You
            • 119:00 - 119:30 mentioned quanta earlier. So this quanta feature  that featured this one, consider this as one of   the breakthroughs of 2024. Great. And that word  murmuration, which was used repeatedly throughout,   it was never defined, but it will be in the part  two. Absolutely. I'm looking forward to it. Me   too. Me too. Okay. Thank you so much. Thank  you. This has been wonderful. I could continue   speaking to you for four hours. Both of us have  to get going, but that's so much fun. Pleasure.
            • 119:30 - 120:00 Don't go anywhere just yet. Now I have a recap of  today's episode brought to you by The Economist.   Just as The Economist brings clarity to complex  concepts, we're doing the same with our new   AI-powered episode recap. Here's a concise summary  of the key insights from today's podcast. Alright,   let's dive in. We're talking about Curt Jamungal  and his deep dives into all things mind-bending.   You know this guy puts in the hours, like weeks  prepping to grill guests like Roger Penrose on
            • 120:00 - 120:30 some wild topics. Yeah, it's amazing using his own  background to dig in. Really challenging guests   with his knowledge of mathematical physics pushes  them beyond the usual. Definitely. And today we're   focusing on his chat with mathematician Yang-Hui  He. They're getting into AI, math, where those two   worlds collide. And it's fascinating because it  really makes you think differently about how math   works, how we do math, and where AI might fit into  the picture. You might think a mathematician's
            • 120:30 - 121:00 life is all formulas and proofs, but Yang-Hui,  he actually started exploring AI-assisted math   while dealing with sleepless nights with his  newborn son. It's such a cool example of finding   inspiration when you least expect it. Tired but  inspired, he started messing around with machine   learning in those quiet early morning hours. So  let's break down this whole AI and math thing.   Yang-Hui, he talks about three levels of math.  Bottom-up, top-down, and meta. Bottom-up is like   building with Legos. Very structured, rigorous  proofs. That's the foundation. But here's where
            • 121:00 - 121:30 things get really interesting. It has limitations.  Right. And those limitations are highlighted by   Gödel's incompleteness theorems. Basically, Gödel  showed us that even in perfectly logical systems,   there will always be true statements that can't be  proven within that system. It's mind-blowing. So   if even our most rigorous math has these inherent  limitations, it makes you think. Could AI discover   truths that we as humans bound by our formal  systems might miss? Could it explore uncharted   territory? That's a really deep thought. And it's  really at the core of what makes this conversation
            • 121:30 - 122:00 revolutionary. It's not about AI just helping us  with math faster. It's about AI possibly changing   how we think about math altogether. So how is this  all playing out? We've had computers in math for   ages, from early theorem provers to AI assistants  like Lean. But where are we now with AI actually   doing math? Well, AI is already making some big  strides. It's tackling Olympiad-level problems and   doing it well. Which makes you ask, can AI really  unlock the secrets of math? And that leads us to
            • 122:00 - 122:30 the big philosophical questions. Is AI really  understanding these mathematical ideas? Or is   it just incredibly good at spotting patterns?  It's like that famous Chinese room thought   experiment. You could follow rules to manipulate  Chinese symbols without truly understanding the   language. Yang-Hui, he shared a story about Andrew  Wiles, the guy who proved Fermat's last theorem,   trying to challenge GPT-3 with some basic math  problems. It highlights how early AI models,   while excelling in tasks with clear rules and  plenty of examples, struggled with things that
            • 122:30 - 123:00 needed real deep understanding. It seems like AI's  strength right now is in pattern recognition. And   that ties into what Yang-Hui calls top-down  mathematics. It's where intuition and seeing   connections between different parts of math are  king. Like Gauss. He figured out the prime number   theorem way before we had the tools to prove it.  It shows how a knack for patterns can lead to big   breakthroughs even before we have the rigorous  structure. It's like AI is taking that intuitive   leap, seeing connections that might have taken  us humans years, even decades, to figure out.
            • 123:00 - 123:30 And it's all because AI can deal with such massive  amounts of data. Which brings us back to Yang-Hui.   He's sleepless nights. He started thinking about  Calabi-Yau manifolds, super-complex mathematical   things key to string theory, as image-processing  problems. Wait, Calabi-Yau manifolds? Those sound   like something straight out of science fiction.  They're pretty wild. Think six dimensions all   curled up, nearly impossible to picture. They're  vital to string theory, which tries to bring all
            • 123:30 - 124:00 the forces of nature together. Now, mathematicians  typically use these really abstract algebraic   geometry techniques for this. But Yang-Hui? He had  a different thought. So instead of equations and   formulas, he starts thinking about pixels. Yeah.  Like taking a Calabi-Yau manifold, breaking it   down into a pixel grid like you do with an image.  He's taking abstract geometry and turning it into   something a neural network built for image  recognition can handle. That is a radical   change in how we think about this. It's like he's  making something incredibly abstract, tangible,
            • 124:00 - 124:30 translating it for AI. Did it even work? The  results blew people away. He fed these pixelated   manifolds into a neural network, and it predicted  their topological properties really accurately. He   basically showed AI could do algebraic geometry  in a whole new way. So it's not just speeding up   calculations. It's uncovering hidden patterns  and connections that might have stayed hidden,   like opening a new way of seeing math. And that  leads us to the big question. If AI can crack open
            • 124:30 - 125:00 complex math like this, what other secrets could  it unlock? We're back. Last time we were talking   about AI not just helping us with math, but  actually coming up with new mathematical insights,   which is where the Birch test comes in. It's  like, can AI go from being a supercalculator   to actually being a math partner? Exactly. And now  we'll look at how researchers like Yang-Hui He are   trying to answer that. Remember, the Turing  test was about a machine being able to hold   a conversation like a human. The Birch test is a  whole other level. It's not about imitation. It's
            • 125:00 - 125:30 about creating completely new mathematical  ideas. Think about Brian Birch back in the   60s. He came up with this bold conjecture  about elliptic curves, just from looking at   patterns and numbers. So this test wants AI to  do similar leaps, to go through tons of data,   find patterns, and come up with conjectures that  push math forward. Exactly. Can AI, like Birch,   show us new mathematical landscapes? That's asking  a lot. So how are we doing? Are there any signs AI
            • 125:30 - 126:00 might be on the right track? There have been some  promising developments. Like in 2021, Davies and   his team used AI to explore knot theory. Knots,  like tying your shoelaces. What's that got to do   with advanced math? It's more complex than you  think. Knot theory is about how you can embed a   loop in three-dimensional space, and it actually  connects to things like topology and even quantum   physics. Okay, that's interesting. So how does AI  come in? Well, every knot has certain mathematical   properties called invariance. It's kind of  like its fingerprint. Davies' team used machine
            • 126:00 - 126:30 learning to analyze a massive amount of these  invariants. So was the AI just crunching numbers,   or was it doing something more? What's amazing is  the AI didn't just process the data. It actually   found hidden relationships between these  invariants, which led to new conjectures   that mathematicians hadn't even considered  before. Like the AI was pointing the way to   new mathematical truths. That's wild. Sounds  like AI is becoming a powerful tool to spot   patterns our human minds might miss. Absolutely.  Another cool example is Lample and Charton's work
            • 126:30 - 127:00 in 2019. They trained AI on a massive data set of  math formulas. And what did they find? Well, this   AI could accurately predict the next formula in  a sequence, even for really complex ones. It was   like the AI was learning the grammar of math and  could guess what might come next. So we might not   have AI writing full-blown proofs yet, but it's  getting really good at understanding the structure   of math and suggesting new directions. And that  brings us back to Yang-Hu He. His work with those   Calabi-Yau manifolds, analyzing them as pixelated  forms, that was a huge breakthrough. Showed that
            • 127:00 - 127:30 AI could take on algebraic geometry problems in  a totally new way. Like bridging abstract math in   the world of data and algorithms. Exactly. And  that bridge leads to some really mind-bending   possibilities. Yang-Hu He and his colleagues  started exploring something they call murmuration.   Murmuration. Like birds. It's a great analogy.  Think of a flock of birds moving together like   one. Each bird reacts to the ones around it, and  you get these complex, beautiful patterns. Right,
            • 127:30 - 128:00 I get it. But how does it relate to AI and math?  Well, Yang-Hu He sees a parallel between how birds   navigate together in a murmuration and how AI  can guide mathematicians towards new insights   by sifting through tons of math data. So the AI  is like the flock, exploring math and showing us   where things get interesting. Yeah, and they've  actually used this murmuration idea to look into   a famous problem in number theory, the Birch  and Swinerton-Dyer conjecture. That name sounds   a bit intimidating. What's it all about? Imagine  a donut shape, but in the world of numbers. These
            • 128:00 - 128:30 are called elliptic curves. Mathematicians  are obsessed with finding rational points on   these curves. Points where the coordinates can be  written as fractions. Okay, I'm following so far.   The Birch and Swinerton-Dyer conjecture basically  says there's this deep connection between how many   of these rational points there are and a specific  math function, like linking the geometry of these   curves to number theory. Things are definitely  getting complex now. And it's a big deal in   math. It's actually one of the Clay Mathematics  Institute's Millennium Prize problems. Solve it,
            • 128:30 - 129:00 you win a million bucks. Now that's some serious  math street cred. So how did Yang-Hu He's team use   AI for this? They trained an AI on this massive  data set of elliptic curves and their functions.   The AI didn't actually solve the whole conjecture,  but it found this new pattern, this correlation   that mathematicians hadn't noticed before. So the  AI was like a digital explorer, mapping out this   math territory and showing mathematicians what  to look at more closely. Exactly. This discovery,
            • 129:00 - 129:30 while not a complete proof, gives more support  to the conjecture and opens up some exciting new   areas for research. It shows how AI can help with  even the hardest problems in mathematics. It feels   like we're on the edge of something new in math.  AI is not just a tool, it's a partner in figuring   out the truth. What does all this mean for math  in the future? That's a great question, and it's   something we'll dig into in the final part of  this deep dive. We'll look at the philosophical   and ethical stuff around AI in math. We'll ask if  AI is really understanding the math it's working
            • 129:30 - 130:00 with, or if it's just manipulating symbols in a  really fancy way. See you there. Welcome back to   our deep dive. We've been exploring how AI is  changing the game in math, from solving tough   problems to finding hidden patterns in complex  structures. But what does it all mean? What are   the implications of all of this? We've touched  on this question of understanding. Does AI really   understand the math it's dealing with, or is it  just a master of pattern matching? Yeah, we can   get caught up in the cool stuff AI is doing, but  we can't forget about those implications. If AI
            • 130:00 - 130:30 is going to be a real collaborator in mathematics,  this whole understanding question is huge. It goes   way back to the Chinese room thought experiment.  Imagine someone who doesn't speak Chinese has this   rulebook for moving Chinese symbols around. They  can follow the rules to make grammatically correct   sentences, but do they actually get the meaning?  So is AI like that, just manipulating symbols in   math without grasping the deeper concepts?  That's the big question, and there's no easy
            • 130:30 - 131:00 answer. Some people say that because AI gets  meaningful results, like we've talked about,   it shows some kind of understanding, even if it's  different from how we understand things. Others   say AI doesn't have that intuitive grasp of math  concepts that we humans have. It's a debate that's   probably going to keep going as AI gets better  and better at math. Makes you wonder how it's   going to affect the foundations of mathematics  itself. That's a key point. Traditionally,   mathematical proof has been all about logic,  building arguments step by step using established   axioms and theorems. But AI brings something  new, inductive reasoning, finding patterns
            • 131:00 - 131:30 and extrapolating from those patterns. So could we  see a change in how mathematicians approach proof?   Could we move toward a way of doing math that's  driven by data? It's possible. Some mathematicians   are already using AI as a partner in the proving  process. AI can help generate potential theorems   or find good strategies for tackling conjectures.  But others are more cautious, worried that relying   too much on AI could make math less rigorous,  more prone to errors. It's like with any new tool.
            • 131:30 - 132:00 There's good and bad. Finding that balance is  important. We need to be aware of the limitations   and not rely on AI too much. Right. And as AI  becomes more important in math, it's crucial   to have open and honest conversations. We need  to talk about what AI means, not just for math,   but for everything we do. It's not just about the  tech. It's about how we choose to use it. We need   to make sure AI helps humanity and the benefits  are shared. That's everyone's responsibility.   A responsibility that goes way beyond just  mathematicians and computer scientists. We need
            • 132:00 - 132:30 philosophers, ethicists, social scientists, and  most importantly, the public. We need all sorts of   voices and perspectives to guide us as we go into  this uncharted territory. This has been an amazing   journey into the world of AI and math. From  sleepless nights to those mind-bending manifolds,   we've seen how AI is pushing the boundaries of  what's possible. And as we wrap up, we encourage   you to keep thinking about these things. What does  it really mean for a machine to understand math?
            • 132:30 - 133:00 How will AI change the way we prove things and  make discoveries in math? How can we make sure   we're using AI responsibly and ethically in our  search for knowledge? These are tough questions,   but they're worth asking. The future of  mathematics is being shaped right now,   and AI is a major player. Thanks for joining us  on this deep dive. We'll catch you next time,   ready to explore some other fascinating corner of  the universe of knowledge. New update! Started a   substack. Writings on there are currently about  language and ill-defined concepts, as well as
            • 133:00 - 133:30 some other mathematical details. Much more being  written there. This is content that isn't anywhere   else. It's not on Theories of Everything, it's not  on Patreon. Also, full transcripts will be placed   there at some point in the future. Several people  ask me, Hey Curt, you've spoken to so many people   in the fields of theoretical physics, philosophy,  and consciousness. What are your thoughts?   While I remain impartial in interviews, this  substack is a way to peer into my present   deliberations on these topics. Also, thank you  to our partner, The Economist. Plus, it helps
            • 133:30 - 134:00 out Curt directly, aka me. I also found out last  year that external links count plenty toward the   algorithm, which means that whenever you share, on  Twitter, say on Facebook, or even on Reddit, etc.,
            • 134:00 - 134:30 it shows YouTube, Hey, people are talking about  this content outside of YouTube, which in turn   greatly aids the distribution on YouTube. Thirdly,  there's a remarkably active discord and subreddit   for Theories of Everything, where people explicate  TOEs, they disagree respectfully about theories,   and build, as a community, our own TOE. Links  to both are in the description. Fourthly,   you should know this podcast is on iTunes, it's  on Spotify, it's on all of the audio platforms.
            • 134:30 - 135:00 All you have to do is type in Theories of  Everything and you'll find it. Personally,   I gain from re-watching lectures and podcasts. I  also read in the comments that, Hey, TOE listeners   also gain from replaying. So how about instead  you re-listen on those platforms, like iTunes,   Spotify, Google Podcasts, whichever podcast  catcher you use. And finally, if you'd like to   support more conversations like this, more content  like this, then do consider visiting patreon.com
            • 135:00 - 135:30 slash CURTJAIMUNGAL and donating with whatever you  like. There's also PayPal, there's also crypto,   there's also just joining on YouTube. Again,  keep in mind, it's support from the sponsors   and you that allow me to work on TOE full-time.  You also get early access to ad-free episodes,   whether it's audio or video. It's audio in the  case of Patreon, video in the case of YouTube.   For instance, this episode that you're listening  to right now was released a few days earlier.   Every dollar helps far more  than you think. Either way,   your viewership is generosity  enough. Thank you so much.