Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.
Summary
In this engaging discussion with Yang-Hui He, we delve into the transformative potential of AI in the realm of mathematics. Yang-Hui shares his insights on how AI isn’t just a tool for quicker calculations but a revolutionary force that's reshaping how we conceptualize and solve complex mathematical problems. By looking at mathematics through the lens of AI, Yang-Hui suggests new approaches to enduring challenges like the Birch and Swinerton-Dyer conjecture, while emphasizing how AI can bridge the abstract world of algebraic geometry and practical image processing techniques.
Highlights
AI is changing math by offering new insights, not just faster calculations 🧠.
Yang-Hui discovered AI's potential in math during quiet nights with his son 👶.
Calabi-Yau manifolds are reimagined as image problems, showing AI's prowess 📸.
The Birch Test challenges AI to create new mathematical theories 🌟.
AI highlights connections in math akin to leaps made by Gauss 🔍.
Key Takeaways
AI is not just speeding up math but reshaping how we understand it 🧠.
Yang-Hui He stumbled upon AI-assisted math during sleepless nights with his newborn 👶.
Calabi-Yau manifolds modeled as images showcase AI's potential in math 📸.
The Birch Test could be the next step for AI in generating novel mathematical ideas 🌟.
AI's role mirrors intuitive insights of great mathematicians like Gauss 🔍.
Overview
In this episode of Theories of Everything, we explore the intersection of artificial intelligence and mathematics with Yang-Hui He, an expert advocating for AI's transformative role beyond just speeding up calculations. Yang-Hui shares his journey into AI-assisted mathematics, which began during late nights with his newborn. This unexpected inspiration led him to explore how AI could revolutionize the way we approach and solve mathematical problems.
Yang-Hui illustrates how viewing complex algebraic structures, such as Calabi-Yau manifolds, as image recognition tasks have allowed AI to uncover new patterns and insights. This novel approach has even ventured into profound mathematical challenges, like the Birch and Swinerton-Dyer conjecture, demonstrating AI’s capability to facilitate significant breakthroughs in mathematics.
Furthermore, the conversation delves into the notion of the Birch Test, a hypothetical scenario envisioning AI's capacity to independently generate meaningful new mathematical concepts and conjectures. This evolution in mathematics sees AI not just as a computational tool but as a partner capable of fostering groundbreaking advancements akin to historical mathematical insights.
Chapters
00:00 - 02:30: Introduction and Initial Discussion This chapter introduces Yang-Hui He, the guest on the podcast, as the host expresses admiration for his work and lectures. Yang-Hui reciprocates by acknowledging his admiration for the host, especially highlighting the interviews with prominent figures like Roger Penrose and Edik Franco. The atmosphere is one of mutual respect and appreciation as the conversation begins.
02:30 - 10:00: AI and Mathematics The chapter titled 'AI and Mathematics' delves into the intricate relationship between artificial intelligence, machine learning, and mathematics. It explores three levels of understanding mathematics in this context: bottom-up, top-down, and meta perspectives. Before diving into these concepts, the chapter hints at a discussion about the specific mathematical and physics disciplines that initially captivated the speaker's interest, as well as an exploration of their collaboration with Roger. The speaker highlights their expertise in mathematical physics, particularly focusing on the intersection with algebraic geometry.
10:00 - 20:00: Interplay of Physics and Mathematics The chapter discusses the editor's background in string theory and their collaboration with C.N. Young, a notable figure in physics known for the Young-Mills Theory. Young, who is remarkable not only for his scientific contributions but also for being one of the world's oldest living Nobel laureates, having received the Nobel Prize in 1957. The discussion highlights the intertwined nature of physics and mathematics, particularly through influential figures like Young.
20:00 - 22:30: Future of AI in Mathematical Discovery This chapter delves into the collaborative relationship between the speaker and notable physicist Roger Penrose, initiated through a joint editorial project on a book related to topology in physics. The discussion highlights the importance of collaboration and personal relationships in advancing mathematical and scientific discovery, particularly in the context of future developments in AI. Insights into the editing process for academic works and the potential of AI to transform complex disciplines like topology and physics are explored.
Math Will Never Be the Same Again… | Yang-Hui He Transcription
00:00 - 00:30 Yang-Hui He, hey, welcome to the podcast. I'm so
excited to speak with you. You have an energetic humility and your expertise and your passion comes
across whenever I watch any of your lectures, so it's an honor. It's a great pleasure
and great honor to be here. In fact, I'm a great admirer of yours. You've interviewed
several of my very distinguished colleagues like, you know, Roger Penrose and Edik Franco.
I actually watched some of them. It's actually really nice. Wonderful, wonderful.
Well, that's humbling to hear. So firstly,
00:30 - 01:00 people should know that we're going to talk about
or you're going to give a presentation on AI and machine learning mathematics and the relationship
between them, as well as the three different levels of what math is in terms of production
and understanding, bottom-up, top-down, and then the meta. But prior to that, what specific math
and physics disciplines initially sparked your interest? And how did the collaboration with Roger
come about? So my, you know, my bread and butter was mathematical physics, especially, you know,
sort of the interface between algebraic geometry
01:00 - 01:30 and string theory. So that's my background, what I
did my PhD on. And so at some point, I was editing a book with C.N. Young, who is an absolute legend.
You know, he's 102, he's still alive, and he's the world's oldest living Nobel laureate. You know,
Penrose is a mere 93 or something. So C.N. Young of the Young-Mills Theory, so he's an absolute
legend. He got the Nobel Prize in 1957. So at some
01:30 - 02:00 point, I got involved in editing a book with C.N.
called Topology in Physics. And, you know, with a name like that, you can just invite anybody you
want, and they'll probably say yes. And that was my initial friendship with Roger Penrose started
through working together on that editorial. I mean, I have Roger as a colleague in Oxford, and
I've known him on and off for a number of years. But that's when we really started getting,
working together. So when Roger snickers at
02:00 - 02:30 string theorists, what do you say? What do you do?
How does that make you feel? Oh, that's totally fine. I mean, I'm not a diehard string theorist.
And, you know, I'm just generally interested in the interface between mathematics and physics.
And, you know, Roger's totally chill with that. So you just happened to study the mathematics
that would be interesting to string theorists, though you're not one. Exactly, and vice versa.
I just completely chanced on this. It was kind
02:30 - 03:00 of interesting, you know, I was recently given
a talk, a public lecture in Dublin, about the interactions between physics and mathematics. And
I still find that just, you know, string theory is still a very much a field that gives the best
cross disciplinary, you know, kind of feedback. I've been doing that for decades. It's a fun
thing. You know, I talked to my friends in pure mathematics, especially in algebraic geometry,
100% of them are convinced that string theory
03:00 - 03:30 is correct. Because for them, it's inconceivable
for a physics theory to give so much interest in mathematics. Interesting. And that's kind of a,
I think that's a story that hasn't been told so much, you know, in media. You know, if you talk to
a physicist, they're like, you know, string theory doesn't predict anything, this and the other
thing. But there's a big chapter of string theory, you know, to me, more than 50% of the story,
backstory of string theory, is just constantly
03:30 - 04:00 giving new ideas in mathematics. And, you know,
historically, when a physical theory does that, it's very unlikely for it to be completely wrong.
Yeah, you watch the podcast with Edward Frenkel, and he takes the opposite view, although
he initially took the former view that, okay, string theory must be on the correct track
because of the positive externalities. It's like the opposite of fossil fuels. It doesn't give you
what you want for your field, like physics, but it gives you what you want for other fields, as a
serendipitous outgrowth. But then he's no longer
04:00 - 04:30 convinced after being at a string conference.
So you still feel like the pure mathematicians that you interact with see string theory as
on the correct track as a physical theory, not just as a mathematical theory. Yeah, so he,
yeah, absolutely. He does make a good point. And so, I think, you know, Frankel and algebra
geometers like Richard Thomas and various people, they appreciate what string theory is
constantly doing in terms of mathematics.
04:30 - 05:00 And the challenge is whether it is a theory
of physics based on the fact that it's giving so much mathematics. I guess, you know, you've
got to be a mystic. Some of them are mystics. Some of us are mystics. And I actually, I don't, I
don't personally have an opinion on that. I just, you know, some days when I'm like, well, you know,
it's, this is such a cool mathematical structure, and there's so much internal consistency. It's got
to be, there's got to be something there. So it's
05:00 - 05:30 just, but of course, you know, it being a science,
you need the experimental evidence, you know, you need to go through the scientific process.
And that I have absolutely no idea. It could take years and decades. Wouldn't you also have to
weight the field, like w-e-i-g-h-t, weight the, whatever field, like the subdiscipline of string
theory, with how much IQ power has been poured into it, how much raw talent has been poured into
it, versus others. So you would imagine that if it was the big daddy field, which it happens to be,
that it should produce more and more insights.
05:30 - 06:00 And it's unclear to me, at least, if that this
much time and effort went into asymptotic safety, or loop quantum gravity, or what have you,
or causal set theory, if that would produce mathematical insights of the same level of
quality, we don't have a comparison. I mean, I don't know. I want to know what your thoughts are
on that. I think the reason for that is just that, you know, we follow our own nose as a community.
And the contending theories like, you know, loop
06:00 - 06:30 quantum gravity and stuff, you know, there are
people who do it. There are communities of people who do it. And, you know, there's a reason why the
top mathematicians are doing string-related stuff, is because, you know, you follow the right
nose. You feel like it is actually giving the right mathematics. Things like, you know, mirror
symmetry, you know, or vertex algebras, that's kind of giving the right ideas constantly. And
it's been doing this since the very beginning. So, and people do, you know, the other alternative,
you know, the alternative theories of everything.
06:30 - 07:00 But so far, it hasn't produced new math. You can
certainly prove us wrong. But I think, you know, I follow, you know, there's a reason why Witten
is the one who gets the Fields Medal. Because it's just somehow is at the right interface
of the right ideas in geometry, number theory, representation theory, algebra, that this idea
tends to produce the right, you know, the right mathematics. Whether it is a theory of physics,
that's still, you know, that's the next mystical
07:00 - 07:30 level. But, you know, it's kind of, it's an
exciting time, actually. Witten didn't get the Fields Medal for string theory, though. It was
his work on the Jones polynomial, and Chern-Simons theory, and Morse theory with supersymmetry,
and topological quantum field theory, but not specifically string theory. That's right.
That's right. But he certainly is a champion for string theory. And for him, I mean, you know,
that idea, he was able to do, you know, the Morse
07:30 - 08:00 theory stuff, he was able to get because of his
work on supersymmetry. He was able to realize this was a supersymmetric index theorem that generated
this idea. And that's really, supersymmetry really is a cornerstone for string theory, even
though there's no experimental evidence for it.
08:00 - 08:30 So I think that's one of the reasons that's
guiding him towards this idea. I think that's one of the reasons that's guiding him towards
this direction. So what's cool is that just prior, the podcast that I filmed just prior to yours was
Peter Woit, as you know, is a critic of string theory. And Joseph Conlon, who is a defender of
string theory, and he has a book even called Why String Theory. That's right. I think it was the
first time that publicly, someone like Peter Woit, along with a defender of string theory, were just
on a podcast of this length, speaking about in a technical manner, what are both of their likes
and dislikes of string theory, and then the string
08:30 - 09:00 community. There's three issues, string theory
as a physical theory, string theory as a tool for mathematical insight, and then three, string
theory as a sociological phenomenon of overhype, and does it see itself as the only game in town?
Is there arrogance? Should there be arrogance? It was an interesting conversation. Yeah. Well,
Joe is a good friend of mine, Joe Conlon. Yeah, right, right. In Oxford. And yeah, no, I value
his comments greatly. I've always been kind of,
09:00 - 09:30 you know, for me, I've always been kind of like
slightly orthogonal to the main string theory community. I'm just happy because it's constantly
giving me good problems to work on. Yes. Including what I'm about to talk about in AI. Wonderful. And
I'll mention a little bit about it because I got into this precisely because I had a huge database
of Calabi-Yau manifolds, and I wouldn't have done that without the string community. It was, again,
one of those accidents that, you know, no other, you know, the other theoretical physicist
didn't happen to have this, didn't happen
09:30 - 10:00 to be thinking about this problem. There's
this proliferation of Calabi-Yau manifolds, and I'll mention that a bit in my lecture later,
and why this is such an interesting problem, why Calabi-Yau-ness is interesting inherently,
regardless whether you're a string theorist. And that kind of launched me in this direction of
AI-assisted mathematical discovery. So this is kind of really nice. And I think, I mean, for me,
the most exciting thing about this whole community
10:00 - 10:30 is that, you know, science, and especially, you
know, theoretical science, well, not especially, science, including theoretical science, has
become so compartmentalized, right? You know, everyone is doing their tiny little bit of
thing. And string theory has been breaking that mold for the last decades. It's constantly,
oh, let's take a piece of algebraic geometry, let's take a bit of number theory here, elliptic
curves, let's take a bit of quantum information,
10:30 - 11:00 entanglement, whatever, entropy, black holes. And
it's the only field that I know that different expertise are talking to each other. I mean, this
doesn't happen in any other field that I know of in sort of mathematics, theoretical physics.
And that just gets me excited. And that's what I really like thinking about. Well, let's hear more
about what you like thinking about and what you're enthusiastic about these days. Let's get to the
presentation. Sure. Well, thank you very much for
11:00 - 11:30 having me here. And I'm going to talk about work
I've been thinking about, stuff I've been thinking about for the last seven years, which is how AI
can help us do mathematical discovery, you know, in theoretical physics and pure mathematics.
I recently wrote this review for Nature, which is trying to summarize a lot of these
ideas that I've been thinking about. And there's an earlier review that I wrote in 2021 about,
you know, how machine learning can help us with
11:30 - 12:00 understanding mathematics. So let me just take
it away and think about, oh, by the way, please feel free to interrupt me. I know this is one of
those lectures. I always like to make my lectures interactive. So please, if you have any questions,
just interrupt me anytime. And I'll just pretend there's a big audience out there and I'll just
make it. So firstly, you're likely going to get to this, but what's the definition of mathematics?
OK, great. So roughly, I mean, of course, you
12:00 - 12:30 know, how does one, so the first question is, how
does one actually do mathematics? Right. And so one can think about, of course, in these reviews,
I try to divide it into sort of three directions. Of course, these three directions are interlaced
and it's very hard to pull them apart. But roughly, you can think about, you know, bottom-up
mathematics, which is, you know, mathematics is a formological system, you know, definition and, you
know, lemma proof and, you know, theorem proof.
12:30 - 13:00 And that's certainly how mathematics is presented
in, you know, papers. And there's another one, which I like to call top-down mathematics, is
where, you know, where the practitioner looks from above. That's why I say top-down, from like
a bird's eye view. You see different ideas and subfields of mathematics. And you try to do this
as a sort of an intuitive creative art. You know, you've got some experience and then you're trying
to see, oh, well, maybe I can take a little bit of
13:00 - 13:30 piece from here and a piece from there and I'm
trying to create a new idea or maybe a method of proof or attack or derivation. Yes. So these
are these two. So that's, you know, complementary directions of research. And the third one,
meta, that's just because it was short of any other creative words, because there's, you know,
words like meta-science and meta-philosophy or meta-physics. I'm just thinking about mathematics
as purely as a language, you know, whether the
13:30 - 14:00 person understands what's going on underneath.
So meta is of secondary importance. So it's kind of like Chai-Chi-Pi-Ti, if you wish, you know, can
you do mathematics purely by symbol processing? So that's what I mean by meta. So I'm going to talk
a little bit about, in this talk, about each of the three directions and focusing mostly on the
second direction of top-down, which is what I've been thinking about for the last seven years or
so. Hmm. Okay. I don't know if you know of this
14:00 - 14:30 experiment called the Chinese room experiment.
Yeah. Okay. So in that, the person in the center who doesn't actually understand Chinese, but is
just symbol pushing or pattern matching, I don't know if it's exactly pattern, rule following,
that would be the better way of saying it. Yeah. They would be an example of bottom-up or meta in
this? So I would say that's meta. As you know, on Theories of Everything, we delve into
some of the most reality spiraling concepts from theoretical physics and consciousness to AI
and emerging technologies. To stay informed in an
14:30 - 15:00 ever-evolving landscape, I see The Economist as
a wellspring of insightful analysis and in-depth reporting on the various topics we explore here
and beyond. The Economist's commitment to rigorous journalism means you get a clear picture of the
world's most significant developments, whether it's in scientific innovation or the shifting
tectonic plates of global politics. The Economist provides comprehensive coverage that goes beyond
the headlines. What sets The Economist apart is
15:00 - 15:30 their ability to make complex issues accessible
and engaging, much like we strive to do in this podcast. If you're passionate about expanding your
knowledge and gaining a deeper understanding of the forces that shape our world, then I highly
recommend subscribing to The Economist. It's an investment into intellectual growth, one that
you won't regret. As a listener of TOE, you get a special 20% off discount. Now you can enjoy
The Economist and all it has to offer for less.
15:30 - 16:00 Head over to their website, www.economist.com to
get started. Thanks for tuning in, and now back to our explorations of the mysteries of the
universe. So I would say that's meta, in the sense that the person doesn't even have to be a
mathematician. You're just simply taking symbols, large language modeling for math, if you wish.
Got it. Of course, you know, there's a bit of
16:00 - 16:30 component of others, you know, that you can see
there's a little bit of component of bottom-up, because you are taking mathematics as, you know, a
sequence of symbols. But I would mainly call that meta, if that's okay. I mean, these definitions
are just, you know, things that I'm using. Yes, yes. But in any case, I would talk mostly about
this bit, which is what I've been thinking mostly about. One thing, just to set the scene, you know,
20th century, of course, you know, computers have
16:30 - 17:00 been playing an increasingly important role in
mathematical discovery. And of course, you know, it speeds up computation, all that stuff goes
without saying. But something that's perhaps not so emphasized and appreciated is the fact that
there are actually fundamental and major results in mathematics that could no longer have been done
without the help of the computer. And so this,
17:00 - 17:30 you know, there's famous examples. Even back in
1976, this is the famous Upper Harkin-Cock proof of the four-color theorem. You know, that every
map, it only takes four, every map in a plane only takes four colors to completely color it with no
neighbors. And this is a problem that was posed, I think, probably by Euler, right? And this was
finally settled by reducing this whole topology problem to thousands of cases, and then they ran
it through a computer and checked it case by case. So, and then other major things like, you know,
the Kepler conjecture, which is, you know,
17:30 - 18:00 that stacking balls, identical balls. The best way
to stack it is what you see in the supermarket, you know, in this hexagonal thing.
And this was a conjecture by Kepler, but to prove that this is actually the best way
to do it was settled in 1998, again, by a huge computer check. And the full acceptance by the
math community, it was only as late as 2017, when proof copilots actually went through Hauss's
construction and then made this into a proof. Yes.
18:00 - 18:30 But wasn't there a recent breakthrough in the
generalized Kepler conjecture? Absolutely. So this is what Marina Vyotsovskaya got the Fields
Medal for. So the Kepler conjecture is in three dimensions, you know, our world. Vyotsovskaya
showed in dimensions 8, 16, and 24 what the best possible packing are. And she gave a beautiful
proof of that fact. And to my knowledge, I don't
18:30 - 19:00 think she actually used the computer. There's
some optimization method. Actually, what I'm referring to is that there are some researchers
who generalize this for any n, not just 8, not just 24, who used methods in graph theory of
selecting edges to maximize packing density to solve a sphere packing problem probabilistically
for any n. Though I don't believe they used machine learning. Well, thanks for telling me.
I've got to check that. That's interesting. This
19:00 - 19:30 was actually a really interesting one. I mean,
that's something that's closer to me, which is the classification of finite simple groups. So simple
groups are building blocks of all finite groups. And the proof is, you know, took 200 years. And
the final definitive volume was by Gornstein 2008. And what's really interesting, the law in the
finite group theory community is that nobody's actually read the entire proof. It's just not
possible. It takes longer for people to actually read the entire proof than a lifetime. So this
is kind of interesting that, you know, we have
19:30 - 20:00 reached the cusp in mathematical research where
the computers are not just becoming computational tools, but it's increasingly becoming an integral
part of who we are. So this is just set the scene. So we're very much in this, you know, we're now
in the early stages of the 21st century. And this is increasingly the case where we have this,
where computers can help us or AI can help us
20:00 - 20:30 in these three different directions. Great. So
let me just begin with this bottom up and sort of to summarize this. This is probably the oldest
attempt in where computers can help us. So this is where I'm going to define bottom up, which is,
I guess it goes back to the modern version of this is this classic paper, the classic book of.
Russell Whitehead on the Principia Mathematica,
20:30 - 21:00 which is 1910s, where they try to axiomize,
axiomatize mathematics, you know, from the very beginning, you know, it took like 300 pages
for them to prove the one plus one is good to two famously. Nobody has read this. So this is this
is one of those impenetrable books. But I mean, this, but this tradition goes back to, you know,
Leibniz or to Euclid, even, you know, that the idea that mathematics should be axiomatized. Of
course, this, this program took only about 20
21:00 - 21:30 years before he was completely killed in some
sense, because of Gödel and Church and Turin's incompleteness theorems that, you know, this
very idea of trying to axiomatize mathematics by constructing, you know, layer by layer is proven
to be, you know, logically impossible within every order of logic. But I like to quote my very
distinguished colleague, Professor Minyong Kim. He says the practice of mathematician hardly
ever worries about Gödel. Because, you know,
21:30 - 22:00 if you have to worry about whether your
axioms are valid to your day to day, you know, if an algebraic geometer has to worry about this,
then you're sunk, right? You get depressed about everything you do. Right? So the two parts kind
of cancel out. But the reason I mentioned this is that because of the fact that these two parts
cancel each other out, these two negatives cancel each other out, this idea of using computers to
check proofs or to compute proofs really goes back
22:00 - 22:30 to the 1950s. Right? So despite the, you know,
what Gödel and Church and Turin have proved is foundational. Even back in 1956, Noah, Simon and
Shaw devised this logical theory machine. I have no idea how they did it, because this is really
very, very, very primitive computers. And they were actually able to prove some certain theorems
of Principia by building this bottom up, you know, take these axioms and use the computer to prove.
And this is becoming, you know, an entire field of
22:30 - 23:00 itself with this very distinguished history. And
just to mention that this 1956 is actually a very interesting year, because it's the same year, 56,
57, that the first neural networks emerged from the basement of Penn and MIT. And that's really
interesting, right? So people in the 50s were really thinking about the beginnings of AI, you
know, because neural networks is what we now call,
23:00 - 23:30 you know, goes under the rubric of AI. And at
the same time, they were really thinking about computers to prove theorems and mathematics.
So it's 56 was a kind of a magical year. And, you know, this neural network really was a
neural network in the sense that, you know, they put cadmium sulfide cells in a basement.
It's a wall size of photoreceptors. And they were using, you know, flashlights to try to stimulate
neurons, literally to try to simulate computation.
23:30 - 24:00 That's quite an impressive thing. And then this
thing really developed, right? And now, you know, half a century later, we have very advanced and
very, very sophisticated, computer-aided proof, automated theorem provers. Things like the Coq
system, the Lean system. And they were able to create, so Coq was what was used in this, the
full verification of the proof of the Four
24:00 - 24:30 Color theorem was through the Coq system. And,
you know, then there's the Phi-Thomson theorem, which got Thomson the Fields Medal. Again, they
got the proof through this system. And Lean is very good. I do a little bit of Lean, but also
Lean, the true champion of Lean is Kevin Buzzard at Imperial, 30 minutes down the road from
here, from this spot. And he's been very much a champion for what he calls the Xena project,
and using Lean to formulate, to formalize all of
24:30 - 25:00 mathematics. That's the dream. And what Lean has
done now is that it has, Kevin tells me, all of the undergraduate level mathematics at Imperial,
which is a non-trivial set of mathematics, but still a very, very tiny bit of actual mathematics.
And they can check it, and everything that we've been taught so far at undergraduate level is good
and self-consistent, so nobody needs to cry about
25:00 - 25:30 that one. Wonderful. And so that's all good. And
then more recent breakthroughs is the beautiful work of, you know, so three Fields Medalists here.
So two Fields Medalists, Gowers, Manners, I think it's his name, and Tao, where they proved this
conjecture, which I don't know the details of, but they were actually using Lean to prove, to
help prove this. And I think Terry Tao in this public lecture, which he gave recently in 2024 in
Oxford, he calls this whole idea of AI co-pilot,
25:30 - 26:00 which I very much like this word. I was with
Tao in August in Barcelona, we were at this conference, and he's very much into this, and
of course, you know, Tao, Terry Tao for us is, you know, is a godlike figure. And the fact that
he's championing this idea of AI co-pilots for mathematics is very, very encouraging for all
of us. Yes. And for people who are unfamiliar
26:00 - 26:30 with Terry Tao, but are familiar with Ed Witten,
Terry Tao is considered the Ed Witten of math, and Ed Witten is considered the Terry Tao of
physics. Yeah, I've never heard that expression. That's kind of interesting. At Barcelona, when
Terry was being introduced by the organizer, Eva Miranda, she said, Terry Tao is, this is a very
beautiful sentence, Terry Tao has been described as the Gauss of mathematics. Yes, or the Mozart.
But I think a more appropriate thing to describe
26:30 - 27:00 him is to describe him as the Leonardo da Vinci of
mathematics, because he has such a broad impact on all fields of mathematics, and that's a very rare
thing. Yeah, I remember he said something like, topology is my weakest field, and by weakest
field to him, it means I can only write one or two graduate textbooks off of the top of my head on
the subject of topology. Exactly, exactly. I guess
27:00 - 27:30 his intuitions are more analytic. He's very much
in that world of analytic number three, functional analysis. He's not very pictorial, surprisingly.
Like Roger Penrose has to do, everything has to be in terms of pictures. But Terry is a symbol,
symbolic matcher. We can just look at equations, extremely long, complicated equations, and just
see which pieces should go together. That's kind
27:30 - 28:00 of very interesting. Speaking of Eva Miranda,
you and I, we have several lines of connection. Eva's coming on the podcast in a week or two to
talk about geometric quantization. Awesome. Eva is super fun, right? She's filled with energy.
Yes. She's a good friend of mine. I think in this academic world of math and physics, I think we're
at most one degree of separation from anyone else. It's a very small community, relatively small
community. Back to this thing about, of course,
28:00 - 28:30 one could get overoptimistic. I was told by my
friends in DeepMind that Shaggedy, who I think he's on this AI math team, he was instructing
that computers beat humans in chess in the 90s, that humans go at 2018, so you should beat humans
in proving theorems in 2030. I have no idea how he extrapolated these two points. They're only three
data points. But DeepMind has a product to sell,
28:30 - 29:00 so it's very good for them to be overoptimistic.
But I wouldn't be surprised that this number… Well, I'm not sure to beat humans, but it
might give ideas that humans have not thought about before. So that's possible. Just moving
on. So that's the bottom up. And as I said, this is very much a blossoming, or not blossoming,
it's very much a long, distinguished field of automated theorem computing, of theorem provers
and verifications of formalization mathematics,
29:00 - 29:30 which Tal calls the AI copilot. Just to mention
a bit with your question a bit earlier about metamathematics. So this is just kind of… I
like your analogy. This is like the Chinese room. Can you do mathematics without actually
understanding anything? You know, personally, I'm a little biased, because having interacted
with so many undergraduate students before I moved to the London Institute so I don't have to
teach anymore, teach undergraduates, I've noticed,
29:30 - 30:00 you know, maybe one can say the vast majority
of undergraduates are just pattern matching, whether there's any understanding. I think
this is one of the reasons why, you know, why CHAT-GPT does things so well. It's not
just because… It's not because, oh, you know, LLMs are great, large language models are great.
It's more that most things that humans do are so without comprehension anyway. So that's why it's
kind of this pattern matching idea. And this is
30:00 - 30:30 also true for mathematics. What's funny is that my
brother's a professor of math in the University of Toronto for machine learning, but for finance.
And I recall 10 years ago, he would lament to me students that came to him who wanted to be
PhD students, and he would say, okay, but Curt, some of them, they don't have an understanding.
They have a pattern matching understanding. He didn't want that at the time, but now he's into
machine learning, which is effectively that times 10 to the power of 10. Right, right. No, no, I
completely agree. I mean, this is not… This is
30:30 - 31:00 not to criticize undergraduate studies. You know,
I think in undergraduate students, it's just that, you know, it's part of being human. We kind of
pattern match, and then we do it the best we can. And then, of course, if you're Terry Tao, you
know, you actually understand what you're doing, but you know… Of course. But the vast majority
of us doing most of the stuff is just pattern matching. So that's why… And this is true even
for mathematics. So here, I just want to mention
31:00 - 31:30 something, which is a fun project that I did
with my friends, Vishnu Jijala and Brent Nelson, back in 2018, before LLM. So before all this LLM
for science thing. And this is a very fun thing, because what we did, we took the archive, and
we took all the titles of the archive. You know, this is the, you know, the preprint server for
contemporary research in theoretical sciences. And, you know, we were doing LLM classifiers,
Word2Vec, very old fashioned. This is a neural
31:30 - 32:00 network, Word2Vec. And, you know, you can
classify this and do their thing. But what's really interesting, this is my favorite bit, we
took, to benchmark the archive, we took Vixra. So Vixra is a very interesting repository, because
it's archive spelled backwards. And he has all kinds of crazy stuff. I'm not saying everything
on Vixra is crazy, but certainly has everything that archive rejects, because he thinks it's
crazy. Things like, you know, three page proof
32:00 - 32:30 of the Riemann hypothesis, or Albert Einstein is
wrong, it's got filled with that. It's interesting to study the linguistics, even at a title level,
you could see that, you know, what they call the distinctions of quantum gravity versus the other
things, they have the right words in Vixra. But the word order is already quite random, that, you
know, in other words, the classification matrix, the confusion matrix for Vixra is certainly
not as distinct as archive, which, you know, so kind of interesting, you know, you get all
the right buzzwords. It's like, you know, kind
32:30 - 33:00 of thing, Vixra, I think, is a good benchmark,
that linguistically is not as sophisticated as, you know, real research articles. But this idea,
so this is something much more serious, is this very beautiful work of Chitoyan et al in Nature,
where they actually took all of material science, and they did a large language model for that, and
they were able to actually generate new reactions in material science. So this, I think, this paper
in 2019, this paper by Chitoyan, is really the
33:00 - 33:30 beginnings of LLM for scientific discovery. This
is quite early, I mean, this is 2019, right? Yeah, and it's remarkable how we can even say that
that's quite early. The field is exploding so quickly. Absolutely. That five years ago
is considered quite some time ago. Yeah, absolutely. Even five years ago, you know, I was
still very much in a lot of... I've evolved in thinking a lot about this thing. I would also
like to get to your personal use cases for LLMs,
33:30 - 34:00 ChagGVT, Claude, and what you see as the pros and
cons between the different sorts, like Gemini was just released at 2.0, and then there's O1, and
there's a variety. So at some point, I would like to get to you personally, how you use LLMs, both
as a researcher and then your personal use cases. Okay, now I can mention a little bit. One of the
very, very first things when ChagGVT3 came out in, what, 2018, something, 2019, something like
that? ChagGVT, oh, three. You mean GPT-3? GPT-3,
34:00 - 34:30 like the really early baby versions. Yeah,
that was during, just before the pandemic. Just before pandemic. So that was just like,
so I got into this AI for math through this Calabi-Yau metaphor, which I'm going to mention a
bit later. And then GPT came out when I was just thinking about this large language model. So this
is a great conversation. So I was typing problems
34:30 - 35:00 in calculus, freshman calculus, and it was solved
fairly well. I mean, it's really quite impressive what he can do. So it's fairly sophisticated
because things like, I was typing questions like, take vector fields, blah, blah, blah on a sphere,
find me the grad or the curve. I mean, it's like
35:00 - 35:30 first, second year stuff, and you have to do a
lot of computation. And he was actually doing this kind of thing correctly, partially because there's
just so many example sheets of this type out there on the internet. And so he's kind of learned all
of that. So I was getting very excited and I was trying to sell this to everybody at lunch. I was
having lunch with my usual team of colleagues in Oxford over this. And of course, lo and behold,
who was at lunch was the great Andrew Wiles. So
35:30 - 36:00 I felt like I was being a peddler for GPT, LLM for
mathematics, to perhaps the greatest living legend in mathematics. And Andrew's super nice, and he's
a lovely guy. And he just instantly asked me, he says, how about you try something much
simpler? Two problems he tried. The first one was, tell me the rank of a certain elliptic curve. And
he just typed it down, a certain elliptic curve,
36:00 - 36:30 or rational points, of a very simple elliptic
curve, which is his baby. And I typed it, and it got completely wrong. He very quickly started
saying things like, five over seven is an integer. Partially because this is a very hard thing
to do. You can't really guess integer points. Unlike in calculus, where there's a routine
of what you need to do. And then very quickly, we converge on an even simpler problem. How about
find the 17th digit in the decimal expansion of
36:30 - 37:00 22 divided by 29, like whatever. And that it's
completely random, because you can't train, you actually have to do long division. This is primary
school level stuff. And yet GPT just simply cannot do, and it's inconceivable that it could do it,
because no language model could possibly do this. But GPT now, O1, O2, O1 is already clever enough.
When he asks a question like this, linguistically,
37:00 - 37:30 it knows to go to O from alpha. And then it's
okay, then it's actually doing the math. So something so basic like this, you just can't train
a language model to do. You get one in 10 right, and it's just a randomly distributed thing.
Whereas sophisticated things, they are seemingly sophisticated things, like solving differential
equations, or doing very complicated integrals. It can do, because there's somewhat of a routine,
and there are enough samples out there. So anyway,
37:30 - 38:00 so that's my user case, two user cases. That's
also not terribly different than the way that you and I, or the average person, or people in
general, think. So for instance, we're speaking right now in terms of conversation. And then if we
ask each other a math question, we move to a math part of our brain. We recognize this is a math
question. So there's some modularity in terms of how we think. It's not like we're going to solve
long division using Shakespeare. Even if we're in
38:00 - 38:30 a Shakespeare class, and someone just jumps in
and then asks that question, we're like, okay, that's a different, that's of a different sort
of mechanism. Yeah, that's a good analogy. Yeah, yeah. When you first encountered ChatGPT,
or something more sophisticated that could answer even larger mathematical problems, did
you get a sense of awe or terror initially? So I'll give you an example. There was this
meeting with some other math friends of mine, and I was showing them ChatGPT when it first came
out. And then one of the friends said, explain,
38:30 - 39:00 can you get it to explain some inequality,
or prove some inequality? And then it did, and then explained it step by step. Then everyone
just had their hand over their mouth like, are you serious? Can you do this? And then they're
like, then one said, one friend said, this is like speaking to God. And then another friend said,
had the thought like, what am I even doing? What's the point of even working, if this can just do my
job for me? So did you ever get that sense? Like, yes, we're excited about the future, and it as
an assistant, but did you ever feel any sense
39:00 - 39:30 of dread? I'm by nature a very optimistic person.
So I think it was just awe and excitement. I don't think I've ever felt that I was threatened, or the
community is being threatened. I could totally be wrong. But so far, I just say, this is such an
awesome thing, because it'll save me so much time looking up references and stuff like this.
I was happy. I was just like, wow, this is kind
39:30 - 40:00 of cool. I mean, I guess if I were an educator, I
might get a bit of a dread, because there's like, you know, undergraduate degrees, you know, if you
do an undergraduate degree, it's just basically one chart GPT being fed to another. You know, a
lot of my colleagues started setting questions in exams with chart GPT, with fully latexed out
equation. I mean, this is becoming the standard thing to do. I guess even if you're an educator,
you would probably worry. But I was thinking about
40:00 - 40:30 just long term discovery of, you know, what new
knowledge can we generate? So in that sense, this is going to be a certainly an incredible
help, because it's got all the knowledge in the background. Wonderful. All right, let's move
forward. Yeah, sure. So 2022 was a great year. I'm surprised this wasn't like over every single
newspaper. I don't know why. At least I was told there was some obscuring outlet. I can't remember.
Some expert friends in the community told me that
40:30 - 41:00 the chart GPT has passed the Turing test. This is
a big deal. But I don't know why it hasn't been. I was hoping to see this on BBC and every major
newsletter, but it didn't catch on. But anyhow, I believe that in 2022, chart GPT has passed the
Turing test. And then, you know, where in the last two years, this is obviously where we can, you
know, this is a huge development now for large language models for mathematics. And, you know,
every major company, OpenAI, MetaAI, EpochAI,
41:00 - 41:30 you know, everything. And they've been doing
tremendous work in trying to get LLM for math. Basically, you know, take the archive, which is a
great repository for mathematics and theoretical physics, pure mathematical and theoretical
physics, and then just learn that and try to generate to see it to how much. And this is very
much a work in progress. And of course, you know,
41:30 - 42:00 AlphaGeo, AlphaGeo2, AlphaProof, this is all
the DeepMind's success. It's kind of interesting within a year, you know, you've gone from 53%
on Olympia level to 84%, which is, you know, this is scary, right? This is scary in the sense
that impressively awesome that, you know, they could do so quickly. So basically in 2022, an AI
is approximately equal to the 12-year-old Terence Tao, in the sense that it could do a silver
medal. But of course, this is a very specialized,
42:00 - 42:30 you know, the AlphaGeo2 was really just homing in
on Euclidean geometry problems, which to be fair, extremely difficult, right? If you don't know
how to add the right line or the right angle, you have no idea how to attack this problem, but
it's kind of learned how to do this. So it's kind of nice. So, you know, this is all within, you
know, a couple of years. And there's this very
42:30 - 43:00 nice benchmark called Frontier Math that Epoch
AI has put out. I think there was a white paper and they got Gowers and Tao, you know, the usual
suspects, just to benchmark. Okay, fine. So it can do 84% on Math Olympiad, which is sort of
high school level. What about truly advanced research problems, right? To my knowledge, as of
the beginning of this month, it was only doing 2%. So that's okay, fine. So it's not doing that
great. But the beginning of this week, you learn
43:00 - 43:30 that OpenAI 03 is doing 25%. So we've gone 20%
up. We've got a fifth up within four weeks of what they can do. So this is, wow, that's kind of very
interesting. Such a rapid improvement. It's so, this is great. I love this, right? Because it's
exciting. It's very rare to be. I remember back in the day when I was a PhD student, doing AdS/CFT
related algebra geometry, because Maldacena had
43:30 - 44:00 just come out with a paper in 97, 98, and that's
just when I began my PhD. I remember that kind of excitement, the buzz in the string community.
People were saying, there was a paper every couple of days on the next, that kind of excitement. And
I haven't felt that kind of excitement for a very long time, just because of this. Wow. And then
this is like that, right? Every week, there's this
44:00 - 44:30 new benchmark, a new breakthrough. So that's
why I find this field of AI system mathematics to be really, really exciting. Can you explain,
perhaps it's just too small on my screen because I have to look over here, but can you explain
the graph to the left with Terence Tao? Oh, gosh. I'm not sure I can, because I'm sure I can
read this graph in detail. I think it's the, it's the year. What is it trying to convey? So it's the
ranking of Terence. Oh, no, this is just Terence
44:30 - 45:00 Tao's individual performances over different
years, over different problems. So he's retaking the test every year? No, no, he's taken it three
times. Ages 10, 11, and 12. And when he was 10, he got the bronze medal, and then he got the
silver medal, then he got the gold medal within three years. Okay. And age of 12 or something. But
I can't, I think... What are those bars, though? I
45:00 - 45:30 think the bars, it's a good question. I think,
maybe it's to the different questions, you're given 60 questions, and what it would take to get
the gold medal, I think, or what it would take to get the silver medal. I think. How many percents
do you have to correct? Okay, so it wasn't a foolish question of mine. It's actually... No,
no, no, no, it's a good question. I have no recollection, or maybe I never even looked at it.
Somebody told me about this graph at some point. I forgot what it is. Okay, because it looks to me
like Terence Tao is retaking the same test, and
45:30 - 46:00 then this is just showing his score across time,
and he's only getting better. But that can't be it. Why would he retake a test? He's a professor.
No, I think it goes to 66. It must be like... This is an open source graph. Oh, I thought you were
going to say, this is an open problem in the field. What does this graph mean? No, no, no.
It's an open source. This graph is just... You can take it from the Math Olympiad database. Got
it. Which I shamelessly... See, again, perfect,
46:00 - 46:30 right? I've just done something that I have
absolutely no understanding. I've presented to you like a language model, and I just copy and pasted,
because it's got a nice, cute picture of Terence Tao when he was a little... So, finally, I'll go
back to the stuff that I really be thinking about, which is this sort of top-down mathematics,
right? So, and then this is kind of interesting. So, the way we do research, you know,
practitioners, is completely opposite to
46:30 - 47:00 the way we write papers. I think that's important
to point that out. We muck about all the time. We do all kinds... When you look at my board, right,
it's just filled with all kinds of stuff. And most of it is probably just wrong. And then once we got
a perfectly good story, we write it backwards. And I think writing math papers backwards, and math
generally defined, math and theoretical physics papers backwards, well, theoretical physics is
a bit better. At least sometimes you write the
47:00 - 47:30 process. But in pure math papers, everything
is written in the style of Bubaki, this very dry definition proof, which is completely not how
it's actually done at all. This is why, you know, Arnold, the great Vladimir Arnold, says, you know,
Bubaki is criminal. He actually used this word, the criminal Bubaki-ization of mathematics,
because it leaves out all human intuition experience. It just becomes this dry machine-like
presentation, which is exactly how things should
47:30 - 48:00 not be done. But Bubaki is extremely important,
because that's exactly the language that's most amenable to computers. So, you know,
it's one way or another. But we, you know, human practitioners certainly don't do this kind
of stuff, right? We muck about, you know, we have to... And sometimes even rigorous sacrifice,
right? If we have to wait for proper analysis in the 19th century to come about, before Newton
invented calculus, we won't even know how to
48:00 - 48:30 compute the area of an ellipse. Because we have to
wait and formalize all of that. It'll just go all backwards. So kind of the historical progression
of mathematics is exactly opposite to the way that it's presented. I mean, it's fine, but the way
it's presented is better. It's much more amenable to a proof copilot system like Lean than what we
actually do. Even science in general is like that,
48:30 - 49:00 where we say it's the scientific method, where
you first come up with a hypothesis and then you test it against the world, gather data and so on.
But the way that scientists, not just in math and physics, but biologists and chemists and so on,
work, are based on hunches and creative intuitions and conversations with colleagues and several dead
ends, and then afterward you formalize it into a paper in terms of step-by-step, but it was highly
nonlinear. You don't even have a recollection most
49:00 - 49:30 of the time of how it came about. That's right.
And I think one of the reasons I got so excited about all this AI for math is this direction.
Because this hazy idea of intuition or experience, this is something that a neural network is
actually very, very good at. Wonderful. So I'm going to give concrete examples later on
about how it guides humans. But just to give
49:30 - 50:00 some classical examples, I've said this joke so
many times. So what's the best neural network of the 18th century? Well, it's clearly the brain
of Gauss. I mean, that's a perfectly functioning, perhaps the greatest neural network of all
time. I want to use this as an example. Because what did Gauss do? Gauss plotted the number of
prime numbers less than a given positive real
50:00 - 50:30 number, just to give a sort of continuity. And he
plotted this, and it's kind of a really, really, you know, jaggedy curve. And it's a step function
because it jumps whenever you hit a prime. But Gauss was just able to look at this when he was
16 and say, well, this is clearly x over log x. How did he even do this experience? I mean, he had
to compute this by hand, and he did, and he got
50:30 - 51:00 some of them wrong even. You know, primes, he had
tables. By his time, the tables of primes were up in the tens and hundreds of thousands. He has it
go up in the hundred thousand range. And you can just look at this as x over log of x. But this
is very important because he was able to raise a conjecture before the method by which this
conjecture is proved, namely complex analysis, was even conceived of by Cauchy and Riemann.
And that's a very important fact. So he just
51:00 - 51:30 kind of felt that this was x over log x. And you
had to wait for 50 years before Hadamard and de la Vallee Poussin to prove this fact because
this technique, which we now take for granted, this technique called complex numbers, complex
analysis, wasn't invented by Cauchy. It wasn't invented yet. You had to wait for that to
happen. So it happens like this in mathematics all the time. Even major things. Of course, you
know, now it's called the prime number theorem,
51:30 - 52:00 which is a cornerstone of all of mathematics.
This is the first major result since Euclid on the distribution of primes. How did Gauss say
this was x over log x? I don't know. Because he had a really great neural network. And this
happens over and over again. Like, you know, the Birch-Swinnerton-Dyer conjecture, which
I'm going to talk about later, which is one of the millennium problems. And it's still open,
and it's certainly one of the most important problems in mathematics of all time. And this is
Birch-Swinnerton-Dyer in a basement, you know,
52:00 - 52:30 in Cambridge in the 1960s. They just plotted ranks
and conductors of related curves. I'm going to define those in more detail later. And they would
say, oh, that's kind of interesting. You know, the rank should be related to the conductor in some
strange way. And that's now the BSD conjecture, the Birch-Swinnerton-Dyer conjecture. And what
they were doing was computer-aided conjectures. So here was the eyeballs of Gauss in the 19th
century. But the 20th century really had seriously
52:30 - 53:00 computer-aided conjectures. And of course, the
proof of this is still open in general. There's been lots of nice progress in this. And, you
know, where we're going to go is very much, what technique do we need to wait to prove
something like this? Now, is there a reason that you chose Gauss and not Euler? Like, is
it just because Gauss had this example of data
53:00 - 53:30 points and guessing a form of a function? I'm sure
Euler, who certainly is great, had conjectures. Maybe... That's an interesting quote. I'll mention
Euler later. But I think there's not an example as striking as this one. In fact, what's interesting,
as a byproduct of Gauss inventing this, because it was kind of mucking around with statistics,
right? This is before statistics existed as
53:30 - 54:00 a field as well, right? This is like early 1800s.
And Gauss, I think, and you can check me on this, Gauss got the idea of statistics and the Gaussian
distribution because he was thinking about this problem. So it's kind of interesting. So he was
laying foundations to both analytic number theory and modern statistics in one go. He was doing
regression. So I think he essentially invented
54:00 - 54:30 regression and the curve fitting, which is like
101 of modern society. He was trying to fit a curve. What was the curve that really fit this? In
the process, he got x over log x. And in addition, he got this idea of regression. An impressive
guy. What can we say? He's a god to us all. The
54:30 - 55:00 upshot of this is like, I love this. Again, this
is something I found on the internet. And just to emphasize that this idea of... Speaking
of God. Yes, speaking of God, this idea of mucking about with data in pure mathematics is
a very ancient thing, right? Once you formulate something like this in conjecture, you will write
your paper. Imagine writing a paper, you will say,
55:00 - 55:30 conjecture, definition, prime, definition, pi of
x, then conjecture, pi of x, evidence. Rather than all of the failed stuff about inventing regression
and mucking about, all that stuff just gets not written at all. That intuitive creative process is
not written down anywhere. So it's great. I'm glad I'm chatting to you about it, right? Because it's
nice to have an audience with this, right? So if you look at like... So pattern recognition, what
do we do in terms of pure mathematical data? If I
55:30 - 56:00 gave you a sequence like this, you can immediately
tell me what the next number is, to some confidence. Zeros is just, you know, this is just
multiple of three or not. This one, I've tried this with many audiences, and after a few minutes
of struggle, you can get the answer. And then this turns out to be the prime characteristic function.
So what I've done here is to mark all the odd integers. And evens, obviously, you're going to
get zero. So it's kind of pointless. You just add
56:00 - 56:30 just a sequence of odd integers. And then it's a
one if it's a prime, it's zero if it's not. So 3, 5, 7, 8, and so on and so forth. No, sorry, 3, 5,
7, 9, 11. And you mark all the odd ones, which are one. And you can probably, after a while, you can
muck about and you can see where this is going. The next sequence is much harder. So I'm going
to give away so we won't have to spend a couple
56:30 - 57:00 of hours staring at it. So this one is what's
called the shifted Möbius function. What this is, just you take an integer, and you take the
parity of the number of prime factors it has up to multiplicity, starting from 2. I think I didn't
start from 1 here. And then if it's 1, if it's 0, if it's an odd number of prime factors, it's 1 if
it's an even number of prime factors for all the
57:00 - 57:30 sequence of integers. And I hope now I've gotten
this right. So if I think I start with 2, 2 has, so that's all. No, let's see, 2, 3. Yeah, so I did
start I'm going to mark 1 for 1, just to kick off the sequence. And then 2 is a prime number, it has
only one prime factor. It's an odd number. 3 is an odd number of prime factors. 4 is 2, because
it's 2 squared. So it has an even number of prime
57:30 - 58:00 factors, and so on and so forth. So 5 is prime, it
has one odd number. 6 is 2 times 3, so it has 2, an even number of prime factors, and so on and so
forth. It looks kind of harmless. What's really interesting, so this is even number. So if you
stare at this for a while, it's very, very hard to recognize a pattern. And what's really interesting
is that to know the parity of the next number,
58:00 - 58:30 if you have an algorithm that can tell me the
parity of this in an efficient way, you will have an equivalent formulation of the Riemann
hypothesis. So that's actually an extremely hard sequence to predict. So if you can tell me
with some confidence more than 50% what the next number is, without looking up some table, then
you can probably end up cracking every bank in the world. Because this is equivalent to the Riemann
hypothesis. So I've just given three, so trivial,
58:30 - 59:00 kind of okay-ish, really, really, really hard.
Yes. So now you can think about a question, if I were to feed sequences like this into some neural
network, how would a neural network do? So one way to do it, so this goes a bend, so we go way back
to the very beginning, to the question of what is mathematics? And Hardy, in his beautiful apology,
says, what mathematicians do is essentially we
59:00 - 59:30 are pattern recognizers. That's probably the best
definition of what mathematics is, is that it's a study of patterns, finding regularity in patterns.
And in fact, if there's one thing that AI can do better than us, it's pattern detection. Because
we evolved in being able to detect patterns in
59:30 - 60:00 three dimensions and no more. So in this sense,
if you have the right representation of data, you're sure that AI can do better than that. I
mean, it'll generate a lot of stuff, but filtering out what is better is a very interesting problem
in and of itself. So let's try to do one. I mean, there are various ways to do this representation.
One way you can do it is to do a problem which is maybe best fit for an AI system, which is
binary classification of binary vectors. So
60:00 - 60:30 what you do is, you know, sequence prediction
is kind of difficult. So one thing you can do is just take this infinite sequence and just
take, say, a window of a hundred, a thousand, a fixed window size, and then label it with
the one immediately outside the window, and then shift, label, shift, label. So then you
can generate a lot of training data this way. So for this sequence, I think I've just taken here,
you know, whatever the sequence is, and I just,
60:30 - 61:00 with a fixed window size, and with this label.
So now you have a perfectly supervised, perfectly defined binary supervised machine learning
problem. Then you pass it to your standard AI, you know, algorithm that they're, you know,
just, you know, out of the box ones, nothing, you don't even have to tune your particular
architecture. Just take your favorite one, and then do cross validation, you know, the
standard stuff, take sample, do the training,
61:00 - 61:30 you know, and then try to validate this on unseen
data. So if you do this to the mod3 problem, to this one, you immediately find that, you know,
any neural network, or whatever Bayes classifies, would do it 100% accuracy, as it should,
because it would be really dumb if it didn't, because this is just a linear transformation.
So even if you have a single neuron that's just doing linear transform, that's good enough to do
it. The prime-q problem, I did some experiment,
61:30 - 62:00 some, oh gosh, like seven years ago, and it got
80% accuracy. And I was like, wow, that's kind of, this was a wow moment, I was like, why is it
doing 80? I don't have a good answer to this. Why is it doing 80% accuracy to this? How is it
learning? Maybe it's doing some sieve method, which is kind of interesting, somehow. The second
number is just to chi-squared, just to double test what's called MCC, which is Matthew's correlation
coefficient. These are just buzzwords in stats.
62:00 - 62:30 I never learned stats, but now I'm relearning. I
took Coursera in 2017, so I can relearn all these buzzwords. Great. It's great, it's really useful.
And then this shifted Liouville lambda function, it's, sorry, I think I made a, yeah, I mistakenly
called this Möbius mu function. It's not, I mean, it's related, but it's not. It's the shifted
Liouville lambda function. Got it. Sorry,
62:30 - 63:00 one of my neurons died when I said Möbius mu,
but it's Liouville lambda. You were subject to the one-pixel attack. But so this one, I
couldn't break 50%, right? 0.5 just means it's coin toss. It's not doing any better guessing
than whatever. And this chi-squared is 0.00, that means I'm up to statistic error. So which means
I couldn't find an AI system which could break, which could do better than random guess. I'm not
saying there isn't one, it would be great if there
63:00 - 63:30 were one. And then, yeah, so it's kind of, you
know, it's life. And I couldn't, if I do break it, you know, I might actually stand a good chance
breaking every bank in the world. All right. But I don't, I haven't made it worse. Let's
remain close friends. Yeah, that's right, that's right. So I was very proud of this because
this experiment, I'm going to mention a bit later,
63:30 - 64:00 this Liouville lambda was just a thing I was
just trying, like way back when. But apparently, Peter Sarnak, whom I really admire, he's one of
the world's greatest number theorists currently, current number theorists. And I got to know him
through this memoration thing that I'm going to talk about later. And I reminded him that I almost
became his undergraduate research student. I ended up doing, I was an undergrad at Princeton,
where I had two paths I could follow for,
64:00 - 64:30 you know, it kind of defines your undergraduate
thesis, right? So one was in mathematical physics, that's with Alexander Polyakov. And the other
one was with, you know, two problems. And the other one was actually offered by Peter Sarnak
on arithmetic problems. And I somehow just, because I wanted to understand the nature of space
and time, I went through the Alexander Polyakov
64:30 - 65:00 path to do mathematical physics, which led to do
string theory. After 20, 30 years, I came full back to be in Peter Sarnak world again. I met him
at this conference, I reminded him of this, and he was very happy. But what's really interesting is
that he was asking DeepMind the same question a few years ago about the Liouville lambda, whether
DeepMind could do better than 50%. So I was glad
65:00 - 65:30 that I thought along the similar lines as a great
expert in number theory. And somebody who could have potentially have been my supervisor, and then
I would have gone into a number theory instead of string theory, which is whatever, it's just how
life happens. So perhaps you're going to get to this later on in the talk, but I noticed here
you have the word classifier. And the recent buzz since 2020 or so has been with architecture,
the transformer architecture in specific. So is there anything regarding mathematics, not
just LLMs, that has to do with transformer
65:30 - 66:00 architecture that's going to come up in your talk?
Not specifically. I'm actually, it's interesting, one of my colleagues here at the London
Institute, he's Mikhail Burtsev. He's an AI, he's our institute's AI fellow, and he's an expert
on transformer architecture. So I've been talking to him and we're trying to get, to devise a nice
transformer architecture to address problems
66:00 - 66:30 in finite group theory. It's in the works. But
nothing so far, even with the memorization stuff, it's very basic neural networks that we didn't use
anything more sophisticated than that. So to be determined whether it will outperform the standard
ones will be kind of interesting. Got it. Yeah, so actually now we go way back to the beginning
of our conversation, is how I got into this stuff. And that, I don't know, completely coincidentally
was through string theory. So at this point,
66:30 - 67:00 maybe I'll just give a bit of a background of how
all this stuff came about, at least personally. Why was I even thinking about this? Because I knew
nothing about AI seven, eight years ago. Zero, like literally zero. I knew nothing more than
to read it from the news. And this is actually a very interesting story, which shows again, the
kind of ideas that the string theory community
67:00 - 67:30 is capable of generating, just because you got
all these experts looking at kind of interesting problems. So let's go way back. And again, you
know, I've quoted Gauss, right? I gotta cook, I have to say something about Euler. So this is a
problem. Again, you can see I'm very influenced by three, the number three. You know, I'm a total
numerologist, right? Trinity, name the three, three is something, right? And then there is
called the trichotomy classification theorem by
67:30 - 68:00 Euler. This dates to 1736. So if you look at, so
I'm going to say the buzzword, which is connected compact orientable surfaces. So these are, you
know, I mean, the words explain themselves, you know, they have no boundaries and they're,
you know, topologically, you know, whatever the topological surfaces. So Euler was able to
realize that a single integer characterizes all
68:00 - 68:30 such surfaces. So this is the standard thing
that people see in topology, right? So the surface of a ball is the surface of a ball, and
you can deform it, you know, the surface of a football is the same as an American football,
it can deform without cutting or tearing. And then the surface of a donut is the same as,
you know, your cup, right? Because, you know,
68:30 - 69:00 it's everything that everyone, the standard thing,
you know, this is, it has one handle. And so the surface of a donut is exactly the topologically,
what they call topologically homomorphic to the cup. And then you got the, you know, the pretzel.
So I think that's a pretzel. Or maybe, I think this is like the German pretzel, and it gets
more and more complicated. But Euler is, because, you know, Euler invented the field of topology. So
he realized this idea of topological equivalence,
69:00 - 69:30 in the sense that there's a single topological
invariant, which we now call the Euler number, which characterizes these things. Another way to,
an equivalent way to say is, the genus of these surfaces is, you know, no handles, one handle, two
handles, three handles, and so on and so forth. It turns out that the Euler number, which we now
call the Euler number, is 2 minus twice the genus. So 2, 2, 2 minus 2g. Okay, that's great. So this
is, that's the classic Euler's theorem. And then,
69:30 - 70:00 you know, comes in Gauss, right? Once you've got
these three names, the Euler, Gauss, and Riemann, you know, this is, it's got to be some serious
theorem, right? So Euler did this in topology. And then Gauss did this incredible work, which
he calls him, he himself calls him the Theorema Grigio, the great theorem, which he considers
this is his personal favorite. And this is Gauss,
70:00 - 70:30 right? And Gauss said, you can relate this number
to, which is, this number is purely topological. You can relate this number to metric geometry.
So he came up with this concept, which we now call Gaussian curvature, which is some complicated
stuff. You can characterize this curvature, which you can define on this. Well, this is even before
the word manifold existed on the surface. And then
70:30 - 71:00 you can integrate using calculus, and the integral
of this Gaussian curvature divided by 4 pi is exactly equal to this topological number. And
that's incredible, right? The fact that you can do an integral, it comes out to be an integer. And
that integer is exactly a topology. So this idea, this Gauss related geometry to topology in this
one suite. And then what's even the next level,
71:00 - 71:30 comes Riemann. Riemann says, well, what you can
do is to complexify. So these are no longer, you know, real connected compact orientable surfaces.
But you can think about these as complex objects. So what do we mean by that is, well, if you think
about, you know, the real Cartesian plane, that's
71:30 - 72:00 a two dimensional object. But you can equally
think of that as a one complex dimensional object, namely the complex plane. Or the complex line.
Yeah, the complex line, exactly. So with R2, Riemann would call C. And then Riemann realized
that you can put similar structure on all of these things as well. So all of a sudden, these things
are no longer two dimensional real orientable surfaces. But one complex dimensional, what's now
called curves. I mean, it's a terrible name. So a
72:00 - 72:30 complex curve is actually a two real dimensional
surface. And it turns out that all complex curves are orientable. So you already rule out things
like, you know, Klein bottles and stuff like that, or Möbius strips. So the complex structure
requires orientability. And that's partly because of Cauchy-Riemann relations, you know, it puts a
direction, you can't get away. But the idea is,
72:30 - 73:00 the interesting thing is, all of these now should
be thought of as one complex dimensional curves. They're called curves because they're one complex
dimension, but they're not curves, right? They're surfaces in the real sense. Yes. So now here
comes, so if you apply this to the Gauss thing, you get this amazing trichotomy theorem. And the
theorem says, if you do this to the curvature, you can see this, I mean, the number here
is two, right? You get the Euler number two,
73:00 - 73:30 which is a positive curvature thing, right?
And that's consistent with the fact that the sphere is a positively curved object. Locally,
everywhere, it has positive curvature. If you do it to a torus, or the surface of a donut, which
is just called, you know, the algebraic donut, you integrate that, you get zero curvature.
And this is not a surprise, because, you know, you have a sheet of paper, you fold it once,
you get a cylinder, and you fold it again, you glue it again, you get this torus, this
donut. And this sheet of paper is inherently
73:30 - 74:00 flat. Yes. So if you just take a piece of paper,
you take this piece of paper, and you roll it up, you get a cylinder. And then you do it again,
you get the surface of a donut, like a rubber tire. And that is inherently zero curvature. And
then you can do this, and this is a consequence
74:00 - 74:30 of what's known as Riemann uniformization theorem.
If you do anything that has more than one handle, you get zero curvature. So now you have the
trichotomy, right? You have positive curvature, zero curvature, and negative curvature. The one in
the middle is really, obviously, it's interesting, it's the boundary case. In complex algebraic
geometry, these things are called funnel varieties. Earlier you said, if you have anything
that's more than one handle, you have zero curvature, you meant negative curvature. Sorry,
sorry, I meant negative curvature. Negative,
74:30 - 75:00 yeah. So these fidget spinors on the right, they
all have negative curvature. Everything here has negative curvature. Yeah. So now in the world
of complex algebraic geometry, these positive curvature things are called funnel varieties,
after this Italian guy, Fano. These negative curvature objects which proliferate are called
varieties of general type. And this boundary case are called zero curvature objects. And it just
so happens, we now call things in the middle,
75:00 - 75:30 clavier. These zero curvature objects. Yes. So
far, this has got nothing to do with physics. I mean, it's just the fact of topology. But this
is such a beautiful diagram that took from 1736 until Riemann. Riemann died in the 1860s, I
think, or something like that. So it took 100, 120 years to really formulate just this table to
relate metric geometry to topology, to algebraic
75:30 - 76:00 geometry. It's kind of a beautiful thing, right?
So to generalize this table is the central piece of what's now called the minimal model program in
algebraic geometry, for which there have been all these fields metalists, you know, Birkar a couple
of years ago, and then it started with Mori who got the fields metal, and then this whole Mukai
and this whole distinguished idea. So basically, this minimal model program should just generalize
this to higher dimension. This is dimension,
76:00 - 76:30 complex dimension one, right? How do you do it?
It's very hard. And once you have it, I won't bore you with the details. This is very nice.
You know, there's topology, algebraic geometry, differential geometry, index theorem, they all
get unified in this very beautiful way. And you want to, obviously, you want to generalize
this to arbitrary dimension, arbitrary complex dimension. It'd be nice. It's still an open
problem. How do you do it in general? It's a very nice problem. But at least for a class
of complex manifold known as Kähler manifolds,
76:30 - 77:00 I won't bore you with the details, but Kähler
manifolds on which where the metric has very nice behavior, there's a potential for which you can
have a double derivative that gets on the metric. And then it was conjectured by Calabi in the 50s.
Again, you know, 54, 56, 57, it was a great year, right? All these different ideas, I mean, in three
completely different worlds, now come together because mathematical physicists have kind of tied
it up, you know, the world of neural networks,
77:00 - 77:30 the world of Calabi conjecture, the world of
string theory to one. I like, you know, when things get bridged up in this way, you know, but
again, the theorem itself is extremely technical. But the idea is for this Kähler manifolds, there
is an analog of this diagram, basically. I love this slide. I saved this slide for my own private
notes. I keep a collection of dictionaries in physics and math. Yeah. I think this is beautiful.
Yeah, me too. I mean, but it took me years to do
77:30 - 78:00 this table because, you know, it's not written
down anywhere. And it touches different things. I think it's not written down anywhere precisely
because math textbooks are written in the Bubaki style. But now it just becomes clear what people
have been thinking about for the past 100 years, you know, after Grotendieck. It's just
trying to relate these ideas. You know,
78:00 - 78:30 this is intersection theory of characteristic
classes. Ah, so this is topology. And, you know, this is, I mean, this is over 200 years of work
of, you know, the central part of analytics. And mathematicians like Chern, Ritchie, Euler,
Betty. Yeah, everything. Everybody was ever involved in this diagram is an absolute legend.
In fact, there is one more column to this diagram. I think for sure, I think when I did, I mean, this
was a slide from some time ago. But when I was
78:30 - 79:00 talking to a string audience, there is one more,
one more, which is relations to L-function. And that's what number theory comes in. So there
is one more column. And to understand this world to this one more column of its behavior to
L-functions, that's the Langlands program. So it's actually really magical that this table actually
extends more as far as, I mean, that's just as far as we know now, right? The L-functions and its
relations to modularity. And this is, of course,
79:00 - 79:30 obviously, to me, like mathematics is about
extending this table as much as possible to let it go into different fields of mathematics. So, but
at least for sure, we know there is one because of the Langlands correspondence, there is one more
column and that column should be on number theory and modularity. And soon there'll be another table
on the Yang invariant, the He invariant. No, I don't think, I don't think I have enough talent to
create something that, but it could well be there
79:30 - 80:00 should be something, something new to do. To me,
that's really the most fun part about mathematics. It's not, not so, I mean, they're like, you know,
I, who was it? I think maybe it's Arnold as well, because there are two types of mathematicians.
They're the hedgehogs and they're the birds, right? Hedgehogs really like, you know,
like, like specialize, specialize. I mean, you absolutely need it. I think, you know, who
is a great hedgehog? I think Zhang, the guy who,
80:00 - 80:30 you know, made this first major breakthrough
in the pride gap. I mean, he's been saying his entire life, just trying to think about, can I
bound, can I bound the, you know, the, how many, you know, in the, in the, what is the, what's
the limb sob of the, of the distance between, between prime pairs. And the technique he uses is,
it's beautifully argued, analytic number theory
80:30 - 81:00 technique, sieve methods, you know, kind of, you
know, the, the, the Ben Green world of this, of, of, of sieves and James Maynard. And then there
are the, the, the, the, the birds who are like, you know, I'm just going to just fly
around. I may bump into trees and whatnot, but I'm just trying to see whether they can
do. And, and people like Robert Langlands and, you know, they're very much in that world. Can I
see from a distance? I mean, I may get very coarse grain view. And which are you? I'm 100% in the
bird category. I mean, what I, I, I like to go,
81:00 - 81:30 you know, once I, once I see something and I,
I, I, of course, sooner or later, you need to dig like a, like a hedgehog, but the most thrill
that I get is when I say, oh, wow, this is gets, gets connected. So the results are proven when you
dig, but the connections are seen when you get the overview. Yeah. Yeah, absolutely. So, I mean, of
course, again, this is a division that's kind of artificial in all of us. We, we, we do a bit of
both. Yes. The guy who really does it well is,
81:30 - 82:00 I forget to mention, of course, it's, it's, it's
like, it's, it's become like a grand, well, he, he passed away, John Mackay, who was a Canadian,
probably the, the, the great, the greatest, greatest Canadian mathematician since Coxeter.
John Mackay really saw unbelievable connections in fields that nobody will ever see. And he passed
away, he became sort of, in the last 10 years of
82:00 - 82:30 life, he became sort of like a, like a grandfather
to, to me. He, you know, he saw my kids grow up, you know, over Zoom. I know, so the, the London
Math Society asked me to write an obituary. I was very touched by this, and so I wrote his
obituary for it, and I was just trying to say, well, this guy is the ultimate pattern, you know,
linker. So, so John Mackay, absolute legend. Great. Moving on. I mean, this is a, this is very
much, this is very much a huge digression for what
82:30 - 83:00 I'm actually going to tell you about, which is,
you know, the Birch test for AI. And that's... Great. Do you have a limit on what these videos
are? No, just so you know, some of them are one hour, some of them are four hours. And people
listen for all of it. Yeah, this is great fun. Great. Yeah, same. I'm loving this. Yeah, me too.
Because normally, you know, I have one hour, you know, what, 55 minute cutoff. I could give a
talk, right, and then five minutes questions.
83:00 - 83:30 And they're like, oh my God, I haven't said most
of the stuff I wanted to say. Yeah, yeah, exactly. Because the point of this channel is to give
whoever I'm speaking to enough time to get through all of their points, rather than they're rushing
and not covering something in depth. I want them to be technical and rigorous. So please continue.
Sure. Sounds good to me. So in that magical year of 1957 of neural networks, the magical year of
the automated theorem prover world, and the world
83:30 - 84:00 of algebraic geometry, in three complete different
worlds, they didn't even know each other's names, let alone the results. Kalabji conjectured that at
least for Kähler manifolds, this diagram is very much well-defined, this table. And Yau proved
it 20 years later. So Xintong Yau, who is, again, very much like a mentor to me. And
he gets the Fields Medal immediately. So
84:00 - 84:30 you can see why this is so important. He gets the
Fields Medal because this idea of falling through Kalabji is trying to generalize this sequence
of ideas of Euler, Riemann, and Euler, Gauss, and Riemann. So it's certainly very important.
So there it is. We can park this idea. So Yau showed that there are these Kähler manifolds that
have this property, that have the right metrical properties. So by metric, I mean distance,
something you can integrate over. Because here,
84:30 - 85:00 you never think that this integral is messy,
right? Even if we do this on a sphere, this R has all these cosines and sines. And they've
all got to cancel at the end of the day to get 4 pi. Yes. Like, what the hell? And then divided
by 2 pi, you get 2. And that's the Euler number, which is kind of amazing stuff. And now you can
do this in general. Just as a caveat, Yau showed that this metric exists. He never actually gave
you a metric. So the only currently known metric
85:00 - 85:30 on this thing is, for the zero curvature case,
it's just the torus. Anything above that, we don't know. We just know that exists. And if you did
this integral, you're going to get like 2, 5, or whatever the number is. Which is kind of amazing.
This is like a completely non-constructive proof. What's interesting is that these automated
theorem provers, they seem computational. And it's my understanding that computationalists, so
people who use intuition as logic, they don't like
85:30 - 86:00 constructive proofs. Sorry, they like constructive
proofs. They don't like non-constructive proofs. In other words, existence proofs without showing
the specific construction. So it's interesting to me that all of undergraduate math, which has
some non-constructive proofs, are included in Lean. So I don't know the relationship between
Lean and non-constructive proofs, but that's an aside. Yeah, that's an aside. I probably
won't have too much to say about it. Cool. So,
86:00 - 86:30 I don't know why I went on this digression
on string theory. But I just want to say, this is a side comment. So this is something
since 1736, which is kind of nice. Oh, by the way, that's actually kind of interesting. I'm going
to have to check this again. Just down the street from the Institute is the famous department store
Fortman Mason's, which I think is established in
86:30 - 87:00 17-something. It's a great department store.
It's not where I usually do my shopping, but it's just a beautiful department store where
Mozart and Haydn might have called and did their Christmas shopping. But anyhow, just random
thought. So string theory was just one slide, right? I mean, in some sense, I'm not a string
theorist. In a sense, I don't go quantize strings.
87:00 - 87:30 The kind of stuff that I'm more interested in is
like, I didn't grow up writing conformal field theories and do all that stuff. For me, it's an
input, so I can play with a little more problems in geometry. So string theory is this theory of
space-time that unifies quantum gravity, blah, blah, blah. And then it works in 10 dimensions,
and we've got to get down to four dimensions. So we're missing six dimensions. So that's what I
want to say. And this amazing paper in 1985 by
87:30 - 88:00 Candelas, Horowitz, Strominger, and Witten, they
were thinking about what are the properties of the six extra dimensions. So what is interesting is
that by imposing supersymmetry, and this is why supersymmetry is so interesting to me, by imposing
supersymmetry and other anomaly cancellation, not too stringent conditions, they hit on the
condition that this six extra dimensions has to
88:00 - 88:30 be Ricci-flat. Ricci-flat, you can understand,
because it's vacuum-style solutions. You want the vacuum string solution. And then a condition
which you've never seen before, which just happens to be this Cayley condition. They didn't know
about this. No physicist until 1985 would know what a Cayley manifold was. And it's complex,
and it's complex dimension three. Remember, again, I said complex dimension three means real
dimension six, right? That's 10 minus four is six,
88:30 - 89:00 and six needs to be complexified into three. And
again, this is just an amazing fact that in 1985, Strominger, who was a physicist, was visiting Yau
at the Institute of Advanced Study in Princeton. And so he went to Yau and said, can you tell
me what this strange condition, this technical condition I got? And Yau says, well, you know, I
just got the Fields Medal for this. I think I may
89:00 - 89:30 know a few things. It's just amazing. Again, it
was a complete confluence of ideas that's totally random. And the rest is history. So in fact, these
four guys named this Ricci-flat Cayley manifold Clabi-Yau. So it wasn't the mathematicians who
did it. This word Clabi-Yau came from physicists. So from string theorists, which now, you know,
of course, Clabi-Yau is now one of the central pieces. And so Philip Candelas was my mentor at
Oxford when I was a junior fellow there. And he
89:30 - 90:00 tells me this story. He's a very lively guy. He
tells me about how this whole story came about, and it's very interesting. So he and these four
guys came up with the word Clabi-Yau. So all of a sudden, we now have a name for this boundary
case in complex algebraic geometry. This bounding
90:00 - 90:30 case is now known as a Clabi-Yau. So remember,
we had names before, right? This was the final variety. This was varieties of general type.
And this bounding case is now called Clabi-Yau. So what we're seeing with the torus here is
a Clabi-Yau 1. Exactly. In fact, the torus is the only Clabi-Yau 1. So it's the only one
that's Ricci-flat. I mean, by this classification,
90:30 - 91:00 it's the only one that's topologically possible.
So that's kind of interesting, right? And then this is just a comment. I like this title because
I think your series is called TOE. This is a TOE on TOE. Love it. I just want to emphasize, this
is a nice confluence of ideas with mathematical physics. But string theory really, what it really
is, is this brainchild of interpreting problems between, interpreting and interpolating between
problems in mathematics and physics. So for
91:00 - 91:30 example, we now, you know, GR should be phrased
in differential geometry. The standard model gauge theory should be phrased in terms of algebraic
geometry and representation theory of finite groups. And, you know, condensed matter physics of
topological insulators should be phrased in terms of algebraic topology. This idea, you know, I
think the greatest achievement of the 20th century physics is, to me, and I think something you would
appreciate since you like tables, is that here's
91:30 - 92:00 a dictionary of a list of things, and then here's
what they are in mathematics. And then, you know, you can talk to mathematicians in this language,
and you can talk to physicists in that language, but they're actually the really same, same, same
thing. You know, what's a fermion? You know, it's a spin representation of the Lorentz group.
You know, I like that because it gives a precise definition of what we are seeing around. Then you
have something you can purely play with in this platonic world. And string theory is really just
the brainchild of this translation, this tradition
92:00 - 92:30 of what's on the left and what's on the right, and
let's see what we can do. And sometimes you make progress on the left, you give insight and stuff
on the right, and sometimes you make progress on the right and you give insight on the left. Why
is it that you call the standard model algebraic geometry? Because bundles and connections are
part of differential geometry, no? Oh yeah, that's true. Well, I think that's, yeah, I mean,
they're interlinked. And I think algebraic, maybe it's because of Atiyah and Hitchin. Of course, you
know, they are fluid in both. Yeah, they go either
92:30 - 93:00 way. But algebraic in the sense that you can often
work with bundles and connections without actually doing the integral in differential geometry.
So I think that's the part I want to emphasize. You know, you can understand bundles purely
as algebraic objects without ever doing an
93:00 - 93:30 integral. You know, like here, for example. Like
this integral is obviously something you would do in differential geometry. But this integral, the
fact that it comes to be an integer, was explained through the theory of churn classes. You know,
this integral is a pairing between homology and cohomology, which is a purely algebraic thing.
You know, we all try to avoid doing integrals,
93:30 - 94:00 because integrals are horrible. Because it's
hard to do. And in this language, it really just becomes polynomial manipulation. And it becomes
much simpler. Okay. So, you know, in that sense, I want to put it. Of course, you know, it's a bit
of both. So I like doing this diagram, right? And, you know, if you look at the time lag between the
mathematical idea and the physical realization of
94:00 - 94:30 that idea, there really is a confluence. Yeah.
It's getting closer. I mean, these things going up and down. I mean, I'm just saying in
the past, if you take the last 200 years, last 100 years or so of the groundbreaking ideas
in physics, there is this. Interesting. Right. It gets shorter and shorter. So, I mean, obviously,
Einstein took ideas of Riemann. And, you know, there was a six-year gap. Dirac was able to come
up with the equation of electron, essentially
94:30 - 95:00 because of Clifford algebras. Historically, was
he motivated by Clifford algebras? Or was it later realized, hey, Dirac, what you're doing is
an example of a Clifford algebra? So I believe the story goes, in order to write down the first-time
derivative version of the Klein-Gordon equation, which is a second-order, you know, that's the
bosonic one, he had to do some… Essentially, he factorized the matrix in a way that
seemed very strange to him. And Dirac said,
95:00 - 95:30 this really reminded me of something that I've
seen before. And this is one of those moments, right? Today, we can ChatGPT this. But
what Dirac did was, he was at St. John's in Cambridge at the time. He said, I have seen
this in a textbook before somewhere, you know, this gamma mu gamma nu thing. And then he said,
I need to go to the library to check this. So he
95:30 - 96:00 really knew about this. And unfortunately, the
St. John's library was closed that evening. So he waited until the morning, until the library
was open, to go to Clifford's book. Or a book about Clifford. I can't remember whether
it was Clifford's book, or maybe it was one of these books. And then he opened up,
and he really knew that this gamma mu gamma nu anti-commutation relation really was through…
So he knew about Clifford. Cool. It's kind of
96:00 - 96:30 interesting. Yeah. Just like Einstein knew about
Riemann's work on curvature. But whether you say, you know, Dirac was really inspired by
Clifford, well, he certainly did a funky factorization. And then he knew how to justify
it immediately, by looking at the right source. And then similarly, you know, Yang-Miao's theory
depended on this Zybert's book on apology. And
96:30 - 97:00 then, you know, by the time you get to Witten and
Borchardt's, really, there's this… This diagram, for me, is what gets me excited about string
theory. Because string theory is a brainchild of this curve, this orange curve. And now it's
getting mixed up. I mean, of course, you know, people hear about this great quote that Witten
says, you know, string theory is a piece of 21st century mathematics that happens to fall into the
20th century. And I think he means this. You know,
97:00 - 97:30 that he was using supersymmetry to prove, you
know, theorems in Morse theory, and vice versa. Richard Borchardt was using vertex algebras,
which is sort of foundational thing conformal field theory, to prove some properties about the
monster group. We're at this stage. And of course, you know, this was turn of the century. And now
we're here, and we have to… Where are we now? Are we crisscrossed, or are we parallel? It's hard to
say. And in a meta manner, you can even interpret
97:30 - 98:00 this as the pair of pants in string theory with
the world sheet. Yeah, cute. Very cute. Why not? But going back to what you were saying, how I got
to… Oh, yeah. So just, yeah, this confluence idea, of course, you know, everyone quotes these
two books, papers. You know, when Wigner was
98:00 - 98:30 thinking about in 59, why mathematics is so
effective in physics. And there's this maybe slightly less known paper, but certainly equally
important paper by the great Leitler, Thea, and then Dijkgraaf and Hitchen, which is the other
way around. Why is physics so effective in giving ideas in mathematics? So this is a beautiful pair
of essays. This is like very much in the world of
98:30 - 99:00 a summary of the kind of physics ideas from string
theory. It's making such beautiful advances in geometry. So this is a very beautiful pair of one
given in the other that needs to be, you know, sort of praised more. And that's why you were
mentioning earlier how I got to know, you know, Roger. So while he's through these editorials,
we try to connect, you know, with my colleague,
99:00 - 99:30 Molinke, who is a former director of the Chern
Institute. You know, everybody's connected, right? So it just so happens that, you know, I
grew up in the West, but after my trip with my parents, after so many decades, my parents
actually retired and went back to Tianjin, where Nankai University is, where Chern founded
what's now called the Chern Institute for
99:30 - 100:00 Mathematical Sciences. And that's an institute
devoted to the dialogue between mathematics and physics. In fact, one third of Chern's ashes is
buried outside of the Math Institute. There's a great, beautiful marble tomb. And one third,
not because of any mathematical reason, it's just that he considered three parts of his home.
So his hometown in Zhejiang, China, and Berkeley,
100:00 - 100:30 where he did most of his professional career, and
then Nankai University, where he retired to for the last 20 years of his life. So a third each.
Yes, the number three comes up again. And in fact, I was going to joke, so in Chern-Simons theory in
three dimensions, there's this topological theory, the Chern-Simons theory, there's a crucial factor
of one third. I always joke, you know, that's why Chern chose one third for his ashes, but that's
not right. Complete coincidence. But what is
100:30 - 101:00 actually interesting is that tomb, that beautiful
black marble tomb, you know, for somebody as great as Chern, it mentions nothing about, you know, his
chief done this, done the other thing. It's just one page of his notebook. You think about the poor
guy who had a chisel, or that he had no idea what he's chiseling, right? The guy was chiseling this
thing, and it's the proof of this. And of course,
101:00 - 101:30 you can look this on the internet, just say the
grave of S.S. Chern at Nankai University. Well, the whole conversation we've had is just
about pattern matching without the intuitive understanding behind it. So this chiseler may have
had that. Yes, that's what I do every day. I love it. So that chisel is essentially his proof of
why this is equal to this. You know, why this
101:30 - 102:00 intersection product is the same as this integral.
So essentially, it's where the Gauss-Bonnet theorem is a corollary of this trick in algebraic
geometry, which is his great achievement. But anyhow, back to this coincidence, and it just so
happens that my parents, after drifting all these years abroad, they retired back to Tianjin,
where the Chern Institute is. So that's why I became an honorary professor at Nankai, because
my motivation was purely just so that I could
102:00 - 102:30 spend time to hang out with my parents. But it
just so happens that it happens to be there, and I can just pay my homage to Chern, just to
see his grave. I mean, it's a great, you know, it's a mind-blowing experience just to see the
Chern's grave and to see the derivation of this in his handwriting chiseled in stone. But anyway,
so that's how I got involved with C.N. Yang, because he was very deeply involved with
Chern. He and Chern are good friends. I can
102:30 - 103:00 imagine that C.N. Yang is 102 today. Yeah, it's
remarkable. And that he was still doing, he wrote the preface to this when he was 99. These guys
are unstoppable. And, you know, Roger Penrose, he sent his essay to this one when he was, what,
92? Yeah, these guys are... Anyhow, it's kind of,
103:00 - 103:30 you like tables, right? I love tables. So the
tables, here's just a speculation of where string theory is going. Here's a list of, you know, the
annual conferences, like the series where string theory has been happening. So 1986 was the first
string revolution, where since then, every year, there's been a major string conference. I'm
going to the first one I'm going to for years
103:30 - 104:00 in two weeks' time. It happens in Abu Dhabi,
I get some sun. And then, you know, there's a series of annual ones, the StringFino, and then
StringMath came in as late as in 2011. That's kind of interesting. So that's like, you know, 30
years after the first string conference. And the various other ones. What's really interesting one
is in 2017, there's the first string data. This is when AI entered string theory. And so it's
kind of, so what I read the first paper in 2017
104:00 - 104:30 about AI-assisted stuff, and there were three
other groups independently, mining different AI aspects and how to apply to string theory. So
the reason I want to mention this was just how, why was, you know, with the string community even
thinking about these problems in AI? Oh, and also, just to be clear, briefly speaking, I'm not a
fan of tables, per se. I'm a fan of dictionaries because they're like Rosetta Stones. So I'm a
fan of Rosetta Stones and translating between
104:30 - 105:00 different languages. So you mentioned the siloing
earlier. And mathematicians call, even physicists call them dictionaries, but technically they're
thesauruses. Like a dictionary, you just have a term and then you define it. The translation.
Like Rosetta Stones. Yes. No, absolutely. I guess that's why you like Langlands so much. Yeah.
Yeah, for sure. Yeah, no, absolutely. In some way, this whole channel is a project of a Rosetta Stone
between the different fields of math and physics and philosophy. Right. Yeah. That's fantastic.
Love it. Big fan. Thank you. Okay. So do you want
105:00 - 105:30 to just, I noticed it jumped back to number 13. So
it seems like, I thought we were at 39 out of 40. No, no, no. Because I've learned this nonlinear
structure. Because you see, like, I've learned this, this is really dangerous. I've learned
like the click button in PDF presentations. Like you click it, it jumps to another one.
And you can have interludes. So, you know, it's clearly an interlude. And you say you jump back to
your main. So my actual main presentation is only
105:30 - 106:00 like, you know, 30 pages. But it's got all these
digressions, which is actually very typical of my personality. So I gave you this big interlude
about string theory and Calabi-Yau manifolds, right? So now we've already got to the point that
Calabi-Yau one-fold, the one-dimensional complex Calabi-Yau. There's only one example. That's
just one of these, right? And then it turns
106:00 - 106:30 out that in complex dimension two, there are two
of these. There is the four-dimensional torus, which is, and then there's this crazy thing
called the K3, which is Ritchie, Flatt, and Caylor. So you got one in complex dimension one,
two in complex dimension three. You would think in three dimensions, there's three of these
things that are topologically distinct. And unfortunately, this is one of the sequences
in mathematics that goes as one, two, we have
106:30 - 107:00 absolutely no idea. And we know at least one
billion. At least. So it's kind of, it goes one, two, a billion. And so Calabi-Yau, so starting
from complex dimension three just goes crazy. It's still a conjecture of Yau that in every dimension,
this number is finite. So remember this positive curvature thing, this final thing to the very
top? It is a theorem that in every dimension,
107:00 - 107:30 final varieties is finite in possibility, in
topology, that only a finite number of these that are distinct topologically. It's also known
that the negative curvatures is infinite in every dimension. And when it goes higher, it's like
even uncountably infinite. Oh, interesting. But it's this boundary case. Yau conjectures in
an ideal world, they're also finite. But we don't know. This is the open conjecture. Now the
billion, are any of them constructed? Or is it
107:30 - 108:00 just the existence? Yeah, that's it. Now that's
exactly what we're getting. So it's gotten one, two, and three. Three is like, you know, how are
you going to list these things, right? And then algebraic geometers never really bother listing
all in one mouth. This is just not something they do. So it took on, the physicists took on
the challenge. So Philip Candelas and Franz,
108:00 - 108:30 and then Harald Skaka and Maximilian Kreutzer
started just listing these. And that's why we have these billions. There is actually databases of
these. And they're presented in just like matrices like this. I won't bore you with the details of
these matrices. You know, these are algebraic varieties. You can define these as, you know,
like intersections of polynomials. That's one way to present them. And in Kreutzer and Skaka's
database, they put in vertices, optoric varieties, height, and bend. But the upshot is that,
you know, there's a database of many, many
108:30 - 109:00 gigabytes that really got done by the, certainly
by the turn of the century, by year 2000. These guys were running on Pentium machines. I mean,
this is an absolute feat. Especially Kreutzer and Skaka. They were able to get 500 million
of these things stored on a hard drive using a Pentium machine of these Carb-EL manifolds. And
they were able to compute topological invariants
109:00 - 109:30 of these. So that's, so I happened to have this
database. I could access them. And that was kind of fun. And I've been playing on and off with
them for a number of years. So, and, you know, a typical calculation is like, you know, you
have something like a configuration of tensors of, you know, here is even in integers. And you
have some standard method in algebraic geometry to compute topological invariants. And this
topological invariants, again, in this dictionary
109:30 - 110:00 means something. So for example, H1, H21, in some
context, is the number of generations of fermions in the low energy world. So that's a complete
problem in this computing a topological invariant in algebraic geometry. And there are methods to
do it. And in these databases, you know, people took 10, 20 years to compile this database. And
you got these things in. And they're not easy. It's very complicated to compute these things. So
in 2017, I was playing around with this. And the
110:00 - 110:30 reason is very, why I was playing around with
this was very simple, is because my son was born. And I had infinite sleepless nights, that I
couldn't do anything, right? I had like, you know, there's the kid. And then, you know, there's the
kid, and you know, and he wakes you up at two, you know, put him to, and, you know, and I was
bottle feeding him. And while I had a daughter at the time, so that my wife's taking care of the
daughter, they're passed out. And then I got this
110:30 - 111:00 kid, I passed him, I put them into bed, and I'm
wide awake at this point, it's like 2am. So like, I can't fall asleep anymore. And I can't do real,
you know, serious computation anymore, because I'm just too tired. So let's just play around with
data, the least I can let the computer help me to do something. And then that's when I learned,
well, you know, what's this thing that everybody's talking about? Well, you know, it's machine
learning. So that's why I got through this. It's
111:00 - 111:30 a very simple, very simple biological reason why
I was trying to learn machine learning. So then I think I was hallucinating at some point, right?
I was like, well, if you look at pictures, like, you know, matrices a lot, like, you know, we're
talking about, you know, 500 million of these things, right? Yes. Certainly, I wasn't going
through all of them. And they're being labeled by topological invariants. How different is it if
I just sort of pixelated one of these and labeled them by this? And all of a sudden, this began to
look like a problem in hand-digit recognition,
111:30 - 112:00 right? This is like, how different is this or
image recognition? So and I just literally started feeding in, I took 500, I mean, 500 million is
too much, right? So I took like 5,000 of these, 10,000 of these, and I trained them to look and
recognize this, to recognize this number. And I was like, this is going to be like, it's just
going to give crap. Obviously, it's going to
112:00 - 112:30 give 0% accuracy. And to my surprise, it was
giving extremely good accuracies. So somehow, the neural network that I was training, this is, I
was even using standard MNIST, you know, the hand recognition, MNIST things, recognizing this. And
it was recognizing it to great accuracy. And now, I mean, people have improved this, like loads of
people like, you know, Finatello, there's a group there that did some serious work on just trying
this problem. But this idea suddenly didn't seem
112:30 - 113:00 so crazy anymore. The idea seemed completely crazy
to me because I was hallucinating at 2am. But what's the upshot of this? The upshot is, somehow
the neural network was doing algebraic geometry, like this kind of algebraic geometry, really
sequence-chasing, very complicated Bourbaki-style stuff, without knowing anything about algebraic
geometry. It somehow was just doing pattern recognition, and somehow it's beating us. Because,
you know, if you do this computation seriously,
113:00 - 113:30 it's double exponential complexity. But it's just
now, by pattern recognition, it's bypassing all of that. So then I became a fanatic, right?
Then I said, well, all of algebraic geometry is image processing. And so far, I have not been
shocked by the algebraic geometries, because it's actually true. If you really think about it,
the point of algebraic geometry, the reason I like algebraic more than differential is because
there's a very nice way to represent manifolds in
113:30 - 114:00 this way. Manifolds in algebraic geometry. So in
differential geometry, manifolds are defined in terms of Euclidean patches. Then you do transition
functions, which are differentiable, C infinity, blah, blah, blah. But in algebraic geometry,
they're just vanishing low-side polynomials. And then once you have systems of polynomials, you
have a very good representation. So for example, here, I'm just recording the list of polynomials,
the degrees of polynomials that are embedded
114:00 - 114:30 in some space. And that really is algebraic
geometry. So basically, any algebraic variety, so that's a fancy way of saying this polynomial
representation of a manifold, which is called an algebraic variety, this thing is representable in
terms of a matrix or a tensor, sometimes even an integer tensor. And then the computation
of invariance, a topological invariance,
114:30 - 115:00 is the recognition problem of such tensor. But
once you have a tensor, you can always pixelate it and picturize it. At the end of the day, it's
doing this because it's just image processing algebraic geometry. Now, do you mean to say every
problem in algebraic geometry is an image process? Almost. Is an image processing problem, or just
problems involving invariance or image processing, or even broader than that? Well, I think it is
really more broad. I think at some level, I think
115:00 - 115:30 in my view, I try to say bottom-up mathematics is
language processing, and top-down mathematics is image processing. Interesting. Of course, this
is, I mean, take with a caveat, but of course, at some level, there is truth in what I say.
Of course, it's an extreme thing to say. But in terms of what mathematical discovery is, is that
you're trying to take a pattern in mathematics.
115:30 - 116:00 So in algebraic geometry, just a perfect example,
you can pixelate everything, and you can just try to see certain images have certain properties.
And so you're image processing mathematics, whereas bottom-up, you're building up mathematics
as a language. So it's language processing. And of course, all of this will be useless if you
can't actually get human-readable mathematics out of it. So this is the first surprise, the fact
that it's even doing it at all to a certain degree
116:00 - 116:30 of accuracy. Now we're talking about accuracy,
it's been improved to like 99.99 percent accuracy in these databases. But that's the first level,
that's the first surprise. The second surprise is that you can actually extract human-understandable
mathematics from it. And I think that's the next level surprise. So in the memoration conjectures,
this beautiful work in DeepMind that Jody Williamson's involved in, in this human-guided
intuition, you can actually get human mathematics
116:30 - 117:00 out of it, and that's really quite something. So
maybe that's a good point to break for part two, which is an advertisement of, you know,
here is like, we've gone through many, many things about what mathematics is, and to,
you know, how it got this through doing, you know, this interaction between algebraic geometry
and string theory. And then a second part would be how you can actually extrapolate and extract
mathematics, actual conjectures, things to prove
117:00 - 117:30 from doing this kind of experimentation, which are
summarized in these books. I keep on advertising my books because I get 50 pounds per year of,
what do they call it, royalties, you know, so I don't have to sell my liver for my kids. But it's
actually kind of fun. It's a complete, I mean, academic publishing is a good joke, right? You
get like, I don't know, like 100 pounds a year,
117:30 - 118:00 because you don't actually make money out of it.
But maybe that's a good place to break. And then for part two, how we try to formulate what the
Birch Test is for AI, which is sort of, you know, the Turing Test Plus. Because the Birch Test is
how to get actual meaningful human mathematics out of this kind of playing around with
mathematical data. I see two of your sentence that will be these maxims for the future will be
that machine learning is the 22nd century's math
118:00 - 118:30 that fell into the 21st. So this machine learning
assisted mathematics, or that the bottom up is language processing, and then the bottom, the top
down is image processing. Yeah. I like those two. Yeah. Anyone who's watching, if you have questions
for Yang-Hui for part two, please leave them in the comments. Do you want to give just a brief
overview? Oh, yeah, sure. So just so I'm going to talk about what the Birch Test is, and which
papers so far have gone, how close they've gone
118:30 - 119:00 to the Birch Test. And then I'm going to talk
about some of the more experiments, number three, and the one that I really enjoyed doing with
my collaborators, Lee, Oliver, and Pashnakov, which is to actually make something meaningful
that's related to the Birch-Stone-Winton-Dye conjecture. Just by just letting machine go crazy
and finding a new pattern in elliptic curves, which is fundamentally a new pattern in the
prime numbers, which is completely amazing. You
119:00 - 119:30 mentioned quanta earlier. So this quanta feature
that featured this one, consider this as one of the breakthroughs of 2024. Great. And that word
murmuration, which was used repeatedly throughout, it was never defined, but it will be in the part
two. Absolutely. I'm looking forward to it. Me too. Me too. Okay. Thank you so much. Thank
you. This has been wonderful. I could continue speaking to you for four hours. Both of us have
to get going, but that's so much fun. Pleasure.
119:30 - 120:00 Don't go anywhere just yet. Now I have a recap of
today's episode brought to you by The Economist. Just as The Economist brings clarity to complex
concepts, we're doing the same with our new AI-powered episode recap. Here's a concise summary
of the key insights from today's podcast. Alright, let's dive in. We're talking about Curt Jamungal
and his deep dives into all things mind-bending. You know this guy puts in the hours, like weeks
prepping to grill guests like Roger Penrose on
120:00 - 120:30 some wild topics. Yeah, it's amazing using his own
background to dig in. Really challenging guests with his knowledge of mathematical physics pushes
them beyond the usual. Definitely. And today we're focusing on his chat with mathematician Yang-Hui
He. They're getting into AI, math, where those two worlds collide. And it's fascinating because it
really makes you think differently about how math works, how we do math, and where AI might fit into
the picture. You might think a mathematician's
120:30 - 121:00 life is all formulas and proofs, but Yang-Hui,
he actually started exploring AI-assisted math while dealing with sleepless nights with his
newborn son. It's such a cool example of finding inspiration when you least expect it. Tired but
inspired, he started messing around with machine learning in those quiet early morning hours. So
let's break down this whole AI and math thing. Yang-Hui, he talks about three levels of math.
Bottom-up, top-down, and meta. Bottom-up is like building with Legos. Very structured, rigorous
proofs. That's the foundation. But here's where
121:00 - 121:30 things get really interesting. It has limitations.
Right. And those limitations are highlighted by Gödel's incompleteness theorems. Basically, Gödel
showed us that even in perfectly logical systems, there will always be true statements that can't be
proven within that system. It's mind-blowing. So if even our most rigorous math has these inherent
limitations, it makes you think. Could AI discover truths that we as humans bound by our formal
systems might miss? Could it explore uncharted territory? That's a really deep thought. And it's
really at the core of what makes this conversation
121:30 - 122:00 revolutionary. It's not about AI just helping us
with math faster. It's about AI possibly changing how we think about math altogether. So how is this
all playing out? We've had computers in math for ages, from early theorem provers to AI assistants
like Lean. But where are we now with AI actually doing math? Well, AI is already making some big
strides. It's tackling Olympiad-level problems and doing it well. Which makes you ask, can AI really
unlock the secrets of math? And that leads us to
122:00 - 122:30 the big philosophical questions. Is AI really
understanding these mathematical ideas? Or is it just incredibly good at spotting patterns?
It's like that famous Chinese room thought experiment. You could follow rules to manipulate
Chinese symbols without truly understanding the language. Yang-Hui, he shared a story about Andrew
Wiles, the guy who proved Fermat's last theorem, trying to challenge GPT-3 with some basic math
problems. It highlights how early AI models, while excelling in tasks with clear rules and
plenty of examples, struggled with things that
122:30 - 123:00 needed real deep understanding. It seems like AI's
strength right now is in pattern recognition. And that ties into what Yang-Hui calls top-down
mathematics. It's where intuition and seeing connections between different parts of math are
king. Like Gauss. He figured out the prime number theorem way before we had the tools to prove it.
It shows how a knack for patterns can lead to big breakthroughs even before we have the rigorous
structure. It's like AI is taking that intuitive leap, seeing connections that might have taken
us humans years, even decades, to figure out.
123:00 - 123:30 And it's all because AI can deal with such massive
amounts of data. Which brings us back to Yang-Hui. He's sleepless nights. He started thinking about
Calabi-Yau manifolds, super-complex mathematical things key to string theory, as image-processing
problems. Wait, Calabi-Yau manifolds? Those sound like something straight out of science fiction.
They're pretty wild. Think six dimensions all curled up, nearly impossible to picture. They're
vital to string theory, which tries to bring all
123:30 - 124:00 the forces of nature together. Now, mathematicians
typically use these really abstract algebraic geometry techniques for this. But Yang-Hui? He had
a different thought. So instead of equations and formulas, he starts thinking about pixels. Yeah.
Like taking a Calabi-Yau manifold, breaking it down into a pixel grid like you do with an image.
He's taking abstract geometry and turning it into something a neural network built for image
recognition can handle. That is a radical change in how we think about this. It's like he's
making something incredibly abstract, tangible,
124:00 - 124:30 translating it for AI. Did it even work? The
results blew people away. He fed these pixelated manifolds into a neural network, and it predicted
their topological properties really accurately. He basically showed AI could do algebraic geometry
in a whole new way. So it's not just speeding up calculations. It's uncovering hidden patterns
and connections that might have stayed hidden, like opening a new way of seeing math. And that
leads us to the big question. If AI can crack open
124:30 - 125:00 complex math like this, what other secrets could
it unlock? We're back. Last time we were talking about AI not just helping us with math, but
actually coming up with new mathematical insights, which is where the Birch test comes in. It's
like, can AI go from being a supercalculator to actually being a math partner? Exactly. And now
we'll look at how researchers like Yang-Hui He are trying to answer that. Remember, the Turing
test was about a machine being able to hold a conversation like a human. The Birch test is a
whole other level. It's not about imitation. It's
125:00 - 125:30 about creating completely new mathematical
ideas. Think about Brian Birch back in the 60s. He came up with this bold conjecture
about elliptic curves, just from looking at patterns and numbers. So this test wants AI to
do similar leaps, to go through tons of data, find patterns, and come up with conjectures that
push math forward. Exactly. Can AI, like Birch, show us new mathematical landscapes? That's asking
a lot. So how are we doing? Are there any signs AI
125:30 - 126:00 might be on the right track? There have been some
promising developments. Like in 2021, Davies and his team used AI to explore knot theory. Knots,
like tying your shoelaces. What's that got to do with advanced math? It's more complex than you
think. Knot theory is about how you can embed a loop in three-dimensional space, and it actually
connects to things like topology and even quantum physics. Okay, that's interesting. So how does AI
come in? Well, every knot has certain mathematical properties called invariance. It's kind of
like its fingerprint. Davies' team used machine
126:00 - 126:30 learning to analyze a massive amount of these
invariants. So was the AI just crunching numbers, or was it doing something more? What's amazing is
the AI didn't just process the data. It actually found hidden relationships between these
invariants, which led to new conjectures that mathematicians hadn't even considered
before. Like the AI was pointing the way to new mathematical truths. That's wild. Sounds
like AI is becoming a powerful tool to spot patterns our human minds might miss. Absolutely.
Another cool example is Lample and Charton's work
126:30 - 127:00 in 2019. They trained AI on a massive data set of
math formulas. And what did they find? Well, this AI could accurately predict the next formula in
a sequence, even for really complex ones. It was like the AI was learning the grammar of math and
could guess what might come next. So we might not have AI writing full-blown proofs yet, but it's
getting really good at understanding the structure of math and suggesting new directions. And that
brings us back to Yang-Hu He. His work with those Calabi-Yau manifolds, analyzing them as pixelated
forms, that was a huge breakthrough. Showed that
127:00 - 127:30 AI could take on algebraic geometry problems in
a totally new way. Like bridging abstract math in the world of data and algorithms. Exactly. And
that bridge leads to some really mind-bending possibilities. Yang-Hu He and his colleagues
started exploring something they call murmuration. Murmuration. Like birds. It's a great analogy.
Think of a flock of birds moving together like one. Each bird reacts to the ones around it, and
you get these complex, beautiful patterns. Right,
127:30 - 128:00 I get it. But how does it relate to AI and math?
Well, Yang-Hu He sees a parallel between how birds navigate together in a murmuration and how AI
can guide mathematicians towards new insights by sifting through tons of math data. So the AI
is like the flock, exploring math and showing us where things get interesting. Yeah, and they've
actually used this murmuration idea to look into a famous problem in number theory, the Birch
and Swinerton-Dyer conjecture. That name sounds a bit intimidating. What's it all about? Imagine
a donut shape, but in the world of numbers. These
128:00 - 128:30 are called elliptic curves. Mathematicians
are obsessed with finding rational points on these curves. Points where the coordinates can be
written as fractions. Okay, I'm following so far. The Birch and Swinerton-Dyer conjecture basically
says there's this deep connection between how many of these rational points there are and a specific
math function, like linking the geometry of these curves to number theory. Things are definitely
getting complex now. And it's a big deal in math. It's actually one of the Clay Mathematics
Institute's Millennium Prize problems. Solve it,
128:30 - 129:00 you win a million bucks. Now that's some serious
math street cred. So how did Yang-Hu He's team use AI for this? They trained an AI on this massive
data set of elliptic curves and their functions. The AI didn't actually solve the whole conjecture,
but it found this new pattern, this correlation that mathematicians hadn't noticed before. So the
AI was like a digital explorer, mapping out this math territory and showing mathematicians what
to look at more closely. Exactly. This discovery,
129:00 - 129:30 while not a complete proof, gives more support
to the conjecture and opens up some exciting new areas for research. It shows how AI can help with
even the hardest problems in mathematics. It feels like we're on the edge of something new in math.
AI is not just a tool, it's a partner in figuring out the truth. What does all this mean for math
in the future? That's a great question, and it's something we'll dig into in the final part of
this deep dive. We'll look at the philosophical and ethical stuff around AI in math. We'll ask if
AI is really understanding the math it's working
129:30 - 130:00 with, or if it's just manipulating symbols in a
really fancy way. See you there. Welcome back to our deep dive. We've been exploring how AI is
changing the game in math, from solving tough problems to finding hidden patterns in complex
structures. But what does it all mean? What are the implications of all of this? We've touched
on this question of understanding. Does AI really understand the math it's dealing with, or is it
just a master of pattern matching? Yeah, we can get caught up in the cool stuff AI is doing, but
we can't forget about those implications. If AI
130:00 - 130:30 is going to be a real collaborator in mathematics,
this whole understanding question is huge. It goes way back to the Chinese room thought experiment.
Imagine someone who doesn't speak Chinese has this rulebook for moving Chinese symbols around. They
can follow the rules to make grammatically correct sentences, but do they actually get the meaning?
So is AI like that, just manipulating symbols in math without grasping the deeper concepts?
That's the big question, and there's no easy
130:30 - 131:00 answer. Some people say that because AI gets
meaningful results, like we've talked about, it shows some kind of understanding, even if it's
different from how we understand things. Others say AI doesn't have that intuitive grasp of math
concepts that we humans have. It's a debate that's probably going to keep going as AI gets better
and better at math. Makes you wonder how it's going to affect the foundations of mathematics
itself. That's a key point. Traditionally, mathematical proof has been all about logic,
building arguments step by step using established axioms and theorems. But AI brings something
new, inductive reasoning, finding patterns
131:00 - 131:30 and extrapolating from those patterns. So could we
see a change in how mathematicians approach proof? Could we move toward a way of doing math that's
driven by data? It's possible. Some mathematicians are already using AI as a partner in the proving
process. AI can help generate potential theorems or find good strategies for tackling conjectures.
But others are more cautious, worried that relying too much on AI could make math less rigorous,
more prone to errors. It's like with any new tool.
131:30 - 132:00 There's good and bad. Finding that balance is
important. We need to be aware of the limitations and not rely on AI too much. Right. And as AI
becomes more important in math, it's crucial to have open and honest conversations. We need
to talk about what AI means, not just for math, but for everything we do. It's not just about the
tech. It's about how we choose to use it. We need to make sure AI helps humanity and the benefits
are shared. That's everyone's responsibility. A responsibility that goes way beyond just
mathematicians and computer scientists. We need
132:00 - 132:30 philosophers, ethicists, social scientists, and
most importantly, the public. We need all sorts of voices and perspectives to guide us as we go into
this uncharted territory. This has been an amazing journey into the world of AI and math. From
sleepless nights to those mind-bending manifolds, we've seen how AI is pushing the boundaries of
what's possible. And as we wrap up, we encourage you to keep thinking about these things. What does
it really mean for a machine to understand math?
132:30 - 133:00 How will AI change the way we prove things and
make discoveries in math? How can we make sure we're using AI responsibly and ethically in our
search for knowledge? These are tough questions, but they're worth asking. The future of
mathematics is being shaped right now, and AI is a major player. Thanks for joining us
on this deep dive. We'll catch you next time, ready to explore some other fascinating corner of
the universe of knowledge. New update! Started a substack. Writings on there are currently about
language and ill-defined concepts, as well as
133:00 - 133:30 some other mathematical details. Much more being
written there. This is content that isn't anywhere else. It's not on Theories of Everything, it's not
on Patreon. Also, full transcripts will be placed there at some point in the future. Several people
ask me, Hey Curt, you've spoken to so many people in the fields of theoretical physics, philosophy,
and consciousness. What are your thoughts? While I remain impartial in interviews, this
substack is a way to peer into my present deliberations on these topics. Also, thank you
to our partner, The Economist. Plus, it helps
133:30 - 134:00 out Curt directly, aka me. I also found out last
year that external links count plenty toward the algorithm, which means that whenever you share, on
Twitter, say on Facebook, or even on Reddit, etc.,
134:00 - 134:30 it shows YouTube, Hey, people are talking about
this content outside of YouTube, which in turn greatly aids the distribution on YouTube. Thirdly,
there's a remarkably active discord and subreddit for Theories of Everything, where people explicate
TOEs, they disagree respectfully about theories, and build, as a community, our own TOE. Links
to both are in the description. Fourthly, you should know this podcast is on iTunes, it's
on Spotify, it's on all of the audio platforms.
134:30 - 135:00 All you have to do is type in Theories of
Everything and you'll find it. Personally, I gain from re-watching lectures and podcasts. I
also read in the comments that, Hey, TOE listeners also gain from replaying. So how about instead
you re-listen on those platforms, like iTunes, Spotify, Google Podcasts, whichever podcast
catcher you use. And finally, if you'd like to support more conversations like this, more content
like this, then do consider visiting patreon.com
135:00 - 135:30 slash CURTJAIMUNGAL and donating with whatever you
like. There's also PayPal, there's also crypto, there's also just joining on YouTube. Again,
keep in mind, it's support from the sponsors and you that allow me to work on TOE full-time.
You also get early access to ad-free episodes, whether it's audio or video. It's audio in the
case of Patreon, video in the case of YouTube. For instance, this episode that you're listening
to right now was released a few days earlier. Every dollar helps far more
than you think. Either way, your viewership is generosity
enough. Thank you so much.