Exploring the Limits of AI

Yann LeCun: Human Intelligence is not General Intelligence // AI Inside 63

Estimated read time: 1:20

    Learn to use AI like a Pro

    Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo
    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo

    Summary

    In this illuminating episode of the "AI Inside" podcast, Yann LeCun, Chief AI Scientist at Meta and Turing Award winner, discusses the current limitations and future directions of AI technology, particularly focusing on Large Language Models (LLMs). LeCun provides insights into why the promise of Artificial General Intelligence (AGI) isn't just around the corner, emphasizing the need for machines that truly understand and interact with the physical world. He critiques the hype surrounding LLMs, asserting that while they're useful, they fall short of representing a comprehensive model of intelligence. Instead, he advocates for a future where AI progresses through open collaboration, akin to the early days of the internet, and stresses diversity in AI systems to truly capture global perspectives.

      Highlights

      • Yann LeCun, often dubbed the 'godfather of AI,' sheds light on why LLMs like ChatGPT are not the ultimate path to AGI 📚.
      • LeCun emphasizes the importance of AI that understands and interacts with the real world 🌍.
      • He argues that human intelligence is specialized, not general, challenging the notion of AGI as a close reality 🔍.
      • Meta's strategy of open-sourcing LLAMA is aimed at democratizing AI development and fostering faster innovation 🌐.
      • LeCun envisions a future where AI assistants aid daily life, but notes achieving human-level intelligence is a significant challenge 🌟.

      Key Takeaways

      • LLMs are useful tools, but they aren't the breakthrough to general intelligence 🎯.
      • AI needs to understand the physical world and have a world model to advance 🚀.
      • True intelligence isn't just about solving known problems but adapting to new ones 🌟.
      • Open-source AI, like Meta's LLAMA, opens doors for innovation beyond a few big players 🤝.
      • We shouldn't feel threatened by AI; it should empower us, working as our 'smart assistants' 🔧.

      Overview

      Yann LeCun, the Chief AI Scientist at Meta, candidly discusses the real-world applications and limitations of Large Language Models on AI Inside podcast. He shares his skepticism about the prevalent notion that AI is on the brink of achieving Artificial General Intelligence.

        LeCun proposes a forward-thinking perspective that AI systems need to deeply understand and model the physical world to reach new heights. In his view, creating open-source platforms like Meta's LLAMA is pivotal. This openness is intended to democratize AI research, accelerating innovation through global collaboration.

          Looking to the future, LeCun imagines a world where AI assistants become integral companions in daily life. However, he maintains that reaching human-level AI isn't just around the corner—it requires groundbreaking advancements in understanding and technology.

            Chapters

            • 00:00 - 01:30: Introduction and Guest Welcome The chapter begins with an introduction to the main topics and themes that will be covered in the discussion. The host warmly welcomes the guests, outlining their background and expertise related to the subject matter. The conversation is set to explore various aspects of the topic, promising an in-depth and insightful dialogue.
            • 01:30 - 05:00: Discussion on LLMs and AI Reliability In this chapter, Jason welcomes Yann LeCun, the chief AI scientist at Meta, to discuss topics related to Large Language Models (LLMs) and the reliability of AI systems.
            • 05:00 - 10:00: Limits of AI and Future Research Directions The chapter is titled 'Limits of AI and Future Research Directions' and features a dialogue with a Turing Award winner, often referred to as the godfather of AI. The chapter opens with a welcome to Yann, who is a guest on the show, setting the stage for a discussion likely centered around the current limitations of artificial intelligence and potential avenues for future research in the field. The presence of a prominent figure like Yann suggests that the conversation will offer deep insights into AI challenges and innovations.
            • 10:00 - 15:00: Challenges in Achieving Human-Level Intelligence The chapter opens with a conversation between Jason and Yann, where Jason refers to Yann as the 'godfather of AI.' Yann modestly responds by saying he shuts his ears to avoid turning red, indicating his humility despite the recognition.
            • 15:00 - 30:00: Meta's Open Source Strategy and AI Future The chapter discusses the dynamics of the AI industry, specifically focusing on the differing opinions on the effectiveness and potential of Large Language Models (LLM). Despite OpenAI securing substantial funding due to its success with LLM technology, there are concerns about diminishing returns. The chapter explores why some companies continue to heavily invest in generative AI and LLM, possibly overlooking the limitations highlighted by certain experts.
            • 30:00 - 45:30: Reflecting on Education and AI's Potential The chapter discusses the current and potential future applications of Large Language Models (LLMs) in various fields, particularly in coding and as AI assistants. Yann acknowledges the usefulness of these models but points out that they are not yet fully reliable, especially when considering their application as agentic systems.

            Yann LeCun: Human Intelligence is not General Intelligence // AI Inside 63 Transcription

            • 00:00 - 00:30
            • 00:30 - 01:00 Jason: I am thrilled to welcome to AI Inside  Yann LeCun, chief AI scientist at Meta,
            • 01:00 - 01:30 Turing Award winner, known by many as  the godfather of AI. Welcome to the show,   Yann. It's really nice to meet you. Yann: Thanks for having me on.
            • 01:30 - 02:00 Jason: Does it ever get old  hearing someone introduce   you as the godfather of AI? Are you  kind of like, yeah, here we go again. Yann: I shut my ears so I don't turn red. Jason: But you can accept it at this point because  it's the truth. Well, there's so many different   directions that we can go on this conversation.  And we end up talking about your work a lot,   obviously, the work of Meta and this moment in  LLM. I think the kind of question that I have to   kick things off is that we are so firmly implanted  into the current realm of artificial intelligence   which really seems to be the LLM generation, and  there's probably something on the horizon around
            • 02:00 - 02:30 that. But we're still firmly implanted there, and  you've been pretty opinionated on the limits of   LLM at a time when we're also seeing things like  OpenAI securing a record breaking round of funding   largely built on its success in LLM technology.  And so I see diminishing returns on one side.   On the other, companies betting everything on  generative AI and LLM. And I'm curious to know   what you think as far as why they might not be  seeing what you're seeing about this technology.
            • 02:30 - 03:00 Or maybe they are and they're just approaching  it differently. What are your thoughts there? Yann: Oh, maybe they are. There's no  question that LLMs are useful. I mean,   particularly for coding assistants and  stuff like that. And in the future,   probably for more general AI assistants, jobs.  People are talking about agentic systems. It's still not totally reliable yet. For these  kinds of applications, the main issue, and it's
            • 03:00 - 03:30 been a recurring problem with AI and computer  technology more generally, is the fact that   you can see impressive demos. But when it comes  time to actually deploy a system thats reliable   enough that you put it in the hands of  people and they use it on a daily basis,   there's a big distance. It's much harder to  make those systems reliable enough. Right? Ten years ago, we were seeing demos of  cars driving themselves in the countryside,   streets, for about ten minutes before you had  to intervene. And we made a lot of progress
            • 03:30 - 04:00 but we're still not to the point of having cars  that can drive themselves as reliably as humans,   except if we cheat, which is fine, which is what  Waymo and others are doing. So there's been sort   of a repeated history over the last seventy years  in AI of people coming up with a new paradigm and   then claiming, okay, that's it. This going to  take us to human level AI. Within ten years
            • 04:00 - 04:30 most intelligent entity on  the planet would be a machine. And every time it's turned out to be  false because the new paradigm either hits   a limitation that people didn't see or  turned out to be really good at solving   a subcategory of problem that didn't turn  out to be the general intelligence problem.
            • 04:30 - 05:00 And so there's been generation  after generation of AI researchers,   and industrialists, and founders, making those  claims, and they're being wrong every time. So, I don't want to poo poo LLMs. They're  very useful. There should be a lot of   investment in them. There should be a lot  of investment in infrastructure to run them,   which is where most of the money is  going, actually. It's not to train   them or anything. It's to run them in the end  serving billions of users potentially. But,
            • 05:00 - 05:30 like every other computer technology, it can be  useful even if it's not human level intelligence. Now if we want to shoot for human  level intelligence, I think we should.   We need to invent new techniques.  We're just nowhere near matching that. Jeff: I'm really grateful you're here, Yann,   because I quote you constantly on this show and  elsewhere because you are the voice of realism,   I think, in AI. And I don't hear you  spouting the hype that I hear elsewhere.
            • 05:30 - 06:00 And you've been very clear about where  we are now. I think you've equated us to,   maybe we're getting to the point  of a smart cat or a three year old. Yann: Not even *****  Jeff: Right. And I think you've also talked about  that we've hit the limits of what LLMs can do.   So there is a next paradigm, a next leap, And I  think you've talked about them being understanding   reality better. But can you talk about where  you think research, where you're taking it
            • 06:00 - 06:30 or where it should be going next? Where we should  be putting resources next, to get more out of AI? Yann: So I wrote a long paper three years ago  where I explained where I think AI research   should should go over the next ten years. This  was before the world learned about LLMs. And,   of course, I knew about it because  we were working on it before. But   this vision hasn't changed. It has not  been affected by the success of LLM.
            • 06:30 - 07:00 So here's the thing. We need machines  that understand the physical world.   We need machines that are capable of  reasoning and planning. We need machines   that have persistent memory. And we need  those machines to be controllable and safe,   which means that they need to be driven by  objectives we give them. We give them a task,   they accomplish it, or they give us  the answer to the question we ask,   and that's it. Right? And they can't escape  whatever it is that we're asking them to do.
            • 07:00 - 07:30 So, what I explained in that document  is how we might potentially one way   get to that point. And it's centered on a  central concept called "world model." So   we all have world models in our head. And  animals do too, right? And it's basically   the mental model that we have in our head that  allows us to predict what's going to happen in   the world. Either because the world is being  the world or because of an action we might
            • 07:30 - 08:00 take. So if you can predict the consequences of  our actions then what we can do with that is if   we set ourselves an objective, a goal, a task  to accomplish, we can, using our world model,   imagine whether a particular sequence of  actions will actually fulfill that goal. Okay? And that allows us to plan. So planning  and reasoning really is manipulating our
            • 08:00 - 08:30 mental model to figure out if a particular  sequence of actions is going to accomplish   a task that we set for ourselves. Okay? So that is  what psychologists call "System 2." A deliberate,   I don't want to say conscious  because it's a loaded term, but   deliberate process of thinking about  how to accomplish a task, essentially.
            • 08:30 - 09:00 And that we don't know how to do really. I mean,   we're making some progress at the research level.  A lot of the most interesting research in that   domain is done in the context of robotics.  Because when you need to control a robot,   you need to know in advance what the effect  of sending a torque on an arm is going to be. So this process, in fact, in control theory  and robotics, of imagining the consequences
            • 09:00 - 09:30 of a sequence of actions and then basically, by  optimization searching for a sequence of action   that satisfies the task, even has a name, even has  an acronym. It's called Model Predictive Control,   MPC. It's a very classical method in  optimal control going back decades. The main issue with this that in robotics  and control theory, the way this works,   the one model is a bunch of equations that are  written by someone, by an engineer. You only   control a robot arm or a rocket or something. You  can just write down the dynamical equations of it.
            • 09:30 - 10:00 But what we need to do for AI systems,  we need this world model to be learned   from experience or learned from observation.  So this the kind of process that seems to be   taking place in the minds of animals and  maybe humans, infants, learning how the   world works by observation. That's the part  that seems really complicated to reproduce. Now, this can be based on a very simple principle  which people have been playing with for a long
            • 10:00 - 10:30 time without much success, called Self-supervised  Learning. And Self-supervised Learning has been   incredibly successful in the context of natural  language understanding and LLMs and things like   that. In fact, it's the basis of LLM. Right?  So you take a piece of text, and you train   a big neural net to predict the next word in the  text. Okay? That's basically what it comes out to.
            • 10:30 - 11:00 These tricks are how to make this efficient and  everything. But that's the basis of LLM. You   just train it to predict the next word  in the text. And then when you use it,   you have it predict the next word, shift  the predicted word into its viewing window,   and then predict the second word, then  shift that in, pick the third. Right? That's autoregressive prediction. That's  what LLMs are based on. And the trick is   how much money you can afford to hire  people to fine tune it so we can answer   questions correctly. Which is what a  lot of money is going into right now.
            • 11:00 - 11:30 So you could imagine using this principle  of Self-supervised Learning for learning   representations of images, learning to predict  what's going to happen in a video. Right? So if   you show a video to a computer, and train some big  neural net to predict what's going to happen next   in a video, if the system is capable of learning  this and doing a good job at that prediction,   it will probably have understood a  lot about the underlying nature of
            • 11:30 - 12:00 the physical world. It thinks that objects  move according to particular laws, right? So animate objects can move in ways that are more  unpredictable but still satisfy some constraints.   You're not going to have objects that are not  supported fall because of gravity, etcetera.   Right? Now, human babies take nine months  to learn about gravity. It's a long process.
            • 12:00 - 12:30 Young animals, I think, learn this much quicker,  but they don't have the same kind of grasp of   where gravity really is in the end. Although cats  and dogs are really good at this, obviously. So   how do we reproduce this kind of training? So if  we do the Naive scene, which I've been working   on for twenty years doing similar thing as taking  a piece of text, but just taking a video and then   training a system to predict what happens next in  the video, it doesn't really work. So if you're
            • 12:30 - 13:00 training to predict the next frame, it doesn't  learn anything useful because it's too easy. If you're training to predict longer term, it  really cannot predict what's going to happen in   the video because there's a lot of plausible  things that might happen. Okay? So in the   case of text, that's a very simple problem  because you only have a finite number of   words in the dictionary. And so you can never  predict exactly what word follows a sequence,   but you can predict a probability distribution  of all words in the dictionary. And that's good
            • 13:00 - 13:30 enough. You can represent uncertainty in the  prediction. You can't do this with video. We do   not know how to represent appropriate probability  distribution over the set of all images or video   frames or a video segment, for that matter. It's  actually a mathematically intractable problem. So it's not just a question of we don't have  big enough computers. It's just intrinsically   intractable. So until maybe five, six  years ago, I didn't have any solution
            • 13:30 - 14:00 to this. I don't think anybody had any  solution to this. And one solution that   we came up with is a kind of architecture  that changes the way we would do this. Instead of predicting everything that happens  in the video, we basically train a system to   learn a representation of the video, and we  make the prediction in that representation   space. And that representation eliminates  a lot of details in the video that are just   not predictable or impossible to figure out.  That kind of architecture is called a JEPA,
            • 14:00 - 14:30 Joint Embedding Predictive Architecture. And  I can tell you a little bit about about that   later. But what may be surprising  about this that it's not generative. So everybody is talking about  generative AI. My hunch is that the   next generation AI system will be based  on non-generative models, essentially. Jason: So, what occurs to me in hearing you  talk about the real limitations of where we're   at when we take a look at what everybody seems to  be claiming is so great about LLM and that "we're   right on the precipice of AGI, Artificial  General Intelligence, and here's the reason
            • 14:30 - 15:00 why." It depends on who you ask. Right?  Some people are like, "it's right around the   corner." Other people are like, "oh, it's already  here. Take a look at this. Isn't that amazing?" Jeff: Others are "it'll never be here." Jason: Yes. And then others are, "it'll never be  here." I think often on this show where we talk   about this topic a little bit in disbelief, and I  think what you just said kind of punctuates that   a little bit for me. How do you model around or  create a model that can really analyze all aspects   of what you're talking about? Like, we've got LLMs  focusing on reasoning. Although maybe maybe it's   a different type of reasoning compared to what  we are looking at now. Maybe that is an actual   reasoning in the way that humans reason. But  then you've got the physical world. You've got   the planning, this persistent memory. All those  components that you talk about, when you put it   that way, it really makes me more confident that  AGI is not right around the corner, that AGI is
            • 15:00 - 15:30 really this distant theory that may never or,  at least a very, very long time down the road,   come true? What are your thoughts on that? *********  Yann: Okay. So, first of all, there is absolutely  no question in my mind that at some point in the   future, we'll have machines that are at least as  smart as humans in all the domains where humans   are smart. Okay? That's not a question. People  have kind of had big philosophical questions   about this. A lot of people still believe  that human nature is kind of impalpable,   and we're never going to be able  to reduce this to computation.
            • 15:30 - 16:00 I'm not a skeptic on that dimension. There's  no question in my mind that at some point,   we'll have machines that are more intelligent than  us. They already are in narrow domains. Right? So then there is the question of what does  AGI really mean? Does it mean general what do   you mean by general intelligence? Do you  mean intelligence that is as general as
            • 16:00 - 16:30 human intelligence? If that's the case, then  okay, you can use that phrase, but it's very   misleading because human intelligence is not  general at all. It's extremely specialized.   We are shaped by evolution to only do the tasks  that are worth accomplishing for survival. And,   we think of ourselves as having general  intelligence, but we're just not at all general. It's just that all the problems that we're not  able to apprehend, we can't think of them. And
            • 16:30 - 17:00 so that makes us believe that we have general  intelligence, but we absolutely do not have   general intelligence. Okay. So I think this phrase  is nonsense first of all. It is very misleading. I prefer the kind of phrase we use to  designate the concept of human level   intelligence within within Meta is AMI,  Advanced Machine Intelligence. Okay? This
            • 17:00 - 17:30 kind of a much more open concept.  We actually pronounce it "ami",   which means friend in French. But let's call  it human level intelligence if you want. Right? So no question it will happen. It's  not going to happen next year. It's not going   to happen two years from now. It may happen or  happen to some degree within the next ten years. Okay. So it's not that far away. If all of the  things that we are working on at the moment turn
            • 17:30 - 18:00 out to be successful, then maybe within ten years,  we'll have a good handle on whether we can reach   that goal. Okay. But it's almost certainly harder  than we think. And probably much harder than we   think because it's always harder than we think.  Over the history of AI, it's always been harder   than we think. You know, it's the story I was  telling you earlier. So, I'm optimistic. Okay?
            • 18:00 - 18:30 I'm not one of those pessimists  who say we'll never get there.   I'm not one of those pessimists that  says all the stuff we're doing right   now is useless. It's not true.  It's very useful. I'm not people   who say we're going to need some quantum computing  or some completely new principle, blah blah blah. No. I think it's going to  be based on deep learning,   basically. And that underlying principle,  I think, is going to stay with us for a   long time. But within this domain  the type of things that we need to
            • 18:30 - 19:00 discover and implement, we're not there  yet. We're missing some basic concepts. And the best way to convince yourself of this is  to say, okay. We have systems that can answer any   question that has a response somewhere on the  Internet. We have systems that can pass the bar   exam, which is basically information retrieval  to a large extent. We have systems that can
            • 19:00 - 19:30 shorten the text and help us understand it, They  can criticize a piece of writing that we're doing,   they can generate code. But generating  code is actually, to some extent,   relatively simple because the syntax is  strong, and a lot of it is stupid. Right? We have systems that can solve equations,  that can solve problems as long as they've   been trained to solve those problems.  If they see a new problem from scratch,
            • 19:30 - 20:00 current systems just cannot find a solution.  There was actually a paper just recently that   showed that if you test all the best LLMs on  the latest math Olympiad, they basically get   zero performance, because there are new problems  they have not been trained to solve. So okay. So we have those systems that can manipulate  language, and that fools us into thinking   that they are smart because we're  used to smart people being able to
            • 20:00 - 20:30 manipulate language in smart ways. Okay.  But where is my domestic robot? Where is   my level five self driving car? Where  is a robot that can do what a cat do? Even a simulated robot that can do what a cat can  do. What cats can do. Right? And the issue is not   that we can't build a robot. We can actually  build robots that have the physical abilities.
            • 20:30 - 21:00 It's just that we don't know how to  make them smart enough. And it's much,   much harder to deal with the real world and  to deal with systems that produce actions than   to deal with systems that understand language.  And again, it's related to the part that I   was mentioning before. Language is  discrete. It has strong structure. The real world is a huge mess,  and it's unpredictable. It's   not deterministic. You know? It's  high dimensional. It's continuous.
            • 21:00 - 21:30 It's got all the problems. So let's try to build   something that can learn as  fast as a cat, first of all. Jeff: I've got so many questions for you, but  I'm going to stay on this for another minute.   Should human level activity or thought even be the  model? Is that limiting? There's a wonderful book   from some years ago by Alex Rosenberg called, How  History Gets Things Wrong, arguing that the theory   of mind, he debunks the theory of mind, that we  don't have this reasoning that we go through.
            • 21:30 - 22:00 That in fact, we're kind of doing what an LLM does  in the sense that we have a bunch of videotapes in   our head. And when we hit a circumstance, we find  the nearest videotape and play that and decide yes   or no in that way. And so that does sound like  the human mind a bit. But the model we tend to   have for the human mind is one of reasoning  and weighing things and so on. And, also,   as you say, we are not generally intelligent, but  the machine conceivably could do things that we,   right now it does things we cannot do.  It could do more. So when you think about
            • 22:00 - 22:30 success and that goal, what is that model? Aa  cat would be a big victory, to get to the point   of being a cat. But what's your larger goal? Is  it human intelligence, or is it something else? Yann: Well, it's a type of intelligence that  is similar to human and animal intelligence   in the following way. Current AI systems  have a very hard time solving new problems
            • 22:30 - 23:00 that they've never faced before. Right?  So they don't have this mental model,   this world model I was I was telling you about  earlier, that allows them to kind of imagine   what the consequence of their actions or whatever.  They they don't reason in that way. Right? I mean,   an LLM certainly doesn't because the only  way it can do anything is just produce words,   produce tokens. Right? So one way you trick an LLM  into spending more time thinking about a question,
            • 23:00 - 23:30 a complex question than a simple question, is you  ask it to go through the steps of reasoning. And   as a consequence, it produces more tokens and  then spends more computation answering that   question. But it's a horrible trick. It's  a hack. It's not the way humans reason. Another example that LLMs do is, for  writing code or answering questions,
            • 23:30 - 24:00 you get an LLM to generate lots  and lots of sequences, of tokens,   that have some decent level of probability or  something like that. And then you have a second   neural net that sort of tries to evaluate each of  those and then picks the one that is best. Okay? It's sort of like producing lots and lots of  answers to a question and then have a critique   of telling you which of those answers is the best.  Now there is a lot of AI systems that work this
            • 24:00 - 24:30 way, and it works in certain situations. If you  want a system, your computer system to play chess,   that's exactly how it works. It produces a tree  of all the possible moves from you, and then from   your opponent, and then from you, and then from  your opponent. That tree grows exponentially. So you can't generate the entire tree. You have  to have some smart way of only generating a piece   of the tree. And then you have what's called a  evaluation function or value function that picks
            • 24:30 - 25:00 out the best branch in the tree that results in  a position that is most likely to win. And all of   those things are trained nowadays. Okay? **********  They're neural nets basically that generate  the good branch in the trees and select it.   That's a limited form of reasoning. Why  is it limited? And it's, by the way,   a type of reasoning that humans are terrible at.  The fact that a $30 gadget that you buy at a toy
            • 25:00 - 25:30 store can beat you at chess demonstrates that  humans totally suck at this kind of reasoning.  ********** Okay? We're just really bad at it. We just don't   have the the memory capacity, the computing speed,  and everything. Right? So we're terrible at this. What we are really good at, though, is the  kind of reasoning and what cats and dogs and   rats are really good at is, sort of planning  actions in the real world and planning them
            • 25:30 - 26:00 in a hierarchical manner. So knowing that if we  want to, let me take an example in human domain,   but there are similar ones in sort of animal  tasks. Right? I mean, you see cats learning to   open jars and jump on doors to open them and  open the lock of a door and things like that. So,   they learn how to do this, and they learn how to  plan that sequence of actions to arrive at a goal,
            • 26:00 - 26:30 which is getting to the other side,  perhaps to get food or something. You see squirrels doing this. Right? I  mean, they're pretty smart actually in   the way they learn how to do this kind of stuff.  Now this is a type of planning that we we don't   know how to reproduce with machines.  And all of it is completely internal. It has nothing to do with language. Right? We  think, as humans, we think that thinking is   related to language, but it's not. The animals  can think. People who don't talk can think.
            • 26:30 - 27:00 And there are types of reasoning. Mostly  most types of reasoning have nothing to   do with language. So if I tell you imagine  a cube floating in the air in front of you   or in front of us, Okay? Now rotate that  cube 90 degrees or along a vertical axis. So probably, you made the assumption that  the cube was horizontal, that the bottom   was horizontal. You didn't imagine a cube that  was kind of sideways. And then you rotate it   90 degrees, and you know that it looks just  like the cube you started with because it's a
            • 27:00 - 27:30 cube. It's got 90 degree symmetry. There's  no language involved in this reasoning. It's just you know, images and sort of abstract  representations of the situation. and how do we do   this? Like, we have those abstract representation  of thought, and then we can manipulate those   representations through sort of virtual actions  that we imagine taking, like rotating the cube,
            • 27:30 - 28:00 and then imagine the result. Right? And that  is what allows us to actually accomplish tasks,   in the real world, at an abstract level. It doesn't matter what the cube is made  of, how heavy it is, whether it floats in   front of us or not. You know? I mean, all of the  details don't matter, and the representation is   abstract enough to really not care about those  details. If I plan to, I'm in New York. Right? If I plan to be in Paris tomorrow, I could  try to plan my trip to Paris in terms of
            • 28:00 - 28:30 elementary action I can take, which basically are  millisecond by millisecond controls of my muscles.   But I can't possibly do this because it's several  hours of muscle control, and, it will depend on   information I don't have. Like, I can go on the  street and hail a taxi. I don't know how long   it's going to take for a taxi to come by. I don't  know if the light is going to be red or green.
            • 28:30 - 29:00 I cannot plan my entire trip. Right? So I have to  do hierarchical planning. I have to imagine that   if I were to be in Paris tomorrow, I first have  to go to the airport and catch a plane. Okay. Now I have a circle going to the airport.  How do I go to the airport? I'm in New York,   so I can go down on the street, hail a taxi.  How do I go down on the street? Or I have to   walk through the elevator, the stairs, hit  the button, go down, walk out the building.
            • 29:00 - 29:30 And before that I have a circle going to the  elevator or to the stairs. How do I even stand   up from my chair? So can you explain words  how you climb a stair or you stand up from   your chair? You can't. Like this is low  level understanding of the real world. And at some point, in all those set goals that I  just described, you get to a situation where you   can just accomplish the task without really kind  of planning and thinking because you're used to
            • 29:30 - 30:00 standing up from your chair. But the complexity of  this process, of imagining what the consequences   of reactions are going to be with your internal  world model and then planning a sequence of   actions to accomplish this task, that's the  big challenge of AI for the next few years.   We're not there yet. **********  Jeff: So one question I've been wanting to ask.  This has been a great lesson, professor. I'm
            • 30:00 - 30:30 really grateful for that, but I also want to get  to the current view of Meta's strategy on this. And the fact that Meta has decided  to go, what do we call open source   or open or available or whatever,  but LLAMA is a tremendous tool. I,   as an educator myself, am grateful. I'm  Emeritus of CUNY, but now I'm at Stony Brook,   and it's because of LLAMA that universities  can run models and learn from them and build   things. And it struck me, and I've said this  often, that I think that the Meta strategy,
            • 30:30 - 31:00 your strategy here on LLAMA and company,  is a spoiler for much of the industry part,   but an enabler for tremendous open development,  whether it's academic or entrepreneurial. And so I'd love to hear from the horse's  mouth here, what's the strategy behind   opening up LLAMA in the way that you've done? Yann: Okay. It's a spoiler  for exactly three companies. Jeff: Yeah. Well, exactly.
            • 31:00 - 31:30 Yann: It's an it's an enabler  for thousands of companies. So obviously, from a pure ethical point  of view, it's obviously the right thing   to do. Right? I mean, LLAMA, LLAMA two, the  release of LLAMA two in qualified open source,   has basically completely jump started the AI  ecosystem not just in industry and startups,
            • 31:30 - 32:00 but but also in academia, as you as you were  saying. Right? I mean, academia basically doesn't   have the means to train their own foundation  model at the same level as as as companies. And so they rely on this kind of open source  platform to be able to make contributions to AI   research. And that's kind of one of the main  reasons for Meta to actually release those   foundation models in open source is to  enable innovation, faster innovation.
            • 32:00 - 32:30 And the question is not whether this or that  company is three months ahead of the other,   which is really the case right now. The question  is, do we have the capabilities in the AI   systems that we have at the moment to enable the  products we want to build? And the answer is no. The product that Meta wants to build  ultimately is an AI assistant, or maybe   a collection of AI assistants, that is with us  at all times, maybe lives in our smart glasses,
            • 32:30 - 33:00 that we can talk to. Maybe it displays  information in the lens and everything.   And for those things to be maximally useful,  they would need to have human level intelligence. Now we know that moving towards human level  intelligence is not going, so first of all,   it's not going to be an event. There's not  going to be a day where we don't have AGI   and a day after which we have AGI. It's  just not not going to happen this way.
            • 33:00 - 33:30 Jeff: I'll buy you the drinks if that happens. Yann: Well, I should be buying you  drinks because it's not happening. It's not going to happen this way. Right? So  the question really would be how do we make   fastest possible progress towards human level  intelligence? And since it's one of the biggest
            • 33:30 - 34:00 scientific and technological challenge that we've  faced, we need contributions from anywhere in the   world. There's good ideas that can come up from  anywhere in the world. And we've seen an example   with DeepSeek recently, right, which  surprised everybody in Silicon Valley.   Didn't surprise many of us in the open  source world that much. Right? I mean,   that's the point. It's sort of validation  of the whole idea of open source. And so good ideas can come from  anywhere. Nobody has a monopoly
            • 34:00 - 34:30 on good ideas, except people who have an  incredibly inflated superiority complex. Jeff: Not that we're talking about  anybody in particular. Right? Yann: No. No. We're not talking about  anybody in particular. There's is a   high concentration of those people in  certain areas of the country. So and,   of course, they have a vested interest in sort  of disseminating this idea that they somehow,
            • 34:30 - 35:00 they are better than everybody else. So I  think it's still a major scientific challenge,   and we need everybody to contribute. So  the best way we know how to do this in the   context of academic research is you publish your  research, you publish your code in open source,   as much as you can, and you get people to  contribute. And I think the history of AI   over the last dozen years really shows that, I  mean, the progress has been fast because people   were sharing code and scientific information.  And some, a few players in the space, started
            • 35:00 - 35:30 coming up over the last three years because they  need to generate revenue from the technology. Now at Meta, we don't generate revenue from the  technology itself. We generate revenue from ads,   and those ads rely on the quality of products that  we build on top of the technology. And they rely   on the network effect of the social networks and  pass a conduit to the people and the users. And so
            • 35:30 - 36:00 the fact that we distribute our technology doesn't  hurt us commercially. In fact, it helps us. Jason: Yeah. 100%. Hearing you talk, you mentioned  the topic of wearables and glasses, and that,   of course, always sparks my attention. I had  the opportunity to check out Google's Project   Astra Glasses last December. And it has stuck  with me ever since, and really solidified my
            • 36:00 - 36:30 view of, and we're not talking AI ten, twenty  years down the line and what it will become,   but kind of more punctuating this moment in AI,  and that being a really wonderful next step for   contextualizing the world while wearing a piece of  hardware that we might already be wearing. If it's   a pair of glasses and looks like our normal  glasses suddenly we have this extra context. And I guess the line that I've been able to draw  in talking with you between where we are now   and where we're going potentially is not only  the context that experience gives the wearer,   but for you, for Meta and for those creating these  systems, smart glasses out in the real world,   taking in information on how humans live  and operate in our physical world could
            • 36:30 - 37:00 be a really good source of knowledge to pull  from for what you were talking about earlier.   Am I on the right track, or is that just one  piece, one very small piece of the puzzle? Yann: Well, it's a piece, an important piece.  But, yeah, I mean, the idea that you have an   assistant with with you at all times that  sees what you see, hears what you hear,   if you let it, obviously.  See if you let it for sure. You know, but to some extent, is your confidant  and can help you perhaps even better than
            • 37:00 - 37:30 how a human assistant could help you. I mean,  that's certainly an important vision. In fact,   the vision is that you won't have a single  assistant. You will have a whole staff of   intelligent virtual assistant, working around  with you. It's like all of us would be a boss. Okay? I mean, people feel threatened. Some  people feel threatened by the fact that   machines would be smarter than us, but  we should feel empowered by it. I mean,
            • 37:30 - 38:00 they're going to be working for us,  you know? I don't know about you, but   as a scientist or as a manager in industry,  the best thing that can happen to you is   you hire students or engineers or people  working for you that are smarter than you. That's the ideal situation. And  you shouldn't feel threatened by   that. You should feel empowered  by it. So so I think that's the   future we should envision. Smart collection of  assistants that helps you in your daily lives.
            • 38:00 - 38:30 Maybe smarter than you. You give  them a task, they accomplish it,   perhaps better than you. And that's great.  Now that connects to another point I wanted   to make related to the previous question,  which is about open source. Which is that,   in that future, most of our interactions with  the digital world will be mediated by AI systems. Okay. And that's why Google is a little  frantic right now because they know that
            • 38:30 - 39:00 nobody is going to go to a search engine  anymore. Right? You're just going to talk to   your AI assistant. So they they're trying  to experiment with this within Google. That's going to be through glasses, so  they realize they probably have to build   those. Like, I realized this several years  ago. So we have a bit of a head start, but   that's really what's going to happen.  We're going to have those AI sitting   with us at all times. And they're going  to mediate all of our information diet.
            • 39:00 - 39:30 Now if you think about this, if you  are a citizen anywhere in the world,   you do not want your information diet to come from  AI assistant built by a handful of companies on   the West Coast of the US or China. You want a high  diversity of AI assistants that, first of all,   speaks your own language whether it's a obscure  dialect or local language. Second of all,
            • 39:30 - 40:00 understands your culture, your value system, your  biases, whatever they are. And so we need a high   diversity of such assistants for the same reason  we need a high diversity of the press. Right? And I realize I'm talking to a journalism  professor here. But am I right?
            • 40:00 - 40:30 Jeff: Amen. In fact, I think that's that's  what I celebrate is what the Internet and   next AI can do is to tear down the  structure of mass media and open up   media once again at a human level.  AI lets us be more human, I hope. Yann: I hope too. So the only way we can  achieve this with current technology is   if the people building those assistants  with cultural diversity and everything,
            • 40:30 - 41:00 have access to powerful open source foundation  models. Because they're not going to have the   resources to train their own models. Right?  We need models that speak all the languages   in the world, understand all the value  system, and have all the biases that   you can imagine in terms of culture,  political biases, whatever you want. And so there's going to be thousands of those  that we're going to have to choose from,   and they're going to be built by small shops,  everywhere around the world. And they're   going to have to be built on top of foundation  models trained by a large company like Meta or
            • 41:00 - 41:30 maybe an international consortium that trains  those foundation models. the picture I see,   the evolution of the market that I see, is  similar to what happened with the software   infrastructure of the Internet in the  late nineties or the early two thousands,   where in the early days of the Internet, you had  Sun Microsystems, Microsoft, HP, IBM, and a few   others kind of pushing to provide the hardware  and software infrastructure of the Internet,
            • 41:30 - 42:00 their own version of UNIX or whatever, or Windows  NT, and their own web server, and their own racks,   and blah blah blah. All of this got completely  wiped out by Linux and commodity hardware. Right? And the reason it got wiped out is because running  Linux is a platform software. it's more portable,   more reliable, more secure, more cheaper  everything. And so Google was one of the first   to do this, building infrastructure on commodity  hardware and open source operating system. Meta,
            • 42:00 - 42:30 of course, did exactly the same thing, and  everybody is doing it now, even Microsoft. So,   I think there's going to be a similar pressure  from the market to make those AI financial models   open and free because it's an infrastructure  like the infrastructure of the Internet. Yann: How long have you been teaching? Yann: Twenty two years. Twenty two years.
            • 42:30 - 43:00 Jeff: So what differences do you see in students  and their ambitions today in your field? Yann: I don't know. It's hard for me to  tell, because in the last dozen years or so,   I've only taught graduate students. So I don't  see any significant change in PhD students,   other than the fact that they come from all over  the world. I mean, there is something absolutely
            • 43:00 - 43:30 terrifying happening in The US right now  where funding for research is being cut,   and then there's sort of threats of visas not  being given to foreign students and things   like that. I mean, it's completely going to  destroy the technological leadership in the US,   if it's actually implemented the way it  seems to be going. Most PhD students in STEM,
            • 43:30 - 44:00 science, technology, engineering,  mathematics, are foreign. And it's even higher in most engineering  disciplines at the graduate level. It's   mostly foreign students. Most founders of  or CEOs of tech companies are foreign born. Jeff: French universities are offering  the opportunity for American researchers   to go to go there. I've got one more  question for you. Do you have a cat?
            • 44:00 - 44:30 Yann: I don't, but, our youngest son has  a cat, and we watch the cat occasionally. Jeff: Okay. I wondered if that was your model. Jason: Alright. Well, Yann, this has been  wonderful. I know we've kept you just a   slight bit longer than we had agreed to for your  schedule. So we really appreciate you carving out   some time. Yeah. It's been really wonderful, and  it's wonderful to hear some of this, as Jeff said,
            • 44:30 - 45:00 from the horse's mouth earlier because you  come up in our conversations quite a lot,   and we really appreciate your perspective,  in the world of AI and all the work that   you've done over the years. Thank you for  being here with us. This has been an honor. Jeff: Thank you for the sanity  you bring to the conversation. Yann: Well, thank you so much. It's  really been a pleasure talking with you.