I Went to GOOGLE'S Office to Uncover their EPIC AI Tools | Gemini 2.5 & Google AI Studio
Estimated read time: 1:20
Summary
In a captivating exploration inside Google's Mountain View office, Ishan Sharma delves into the world of AI, focusing on Google's cutting-edge Gemini 2.5 and the Google AI Studio. The discussion covers Google's ambitious visions for AI, notably its commitment to making powerful AI tools accessible to the public. This conversation highlights the blend of Google's focus on both developer and non-developer audiences, showcasing the innovative models, tools, and applications such as deep research, multimodal AI capabilities, and real-time video understanding.
Highlights
- Ishan Sharma explores Google's Mountain View office and reveals the innovative AI technologies at play 🤩.
- Gemini 2.5 is highlighted as the most advanced AI model available, with groundbreaking capabilities 🌟.
- Google's commitment to free AI through products like Google AI Studio is underscored 🌍.
- Exciting AI demos show potential applications for both developers and general users 🎬.
- Discussion on how AI, particularly Google's offerings, is changing the landscape for tech and non-tech consumers alike 🤓.
Key Takeaways
- Gemini 2.5 is the world's most powerful AI model, seamlessly integrating into daily life 🎉.
- Google offers free access to incredibly powerful AI tools through platforms like AI Studio 🌐.
- Tools like deep research and real-time capabilities make AI even more practical and accessible 🤖.
- Google's AI is not just for developers – it's designed to benefit everyone, with various user-friendly tools 💡.
- The Gemini app and AI Studio are poised to drive the next wave of AI adoption by offering innovative interfaces and functionalities 🚀.
Overview
Ishan Sharma takes us on an adventure through the vast possibilities Google is presenting with its AI technology, directly from their Mountain View office. He introduces viewers to the groundbreaking Gemini 2.5 model, touted as the world's most powerful AI, capable of transforming daily life with ease. With engaging conversations and insights, the video ensures every tech enthusiast's curiosity is piqued.
The video dives into Google's philosophy of democratizing AI, highlighting how powerful tools like Google AI Studio and the Gemini app are available for free. Through various demonstrations, viewers learn about deep research capabilities, real-time video understanding, and multimodal AI applications, reflecting Google's vision of making AI useful and accessible to everyone, not just the tech-savvy.
This exploration touches on the dual audience Google caters to – developers and general public alike. With special features like long context awareness and built-in tools that elevate the AI experience, Google's products are not only technical marvels but also user-friendly. The conversation looks ahead to a future where AI integrations become second nature in our daily interactions, thanks to innovations by companies like Google.
Chapters
- 00:00 - 00:30: Introduction and Overview The 'Introduction and Overview' chapter discusses the Gemini 2.5 Pro, described as the world's most powerful AI model. The speaker's mother finds it complicated to use new technology tools and questions why the AI can't directly assist by interacting with the screen, highlighting the potential of real-time mode. Google is praised for its commitment to making powerful AI tools accessible to the public for free. The speaker mentions Google's vision for AI by the year 2025 and ponders on their own place within the industry, emphasizing the significance of these advancements.
- 01:00 - 01:30: Interview with Logan from Google AI Studio In the chapter titled 'Interview with Logan from Google AI Studio,' the primary focus is the transformative potential of AI tools in everyday life. Logan, who leads Google AI Studio, discusses the lack of awareness and adoption of AI tools among the general public, despite their powerful capabilities. The interview takes place at Google's Mountain View office, highlighting practical ways AI, particularly through initiatives like Gemini and Google AI Studio, can seamlessly integrate into daily routines. The chapter also promises to feature impressive demonstrations showcasing the practical applications of these AI technologies.
- 01:30 - 02:30: Google's Vision for AI and DeepMind In this chapter, the speaker expresses excitement about discussing Google's vision for AI, specifically focusing on the progress made with Project B and other Google developments. The conversation takes place in Mountain View, where the speaker has visited to learn more about Google's latest technologies like Gemini. The chapter sets a scene for a discussion on Google's strategic aims for AI by the year 2025 and how it positions itself within the broader tech industry.
- 03:00 - 04:00: Google AI Tools and Products The chapter titled 'Google AI Tools and Products' discusses the diverse range of research and development activities at Google DeepMind. The lab's research spans multiple domains, including genomics, weather modeling, and generative AI models such as Gemini and VQ-VAEs (Vector Quantised Variational Autoencoders). It emphasizes the environment fostered by DeepMind for innovation and advancement in AI technologies.
- 04:00 - 06:00: Introduction to Gemini App and AI Studio The chapter titled 'Introduction to Gemini App and AI Studio' discusses the integration of research advancements into practical applications that impact daily life. It highlights the influence of mathematical models like Alpha proof and Alpha geometry on the development of Gemini models, particularly in enhancing their mathematical capabilities. The chapter indicates ongoing research across various domains and how this research feeds into creating powerful main models.
- 09:30 - 11:00: Demonstration of AI Tools The chapter titled 'Demonstration of AI Tools' discusses the broad array of AI tools available within the Google ecosystem, particularly focusing on the Gemini consumer experience and AI studio for developers. The discussion is targeted towards both developers and newcomers to AI technologies, highlighting the starting point for many developers which is the Gemini app itself. It suggests that regardless of one's expertise in development, beginning with the Gemini app is beneficial for understanding and utilizing Google's AI tools effectively.
- 13:30 - 14:30: Stream Real Time Mode The chapter "Stream Real Time Mode" discusses the transition of an app from being called Bard to the new name Gemini, which can be accessed at gemini.com. This app provides users with a sleek and modern interface to explore the latest AI models and features. Initially introduced by Google in December with deep research capabilities, it now includes personalization that leverages the user's Google search history. The app aims to consolidate Google's various innovative technologies into a singular platform enhanced by AI, envisioned as AI Gemini.
- 49:00 - 50:30: Gemini 2.5 Pro Launch The chapter titled 'Gemini 2.5 Pro Launch' discusses the launch and the excitement surrounding the new features and tools being released. It highlights Gemini as the main application and mentions the various innovative products Google is developing to enhance AI experiences. The discussion acknowledges the wide range of products Google offers, including Gemini and AI features integrated into Google Search, which provide users with comprehensive AI overviews.
- 58:30 - 60:00: Conclusion and Final Thoughts The final chapter, titled "Conclusion and Final Thoughts," highlights the range of products available in Google's labs. It mentions the core Gemini app and various other tools aimed at providing image and video effects for multimodal use cases. Examples include whisk and notbook LM, with AI Studio being particularly popular among developers. The chapter also touches on enterprise solutions available in the cloud, indicating a broad scope of offerings in Google's ecosystem.
I Went to GOOGLE'S Office to Uncover their EPIC AI Tools | Gemini 2.5 & Google AI Studio Transcription
- 00:00 - 00:30 Gemini 2.5 Pro it's actually the world's most powerful AI model my mom wants to go and like use some new tool it's like actually not easy to like understand how come the model can't just look at my screen to help me and I think that's the magic of this real time mode and I think the best part about Google is that they're able to keep everything available to the general public for free I can't say that I visited 56 websites for anything ever to try to make a decision Google's vision for AI in 2025 and where do you see yourself in this whole industry the thing that makes me
- 00:30 - 01:00 most worried about the AI moment is not sort of like existential worry about Ai and the impact that it has um it really is that like it's this incredibly powerful tool that most people still today don't know about and don't use in their everyday life hi everyone I'm isan and today I'm at Google Mountain View office with Logan who's leading Google AI studio and in this podcast we'll be talking about how you can use AI Tools in day-to-day life and how Gemini and Google AI Studio can blend into your life and there's some amazing demos that we'll be showing
- 01:00 - 01:30 on this laptop so make sure you watch till the end and Logan thanks a lot for taking off the time how are you doing there I'm hanging in here thank you for coming all the way to Mountain View to chat and to look at all the Gemini stuff and and hang out with us I'm hopeful this conversation will be fun amazing I've been following all the development at Google has been doing in the space right from the project B to where we are today and the latest products that Google has been launching love the work that has been put out there but I want to start with understanding Google's vision for AI in 2025 and where do you you see yourself in this whole industry
- 01:30 - 02:00 yeah that's a great question I think one of the beautiful things about Google Deep Mind specifically is you have this if you look at like how we compare to other labs and other people doing research deepmind has this breadth of research across everything from like genomics and alphafold to you know we have the world's best weather models and all of this stuff in between and then of course like Gemini and vo and imagine and all the like generative AI models that folks are familiar with um but I think the the vision is really like how can you create an environment where you
- 02:00 - 02:30 bring together all the best of all this research and make sure that it ends up in products and and models that that impact people's lives every day so I think um we're we're sort of seeing the fruits of this already with Gemini if you look at like Alpha proof and and Alpha geometry and some of the math models like a lot of that research actually like directly went into helping make the Gemini models better at math so I think we'll continue to see this like breath of research across uh all the different things that gdm is doing and then how that sort of manifests itself into these like main models that power
- 02:30 - 03:00 the Gemini consumer experience and AI studio for developers and and all the other products that are in the Google ecosystem M so you'll talk about both the segments of people people who are developers and people who are new to all the ecosystem of AI and AI Tools in general so let's talk from the start what are all the tools from Google that people can use today and they can get started with building whatever they want yeah I think if you're um even if you're even if you are a developer I still think the like starting place for a lot of folks to experience Gemini is in the Gemini app itself so it used to be
- 03:00 - 03:30 called Bard as you were referring to before now it's called The Gemini app and it's gemini.com and and that sort of has this like very slick consumer experience where you can try all the latest models you can have all the features like deep research which Google pioneered back in December it now has like personalization with your own like Google search history so it's like really bringing together um at least my mental model of it is like how do you take all the cool things that Google does bring them into a single interface and then sort of put AI Gemini in this
- 03:30 - 04:00 case to sort of power an experience that really brings all that to life so um it's been a fun a ton of fun to see all the new stuff they've been shipping like actually come to life and you canvas and other things like that so it's awesome so Gemini is the main app what are the other tools and products that Google has been working on for helping people with AI yeah that's a great question I think you know one of the one of the blessings and one of the curses of Google is we have a ton of products out there so I mean you can get Gemini in or you can get sort of AI overviews in search there's now ai mode in search
- 04:00 - 04:30 um there's the core Gemini app as well there's uh a whole slew of of Google Labs products so there's uh all the image effects video effects for a bunch of different like multimodal use cases there there's now whisk which is another image video generation part of labs uh folks love notbook LM notbook LM is another one of those examples uh for developers AI studio is one of those things that that folks come to uh there's a whole like Enterprise you know set up in cloud as well so there's there's a a lot of different places um
- 04:30 - 05:00 my sort of I think ai. gooogle actually has a sort of landing page that sort of gives you the lay of the land of like all of the different products that you could use um across Google's AI offering so if you're sort of trying to get the full Gambit of things ai. Google probably has all the all the red information I was just talking to the audience like couple days ago about the notebook lm's mind map feature which is really helpful for people to just learn everything or revise about topics that they're learning about I saw a tweet and it was someone being like hey bring mind map to everyone or something and then they shipped it like literally the week
- 05:00 - 05:30 after I was like that's so awesome I'm I'm happy that folks are liking it and I think the best part about Google is that they're able to keep everything available to the general public for free which I think is something which really helps bring AI adoption to the masses like people who are outside of the whole AI race on Twitter especially who are just like people who are in jobs people who are trying to build something how do you think about that have you always thought about keeping it free or what is the thought process yeah that that's such a great point and I I honestly like just like abstracting one level higher I
- 05:30 - 06:00 think this is one of the like biggest CH like the thing that makes me most worried about the AI moment is not sort of like existential worry about Ai and the impact that it has um it really is that like it's this incredibly powerful tool that most people still today don't know about and don't use in their everyday life and like the discrepancy and and what's going to be possible with someone who's using these tools in their daily life and someone who's not is like actually pretty extreme and I feel like that's actually already the case today like you could you know people who know
- 06:00 - 06:30 how to use these tools can Vibe code and do all this crazy stuff and if you're a developer can build stuff really quickly um so I think it's a very intentional part of the strategy and I think if you look at this from an AI Studio perspective which is the product I work on um the whole experience is free you show up and it's like try literally like our most Frontier highest capability models are available to the world to try for free and I think you know we we'll see how long we're able to continue to do that I think as more and more people show up it becomes harder and harder to
- 06:30 - 07:00 to offer that um but I I do agree fundamentally it's it's super important that the stuff's available to everyone especially in this like transition moment where like so many folks don't use Ai and are ramping up to use AI in the future uh so I'm I'm happy that lots of the products have this and then also like at the end of the day it's also super important to me that there's like available premium offerings like because there's a limit of how much stuff you can give people for free and like you there needs to be if you really want to go crazy with notebook LM and generate podcast all day like there needs to be a
- 07:00 - 07:30 way for you to do that so I think giving people that flexibility is also super important y I think uh Google's offerings especially in the AI space is truly underrated and I just keep telling people about you know use Gemini use notebook LM use Google AI Studio but we'll come to goog AI a studio before that I want to talk about Gemini and for people who are watching this for the first time who haven't used Gemini how would you describe it and its applications both for developers as well as for the general public yeah that's a great question so I think there's sort of core model capabilities and then
- 07:30 - 08:00 there's a bunch of um features that have been built in the context of like the Gemini app that have been built on top of the model itself so from a core model capability standpoint which people care a lot about right now the model race is is sort of a a top level narrative that everyone's talking about um the models are natively multimodal and and a couple weeks ago we released this like native image Generation image editing which has been going crazy and viral and the world's loving it um but basically the Gemini models from the ground up were built to be natively multimodal so
- 08:00 - 08:30 originally it was just input modality and now it's actually output modality for both images and audio so the model can actually output both of those things um and yeah I think that's that's been one of the core differentiators the second second sort of biller is is how the models actually use tools which if folks think about like what do models actually do on a regular basis it's just basically like tokens in and out uh or like text strings in and out uh tools as like how you actually make models useful do stuff for you um and in the case of
- 08:30 - 09:00 Gemini the model has been trained to actually know when to use tools and when not to specifically a bunch of our first-party tools so things like uh search it can actually like natively know when it's when a query that a user is asking might be a query that would benefit from from going and actually using the Google search infrastructure to retrieve information from the internet same thing with code execution which actually allows the model to be able to run code to answer like if you ask like a really complicated math question llms by default will just guess basically basically and uh code
- 09:00 - 09:30 execution is really nice because you can sort of ground the model in in a source of Truth um so I think that's one pillar of it and I think the third one that continues to give people excited is long context like basically context is the only thing that matters in the world of AI the more context you give the models the better the model is able to actually answer the question that you have um so the more context you can put in the better chances you get a good answer and there's also just lots of data sources which require a lot of context videos like long documents things like that um
- 09:30 - 10:00 so folks love the 1 million and the 2 million token context window with Gemini amazing so I tell everyone that this podcast that I run is not like any other podcast in which we're just yapping all the time so it's time to actually get in and show you a real demo of what Google AI Studio can do all of its applications I I remember a real I made about stream real time and it went viral it got like 8 million views on on Instagram or something but there's a lot more that you can experience and that's what we just do so let's get into go AI studio and tell us about all that applications
- 10:00 - 10:30 cool I love it let's do it so we are on AI studio. google.com yes what all do we have here yeah I love that and and actually so the the experience you start with when you get dropped in is this question of like what will you build um which I think is like a great framing of of how folks should think about AI Studio I do think if you're not a developer there's lots of cool stuff you could do in AI Studio you can access a bunch of our latest models for free you can sort of see what's possible with AI um I'll add the caveat that like it is built for developers so if you're using this and you're like this is kind of a
- 10:30 - 11:00 weird experience for me as someone who's not a developer like that isn't partially by Design like we the Gemini app itself is built for the sort of non-developer persona but I still think we've made it simple enough that like anyone who's interested in AI should be able to use AI Studio this is Google studio what do we see on the screen here yeah we're we're looking at a bunch of example prompts to start we're looking at and we can actually sort of Hide Away some of the complexity um it's just a chat environment to start with uh the model selector on the right hand side is sort of the magic I think of AI studio
- 11:00 - 11:30 is sort of this interface that sits on top of the models um and is really meant to not augment them in any weird way like everything you see in AI studio is sort of the Native capabilities of the model we didn't like none of the things you see in AI Studio are like quote unquote features that um aren't actually available for you to go and build yourself so we've made this intentional decision to like not do a bunch of fancy stuff that a developer wouldn't be able to replicate so everything you see should be directly to be replicated um
- 11:30 - 12:00 so yeah I think of the models as sort of the driver of this experience you can look through what are the differences between all these models that we looking at so we have imen then we have Pro experimental and then plenty of others this is a very common question so Gemini 2.0 flashes are sort of generally available model if you want to build something if you need sort of an everyday Workhorse Model 2.0 flash is a sort of cheaper faster alternative to that um this image generation is a specific variant that lets us do really cool things like dynamically editing images so you can see this picture of a
- 12:00 - 12:30 bunch of croissants and the prompt is add some chocolate drizzle to the croissants and you can see the image actually like it it keeps the very like fine grain details behind the scenes of of the image stays the same so there's just really good um object permanence As you move from image to image and you can keep doing weird things like make them strawberry and we'll see I'm not maybe that's a bad prompt so we'll see if we'll see if the model to yeah there's
- 12:30 - 13:00 now strawberries in the image I was hoping for I don't know if strawberry croissant exists anywhere in the world but doesn't sound that good but maybe you want to do it but so you can do this really iterative prompting and and one of the coolest things about doing this with Gemini 2.0 Flash the image generation variant is that the model benefits from all the world knowledge of Gemini so if you if you look at like how you do this type of prompting versus like if you've used mid Journey or even our image generation models like imagine like those models um you have to prompt in a very specific
- 13:00 - 13:30 way it's like it's like you know uh all these like words wordss like aspect ratio Pixar style hyper realistic hyper realistic 3D HD quality like you know pixel or something all this like very like so you actually have to learn learn that art of prompting those tools which is really interesting and I think the cool thing and like not that that's a bad thing I think it's the way you get those models to do what you want but one of the cool things about this is you can just use the very like English level
- 13:30 - 14:00 prompting of like how you actually go and do this every day in order to get the models to do what you want like we could say like remove some of the croissants I can't spell croissants let's see if it gets it right um and it's also pretty quick like 4 seconds and you have an image ready yeah this I mean this is the beautiful piece of uh let's see did it I'm sure it removed maybe one or two or something like that it definitely moved them around um yeah there's there's definitely quality stuff that we're working um but also like I think this is just
- 14:00 - 14:30 one of the coolest use cases and and to me the most exciting piece of this is like developers can actually take this and build it into the products that they want so this it's available and experimental today um but hopefully it'll be sort of generally available in near future and that's just like one of the models there's lots of lots of other models we have our pro model which is our our strongest model available today we have our thinking model which is like the reasoning model so if you've really complicated questions that you need answered um you can access that and and do that type of analysis um we also have
- 14:30 - 15:00 the original 1.5 series of models and then Gemma Which isma models yeah open open weight model so if you if you need to actually like have the weights on your computer you want to run it locally or you have a server somewhere and you want to sort of control every aspect of the deployment um Gemma is really great for that and it's actually the the research from Gemini is what ends up powering the Gemma model so it's you know a lot of the same technology the models are you know natively multimodal they have longer not full long contacts
- 15:00 - 15:30 but they have longer contacts and a bunch of stuff like that and they run on a single a single GPU so or TPU so it's like really remarkable performance for the size of this uh this Gemma 3 Model mhhm understood and and let's talk about developers so as you showed around the 2.0 pro models what can people build as a developer with it and using Google studio if you could show a demo or something yeah yeah so I think the the sort of traditional workflow um and we can actually show some of these things
- 15:30 - 16:00 if you go to the left hand side here you can see this starter apps and this sort of categorizes like and this is a very early version of what I think we want this experience to be but to answer exactly this question is like hey I want to build something with AI like what are the things that you could build with AI so we have a GitHub repo which has all these things open source but there's a few examples of this um let's try this spatial understanding one it's an example so this opens up this bespoke user experience um and then can ask the model to in this context take this
- 16:00 - 16:30 picture and add the bounding boxes around where these images are so you could imagine there's lots of like really interesting use cases of like um OCR which is like you know putting in a bunch of uh files and stuff and doing processing on them which is like not the most exciting use case but like extremely relevant in a world where like so much of everything is is still based on paper um and you can do all types of things like you could obviously crop an image to only show the specific thing that has the bounding box around it lots of cool things like this um there's lots
- 16:30 - 17:00 of other interesting starter apps I think natively Native video generation is is one of these things so if we um you can add a video yeah yeah if you have let's add a video and I can maybe add this video for example yeah let's do it it's a a small real about San Francisco cool I like and 234 the Google Wi-Fi is fast this thing's uploading um yeah so we we'll see what happens when we upload this video but another example of like the type of use
- 17:00 - 17:30 case that you could have with with Native video understanding so I think again this like native multimodal piece is super interesting if you think about like all the value that AI can create so much of it is actually locked behind these like visual representations of things um in a world where like Texton LMS like actually aren't like aren't going to be able to solve those problems so I think having this video version is is really cool um let's see explore this video AV caption tables hiou
- 17:30 - 18:00 charts key moments generate see so because it has the multimodel capability that can analyze what's actually happening in the video and give you insights from that 100% And and it doesn't like there's a lot of um you know obviously there's ways that you could sort of like hack your way around this like you could you know use a uh uh Audio model or you could use like a in individual frames of an image and then use like an image understanding model but like actually the the thing that's
- 18:00 - 18:30 magical is you sort of bring it all together and you don't lose the like very specific Nuance of like you know the flow of a story whereas like if you were looking at just like frames of of an image or something like that or just an audio transcript or like you know all the things like how is someone actually saying something can you sort of pick up on the Nuance of like are they animated are they talking in a weird you know you know what what's the what's what's actually happening there so there's lots of cool ways in which like video understanding is this really quintessential use case that again isn't
- 18:30 - 19:00 possible with with other models which is super cool and have you seen like what is the most crazy application you've seen of people using Google a studio or Gemini in general to build some crazy app using its multimodal capabilities that's a good question I think the most like magical one right now that I'm seeing is and we can I don't know how long it's going to take because this is a long video but if you just take a YouTube link and you drop it into the AI Studio chat it'll automatically populate and I think that's a good example of there's so much um there's so much
- 19:00 - 19:30 information that's like Locked Away on YouTube and I think giving developers a path to be able to tie into that directly for for people's own content or like studying and all that like it's just going to be like such a cool world to Let's actually do that so you said that you can you can paste a YouTube video link so if I open another another Google studio we go to YouTube and let's say we take any video let's just take this video about nextjs last Fray My Fire Ship lots of xjs Stu
- 19:30 - 20:00 happening and they used right now okay so I have pasted I have that video with me now what do what can it do and I'll switch to this is using stream real time I'll switch to 2.0 and then automatically the video is just loaded into the context window of so you can basically paste any YouTube video on Google AI studio and you can ask anything about it exactly so MH what is the controversy with
- 20:00 - 20:30 majs and yeah shout out to the folks that verell they're they're fighting the guy I don't of all the context but I've been watching all the threads on Twitter of uh of Lee and and G Mo talking about all the nextjs stuff so we'll see uh see what it says um yeah I feel like that's my at least the first sentence you know captures my understanding of of some of the stuff that's happening um so yeah really cool feature and I think like again it's if if you're not a developer like this is cool cuz you can put
- 20:30 - 21:00 YouTube videos in and sort of get context but as a developer like we're making this accessible and right now it's in preview so there's limits on like how many hours of video you can pass through the API but we're working on a path forward to actually make this like at scale available for developers and and really opening YouTube up as sort of a place where um where you know developers can can build on top of it and and get access to some of this countown which is so it not only uses the captions the it has the video understanding of it exactly I can ask it that if you see a Tesla tell me all the
- 21:00 - 21:30 time stamps in this video where you see a Tesla and it'll give me those time stamps for it exactly and that's and that's actually one of the at least temporary Deltas between how the Gemini app sees these videos and how AI Studio does so from a developer perspective we're we're using like the full tokens like 66,000 tokens for this video and what's a token what people are doing know yeah tokens are um sort of an a a chunk of text I think it's for what it's worth I think it's kind of this made up complexity like the atom of or AI sort
- 21:30 - 22:00 of to explain yeah yeah and and for every language model there's sort of a a tokenization process where you sort of try to take all of the Corpus of information that the model was trained on and break it up into the most like Optimal set of chunks and those chunks represent tokens and it's usually like one token is like three or four characters um but it's this abstraction to try to make the model training process and the inference process uh more algorithmically efficient and I again for what it's worth I think
- 22:00 - 22:30 it's this added complexity that doesn't really need to exist as far as the narrative for AI goes but is it's obviously helpful uh computationally I see it's saying Gemini makes mistakes so double check it what does that really mean and how can people be aware of this because I see this happen all the time yeah this is one of the most difficult problems and I I I think model hallucination stuff is like still this really unsolved problem um I think in the context of like video of this exact use case like you could watch the video and you can sort of know like does this
- 22:30 - 23:00 actually work like where the and and my general guidance is like it doesn't tend to be that like the models are like making there there's like certain classes of problems that the models make mistakes on so I think if you do like a bunch of testing to know like hey the model's actually really good at like science videos like it understands the concepts but like maybe it's not good at like you know the Nuance of politics or like some other so I think that's the level of like testing to make sure that the model makes M uh doesn't make mistakes is there's like certain areas where the model's stronger or certain
- 23:00 - 23:30 areas where the model's weaker and and sort of doing that experimentation can be really helpful amazing so we had a look at the YouTube feature um there is something over here as well nice this pulled up the time stamps of all of the different things that are happening in and it's describing every single frame so this is a aial view of San Francisco yeah person looking up at the skyscraper can you play this video that's actually okay so so that's me then Sports Park that's a way more that's some robots so it's
- 23:30 - 24:00 basically able to identify and label every single frame and and what could be like applications of something like this yeah that's a great question I think there's a lot of um one of the ones that jumps to my mind which again is not the most exciting use case but there's just a lot of like there's a lot of context that humans usually like right now the only way to get context out of a video is to watch the video and I think there's a lot of like really use ful information that's sort of locked behind
- 24:00 - 24:30 like one to one if there's an hour video a human has to spend an hour watching this um and if you think about the world there's like obviously YouTube there's billions of hours of video every day that are being uploaded to YouTube so like I think um and and there's a lot of use cases that developers have in which you know they maybe have hundreds or thousands of hours of video from whatever their use case is that um that aren't being like it's just not possible to understand one of my one of my friends is building a company using Gemini using this k ility to do like product analytics so like understanding
- 24:30 - 25:00 how users are interacting with products um and having the sort of video understanding like sort of do the analysis of how those people are using products so that he doesn't have to sit there yeah so that de so that teams who build products don't have to sit there and watch hundreds of hours of videos of their customers using their product they can sort of just have ai analyze it pull out all the insights that's exactly what I used to do when I was working at a startup so there's a company called jar which allows you to record customers sessions so I Lally it and see them move their cursor and open different web
- 25:00 - 25:30 pages in the website and figure out what are they really using and now for example an app could be something which you can just feed in all of those videos it analyzes where what people are using and give you insights on what can you build and what to focus on that's one example of what you can build but this is starter apps we talked about the create prompt and the image uh feature and the YouTube feature what else U I think there was a new canvas feature that came up what what is that and how can people use that yeah so canvas is in the Gemini app so we can jump over to the Gemini app let me show one more thing before we do that which is stream real time um real time streaming I think
- 25:30 - 26:00 is one of the things that has really sort of captured people's imaginations and I think it's this my sort of General umbrella is this idea of AI co-presence which is like usually when you engage with a model all of the onus and the work is on you as the user to like get the right context of the model so that it can actually help you and the cool thing and like one of the or maybe not the cool thing one of the interesting things is like most of the context is actually usually already on your screen somewhere it's like hey I have this other thing somewhere else and I want to
- 26:00 - 26:30 make that accessible to the model it's an email it's a video it's a it's a DM it's a slack message whatever it is it's on my screen somewhere how come the model can't just look at my screen to help me and I think that's the magic of this real time mode and again this is something that what you're looking at in as studio is like a demo but you could also build this for developers and we can actually share our screen um and we could share you know a Chrome tab for example and I'll I'll share this or maybe I'll share this other video and I'll ask them all to tell us
- 26:30 - 27:00 what's happening hey Gemini what's happening on this screen right now the screen shows a Google AI studio interface displaying a video titled next JS security advisory controversy along with the transcript of the video critical security advisory related to nextjs explaining how an nice thank you um is there anything else I can help you with yeah there's a couple things you
- 27:00 - 27:30 can help me with what what is the sort of simplified in one sentence description of what's happening in this video the video discusses a critical security flaw in next JS that allows attackers to bypass authentication and authorization beautiful um and lots of like I I think this is like the most rough version of this experience we actually have a bunch of stuff that
- 27:30 - 28:00 we're we're shipping in the next next couple of weeks specifically like how um uh voice act activation detection vad I think is the acronym for of like how do you interject into a conversation with AI in like a fluid way where I think like humans have the ability to understand the social cues in a conversation AI doesn't have that so you need to like algorithmic algorithmically approximate how you sort of do that engagement with humans um I think the thing with this that I'm most excited about is there's so many uh there's so many interfaces in which
- 28:00 - 28:30 people have problems navigating websites and like understanding the kind I think about like you know if my grandma wants to go in or like my mom wants to go and like use some new tool it's like actually not easy to like understand and and we've seen a bunch of cool use cases where people who make um who make applications are actually like building this in as a feature as like a sort of visual co-pilot feature to help their customers like Hey how do I actually use this product like now you know I'm stuck here like I'm on this page in a studio I
- 28:30 - 29:00 want to do this thing like where do I go um and I think that's one of the coolest the coolest use cases you could also Imagine like um this getting built into the OS level which would be really cool you can imagine this in an IDE for a developer like you have a PA programmer who's just there with you who can like see your screen you can just one click be like hey I'm stuck here like and they can be like okay navigate to this file and like check out all there's like so many cool applications of this um and it's available for developers right now to start experimenting with and we'll
- 29:00 - 29:30 have a sort of fully production ready version uh in the next few weeks which I'm super excited about amazing so that's stream real time I also see tuna model and labrary impr prompt Gallery what are those U features about yeah tuna library is is one of the original features we shipped which is just fine tuning if folks have data they want to make the model better at um you can actually come in and fine tune it's only available for the previous generation of of Gemini and we're working to make it available for the next Generation so there's not a ton of value here to be honest at the present moment because it's it's still the older model um but
- 29:30 - 30:00 prompt Gallery is is sort of just a very basic sort of set of different prompts that you know all the very very varied uh level of of use cases from everything like writing a lad to Santa using the models to like analyzing hurricane patterns and all this other stuff so there's a ton of different stuff you could do with the models and this is sort of meant to show case like just a Slither the breadth of things that you can do with it yeah 100% And with with different tools and models enabled so you'll see like different models
- 30:00 - 30:30 specifically different tools enabled for all these different ones um as folks sort of try to get inspiration for what's possible yep I see stuff for for worksheets I see unit testing I see marketing and so much more I think the the funny thing about this is like the it's now becoming very clear that like every you can do everything with AI it's like the P Gallery is like trying to approximate all the Human Experience you know captured down into one single page in a UI which is not easy to use there's a whole lot stuff that's not possible here but um these are at least like some
- 30:30 - 31:00 of those use cases you could explore and then what is the canvas feature about in Gemini app yeah so do you want to we'll switch over to we go to Gemini app yeah so there's there's actually two big features available inside of um inside of the Gemini app and the two that I'm most excited about is is deep research in canvas um so canvas is a way that you can actually go in and it's actually showcasing a bunch of these examples here um being able to make a study plan for chemistry 101
- 31:00 - 31:30 write an essay make a quiz app uh create a landing page for my robotics project so you can basically two core new artifacts you can both create documents and edit them inside of the Gemini app um and you can actually create and edit like simple web apps directly inside of of the Gemini app so let's try this landing page one just because robotics is cool um and you can see on the right hand side all of this content is being dynamically generated and it's writing code for you and after the code is done being written it'll actually preview
- 31:30 - 32:00 yeah you'll be able to preview and interact with the code um this is actually well this is pretty it's pretty decent uh yeah it's not perfect but it definitely is faster like it gives you a faster starting place than you would be able to do yourself and for a prompt like this it's pretty good yeah and and the nice thing is you can actually like directly iterate so we could play just like the image capabilities exactly so this is cool but I need it to be a little more
- 32:00 - 32:30 retro just around was like uh robots retro stuff I don't know so we'll see and the M can actually go in and sort of dynamically edit based on this context they I've given it um and you can again continue to see these things show up in real time you can download the code you can go and like build this into a real app if you wanted to I think the good part is that it's not only creating the functionality but it's also helping you write the CSS and the style it I actually like this I the colors need to be slightly fixed but I like this sort of Vibe a little bit more this is like the the text of um of like old Mario
- 32:30 - 33:00 Mario games stuff like that yeah yeah so this is cool so this is really nice um and then you can actually say something like okay nice now make me a deep report on how to create my first robotics project and again I can't spell we're to feature of AI that does all the dirty work for me um so we'll be able to yeah and so we can see automatically dynamically canvas mode switches over
- 33:00 - 33:30 from hey before we were doing code and seeing a code preview now we're actually in sort of a documentation editing experience which is like a a you know loose approximation of Google docs so if you're a developer or you're you know a daily active uh Google Docs user like me um you can sort of experience this and it has a bunch of like cool like small AI you know features built in so we can do something called Suggest edits and the model will sort of dynamically critique itself from the generation um you can change the lengths of the
- 33:30 - 34:00 models and then actually at the lengths of the output from the model and then at the end you can just which is really cool um and we'll see some of the suggestions from the model nice and it's like actually kind of again it's an approximation of um the Google Docs experience and you can imagine like an AI sort of critiquing your work and your writing directly inside this um so very cool I think canvas mode is awesome um and and a nice way to sort of start a project or something like that I think
- 34:00 - 34:30 the other one is deep research and research will take a little bit of time to actually do but if you have some topic you want to explore like what are the best I'm just going to continue with this robotics thread um what are the best home robots for people who want to be on The Cutting Edge of what is available in robotics today give me only the best okay how is
- 34:30 - 35:00 this different than using like the normal chat working with any chatbot like chat GPD or just any other app that's a great question so the normal sort of chatbot experience and we'll sort of assume that search is not enabled in those experiences is basically the model is taking all the context that it was trained on and returning that to you in some reasonable way so like basically if you know really the best home robotics thing like isn't well represented in the training data um it's some new company or it's some like you know dynamic new thing uh it's not
- 35:00 - 35:30 going to come out of like the base model itself you can sort of augment some of that with search I think the challenge with that is it usually is again it's like a rough approximation maybe you get like three or four items you know it hasn't actually like gone and read all the reviews and done some really deep thinking um this is the cool thing about deep research is that it actually is going in like visiting like potentially hundreds of pages um in order to get you the answer in real time in all the results yeah yeah so here here the model
- 35:30 - 36:00 actually wrote out like a little plan of how it's going to do the different research and we can also change the plan that's the good part about it if you want it to be specifically focused on particular things exactly and and so this is not intended it is it is sort of doing the searching in real time but it does a lot of like I've seen cases where it visits like a thousand websites and comes back so it takes a little bit of time we might have to come back to this um and once the research is done we'll be able to see the full report we can look at the other ones that I've done I think the the do I need to change it no
- 36:00 - 36:30 it had given some error message that uh something didn't work so yeah refreshing oops we are not robots unfortunately oh cool and it actually it's made good progress so so far it's researched 36 websites and you can sort of see all the different things going on every place like Amazon and YouTube and and whatnot Tom's guides sunat Etc so it's doing all this very interesting work um and I actually just bought a new home cleaning robot so I'm trying to see if if it's going to make the cut or not uh if it's
- 36:30 - 37:00 on and it's going to deep the deepmind website which is really awesome so super super cool stuff and and in this time we've been talking has already visited 56 websites and if like I think about how I do my um sort of research and and shopping and planning all this stuff like way way more thorough than I would ever do like I can't say that I visited 56 websites for anything ever to try to make a decision so super cool to see this so the craziest thing that I did with it is to find the cheapest business class tickets so when I was coming here
- 37:00 - 37:30 I was like uh you it's a long haul flight I need to get a better seat so I just asked it find me the best business class seats I need to be here from this date to this date I can have plus minus two days here and there but find me the best dates and then it researches and finds every Airline every possible combination and every possible date and it got me the best offer so that's just another like you know quirky version that I found out I love that and so deep research is actually available for everyone for free so if you want to sort of try this experience and see like hey
- 37:30 - 38:00 I have some you know travel or or shopping you whatever it is like you can try and see how good the response is and sort of validate this for yourself but I think the it's the breadth of the research that it's doing that gets me excited like again it's up to 134 different websites that has visited for you and um it's it's crazy yeah it's it's it's just so um it's another really quick anecdotal example is I think one of the biggest challenges of of AI products today is uh often times you're you're you're as a user you're being
- 38:00 - 38:30 asked to sort of extend belief that this thing's going to be useful and I think like email is the best example of this like when I show see some new email product I'm like cool it might do the thing that it's saying it's going to do but I need to like give them access to all my emails and like do all this stuff and I think deep research is one of these things where it's like such a minimal investment from a user to then see like oh my gosh that's visited 134 different websites like to me that like as a as a user of the product I feel like attached somewhat more to this
- 38:30 - 39:00 thing now because I've seen this like work that it's been doing um so I think there's a lot of like for folks who are building products I think there's a lot of interesting threads around this of like how do you show users how do you sort of meet them halfway and show them this and and this feature people can access as a developer as well to build stuff like deep research yeah that's a good question so we haven't made the full deep research experience so this is a good example of like what's the difference between a a sort of you know product level feature and a model level feature so the model knows hey you know if the user is asking for this type of
- 39:00 - 39:30 query we should go and use search um or we should go and do function calling um so and you could sort of roughly approximate part of this experience but you definitely won't get the bre the depth of this experience um in the API today for developers so it's definitely something that we're we're looking at as far as how we can make this available for more developers so this is going to take some time we'll come back to it but basically this is like a general purpose use case for anyone who's watching this video but let's go deeper for the developers let's I want to build an app let's say I want to build a AI agent and
- 39:30 - 40:00 I want to use Gemini for that how can that be done like if you can show a simple demo of building a simple ey agent which could do a particular task using Gemini yeah yeah that's a great question so I think my my suggestion is I don't know do you have like a favorite agentic framework or something like that that you use for these types of use cases could be make.com or n10 or anything like that yeah yeah I've never used make.com but um we can we can look at this so I think the way a developer would do this is we we as sort of the
- 40:00 - 40:30 Gemini API team don't have a sort of agentic framework so like we're we're sort of framework agnostic whatever you want to build with it could be Lang chain it could be me.com it could be zapier it could be anything um all you need to do is go into AI Studio click get API key um and then you're able to like directly copy for free take that API key and go into whatever framework you want and something um so we can we can do M we can do make.com I've never used it but we can I'm happy we can try building something so that this is one thing what what I think developers would
- 40:30 - 41:00 want to use is like Visual Studio code yeah yeah so if you go over here we have something called as Klein and so the good thing that Klein does is I could go here and I can change it to Google Gemini and I can enter my Gemini key so I'll just mute the screen recording for a minute and grab my API key if I can find it let me see if this works here you can also make a key and then just delete it if you want the problem with
- 41:00 - 41:30 that is I've exhausted my credits so I do here and I generate a key does it work fail to generate a key access it's probably based on what account you're in I don't know yeah yeah but let's see if we can go to a different account okay let's see if this should work I'll get you some more credits that seems easy you should me okay so we have created an API key I'll just take this create API key so now I have an API key I can go in here and I can replace it with this and
- 41:30 - 42:00 so now I can choose all the models that I have available over here what model would you recommend yeah what what do you want to build I think that's my let's say let's say we build something which recommends me some you know something to buy from Amazon or or does Dev scraping for me to find some contacts cool let's uh let's use the thinking model I think it's it's usually uh if you're looking for like the best performance on coding it's usually split between Pro and thinking so let's try the let's try the thinking
- 42:00 - 42:30 model and see um and see what it does and what does you want to build again let's say we trying to build a web scraper I can tell it to like get me contacts like lead generation of a particular company web scraper E Generation company androll so this would actually like go on a browser and search for it is that how it
- 42:30 - 43:00 works yeah so um my understanding and I haven't used Klein before but like this will give us sort of a a code experience like generate the code for this for exactly this use case if we wanted to specifically um if we wanted to specifically enable like Gemini's search functionality we can do so using um like we can tell the model to do that I think my guess is it's not going to do that by default but if we switch over to the developer documentation we can copy a
- 43:00 - 43:30 bunch of stuff in and then actually have it do that so that's basically like in the as through API documentation exactly so if we go it'll be like search grounding as one of the as one of the options grounding with Google search and then we can go to this so we can copy this code sample we actually and the example is one of the next s eclips going to happen in the United States so we can actually just really quickly uh switch over to
- 43:30 - 44:00 a new file if we were to is that python yeah this is python call it search. py and then we actually use this can we say this wer yeah yeah sure we we'll use this as the um as the starter for having Klein I think Klein can actually like build around this file perhaps if we want to do it that way um
- 44:00 - 44:30 and then we go back to I forgot how you got to the API key but we'll hard code the API key here too yep cool it's right here see if that will actually API key this really cool uh nope that's not what we wanted let's go back to a studio and get that key again cool
- 44:30 - 45:00 so this is they Mar um this is telling us we need to just install the Right Packages so I do not install All the Right Packages so we'll see if uh if it has the Google gen package installed and then awesome Echo cool awesome so we installed the the
- 45:00 - 45:30 Google gen package which is how we get access to the models um and then we're seeing sort of the raw response object which uh the query was uh when is the next solar eclipse in the United States and then the model responses the next total solar eclipse that will be seen from the contiguous United States will be August 23rd 2024 wow um the path of totality for this Eclipse will only be touching the states Montana North Dakota and southa and let's actually just you know just to make sure that we're not getting duped by the model um let's
- 45:30 - 46:00 switch over to uh regular Google search and we'll just validate that this is true okay cool awesome August 22nd 20 2044 um so this this becomes really really powerful the going back to this use case that you were describing of like can we actually go and and scrape the internet for certain things um it is it is definitely possible I think the one of the features that we're working on that folks have asked for is like can I constrain it like a certain website or like a specific criteria this is doing
- 46:00 - 46:30 like very broad search so like it is possible that I could get the answers like we just say something like um I forgot what was the what was the scpt like is there a specific company that you're interested in trying to get let's let's get into tou for a deep mind yeah is to it um and I'm trying to think of like there is a little bit of a mental model of like this is doing uh like it really is using search behind the scenes so it's not like a traditional like prompt the model and like a normal uh it's like
- 46:30 - 47:00 think about how you would think about doing search so I'm wondering like Google deep mind uh Google Deep Mind contact info people to get in touch with at Google deepmind I don't know we'll see if this like please only return email addresses is we'll see if that works oh that's interesting
- 47:00 - 47:30 yeah I'm not sure that that will actually work well but we'll find out this is the beauty and fun of of AI stuff and that was very very fast uh goodness that was almost literally instantaneous and so it's returning just for context of why we're scrolling through all this stuff um it actually returns like a Google search results for um so that you can like render and if we go back to the developer documentation it'll return some like HTML and CSS so that you can render this like Google
- 47:30 - 48:00 search box underneath your results how that if you wanted to like go just like the example we did of like validating that this thing was a real result uh you could just like one click go over to Google search and validate that it's a real result um but let's see what it let's see what it come well I can't provide a directory of personal information for Google deine employees good that seems pretty reasonable um but is the format basically yeah the most common is is the format yeah so there's there's probably and they has like press contact Alpha fold and it's probably pulling these um Google deep mine
- 48:00 - 48:30 scholarship their careers website um so you know you could probably do some some more prompt tuning to get the model to try to be more specific about it um yeah so all of this is all this is possible yeah and basically client helps you build like a complete infrastructure this is just the output but you can like build an app with it yeah which would so so this is Cent like people can also be using something like a cursor yeah I like I actually think Klein Powers a bunch of their stuff with uh
- 48:30 - 49:00 with Gemini which is awesome so I'm a I'm a I'm a Klein supporter oh okay I mean I like cursor too and I I love the cursor team and they obviously are also using Gemini in certain contexts but yeah um yeah lots of cool stuff amazing so this is basically cursor and it's very similar to vs code and like it works very similarly so this basically this is an overview of how you can start building with the Gemini API what else is there anything else that is in in the store for people to use for Google studio or gemini or any other tool yeah
- 49:00 - 49:30 that's a good question so I think the thing that's in store at the time of this recording we haven't actually launched it yet um but hopefully by the time the recording goes out we will have come out with um Gemini 2.5 Pro which is our sort of most advanced state of the art uh it's actually the world's most powerful AI model which is crazy uh it's crazy to to sort of say that in a sentence but we we've made a ton of progress on um on the the sort of latest model and it's uh specifically coding performance has been something that
- 49:30 - 50:00 we've leaned into a lot so this whole vibe coding sort of build anything with cursor and Klein and all these other products uh it matters a lot to us and and we've sort of put a bunch of effort into making sure that that's possible amazing so that's four that's to 2.5 2.5 and how is it better like what exactly is better in that in the new model yeah yeah so it's like literally across the board almost steady our performance on like almost every Benchmark um so it's everything from you know coding performance to mlu to
- 50:00 - 50:30 Humanity's last exam um and how that materializes in like real use cases is you know because it's a larger model it's just able to solve problems that other models aren't able to and it has um it has reasoning enabled by default so the model is like reasoning for every query that you send it so takes a little bit longer to get an answer but it is like directly comparable with o1 and and sort of in claw 3.7 son it and and the drop models and deep seek yeah I think it's um it's much stronger than a lot of those models actually so it's it's super
- 50:30 - 51:00 cool to see I'm really excited to see what people build and I actually think that like Klein use case and the cursor use case become really really powerful with with this example and the best part is that you can get all of that power from Google studio available for everyone available for everyone for free yeah and with this model we're we're trying to make I think one of the pieces of feedback has been like you know free models are great but like we obviously again because it's free we have to limit it um people don't want those limits they want want to use the model as much as possible so I think the model is going to be available in both the Gemini
- 51:00 - 51:30 app for people to use um we're also trying to figure out like a way for developers to be able to actually pay for the model so they can go and like use it as much as you want I think it's uh back to this thread of like um how Google thinks about the world and what we're doing with models I think it's really important that the models are available to everyone to actually go and build with like having a state-of-the-art model and not and this is my you know my impassion to pitch to the team is like having a state-of-the-art model that people can't actually build with doesn't create value for the world like it's a great like you
- 51:30 - 52:00 know PR headline for for Google which is awesome and and you know we should take PR headline wins when we can get them but um we need people to build with this stuff and the only way to build with it is like put it in the hands of developers and and sort of now the sort of AI builders that are that are using all these new tools so I'm super excited yep amazing and just as you were speaking we have the Deep research ready for us to find the Home robot that you want yeah I love this um this like what six 16 Pages 16 Pages I'm like here's what I need I need AI to then and you actually can so you can
- 52:00 - 52:30 ask and be like okay this is a little too long what is the main point people people say that this is going to take the Consulting Market away yeah do you do you agree with that uh that's a good question I think um I do think you need a human to really drive a lot of this stuff so I I am like very staunchly uh a AI augmented humans will like create a ton of value in the
- 52:30 - 53:00 world human in the loop F human in the loop 100% I I I really do fundamentally believe that um and this is telling me what the main point of this so like this is really helpful as you sort of go back and forth between these different abstraction layers you now have Gemini and and sort of the um the whole Suite of tools available indox and actually if we go back we could say if we were to oh actually and the other thing is audio overviews so audio overviews are built directly into into this canvas mode and you can take the Deep research report
- 53:00 - 53:30 you can turn it and if folks haven't tried audio overviews in Notebook Al it's been like one of the most popular features where you can basically two podcast hosts replace us yapping all day um and and make the experience come to life inside of inside of nopa or inside of the Gemini app now which is just super cool so deep research to audio overviews I was talking to Dave Citron who who's one of the leads for the Gemini product team and he's saying that like that's now his like de facto work stream um when he drives into the office is he does like a deep research before
- 53:30 - 54:00 he leaves and then he listens to the audio overview in the office about all these different and it's just like personalized podcasts uh on demand which this is really cool so basically a workflow for all of you who are watching you could literally just wake up whatever field you are in maybe you could be in robotics you could be in in bioengineering you could literally just ask it give me the latest news on this and it'll create a complete research report like 16 Page Long report you can turn it into a podcast hat it while going going to the office and you can be that smart person who knows of what's
- 54:00 - 54:30 happening in the field yeah I I think this this literally like changes how people think about the relationship with learning honestly in many ways like I think there's just so many people and like I find myself in this Camp all the time which is like you know you're being forced to learn something is a lot less fun in a lot of ways and I think this is like the type of interface which like uh this infinite content repurposing platform really makes content fun and if you're if you're sort of a student or you're just like learning something right now notebook Alm has all these and you mentioned mind maps has all these
- 54:30 - 55:00 great features where you can just like dump in all the documents and be like teach this to me in a way that's actually interesting and exciting and I think that's one of the like um most positive impacts of AI is that exactly that workflow uh so hopefully we'll yeah hope there's one more thing called Vortex AI um what what is exactly is it and how can developers use that yeah so vertex um if you think about like what AI shudio does we're really focused on providing Google's first-party models to developers um if you're like you know if
- 55:00 - 55:30 you're Walmart or you're like a very large company that like needs a lot of stuff that's not just like a basic set of tools for developers that's really where vertex comes in vertex is able to provide this like really verbose um like full end to-end experience for for Enterprises and it's everything from Like You Want To Train Your Own Foundation model and there's like you know some of the world's best foundation models are trained on on through vertex on Google Cloud to you need M and and you need like billions of tokens a minute and like huge you know production
- 55:30 - 56:00 Enterprise gr scale um to you actually want to access thirdparty models like anthropics models for example are available on Google cloud and a lot of the open source models are available there and others so there's a whole like model Garden of like hundreds of of third party not Google models available there versus AI Studios like we're there to provide specifically Google's first party models for the developer startup uh user Persona wow and we have a 10-minute podcast ready for you to watch we'll learn from 10 seconds and we'll
- 56:00 - 56:30 see you with any good if you're listening to this you're probably as fascinated as we are by what's happening with home robots yeah I mean we're moving Way Beyond just robot vacuums now right oh absolutely things are getting really interesting and uh we're lucky enough to have this report on the latest humans yeah it goes really deep and we're going to try and untack it all for you today so what are we looking at well I think the biggest I love that trying to convince the the Deep research team that they need to have a um they need to
- 56:30 - 57:00 have their own podcast because I feel like that would actually like using deep research to power the podcast experience to like do like really like I come full circle yeah it' be some like really really interesting use cases and takeaways and you can download this and here whenever you want to yeah and then you can take the model and go back to AI studio and get all the time stamps and like you know do all this all this so I think we're like very um the Gemini app is is sort of taking full form with like bringing together all of these parts of the Google ecosystem into a single
- 57:00 - 57:30 interface and there's actually this personalization now where you can go in and ask questions and it's basically pulls in a bunch of your search history and like really personalizes the experience to you so like lots of cool it has those nuances about me and what I've like talked about EXA in the past it's it's super cool so I think all of those experiences are coming together in the Gemini app and it's it's been super fun to watch this team build all this and bring it together amazing this was really insightful thank you so much Logan last question that I have is for people who are still watching this video till the very end and they want to build
- 57:30 - 58:00 a AI startup in 2025 what has your experience told you the advice that you want to give to them yeah that's a great question I think the two points one is I think everyone's looking for like what is their advantage what is their differentiator I think there's a lot of these things actually baked into the models today and like obviously you know somewhat you know selfishly I I like talking about Gemini stuff I think Gemini is a great example of this where like we talked about all these multimodal use cases we talked about these video use cases like you
- 58:00 - 58:30 know most people are not building in that space so I think there's like real competitive edges from the models themselves that you can get like just out of the box because other people aren't using whatever that model is so I think I would explore that I would try to find like a model Edge if you can um and I think there's like this emergent ux Paradigm of like how do you interact with models like it's not a chatbot so I think if your startup is like we're building an AI chatbot pivot and try to do something else as soon as possible because there's there's too many AI chat
- 58:30 - 59:00 Poots it's not differentiated and actually for you know most use cases you don't need a chap out like there's a lot better experiences that you could build to like actually solve your your customer's need um so I would try to innovate on that like user experience layer of like what is the right way to solve this problem what is the ideal customer experience and actually in a lot of ways how do you abstract away the AI stuff like users most the average person doesn't care about AI they're not they're like I have this problem I have a you know video problem I have a you
- 59:00 - 59:30 know a cooking problem I have a whatever a cleaning problem and whatever your example is like they don't have an AI problem AI is the solution if it's possible abstract it away from your user and I think because of the moment that we're in with AI there's such a desire to like put that in front of users because in some cases they are looking for it but I think as much as possible fight against that abstract where the AI find your model Edge find your ux Edge there's never been a better time to build this has been an amazing time talking to you thanks a lot Logan this
- 59:30 - 60:00 was a complete walk through of Gemini Google a studio and all that can you do with it as a developer or as a general person watch this video till the very end re-watch it take notes and start executing thanks a lot for watching share this video with a friend and we'll see you in the next one e