The 6 Steps to Master AI

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In the most comprehensive guide to AI currently available, Riley Brown takes you on an exhaustive journey through the rapidly evolving world of artificial intelligence. Whether you are a complete beginner or someone looking to enhance your AI toolset, this video is your ultimate resource. Covering everything from basic chat AI tools to advanced coding applications, Riley introduces over 100 practical and fun ways to leverage AI in today’s world without needing technical expertise. By following the outlined steps, you'll be equipped to solve complex problems effectively using the latest AI technologies and platforms.

Highlights

Riley Brown's guide makes AI accessible to everyone, not just the tech-savvy! 🧠
Learn the concept of 'Vibe Stack' to master AI tools and platforms efficiently. 🎯
Explore the latest AI tools that enable easier content creation and editing. 🎬
Transform workflows with AI-driven automations and efficient coding techniques. 🛠️
Escape the complexity of traditional coding through intuitive AI applications! 🌟

Key Takeaways

Mastering AI doesn't require a PhD or technical expertise; practical tools are now at everyone's fingertips! 🤓
The 'Vibe Stack' is your gateway to understanding and creating with AI, from automation to agent tools. 🚀
Visual and sound editing, once reserved for experts, are now accessible, fun, and easy with AI technologies. 🎨
Automation is key to efficient workflows—learn how to automate using tools like Zapier and GPT models. 🔄
AI tools can solve creative challenges, such as designing products and creating content, with innovative approaches. 💡
With AI, app creation is simplified; even complex projects can be tackled without coding from scratch. 📱

Overview

Dive into the comprehensive AI guide crafted by Riley Brown, where complex AI tools are democratized for everyone. This video spells out how even a novice can jump onto the AI bandwagon and leverage popular platforms and tools without previous technical know-how. With a focus on practical application, you'll understand the 'Vibe Stack,' enabling you to create and innovate using AI technologies effectively.

Discover how AI tools are reshaping content creation and workflow automation. Visual and audio editing, tasks once designated for experts, are now fun and straightforward with AI innovations. Riley illustrates this by demonstating key AI tools and the immense benefits they bring to eventuating your creative projects. Empower your creative side and streamline tasks like never before!

Riley unwraps the potential of ‘vibe coding’ where app development, regardless of complexity, is possible without starting from scratch. Learn how to integrate APIs into your projects, grab creative AI solutions, and witness how these tools can transform traditional processes. This video not only equips you with the knowledge to utilize AI but also inspires a mindset shift towards innovation.

Chapters

00:00 - 00:30: Introduction The introduction presents the premise that the viewer might feel left behind in the rapidly advancing AI revolution, and positions the video as the ultimate beginner's guide to AI. It promises to be the most comprehensive resource on AI available online, assuring viewers that even if this is the only content they've consumed in the past two and a half years, they'll still be ahead of the curve. The video plans to showcase over 100 practical and enjoyable applications of AI.
00:31 - 01:00: Video Overview The chapter 'Video Overview' introduces the structure of the video, which is divided into three sections. The first section, 'Vibe Stack,' covers popular use cases for AI tools and platforms, emphasizing their importance in creating useful, innovative, and enjoyable automations and agents. The foundation for this understanding is the Vibe Stack. The chapter also hints at moving toward building an app in section three.
01:01 - 01:30: Exploring ChatGPT and Its Applications The chapter titled 'Exploring ChatGPT and Its Applications' discusses building an application using Vibe Stack tools without any need for coding expertise. It emphasizes the accessibility of these tools, allowing individuals to create functional applications for various use cases, including business, to potentially generate revenue. The chapter aims to provide an ultimate guide on utilizing AI, starting from ChatGPT.
01:31 - 02:00: Using ChatGPT for Business and Projects This chapter begins by highlighting the power and utility of AI-powered coding tools, specifically focusing on their ability to significantly enhance problem-solving capabilities. The chapter promises that by understanding these tools, particularly ChatGPT, users will become much more effective in their respective fields. It starts with an introduction to chat AI tools, with a focus on ChatGPT, emphasizing the benefits the author has personally experienced using them.
02:01 - 02:30: Deep Dive into Vibe Stack and Coding Tools The chapter discusses the advantages of using LLM-based tools for enhancing productivity and quality. These tools have made vast amounts of information—from books to the entire internet—immediately searchable. The discussion begins with the relevance and significance of asking questions.
02:31 - 03:00: Automating with ChatGPT and AI Tools In this chapter titled 'Automating with ChatGPT and AI Tools', the focus is on how to effectively use ChatGPT for practical tasks. The speaker begins by introducing the ChatGPT platform and mentions that many readers might already be familiar with it. The chapter starts with basic interactions, such as asking ChatGPT questions like how to negotiate a better salary. The platform is highlighted for its capability to provide detailed and useful responses tailored to specific contexts, exemplified by queries related to working in finance. The examples underscore the utility of AI tools in assisting with everyday professional challenges.
03:01 - 03:30: AI Use Cases and Applications The chapter titled 'AI Use Cases and Applications' provides advice on professional development and career management. It emphasizes the importance of documenting everything, strengthening one's leverage, and quietly exploring other roles. The chapter also discusses the capabilities of asking questions using search and highlights the growing capability of Large Language Models (LLMs) in this context.
03:31 - 04:00: Utilizing ChatGPT in Real World Scenarios The chapter explores how ChatGPT can be utilized in real-world scenarios by demonstrating the use of its web search feature. It illustrates this by starting a new chat to find information on who made the NBA playoffs. The process involves ChatGPT searching the web, retrieving the relevant details about the playoffs, and generating a concise summary, which can then be used, for example, as content for a tweet.
04:01 - 04:30: Customizing AI Projects The chapter "Customizing AI Projects" discusses the project management feature in Chat GPT, which allows users to organize their chats into projects. For instance, a project named 'Riley's tweets' can be created to store relevant interactions. This feature is useful for customizing and managing different discussions or themes for convenient access and overview.
04:31 - 05:00: Comparative Analysis of AI Models The chapter titled 'Comparative Analysis of AI Models' provides a walkthrough on setting up a project with an emphasis on the user interface differences observed when managing projects and non-project files. It highlights the function of viewing Riley's tweets as an example. The reader is guided through the process of adding files and custom instructions within a project context, particularly focusing on opening a text box to input these instructions.
05:01 - 05:30: AI Video and Image Tools The chapter discusses the use of AI tools in generating video and image content. It highlights the capability of these tools to adapt to specific styles and beliefs as instructed, using examples from the project to guide their output. It also emphasizes maintaining a specific voice in content creation by refraining from using emojis and hashtags.
05:31 - 06:00: Practical Examples and Demonstrations The chapter titled 'Practical Examples and Demonstrations' focuses on illustrating the application of theoretical concepts through real-life examples. It emphasizes the importance of using a conversational tone while engaging with users, akin to the examples shown. The text is organized in a format where each line or sentence stands independently with a full line break in between, enhancing readability and clarity. The examples serve not only to demonstrate practical applications but also to encourage the adoption of a clear and engaging writing style.
06:01 - 06:30: Integrating AI Tools into Workflows The chapter discusses the integration of AI tools, particularly how to tailor AI outputs to better fit user preferences. It starts with a reflection on how generic AI outputs, such as auto-generated tweets, might not align with a user's personal style, citing examples like added hashtags and emojis that do not resonate with the user's voice. The discussion progresses to emphasizing the importance of custom instructions. By employing these instructions, AI can generate more personalized and suitable content. The speaker underscores the value of referencing specific examples to guide AI more effectively, suggesting a shift from generic to customized AI interactions.
06:31 - 07:00: Advanced AI Tool Usage In this chapter, the focus is on effectively using AI tools like ChatGPT to manage social media content, specifically Twitter. The chapter discusses organizing highlights by pinning tweets, allowing for easier access and management. The user demonstrates copying a pinned tweet and utilizing document editing tools like Google Docs for further processing or archiving of content. Overall, this chapter illustrates a practical workflow for enhancing productivity through strategic use of AI and digital tools.
07:01 - 07:30: Building with AI and API Integration The chapter 'Building with AI and API Integration' discusses the process of integrating artificial intelligence and API capabilities into workflows. In the given excerpt, the focus is on automating the documentation of tweets using a Google document. The procedure involves opening a Google document, pasting tweets, and organizing them systematically, starting with headers indicating each tweet, such as 'tweet one'. The approach described enables efficient collection and review of tweets. The narrative implies that this method can streamline the selection and documentation of the best tweets, illustrating a practical application of AI and API integration.
07:31 - 08:00: Developing AI-Enhanced Applications The chapter focuses on developing AI-enhanced applications with a specific emphasis on creating content designed for platforms like Twitter. It discusses the process of selecting high-performing videos to serve as examples in projects, highlighting the importance of quality and engagement metrics, such as views. The transcript outlines the practical steps of saving and categorizing useful examples for future reference.
08:01 - 08:30: Exploring AI Video Generation Tools This chapter delves into the usage of AI video generation tools, specifically how to integrate PDF documents into the process. The discussion begins with navigating back to chat GPT, where users are informed about the option to add files. By adding files such as the video PDF, users can easily manage and integrate these resources. The capability to search within this setup is highlighted, allowing for tasks such as typing in a Vive code for creation purposes.
08:31 - 09:00: Sound and Video Synchronization The chapter "Sound and Video Synchronization" discusses the process of aligning audio with video tracks. It specifically focuses on the use of "MCPS" which appears to be an important tool or concept in the synchronization process. Additionally, the chapter refers to creating a tweet-like output, indicating that concise and clear communication is a key component in the exercises or outputs related to the chapter. It also touches upon searching the web and utilizing context to support synchronization processes.
09:01 - 09:30: Creating AI-Powered Experiences The chapter 'Creating AI-Powered Experiences' discusses utilizing AI to enhance personal computing environments. It describes a tutorial where readers can connect AI to their file systems, calendar, and Notion application using voice commands. The focus is on transforming the user experience by crafting a personalized AI operating system. This approach aims to increase productivity by enabling voice command operations without requiring typed inputs.
09:31 - 10:00: Automation with Zapier and Other Tools The chapter discusses the use of customization in personal AI projects, specifically focusing on using an advanced reasoning model. It highlights how projects can be organized through tools like 'Riley's tweets,' which help in managing and storing chats and information in one place. The chapter previews using the latest model version, 03, to enhance the functionality of these tools.
10:01 - 10:30: AI Agents and Workflow Optimization In this chapter, the focus is on utilizing AI Agents to enhance workflow optimization. The chapter discusses the concept of 'search mode' and illustrates a practical application where an AI is tasked with conducting research on Riley Brown AI's perspectives. The AI agent is then asked to produce a mini essay, mimicking Riley Brown's tone, about the transformative impact of Vibe Code on a global scale. This highlights the innovative ways AI can be used to imitate human cognition and produce insightful analyses on specific topics.
10:31 - 11:00: Case Studies and Practical Examples In this chapter, the focus is on using real-world data and case studies to highlight a growing trend. The power of the model discussed, presumably a language model, is emphasized, particularly its ability to search the internet for expert perspectives, exemplified through the mention of Riley Brown's insights. The chapter underscores the persuasive capabilities and utility of the model, even if the initial prompt wasn't fully showcasing its potential.
11:01 - 11:30: AI in App Development and Deployment The chapter titled 'AI in App Development and Deployment' delves into the concept of 'vibe coding' which entails focusing on the essence of problem-solving and the spirit of the project rather than getting bogged down by syntax. This approach is endorsed by Andre Carpathy, who advocated for developers to embrace the 'vibes' or the overall feeling and objective of the code, as if the code itself doesn't exist. The chapter discusses how AI can aid in this process by searching the internet to understand 'vibe coding' and adapting the project's tone according to provided instructions and uploaded files.
11:31 - 12:00: Conclusion and Future Directions In this chapter titled 'Conclusion and Future Directions', the discussion focuses on the functionalities of a chat interface related to searching, creating projects with custom prompt types, and setting custom instructions. It explains the ability to upload files directly within a chat by using a plus sign and discusses the process involved with a new chat in Chat GBT, including uploading files and taking screenshots.

The 6 Steps to Master AI Transcription

00:00 - 00:30 Do you ever feel like you're falling behind the most important revolution the world has ever seen? Well, you're in the right place. This is the beginner's guide to AI. This is going to be the most in-depth video on the entire AI space on the entire internet. In fact, if you had been sleeping under a rock for the past 2.5 years and this was the only video you'd watched on AI, you'd be ahead of the curve. In this video, you're going to see over 100 useful, practical, and fun ways to use AI. This
00:30 - 01:00 video is going to be divided into three sections. Section number one is called the vibe stack, which are all of the popular use cases for AI tools and popular platforms. It's important that you understand this so that you know how to create automations and agents that actually do useful, cool, and fun things, which the foundation for that is this type one, which is Vibe Stack. And then we're going to move on to section three, and we're going to build an app
01:00 - 01:30 with all of this in mind. We're going to build an app that leverages the Vibe Stack tools. We're going to mix and match tools and you are going to have a better understanding of how to create apps without writing a single line of code using these popular tools. You do not need to be technical in order to code and build applications that you can use to make money that you can use in your business for whatever use case you want. This is going to be the ultimate AI video from chat GPT all the way up to
01:30 - 02:00 the most intense vibe coding tools. I'm going to break it down very simply. And by the end of this video, you are going to be able to tackle problems 10 times better than you could before knowing that you have all of these tools at your disposal that you can use at any time to dominate whatever field that you're in. So, let's go ahead and start with the chat AI tools. And we are going to start with none other than chat GPT. The main benefit that I've gotten from these chat
02:00 - 02:30 tools is I can do everything significantly faster and at as high of quality because one thing that these LLMbased tools did is they basically took all books, all of Wikipedia, all of Reddit and basically all of the internet and they made it searchable immediately searchable which brings me to the first question which is asking a question. So, we're going to start here on
02:30 - 03:00 chatgpt.com. This is where we are. And we can, and most of you guys already know this. We're going to start off with the basics. You can ask a question like, "How do I negotiate a better salary?" And it is going to literally give you the information right here. It is going to tell you exactly how to negotiate a better salary. And you can say, "I work in finance and my boss is being horribly
03:00 - 03:30 mean to me." And we And so here it tells you to document everything, strengthen your leverage, start quietly exploring other roles, negotiate or exit. Very good advice. Not only can we ask questions, right? We can ask ask questions using search and more and more of these LLMs have this capability.
03:30 - 04:00 If we just start a new chat here, we can turn on search and we can say, "Please tell me who made the NBA playoffs." And write a little tweet about it. And as you can see here, it says searching the web. And so it searched the web. It tells us exactly who's made the playoffs. It gives us a little information on it. And then it
04:00 - 04:30 gives us a tweet. And obviously it doesn't have any information about how I would tweet, which brings me to projects. If you open up this side window here on Chat GPT, you'll see this projects header. If we come over here and we hit new projects, we can now create a folder of all of the chats that we want to keep in this folder or project. And so what I want to do is let's call this Riley's tweets. And what
04:30 - 05:00 we can do is we can just create this project. Now this is going to open up a slightly different view. This looks different than this. This we're not in a project, but if we click Riley's tweets, we see this slightly different view. And when you are in this view and within a project, you can add files and you can add custom instructions. Let's go ahead and add some instructions. So, it's going to open up a text box and what we
05:00 - 05:30 can do is we can say whenever I ask for a tweet or type or any content for that matter, please use the examples in this project to influence the style and beliefs. Please don't ever use emojis or hashtags. Maintain the voice of the uh
05:30 - 06:00 user while also using the tone of voice from the examples. When writing content, every line or every sentence should be on its own line like this. A full line break in between each sentence. Now, if
06:00 - 06:30 you remember to the chat we just created down here, I asked for a tweet and the tweet added hashtags. it added multiple emojis. It just sounded dumb, right? Or, you know, it sounded not like me. But when we have these custom instructions, it won't do that. But if you notice here in the instructions, I said reference the examples. So, we need to give chatbt
06:30 - 07:00 some examples. So, let's go ahead and go to Twitter. I'm going to go to my Twitter account. And what I'm going to do is I'm going to go to highlights. Anytime you pin a tweet, it actually goes to your highlights. And so what we can do is we can actually take all of my pinned tweets and let's go ahead and let's copy this one right here. And so we're going to copy this. We're going to go back to chat GBT. And we're actually going to open up let's go to Google Docs. Let's just go
07:00 - 07:30 to a Google doc right here. And we're going to open up a new Google Doc. And I'm going to paste in that tweet. And what I'm going to do is I'm just going to make a header and say tweet one. So we created this first tweet right here. Now we can move to tweet two. So let's go ahead and throw these next to each other. Now I can just keep going down. Let's look for the best tweets. And so all of these tweets right here are made
07:30 - 08:00 specifically for Twitter videos. And you can see this video got 200,000 views. And I just went through and grabbed the top five videos that I've made on Twitter and grabbed those tweets because when you're finding examples, you want to choose the best and highest quality ones because we're going to put this in the project to be used as an example and you want it to have really good examples. So, let's save these as top video style tweets. Now, we are going to hit file and we're going to download
08:00 - 08:30 this as a PDF document. Now we can go back to chat GPT. So we can press this. We can add files. And now we can select that video PDF the top video tweets PDFs and we can enter that right there. Now what we can do is we can search again just like before. And now we can type in is I am creating a Vive code
08:30 - 09:00 tutorial video on cursor. Um specifically going over MCPS the useful ones. Please create a tweet like in like in the files. Please give me four options. And so now it is going to search the web and it's going to have context, right?
09:00 - 09:30 Tweet option one. Remember I specifically asked for all of the sentences to be on its own line and it's doing that. Vibe coding with MCP transforms cursor into an personal AI operating system. In this tutorial, we connect Claude to your file system calendar and notion all without typing, just voice commands and the agent. That's exactly how I would say it. And so now we have created a customized a
09:30 - 10:00 customized personal AI project that uses our tone. Okay. So now let's say we wanted to start over within projects. We can press Riley's tweets. Now we're back here and it's actually going to organize all of the chats that we do within Riley's tweets in this project. It's going to keep all of that information right here. And now what we can do is we can actually raise the model. And so now I'm going to be using 03 which is the latest advanced reasoning model. And so
10:00 - 10:30 what we can do is we can turn on the search mode and we can say I want you to do research research on how Riley Brown AI thinks and I want you to write a mini essay in his tone on how Vibe Code is changing the world. Find
10:30 - 11:00 realworld data that people might have missed on this growing trend and be very persuasive. And this model is incredible. By the way, this isn't a very good prompt to understand its full utility, but 03 is a very strong model. So what this did is this actually searched the internet for me, Riley Brown, and it got my perspectives on
11:00 - 11:30 vibe coding and then it also searched the internet on what vibe coding is. And then it used the project that we were in to uh create the proper tone based on our instructions and the files that we uploaded. Forget the syntax, feel the problem, speak the solution. That's core. the term that Andre Carpathy minted when he told builders to fully give into the vibes and the pretend the code doesn't exist. So, so far we've
11:30 - 12:00 used this chat interface which we can turn on the search. We can also create a project for custom specific types of prompts that you want to use in which you could set your own custom instructions and add project files. Well, we actually don't need to add files directly within a project. We can actually do it directly within the chat by pressing this plus sign. If we go back to chat GBT, we have a new chat. We can actually upload uh files. We could take a screenshot of this right
12:00 - 12:30 here. And we could save it to our clipboard. And we can go back to chat GPT. And we can just paste that bad boy in there. And we can say, please look at the at the image tools and write an essay on uh in the tone of Sam Alman on the image tools. I mentioned short,
12:30 - 13:00 concise, as if it was an internal memo to the Open AI team. And now this new 03 model which we're going to talk about later um we are going to test this out right here and it actually just analyzed the image and now it's analyzing the image at a deeper level which we are going to
13:00 - 13:30 talk about this later. The level to which 03, the new open AAI model, can analyze especially images is truly remarkable. And so it went through all of this analysis before doing this write up in internal memo from Sam Alman where we strand on the current crop of image generation tools team. A quick reality check on the images column from the vibivestack map Riley
13:30 - 14:00 share. And we can actually copy this or or one of the most underutilized tools in all of chat GBT. You can actually edit this in canvas. And so we can actually just press edit in canvas right here. So the chat is here on the left and it put the canvas right here. And now we can actually make edits to this. We can say actually shorten this and please uh write it
14:00 - 14:30 in sentences in the tone of Jeff Bezos. And so it is actually just going to directly edit this right here. Wait, sorry. In order to do that, I think you need to use GPT40 to use the canvas. And we can say add a sentence at the end. Yes. So here it literally edits it just like that. So it added this sentence here at the end. And then you
14:30 - 15:00 can actually just edit this, right? So we can actually make this a heading and we can actually add a sentence here. And we can say hello everyone. And then we can say um add to paragraphs and write it in um the tone of Elon Musk. And now it will edit it and you can watch it edit this which is a very
15:00 - 15:30 fun and interesting thing. And you can see here that this is a much more direct Elon Musk style voice. And so here are those six use cases for chat GPT. However, I do want to mention that Gemini.google.com, which is Google's version of chat GBT or it's their LLM tool, is virtually identical in many different ways. We can even use this canvas mode. Please write an essay on
15:30 - 16:00 themes of racism in to kill a mocking bird. And here we're using Gemini Advance 2.5 experimental which is a very good model. And as you can see here with Gemini we also have in my opinion it is a better design and it is very easy to just edit things. Right here we can add
16:00 - 16:30 paragraphs. I think we have actually more styling options. And then one thing we have here is we have a one-click export to docs. And this allows you to immediately get it in Google Docs. And this is incredibly useful because I personally use the Google Suite and I'm constantly using Google Docs. And so we can put Gemini canvas export to Google Docs. In terms of the search feature, Gemini also has the search feature. So,
16:30 - 17:00 we can always do this deep research or it'll actually search the internet by default, but there's also a lot of other LLM tools like we can actually just use perplexity. This was the first ever model that did search and now all of the tools have search and so I think they might be trying to figure out what they're going to do that is differentiated. But again, this is another tool that has a lot of the similar features. This tool does not have a canvas feature, but it does have a projects feature except they call them
17:00 - 17:30 spaces and you can actually create a space and it's very similar to the chat GBT project. If you remember, it has custom instructions. After you create a space, you can see here that you can upload files and you can actually upload links which is very new to the spaces feature. So yeah, these are the different uh chat tools. We looked at chat GBT, Gemini, and perplexity. We're going to talk a lot more about Claude later in this video. Uh, and you can also use Grock. As you can see here,
17:30 - 18:00 they all mostly have project features. They all have the canvas feature except for Perplexity. They all have search. I think Perplexity is the best at search. I think Claude has the best artifacts feature. I think Gemini is really close though. Um, and they all have deep research. I think Chat GPT has the best deep research feature. They all have file upload. They're not even worth comparing. And the most unique feature that Gemini has is they actually have video analysis. You can upload videos and have AI analyze it. But this
18:00 - 18:30 actually isn't in their Gemini app. It's actually in their Gemini studio app. So if you go to studio Gemini, just type that in. We are going to go to Google AI studio which is very similar there. It's kind of confusing. But if you actually go to this flash model, where is it? So if you go to Gemini 2. So what we can do is we can actually just drag a video in right here. Here is a Tik Tok video that I made. Please give me
18:30 - 19:00 timestamp segments on where I could add some B-roll popups that would be best for education and describe them. So this AI model literally takes in a video here. It takes in a video and it understands the video at every single timestamp and it
19:00 - 19:30 is truly insane. And so here it's thinking. It literally tells us the exact timestamps that says spoken Q. Both of these models have gained the same superpower. Spoken Q. It has control over every single pixel it generates. B-roll suggestion. So it can select the object, remove them, and add a new object nearly perfectly. B-roll suggestion. Use the existing hat example sequence from 26 to 30. It's already
19:30 - 20:00 excellent B-roll demonstrating exactly this. You could slightly slow down the transition where the hat appears. This is how we get AI agents that do video editing for you is you have AI models that understand the video. this video you're watching right now could be uploaded to this model and fully understood and it could I could say uh as you can see by this chart and if it had access to all my charts it the AI model could actually just add the chart right there. I wish I could do that right now. That is where this technology
20:00 - 20:30 is headed and this is so undertalked about. It isn't talked about at all honestly. This model right here literally understands videos. all of Gemini's AI models do and that is incredible. So this is a pretty good start for chat. You can ask a question. You can ask a question using search. You can create custom AI projects for your specific use case. And within those projects, you can also search. You can add files such as images and PDFs. You
20:30 - 21:00 can use the canvas feature to edit. And then within Gemini, you can actually use the canvas and export it to Google Docs. And you can also use the AI studio, which is Gemini's AI, to analyze videos. And you can actually use all of these functionalities within your own apps that you can create, as we will get to later. But what I really want to move on to is images and how to use visuals in
21:00 - 21:30 your workflows and how it relates to this chat because the best AI image model right now is actually directly inside chat GPT. And so that's where we're going to start. So up until recently, MidJourney was the best AI image model by far. you could feel the difference between Midjourney and all of the rest of the image models. Currently, to this day, it's actually still the best at realism in my opinion. However,
21:30 - 22:00 I would say the best overall image model for everyday use cases, especially in the business and content setting. Chat GPT 40 is the best. It is slower than MidJourney. Midjourney is a lot more fun platform because you can create a lot more image faster. This is faster. You can create batches. But GBT40 is the best at styling like
22:00 - 22:30 styling. Obviously, it had the Giblly phenomenon which was the most viral AI feature in a very long time. and it's really really good at text in images like the best by far. And let's dive into that right now. So just like earlier, we're going to go to chat GPT. And all we have to do is make sure that we're using the 40 model. So
22:30 - 23:00 let's pretend for a second that this is my house. This is not my house, by the way. I just found it on Zillow. But we're going to take a screenshot of this. And this is an idea that I got yesterday. Someone posted this on Twitter. And I can't find the tweet, but they were saying that paint companies or companies that paint houses are actually just using AI to change the paint of the of the house to show you what it looks like. And let me show you what this very
23:00 - 23:30 useful business use case looks like. So, we can just upload an image of this house and we can say, "Please paint this house dark gray. Keep everything else the exact same. I just want to change the paint color to dark gray." And the GPT40 model will immediately
23:30 - 24:00 just generate one singular image, which is different than midjourney because you can have it generate four at a time and you can run it over and over again. So I could literally have it generate 30 images in the time that it would take for this to generate one image. And so in that way, midjourney is better. But in terms of the editing quality and using natural chat language uh to edit an image, GBT40 is far and
24:00 - 24:30 away the best. And there we go. Look at this. So the everything about the house is basically the same. And so we started with this and now it looks like this. And obviously it painted these bricks that might not get painted, but I think this actually kind of looks cool. And you can actually paint it any color. And you can say keep
24:30 - 25:00 every the same except add a red Jeep in the driveway. in the driveway with black rims. Maybe you have this car. Maybe you want to put the user's car in the image. I don't know what you want to do, but that is some use cases that you can do, right? We can change the paint on houses for a paint company. And hopefully your
25:00 - 25:30 mind is racing. You can be like, "Okay, what else can you do with that? What else can I do like this?" And the the answers are endless. Imagine if you work for a pen company and you need to sell this pen, right? You don't want to just have to take a photo of this pen. You could say, "Please put this pen in the hand of a model as they hold it. In fact, we'll do that right after this is done loading." There you go. We have the same image and we have this red Jeep out
25:30 - 26:00 front. That's pretty good. We have this new house which is basically identical to the form of this house right here. We painted it dark and now we have this red jeep right here in the driveway. Let's get this photo of Cologne. We'll paste that in and say, "Please turn this into a highly
26:00 - 26:30 professional product photo that looks very sexy. This should have a really cool background, but the actual product should look exactly the same. Look at that. That's pretty insane. And we can actually do specific
26:30 - 27:00 edits. So, let me show you something real quick. I don't know why my mouse cursor isn't showing up over the selection, but we can select a certain area on the photo as you can see right there. And what we can do is we can say change this text to Riley Brown, the goat. And we can now edit this highly professional product photo. And it looks
27:00 - 27:30 like it got like basically the ratio like this font and spacing looks to be exactly the same. Same colors. It just has really clean lighting. And we could obviously change the background as well. All right. Well, here's an example of it not being perfect, right? I selected just this part. Maybe I need to describe the selection a little bit better, but chat GBT is going to continue to get better. And it actually just changed the whole thing to Riley
27:30 - 28:00 Brown the goat. Please change the background to be a lighter. Make it velvet red background. And here you go. It created this Riley Brown the goat with a velvet red background. Let's dive into Midjourney because Midjourney has the
28:00 - 28:30 most realistic AI generated images on the market right now. It is the most photorealistic. This is Midjourney. And Midjourney is definitely more, I would say, artistic. It has better vibes. It's just a fun site to use. And the main difference here is we can actually just mass create images. And it's a lot more fun when you're trying to come up with something when you're after discovery of AI generated images and AI generated
28:30 - 29:00 styles and there's a lot more settings built into the platform. Let's imagine we wanted to create an app icon, right? We and we wanted an app icon for an app called Vibe Code, right? an iOS app icon of an orange monkey cute minimalistic and yeah let's just run
29:00 - 29:30 this and so what you can do is you can actually just press command enter and you can see here four images are now being generated and what we can do actually is we can just do that over and over again so we can say command enter command Enter. Command enter. Command enter. Command enter. Command enter. Command enter. And so here we now have 30 images loading all at once. And so we can kind of see them in this grid. And in this way it's a little bit dangerous because you can spend a lot of money.
29:30 - 30:00 Each of these generations costs money. And so what we can do here is we can click on one of these and we can actually uh download them. And if we download them and open up Finder, right, we can see the image is right here. It's 1024x 1024. So if it gets any bigger than that, it might start to look pixelated. But they have this feature right here, which allows you to upscale an image. And we can also vary an image.
30:00 - 30:30 And so it'll use the same style right here. It'll create a variation that looks uh kind of similar. And so as you can see here, it created this upscale. And we can click on this. And now if we download this image and we open up Finder, we will see here that this one is uh 2048x 2048. So it's bigger. Uh you can use it on posters or full websites. So it doesn't lose its shape once it
30:30 - 31:00 gets bigger. Here we can see that these are slightly varied. So, it uses basically the same general style because if you notice, I first generated like 30 of these app icons. And I love doing this. And so, now I can basically hone in and be like, "Okay, what style do I want here?" Okay. Okay. Okay. And then I can find the style that I want. If I really, let's say I really like this one, like, okay, I really like this style. Now, we can create variations.
31:00 - 31:30 And we can click on this as many times we want. 2 3 4. And remember all of these are multiplied by four. So right here it shows four which means this is generating 16 new variations of this app icon. And you can see that it's somewhat close. So we're kind of getting closer. But let's say we actually want to edit these at a more granular level on Midjourney. We can click on this right here. And what we can do is we can just hit editor. And so this will
31:30 - 32:00 automatically open this up in this editor here. Let's Okay. And so now what we can do is we can actually use this eraser tool. And so we can actually erase this part right here. And what we can do is we can actually uh edit this prompt here. So we can say app icon of an orange monkey. And I'm just going to say wearing a purple beanie. And we're going to hit submit edit. You can see here it's
32:00 - 32:30 editing this beanie and now it's created four versions with a purple beanie, right? And now look at this. This is really, really good. This is a lot of fun. And uh with a purple beanie. Now we're going to go to chat GBT to go to one of our old use cases, which is a question, right? We go to Chat GBT. What is the little ball thing on the top of a beanie? It's called a PMP pom. Great. I didn't
32:30 - 33:00 know that. Thought those were what cheerleaders had. We can go back with pompom. Now what we can do is we can submit the edit. So now it should add a little pom pom thing at the top of the beanie. And there you go. It added this little pompom there. That's fine. And so within MidJourney, we had
33:00 - 33:30 some interesting use cases here. Even though it seemed like what we did was simple, we did a lot. To pay someone to ideulate around an app icon could cost thousands of dollars. And I'm not saying what we did is an equal quality, but we were able to ideulate to have this creative ideiation process around an icon. And then we were able to create variations of the style that we liked. And then we were able to at a more granular level make specific edits and
33:30 - 34:00 it maintained the overall style. So we ideated 30 app icons in 30 seconds. We chose a style that we liked and then we created 30 variations and then we upscaled the image and then we made granular edits to the image. But you'll notice here we're actually using version six. So if we click on settings right here and we go to version, we should go to the latest version. And so let's create a photorealistic landscape image. And we can change the exact. Let's do 6x5. And let's do photo realistic image
34:00 - 34:30 of a tiger uh skateboarding doing a kick flip over a stairway at the White House. And now we're going to create let's create 24 versions. So, command enter, command enter, command enter, command enter, command enter, command enter. And so,
34:30 - 35:00 here we can see all of the variations of this new image generating here with version seven. And you can see these coming in. And I mean, come on, these are going to be awesome. Oh, this is crazy. And so, this is midjourney. And every time I use MidJourney, I'm actually kind of blown away by the realism. I mean, look at this. And they're all in kind of like different styles. I love this one. So, this one
35:00 - 35:30 I'm going to favorite. And let's create a few subtle variations. And let's keep going through these. This one's okay. This one looks a little fake. This one, the lighting's kind of good. I kind of dig this one. I'll I'll favorite this one. And we can kind of go through here. This one is like a a tiger wearing normal clothes. Um I'm going to stick with that original one that I created. So here our variations are done. Okay.
35:30 - 36:00 So this is pretty cool. So we have these images here of this tiger. And let's go ahead and upscale this one right here. Okay. So we got our upscaled image. This looks really cool. Let's go ahead and save this image. Now, what we can do, let's actually move on to video. And we're going to be going back and forth from images to video because they're very much related. Very much related. And so, we're about to just mix images,
36:00 - 36:30 video, and sound to create highquality videos. And we are going to use Crea AI. Crea AI is one of my favorite tools. They just raised a ton of money. This video is not sponsored or anything like that. What we can do here is we can actually go um to video and we are going to generate a video and we're going to use the starting frame of that image
36:30 - 37:00 that we just downloaded of the tiger riding a skateboard down a down these stairs here. And we're going to say tiger riding skateboard down stairs. And let's start off with a very simple prompt here. And so here we can actually select a model. And you can see here cl 2.0 is the best model right now in my opinion based on what I've
37:00 - 37:30 seen. Cling 2.0 and then also runway gen 4 which I'll show you in just a second. Let's go ahead and use cling 2.0. And we're going to hit generate downstairs and falling off. Let's do two. So we'll have one of them. It's falling. But these, look at this. This says 5 minutes relate remaining. So these can take a little while. However, while it's loading, let's go ahead and go to runway.ml, which also created which
37:30 - 38:00 created runway gen 4. And we're going to do the same thing. So, those are the two best models. The two best video models in my opinion are Cling and then Runway Gen 4. These are the two best AI video models right now. Google VO is right behind. I'll even I'll highlight this one as uh this little gray color here because Google VO is not far behind these two. Uh Sora, Lumalabs, and PLABs I think are a slight step behind, but
38:00 - 38:30 all of these companies could end up being billion-dollar companies. In fact, Runway is already a billion-dollar company. So is Clling. Um, yeah, I mean, it's likely. I mean, who knows if Pa Labs get there. Picolabs is kind of going a more consumer route. And but I will actually talk about PABS in just a second. Let's go ahead and focus on these two right here. So on runway, we can hit generate video. And now we can drop that same asset we dropped into using the cling model on Crea. And here
38:30 - 39:00 we can upload this image. And let's just crop that to this 4x3. And we can enter the same thing we did into Korea. So we can come down here and I'm actually just going to enter this prompt right here. And we are going to paste this like that. And can we do 10 seconds here? We can do 10 seconds. Let's generate let's generate another one. Can we generate another one? Cool. So now we have five videos loading. And I actually forgot that Gen 4 just added Gen 4 Turbo. And
39:00 - 39:30 so they just added Turbo. And this actually is very fast. It generates 10-second clips in about 30 seconds, I believe, which is a lot faster than 5 minutes, which Korea does. Uh, which is cool. And so sometimes you don't need to use the best model. And so you see that there's actually 2 minutes uh remaining on this. But let's take a look at these videos. So we're generating a video here. Okay. So, it's still like pedalling down. Let's see. Okay, maybe
39:30 - 40:00 we started off with a bad image. Maybe we Oh, here it's like flying down. Maybe what we should do, and we can always come back to midjourney, right? Maybe we go to a different one, and maybe we do one where it's not over stairs. Let's actually um let's do another image. Let's create another concept here. And let's just say uh man walking down the street. Photo
40:00 - 40:30 realistic man walking down the street holding a leather brown briefcase that uh that has a red circle logo on it. Let's say we have a briefcase brand and the logo is just this red circle. It's going to be this viral sensation. We can generate a bunch
40:30 - 41:00 of these images here. Um, and we're going to do a more simple video. I mean, this is hard. We gave it a tiger riding a skateboard down. And like, this would be good if it was on flat ground, but it's like what is it supposed to do from this frame? So, this is probably an error of mine. Like, this is a kind obviously a very unrealistic frame, but it's like what do you do with the physics of it? like um I guess we could say tiger riding skateboard
41:00 - 41:30 downstairs and falls off. We tried that on the Korea example. And so this one's done. And you can see it's kind of like flowing over the stairs like he's flying and then he reaches the ground at the bottom below the stairs. If you see that right there. Maybe this is just a really bad prompt. Let's go back to Korea and let's see how uh let's see how Clling did with these 5-second clips
41:30 - 42:00 here. Oh, whoa. Okay. A lot more detailed, but very interesting. Oh, wow. Oh, wow. Oh, what the hell? That's actually crazy. Oh, it wiped out. So he's like riding down the hill and then it flips. So Clling actually is the most detailed model. Cling actually goes crazy. I'm not going to lie. Let's imagine we
42:00 - 42:30 wanted to create a 30-se secondond ad for a McLaren, right? If we wanted to create a 30 secondond ad from McLaren, what we would need to do, if we think about this in terms of a video timeline in Cap Cut or Premiere Pro, let's say this right here is a 10second video clip. If we wanted to create a 30-se secondond ad, we would need three of these. So, we would need 10 seconds, 10 seconds, and 10 seconds. But more realistically, it might be a 8second
42:30 - 43:00 clip. It might end up looking something like this. But this is not good enough, right? We need more than just a video layer. We're actually going to need some sound. So, we're definitely going to need some sound. And so, this is Let's make it. We'll probably want some music. And then we're also going to want some sound effects, right? You can't if you have a commercial with a driving car and it doesn't make any car noises, that's a pretty bad video. So, we're going to want some sound effects. And these sound effects, the music might be consistent
43:00 - 43:30 throughout the whole background, but we might have some sound effects for the car starting. Maybe we'll have a sound for the car driving. Maybe we'll have some layered sound effects where we have, you know, just the general sound of the city. Maybe there's some wind or something like that. But ultimately, we're going to be creating something that might look more like this. And so this is what actually makes up a video, right? A video is not just the actual video frames, right? These need to be
43:30 - 44:00 consistent. They need to have a consistent style, but there also needs to be music and some good sound effects. And the third layer of audio is going to be dialogue. Let's say we're going to have some dialogue. And this layer right here might actually just be at the end. It might say something or maybe it'll say something inspiring at the beginning. It'll be a little blank. and then we'll we'll have this dialogue and we'll have a consistent speaker. So, let's go ahead and actually create a video just like
44:00 - 44:30 this. Let's first decide what product we want to advertise in the video. I'm going to ask Chachi PT come up with 10 ironic, simple, funny product item ideas. A do nothing button. Okay, so do nothing button. A pre-racked egg. A pocket sand. You know what? a do nothing button. I actually like this. A sleek button that when pressed lights up and does absolutely nothing. So, let's go ahead
44:30 - 45:00 and copy this. Now, let's actually stay within chat GPT. Please turn this into a an image of a man about to press this button. The button is sitting on a kitchen table in front of him. Make this in the style of
45:00 - 45:30 Studio Gibli. And look at this. This is great. So, do nothing button. Do nothing button. It has this big red button and it's in the style of Studio Giblly. I like it. Let's go ahead and download it. So, what we're going to do is we're going to take this image and drag it into runway gen 4. And we are going to just type in man about to press a button but gets
45:30 - 46:00 stressed and puts both hands on on his head. Generate. Generate. Now, what we're going to do while that's generating on chat GBT, I'm going to say, "Please make a close-up image in the vertical aspect uh ratio that you've already created. Please make it a closeup of the do nothing button. Make it look exactly the
46:00 - 46:30 same and have the case right next to it like it is on the table, but do not have the image of the man." I guess we'll say do not have the man. God dang it. It'll be fine. The first one is done. Let's check it out. Oh, he's speaking. Please put your hands on the head. Oh, that's kind of nice. It did exactly what we said. It's kind of good at the studio Giblly style. Let's see. This is another option we have. I
46:30 - 47:00 don't like how fast he's speaking. So, we can make an adjustment here. We go. man silent not speaking about to press a button but gets stressed puts both hands on his head we can generate this image generate this image and this is how I work with AI you know th those videos are loading so I'm going to go back to this chat GPT page and I have all these different tools open at the same time so when one stops running I'm actually going to go back to the other one now I'm going to say please make
47:00 - 47:30 this again but with a professional background around. Not the case, just the button, a product photo in the style of Studio Gibli. And I like this one. Let's go ahead and download it. So remember, we changed this to silent and not speaking. So let's see if
47:30 - 48:00 this worked on Gen 4. So let's go like this. Why is his mouth moving? It's like it has to have his mouth moving. That's so annoying. So, I just went ahead and tried it on Crea uh with the cling model. And let's give it a try. There we go. See, that is what I was looking for the whole time. To have a character not speaking. For whatever reason, the runway character could not,
48:00 - 48:30 no matter how I prompted it, it always had to be speaking. Here the character is not speaking. We're going to create one more frame here and it's just going to be of the button. So we're going to upload this image right here which is just the do nothing button and we're going to have this button right here. So it's a man's going to press the button. It's going to make a sound. There's going to be light music in the background. And then it's going to do a closeup of the nothing button. So we'll do closeup of the do
48:30 - 49:00 nothing button. zooms in on it on the button and the lighting gets dim. Generate. Generate. Generate. Generate. Now, let's move to another fun part and
49:00 - 49:30 we're going to move to the sound category. We're going to go to 11labs.io and then we're also going to go to sununo.ai. So on 11labs, we're going to go to the app. What we're going to do is we're going to find the sound effects section and we are going to go button pressed. Uh, I'm going to say cartoonish sound of a button pressed
49:30 - 50:00 making a failure noise. Let's see what this sounds like. And this is a lot more fun because the iterations are a lot faster. And I actually just realized we can change the duration of the sound. And so we can make it, let's just make it 3 seconds. And we'll hit generate. So here are the samples. So it just generated these four sound effects. So let's go ahead and try them. That's okay. It's all right. H I actually like
50:00 - 50:30 this one. Let's try this sample, too. On text to speech, let's just type in introducing the Wait, what did we call this item? The do nothing button. Introducing the do nothing button. It does nothing. Now, we can change this to I'm thinking like a a deep voice here. And so, we can use Julian. If you really
50:30 - 51:00 think you're alone at night. Great. Let's send it. So, let's select Julian. And now generate speech. Introducing the do nothing button. It does nothing. Great. Let's download that. Now, what we're going to do is we are going to go back to Korea where we're generating this final shot here. So, let's take a look at these. Amazing. Amazing. That is a wonderful
51:00 - 51:30 shot. Download. So, now we have the dialogue. So, if we go back to our diagram here, you know, we have our sound effects, you know, with the riser and the button being pressed and we have the dialogue, right? I think we're only going to have dialogue at the end. Uh, obviously going to have a lot shorter of a clip, but now we need music. So here on Sunno, what we can do here is we can say we're going to do
51:30 - 52:00 instrumental and then we're going to have light ambient anxious music. I don't know what this means. I'm just putting in emotions here. And again, we can now create songs very easily. Okay, so these are done. Too fast. Too sad. Way too fast. too strummy. Do nothing button. I like it. Let's send it. So now if we press
52:00 - 52:30 these three dots right here, we can click download MP3 [Music] audio. Now what we're going to do is we're going to open up Premiere Pro. We are going to clip all these together really quickly. Okay. So now here we're going to press skip. Now what we're going to do is we're going to drag all these files in here. It's quite simple to like kind of get started. You have this timeline down here. You can see your video over here. This might be easier to get started in
52:30 - 53:00 Cap Cut if you have any experience. But basically here we have this button and we have this character that presses this button. So we need the button pressed button. Let's line these up just like that. Maybe that button press wasn't the best. Maybe we need it. Maybe it's a little too early. And let's get that music in here. Advertisement drum beat. Ironic and
53:00 - 53:30 fun. Create. Create. Do you want to change your life? Don't press this. Don't buy this. Generate speech. Do you want to change your life? Don't buy this. All right. So, now we'll go to Premiere Pro. And what we're going to do is we're actually going to make our own track here. Okay. Do you want to change your life? Okay. Now, let's actually raise
53:30 - 54:00 this volume on this. Let's go uh four. Do you want to change your life? Don't buy this. Let's see this. We can hit create captions. Now, it's going to use AI uh to generate these or used AI to generate the transcript. So, we're going to hit export. So, it's done. We can go here now. We can watch it full screen. Do you want to change your life? Do you want to change your
54:00 - 54:30 life? Don't buy this. Introducing the do nothing button. It does nothing. Do you want to change your life? Okay, so here's the video. And I decided to just tweet this to illustrate the fact that we used GBT40 for the image, Cling for the video, 11 Labs for the voice, 11 Labs for the sound effect, and Sunno for the song. I did that all in like a span of, I don't know, I don't know, maybe 25 minutes total. And it definitely could
54:30 - 55:00 get a lot easier. And that's why we're talking about vibe coding. There's ways to vibe code tools to actually speed up this process. But even before this whole process, we use chat GBT to ideulate on the idea. So that's another step. So we use six AI tools to quickly ideulate, generate the first frame, create the video, add some sound, and it's kind of fun. And then we used AI to add the subtitles. I mean, that is a full AI stack right there. Here is what we did
55:00 - 55:30 for this project. So, as you've seen up until this point, uh, Clling clearly is the best AI video model. Uh, maybe you're going to have to take my word on that, but rumor on the street says it is the best. And I follow the smartest people in the AI space, the people who are actually testing these models every time these new models come out. We have kind of a firsterson view on Runway just doesn't have the control that Clling does. Clling is able to create better effects like the tiger falling off the skateboard and also was
55:30 - 56:00 able to keep the character's mouth closed and had better control in the video clip. And so we're calling these video generation tools. These are just tools that generate videos. And we're just going to say like cling but worse for these other tools. So I don't we don't actually need to go through all of these video tools um because they're a lot like Clling. What I want to do is I want to move on to another category of video tools. Right? This is a whole another category. And this category can
56:00 - 56:30 be looked at as either avatars. You can call this the avatar category or I call it, and people are going to get mad at me for this, the slop category. This is AI generated slop. We may reach a world in the next two or three years where AI avatars are indistinguishable from humans. And this is there's all kinds of weird effects that's going to happen from this. But let's go ahead and test this out. So, we can actually go to
56:30 - 57:00 heyjen.com. And what we can do is I have not trained an avatar on myself yet, but we can go ahead and actually just go to the avatar section. And what we can do is we can we can use Conrad who's sitting here. We can have him sitting on the sofa. Here's a video of Conrad. Welcome to the new era of video creation with Hen. Simply type your script to get started. And it is pretty realistic. And so we can hit create with AI studio. And
57:00 - 57:30 we can say welcome to the intro to AI to vibe code Kevin William. And we can just say video script. And we can hit generate. We can choose either. Let's choose 4K. Let's turn off the watermark because who wants that? And we can hit submit. And so this video is done. Let's go ahead and play this. Welcome to the intro to
57:30 - 58:00 Vibe Coding. I am your instructor, Kevin Williams. Welcome to the intro to VIP coding. See, it's a little bit creepy and it doesn't there's something dead about these AI avatars. One person who does a really good job of this on Instagram is this guy by the name of Rowan Chung. All of his videos are AI generated and you can't even really notice. Let's see one who's that did really well. This one did really well.
58:00 - 58:30 So could revolutionize cancer treatment by navigating through the body's narrowest pathways. South Korean research. This is an avatar version of him. This is not actually him, but it works. Researchers have created a water-based micro robots covered with a special teflon coating that gives them abilities similar to living cells using a clever. And you notice here he has mostly B-roll going. And I think that's why it works. Or else I think people get bored of his movements cuz he barely moves. Technique. Unlike traditional robots, these rice grain sized water
58:30 - 59:00 droplets can squeeze through tight spaces, pick up and move small objects, travel across water surfaces, and even combine with other. You can tell it's an AI avatar, but it is pretty well executed. But anyway, yeah, so that's avatar content. Okay, so I think right now is a good time to pause and reflect. I want to talk about what we've discussed so far so that we can put it all in context because we want everything to be a schema or a web of ideas rather than a list of ideas. How do these things intermingle so that you
59:00 - 59:30 can very quickly think on your feet and use the tools when you need them? So far we've talked about chatbased tools and these chatbased tools more and more are gaining access to uh image and video tools. I asked a question in order to come up with a product idea and then we were able to go through all of this. I created a Giblly image based on this idea and this was the starting frame and then we created a video from that starting frame. In theory, you could set
59:30 - 60:00 all of this up to be done as a workflow automation. Okay, so we just covered the vibe stack of tools like chat tools, video tools, image tools, and sound tools. But what if we want to use different tools in an automation? That's what we're going to talk about right now. So, there's two main flows that we're going to cover in today's video. And we're just going to do a light overview. We're not actually going to dive super deep into them. We're just going to do a nice overview. I'm going to show you an example of uh an automation flow and an agent flow. So
60:00 - 60:30 the first one I want to talk about is an automation flow. And for this we're going to use Zapier. So let's go ahead and open up Zapier. And let's go ahead and create a Zap. And a Zap is I think it's just basically an automation. It's just what Zapier calls an automation flow. And every automation flow has a trigger. And in this case, let's go ahead and make the trigger. And by trigger, I mean this actually sets off the automation. And
60:30 - 61:00 the automation is just a list of things that happen once the trigger happens. So let's say it's let's go ahead and click on notion and let's go ahead and select an event. So let's say when a new database item happens. And so if we're in Notion, for those of you who are in Notion, it's kind of like a over complicated uh Excel spreadsheet with like Wikipedia in it. But anyway, in Notion, you can create these things called databases. And they're kind of like tables. And so we're just going to create a let's call this Zap
61:00 - 61:30 Zappier Zapier database. And what we're going to do here is we can actually go back to Zapier. And what we're going to do is we're going to select new database item account. And let's go ahead and type in. And so basically whenever a new database item is created for Zapier, we can choose that Zapier database that we just had. And we can hit continue. And we can just hit test trigger even though we
61:30 - 62:00 actually haven't added a new item in the database. So let's go ahead and add a new item in this Zapier database. And so I'm going to delete or yeah. So let's add a new one. And we can say man walking down the street. And so we just entered this new title in here. And let's say we have a tag. So we can actually create tags in this. And let's just say image. So now let's create an automation that generates an image. So
62:00 - 62:30 what we're going to do is we're going to go back to Zapier. And what we can do is we can actually AI image. So we can actually just choose open AI. And now we can actually choose the dolly. So I think we can just type in image generate an image with dolly. And here we can choose account. We can just choose the normal account here. And we're going to hit continue. And so now what we can do is we're going
62:30 - 63:00 to use the model dolly 2. And right now the new GPT40 image model that we talked about earlier, this is not out yet, but as soon as it comes out, we can actually replace Dolly 2 with this new GPT40 vision model. And so we can actually prepare ahead of time for when this model comes out and we can create automations with it. And so for the mapping section, if we just press this slash button, we can continue with this selected record and we can press this slash button. And so we can actually just choose the title name. So we can go
63:00 - 63:30 man walking down the street and we can now use this dolly 2 model. So it's going to take whatever the title is which is in this section right here. So man walking down the street. I think we can create a file or media thing and we can actually have it show up right here. So what we can do is we can go to Zapier and we are now sending the title right here. So, Dolly 2. Number of
63:30 - 64:00 results, let's just do one for now. Size. Oh, yeah. Let's do 1024x 1024. User, I'm not sure we need that. Let's test this step. Okay, so it just finished loading. And it gave us this URL. And it showed that it generated this URL. So, this is just a test. So, we can actually copy this, open it in a new tab. Let's see if it generated something. Okay, this is a man walking down the street. Wow, this
64:00 - 64:30 is a really old model. This looks awful. Yeah, I guess we are using dolly 2, not dolly 3. Okay, yeah, we were using dolly 2. Let's switch it to dolly 3. That looked horrible. Uh, style, let's go vivid. Sure. Continue. Okay. Now, now what we want to do is we want to actually add another step. So we want to add a notion step to actually add it as a file. So choose event and now we want to update a
64:30 - 65:00 database item and continue. And what we want to do here is we want to go to we want to update the Zap year database and then the item that we want is the um is the same ID. And so we actually want to do a custom value here. And we want to make sure we select the ID. So we want to choose the same ID that triggered the
65:00 - 65:30 first one. Now what we can do is we can actually just go into the files and media and we can actually map this to this data URL. I wonder if this is going to work. Let's test this step. Okay, it did. It showed up right here. So if we click on the image, it shows up right here. So now in theory, if we were to go to Zapier and if we publish this just like that and we go back to notion, we
65:30 - 66:00 just created a very simple workflow that when we create a new database item that's pulled every 1 minute, it's going to use the GPT40 image model with Dolly 3 and then upload it back up into this new into the files and media. So we can go like this. So, we can go monkey on a bike. Uh, turtle playing soccer. Uh, banana taco balloon. James
66:00 - 66:30 Bond cloud over it. And you see they're coming in. I think the next two are going to come in. Oh, there's that one and this one. So, all that automation did indeed work. And you can see it has this monkey. And so, if we go into the balloon, we click on this one. It should be a balloon once it loads. There we go. We have the balloons and we created this automation. Now, let's add a few more steps. So, if we go to Zapier real quick, we can add another step. So, we
66:30 - 67:00 can edit this zap. We can actually do another let's actually do chatgpt conversation continue configure. And so if we go to the title so like this and we use the GPT40 model and what we're going to do is we're going to say you are a you are uh Sam Alman and you are
67:00 - 67:30 writing a short piece of content on the topic given to you and make it [Music] opinionated and concise. And so here we are now and now when we do the update item. So once we
67:30 - 68:00 update the notion database, we're going to actually add a field in the Zap year database. We're going to call we're going to make this text. And what we're going to do here is yeah, we're just going to add this uh Sam uh Altman's thoughts. That's what we're going to call it here. And now we can go back to Zapier. And what we can do is we can refresh these fields. There we go. Sam Alman's thoughts. And now we can add in this
68:00 - 68:30 conversation here. And we are looking for the reply. And we can hit continue. Continue. Continue. And then now we can publish. Now I think we can give this a test here. And if we just test this final step, we can retest this step. then we should see, yep, a man walking down. Okay, so here's Sam Alman's thoughts. So now in theory, if we go to the database and we type in uh going to
68:30 - 69:00 lunch and we type in large language models, uh we typed in a um San Francisco elevators. where we should get an image and then we also should get a little message by Sam Alman. And look at this. These are loading in right here. And we now have a
69:00 - 69:30 little thought by Sam Alman. And we also have these images. San Francisco. And it also comes with an image from San Francisco, which is pretty damn cool in my opinion. There we go. Not bad. Okay, so this is just one example of an automation no code. You can literally make those zaps as many steps as you want. You could literally use so many different AI tools within these automations. And you can set them up with Lindy. N8N or make.com is another
69:30 - 70:00 one that I didn't put on this list for whatever reason. Doesn't matter. But the next type of flow I want to get to is agent tools. And the first one I want to talk about is deep research. So you have your vibe stacks which are like the individual chat experience, right? The one block which is where you can chat with AI or you can generate an image or you can generate a video. But the automated flow is what I call a vibe flow. So you have the stack and the
70:00 - 70:30 flow. But this is a more deterministic workflow automation. Whereas we also have if we go back to our if we go back to our map here we can see that this is just one part of these automations and flows. I also consider agent tools another version of this and so one version of this actually let's go ahead and do let's run two of these at once because they take a while. So if we go
70:30 - 71:00 to chat GPT again, chat GBT is in every category of the map and we turn on deep research. I consider this a flow, right? Because this can actually take from 5 to 20 minutes. And so we can do deep research with 03 and we can say I want you to do an indepth report on the best AI video models, AI image models, AI text models to use in my content. I'm
71:00 - 71:30 trying to build my company Vibe Code which is and uh because it's research it automatically searches the internet. So in that vibe flow the the workflow automation we just created we added a search step where it searches the internet and you actually had to set it up to Tavi. This just does that automatically. And some of these AIs you don't even have to ask. It'll just automatically search the internet if it decides that it needs to. And so what we
71:30 - 72:00 can do here is we can actually say we can actually say uh vibecodeapp.com here is the info on my company. And so what this will do is it'll actually tell the agent to just go to that site, read about it, and then it will have a little bit more context. And then we can say on top of looking at the best tools for the job, I want you to diagram the flows for this. So what flows can I use of tools to actually make this work? What workflow can I use
72:00 - 72:30 that will best optimize the marketing strategy that I use here? And please make this a very detailed report. And what it's going to do here is it's going to actually ask me some follow-up questions. So, thanks for sharing your vision with Vive Code. What kind of content are you looking to create? So, that's good. I'm looking to create short form reels for short form content. So, we can just throw in a paragraph here. It says, "Awesome. I'm on it." And watch this. So, it's going to tell us what it's going to do. And then now it is
72:30 - 73:00 starting its research. And if you have the plus plan, I think you get like 10 of these per month maybe. This is literally going to search the internet. It's going to think. It's going to compile sources. Sometimes it can go up to 100 sources. This is designed to do it to type this in and then go off with your day and you come back 5 15 minutes later or later, right? You're not meant to just sit here and wait. The same thing is true
73:00 - 73:30 about manis, right? So we can say we can go to Manis and we can actually take a look here at some of the examples. But Manis is a new tool and in order to describe Manis I actually want to bring up a new tool which is let's bring up the Claude desktop app because I don't think I've talked about this tool yet. Quad has access to tools and I actually think I'm going to make a separate MCP video for this because I think I can make an hour or an hour and a half video
73:30 - 74:00 on MCPS alone because I h it has access to 47 MCP tools which I'll talk about later. But what we can do here is Claude actually has the ability to search the internet now. And so I'm just going to say search the internet and tell me about tell me about Manis, the new AI agent tool and what it does in three
74:00 - 74:30 paragraphs in a in an artifact window. Over the course of this video, I have used I have used the term canvas a lot. When chat GBT brings up the side window, which is a canvas in Claude, it is called an artifact, which is the original, the OG popup side window. And this was one of the best
74:30 - 75:00 features ever created. And so here, Manis is a multi-agent AI system developed by a team in China and launched March 6. Unlike conventional chat bots that only provide information, Manis is designed to autonomously execute complex tasks including researching, analyzing data, generating reports, writing code, and deploying websites with minimum supervision. That is a new flowchart of how this tool works. And this is one of my favorite
75:00 - 75:30 parts of Claude is it's really really good at generating flowcharts in what's called mermaid. So this is a mermaid syntax here and this is going to automatically convert once it's done coding into a work. So this is a full flowchart. And so we can see here and so we can actually bring this over here. We're going to want to probably full screen this honestly. make it simpler and make it from the user
75:30 - 76:00 experience perspective. And this is a much simpler explanation of what Manis is. Manis works on a task autonomously, navigates websites, processes data, writes code if needed, creates deliverables. And so Manis completes tasks, notification is sent to user, user reviews completed work, user can replay the session to see how the task was done. user can use or download results. User can request modifications.
76:00 - 76:30 And so this is a full kind of like an AI employee almost like an AI employee that you have access to. So now that you kind of understand that it has this agentic workflow, it kind of makes decisions on what it needs. It can navigate websites. It can go out and process the data that it finds on these websites. It can write code if needed. You can also give it data, by the way. And it creates deliverables. It can create a deliverable for you. So like what deliverable do we want from Manis? So do
76:30 - 77:00 research on the modern AI tools for generating videos. I want a PDF with images as the final output. Okay, let's run that. And so now we have two AI agents working right now because as you can see here the AI agent that we created for deep research on chat GBT is only halfway done 24 sources. So it's
77:00 - 77:30 selecting 24 different sources. It's doing all of this activity. If we go to Manis we can see here that it's that it's doing a bunch of stuff here. What's cool about Manis is you can click on one of these and this will actually open up this side panel right here and it will tell you it'll kind of show you what's going on. So it'll show you the full plan. I think this is very very cool. We can screenshot this. I'm going to add it to this which I'm going to make this
77:30 - 78:00 whole board here at the end. I'm going to organize it and make it like super useful and I'm going to allow you to download it and stuff. And uh this is just kind of a good a good way to see what we've covered in this video. But you can see that Manis has a plan. It has a research plan. It says created a project directory and research plan. So it's created and organized files. It's created like a little mini IDE for itself to work on this. And so now it's going to search for modern AI tools. It's going to collect detailed information on each tool, gather images
78:00 - 78:30 and examples for each tool. And so each of these agents are going off for 10, 15 minutes at a time. And this is why this is why I made a tweet a while back. I said the future of marketing and the future of business and startups may end up looking like online poker. So this is my ultimate uh prediction. Um if you guys followed poker back in the 2010s, I was really really young during this era. I only saw like documentaries of it. I wasn't into poker then. But the best
78:30 - 79:00 poker pros were playing online poker. And if you've ever played poker, it's turnbased, meaning if you fold your hand, you have to wait a while before you get your next hand. And you don't even play every hand. you fold a lot more often than you play, right? You might only play 15 20% of your hands in poker. And so these pros in order to make money would play online poker on multiple tables at once. When it was your turn, that table would light up. And this is called like multi-tabling. Poker pros would literally play 30 games
79:00 - 79:30 at once. their brain ran calculations so fast that they were able to play profitably in 30 games at once and that's how they made the most money that they possibly could. And I think that the future of AI agents, right? As you can see here, we basically sent off these employees. There's nothing that we could do right now, right? there's nothing that we could do other than to send a like we could actually manually do some work or take the research but I think that a lot of the work will actually be using these AI agents
79:30 - 80:00 asynchronously to do a bunch of tasks and I think that there will be a CEO in the near future within the next year who will run a company managing these AI agents on these multiple screens and you're just going to be there and you're going to be like all right manus one manus 2 minus three minus four and it's going to cost a a lot of money in agents. Now, Greg Eisenberg, Greg Eisenberg would call this vibe marketing. This is Greg's So, Greg Eisenberg thinks that like Vibe
80:00 - 80:30 Marketing, Vibe Marketing will be uh you know, a few people using these workflow builders, agent platforms, software. So, like to automatically create lead magnets that you're going to use vibe coding tools to create like little mini MVPs that people can sign up for the weight list. And then you can use like marketing and automation tools to kind of create this entire workflow to basically amplify the work that your marketing team does. For me personally, I just don't like the name and I think
80:30 - 81:00 it's going to be wider than marketing itself. And so I don't know the name for it, but that's kind of how I picture it. I think of it as like multi-tabling agents. The the MTA multi-table agents kind of orchestrating these AI agents. Okay, so Manis I believe is done and deep research is done as well. So let's take a look at what these AI agents went out and did. So here we had 27 different sources right here and uh if we look at
81:00 - 81:30 all the activity 03 went out and did a lot of different things and so we have this entire report. So, if we scroll all the way to the bottom and we open up BareNote once again, this is the report that was created. And I'm really hoping that soon these reports will have images that it just pulls from the internet and puts in here. But like, look at how long this is. Like, this is probably a 20 to 30 page report. This is just insane. And it did
81:30 - 82:00 all of this research. And you can see here it has a lot of different links. And that is the report that it created. If we go to Manis, we can see here that it returned. So it went through all of these different steps. And you can check on them right here. Every single step you can see. And this allows you to actually see where in its reasoning it went wrong. So if you see here, be like, okay, I don't want you to do it like this can actually help you in the prompt that you create next. And I actually heard the argument on Twitter the other
82:00 - 82:30 day that this is a reason why you want to hire agents instead of humans because humans you judge people on the outputs generally especially if they're like a contractor whereas the AI agent you can actually see every single thing that they did and in theory you could create an AI agent that looks for where what went wrong and you could use that to improve your agent and that's kind of I guarantee you a multi-million dollar company will be created that solves solves that specific problem is using AI agents to improve AI agents. That is
82:30 - 83:00 coming for sure. And the output was this PDF. So, we can click on this PDF. Let's see. I've tried PDFs before and wasn't overly impressed, but let's see. So, we can go to downloads and let's open this up. And we can see that we're getting some overlap here. Okay. So, we have a table of contents. The landscape of video creation has undergone a revolutionary transformation. Let's zoom out here. Um,
83:00 - 83:30 okay. This isn't bad. It created this categories of video creation tools. AI avatar and studio quality video. Okay. Text to video generative AI tools. Interesting. Synthesia runway descripting future trends and developments. So, this created this full-on report for us. and in PDF format and all we had to do was ask it to create a report with a PDF and include
83:30 - 84:00 images. So, it created that. And so, um, you can think of all of the things that we've covered today in that first section as tools. I know I call them vibe stacks. Now that I'm saying this out loud, I think a better way to call them would be to you call them vibe tools because AI agents have access to tools. And so AI agents kind of decide what the next step is and they're using tools in a loop. And so I think the future of AI agents like within the next three to five years might just look like
84:00 - 84:30 an agent that has access to all of these things. So you can say, I want you to create a video that does X, Y, or Z. And the AI agent will just go out, use these different tools, right? It can use all any of these different tools in any order that it wants. And then not only can it use the tools, but it can actually process and watch. Remember earlier in the video when uh the Gemini model, you can upload a video and AI can actually take it in as an input, process
84:30 - 85:00 it, and understand what's being said at every timeline, what's being showed on the screen at every time, at any time stamp. AI agents will be able to do that with the own with the videos that they generate using these tools. And then it can decide, okay, what do we need to add? Okay, I'm going to go back out. I'm going to use the cling AI tool. Generate a video clip. Generate a video clip. Video clip. Video clip. And maybe it'll process mult once and decide which parts of that video should be shown as B-roll
85:00 - 85:30 in the in the giant video that it's creating. Cuz these AI agents over time are not only going to get smarter, they're going to get access to more tools. And so we're starting to see the very beginning of tools like Manis that can plan. It's basically like a high functioning research agent that can also generate code when it has to. But we're going to see the version of Manis that can generate videos, that acts as a full-on video editor that can go through all of your footage, all of your images that you've created, your tone of voice,
85:30 - 86:00 your style, and create a video based on all of those things. What I want to talk about is creating experiences with all of these tools and code, which is vibe coding, is being able to create an app fully with AI. And I think the best apps created right now using vibe coding are using these APIs or these tools. And that's something that I actually want to talk about right now. So what you can do now is you can
86:00 - 86:30 go to a site like Vzero. I'm gonna ask it create a landing page for for a site that sells slug bugs but in this style and we're going to upload it this style. Keep the site minimal because we will have images of the car in that color.
86:30 - 87:00 And I want that to be the focal point. Make it just a hero section and a footer, nothing else. And so this coding agent is generating all of the different files. It's going out deciding exactly what we need. And here it's generating all the files. And it's about to immediately just show us the result
87:00 - 87:30 of what it coded. As soon as it's done, it's going to show us this preview. Boom. There we go. Retro slug bugs. And so this is interesting. The G is being cut off. And I'll just tell it that I'm going to say in this uh word the G's are cut off at the bottom. Make that stop and make the buttons less
87:30 - 88:00 colorful and the text less colorful. Now, let's go ahead and go to chat GPT here and I actually let's go ahead and upload this image to GPT40 and let's generate some images of a slug bug. So, create a simple image of a slug bug with a plain
88:00 - 88:30 background. The slug bug should have the same patterns as attached image. Okay, I like this. So, I'm going to go ahead and download this. Now, we have this slug bug. We'll put it right here. And we're going to make this the background of the site. And we're going to make this the background of the app. So what we can do here is while this downloads, we can go back to V 0. And what V 0 does, which is just an AI
88:30 - 89:00 app or website builder, we can actually drag this image in here. And we can say, please make the background of this site, this image, please. Um, I'm just going to say get rid of this text. I want as little text as possible. And so we're going to now insert this image into this site. So we are vibe coding with the image that we generated. And
89:00 - 89:30 it's going to get cooler, trust me. Okay, look at this. So we have this clean site right here and it shows this slug bug. And so now what we can do is we can actually go to RunwayML just because it's faster. And what we're going to do here is we're going to generate a video and we are going to click on this right here and we are going to drop this image just like this. So it has the same aspect ratio as the
89:30 - 90:00 original image that we put into the app. And so now we're going to describe the animation that we want in the website. And I'll show you exactly how I'm going to do this. I'm going to say car driving off the screen to the left. And now we can make these 5 seconds. And let's go ahead and generate this. Generate this. Generate
90:00 - 90:30 this. And now we can get our prompt ready while the video is loading. So here, this is what I'm going to ask of V 0. Okay. When I uh hover my mouse close to the car, I want you to then replace the image with a video that is of the same dem I want it to show up as the same size as the image because the video is this car going off the screen and I want the
90:30 - 91:00 image to turn into a video and the car is going to drive off the screen and after three seconds after the uh video starts. Once I hover close to the car, I want a middle screen popup to come up that says buy now that has a little little image, a littleer version of the image of the car and put the price 20,000 checkout now. The video that I
91:00 - 91:30 want you to use is attached in this comment right here. So we can actually attach videos directly in V 0 which is really cool feature. And so we are now let's go ahead and check which one we want to use. Hopefully one of these works. Okay. Why does it pause? Okay. There we go. That's what I was hoping for. So let's download this right here. So we
91:30 - 92:00 have this downloaded video. Let's go ahead and drag this video in here. And this video is taking some time to upload. Okay, it's good. Now, let's run it. So, hopefully once our mouse gets close to the car, it'll have the car drive off the screen and then it will show us the popup to buy it for 20,000. And while that's loading, let's go to 11 Labs because again, we're applying everything that we've learned and we're connecting it all together. What we want
92:00 - 92:30 to do here is we want to create a sound effect. And we want this to be let's say 6 seconds I think for now. And let's say car starting and slowly driving. Simple car noise. I don't want it to be too complicated. All right, let's check out these noises. Eh, pretty solid. Let's use that noise. So, we first need
92:30 - 93:00 to nail uh let's see if V 0ero got it. So, it's done. I'm going to refresh this whole page just so we have I want to see if this animation works properly. Okay, let's see. Interesting. Okay. Don't make the pop up. Dim everything behind it and make it
93:00 - 93:30 way smaller. Also, when the video starts, also play this sound. And now we can upload this sound. Doesn't look awful. This car, it does glitch a little bit, but now it has check out 20,000. Very interesting. All right, let's test this out here. Wait. Hold on. There we go. And you know what? It's
93:30 - 94:00 not perfect. It's not perfect. We don't want it to play on repeat. All right. So, we just used AI to create a website, a landing page, and we put AI generated images and videos within that website. But what if we actually want to create an app that uses these AI powerups or APIs? And so I call APIs power-ups because these are external tools that you can add to your
94:00 - 94:30 app and it creates really cool features. And so I want to start off with some examples. So here's an example of an app that I built which is actually a clone of a very popular app called Cal AI. And very simply when the user takes a photo on the app, the app sends the photo to the OpenAI API, it analyzes the photo and generates text. So it's actually using the image to text model. It is
94:30 - 95:00 converting this image into text. It sends that information back to the app. And so this is actually what it looks like right here. So if we hit this plus sign right here, we can hit scan food. And I can now take a photo of this banana. And you can see here that it is now processing that image. And there we go. It is 105 calories. It tells me the
95:00 - 95:30 amount of carbs, protein, fats, and we can log that information. And since I made this version of the app specifically for people with diabetes, now we get this little message on here that describes if it's good or not if you have diabetes. So it's like a personalized app. And we're entering the era where you can just build apps specifically for yourself. And so the best app ideas, the best app ideas that you can have to that
95:30 - 96:00 make money, at least when you're starting out, follow these two rules. It's like what specific painoint does the app solve and then how do you get the user to solve that problem in as little touches as possible. And because of APIs and AI, it makes it a lot easier to make things solvable really quickly. And it's tempting when you start vibe coding with tools like cursor, which I'm about to go over, to add a ton of features. And I highly recommend focusing on making the experience with
96:00 - 96:30 one feature like as best as you can make it before you start adding many features. It seems fun to make a ton of features, but it's really hard when you are vibe coding to continue to add additional features because every additional feature you add, it gets 10 times harder to maintain and it's also a lot more confusing to the end user. Here's a list of my favorite APIs that I like to use. And we are now going to use
96:30 - 97:00 cursor, which in my opinion is the best vibe coding tool, the best tool to build apps with AI. And we are going to use some APIs. One thing that you need when you use an API, in order for the API to send anything back, what's required is an API key. If you do not have an API key, the app is actually just going to send nothing back to you and you're going to get an error on your app. But there are tools coming out soon
97:00 - 97:30 that are handling the at least the testing phase for API keys at the beginning. So you can just go immediately into testing your app and the API keys are actually provided in the app. But that's not the case with cursor because cursor is actually the most advanced vibe coding tool, but once you learn the interface, it's actually not that difficult. And so what I want to do right now is I want to build an app on cursor that utilizes APIs. I have like 20 more videos using
97:30 - 98:00 cursor. In fact, I have a 2 and 1 half hour video where I just go through the basics of cursor. But for a very simple app, I actually think you can dive into it only with a 5minute video. And I'm going to show you exactly how to use cursor right now. But if you want more in-depth content, I've made that already. So the way that I get started, I'm just going to hit open project. I'm going to create a new folder wherever you want. This is just going to be test project. And we're going to hit open. And so now it's opened a blank folder
98:00 - 98:30 right here. And so all you need to really understand is that at this folder on your computer, you're going to have files that is this app. And you are going to pay attention mostly to this sidebar. In fact, I only really pay attention to this sidebar. So, I'm just going to say create a NextJS project, a Nex.js app. And for now, don't worry about what Next.js is. Just trust me if you're first if you're testing this out at first. This will
98:30 - 99:00 allow you to create one that actually looks cool. Create a Next.js app. I want this app to solve the specific problem of splitting the bill at a restaurant. I want to be able to upload an image of a receipt and I want to select which items belong to who and then I want it to keep track of the totals. We are going to be
99:00 - 99:30 using the open AI API. I'm going to put in parenthesis GPT 40 to convert the image into a structured output. And then I'm going to say here is my
99:30 - 100:00 open AAI API key. And I'm going to show you this, but always keep your OpenAI key completely secure. But I'm going to create one and then I'm going to delete it after this video. So if you see my key, I'm going to delete it right after this video. So it won't be available. Where we need to go is we need to go to OpenAI Playground. And this is where you're going to get your API key, which will allow you to use, right? This gives you access to this. This is what it gives access to. It will not work if you don't have an API
100:00 - 100:30 key. So, open AI platform. What we're going to do is we're going to click dashboard. And now we're going to click API keys. Now, we're going to hit create new secret key. And I'm just going to call this one test. Create API key. And we're going to copy it. And here is my OpenAI API key. I'm going to paste it right here. And I am going to create the project. then run it locally which will allow me to test it. And now the AI agent, right? This is
100:30 - 101:00 an AI agent because this can actually search the web, it can do planning, it can think, it can do all these different things before generating your code. And we're going to see all the files being created over here. And so now you see we have this file. We have this file. It's called package.json. And it is just creating all the files to build this application. And so this has actually been loading for about four to five minutes now and it's creating so many different files to build this application. And this is just part of the game now with these AI agent
101:00 - 101:30 tools or these flowbased tools. All right. So it says it's done. But I am seeing an error here. So whenever you get an error, just copy and paste the whole thing and paste it in here. Please fix the error. This is all part of vibe coding. Okay. So, here it gives us this local host link. We can open this up. Okay. So, now the app is working. So, now we can upload a receipt. I'm just going to look up a receipt
101:30 - 102:00 um image. And let's go ahead. Restaurant. And we'll snag this one right here. Oh, yeah. Let's get a realistic one. So, let's go ahead and let's select this receipt right here. Now, let's go ahead and upload that receipt. So, we can upload that screenshot right here. And we can hit process receipt. Let's see if this
102:00 - 102:30 works. Failed to process. Please try again. And so, it failed to parse the receipt. Please fix this. Search the internet if you have to to find a fix. Okay. So, here it's done. Let's see if this works. So, we can upload this screenshot here. Process the receipt. Let's see if the API is working. It appears like it's working
102:30 - 103:00 this time. Boom. Okay. So, here is the bill that it processed. Uh, we can add Riley and we could add Kevin. And here, okay. And it created this little table for us. So, we can go Riley, Kevin, Kevin, Kevin, Riley, Riley, and Kevin. And here's the summary. And it split the bill perfectly. So, in this case, we just created this little app and it allows
103:00 - 103:30 you to take a photo of your app or take a photo of your receipt and it sends the photo to the OpenAI API and because we have an API key, it analyzes the photo and generates the structured data text. If we were to upload this screenshot right here, this um the receipt and we just say uh please tell me about this receipt and if we were just to enter this in as a normal chat GBT prompt, this response that it's about to give me
103:30 - 104:00 is unstructured data. Even though you're seeing this table, even though you are seeing this as kind of in different formats, it seems like it's structured. It is unstructured. show me the structured format uh with JSON. And what this is is this is actually a legit structured output and this is what this app is doing. So this is exactly
104:00 - 104:30 what this app is doing. This is a structured output and the API on the back end uh shows this is what it shows and then it basically loads that information in this structured output right here so that it can look the same every single time. And that's one thing that's really cool about the OpenAI API is it has a really good structured output API. And so if you need to create lists, tables, or anything like that, I
104:30 - 105:00 recommend using the OpenAI API key. And obviously it's a little bit harder for beginners because there's a lot that I did here. Uh even though it was a few steps, you know, the cursor interface is a little bit confusing. But let me show you an easier way to create an app like this. So, right now it's a mobile app, but let's go ahead and say, "Please create a mobile app that uh allows me to take a photo of the of
105:00 - 105:30 a receipt." And it allows me to select who paid for what after I add name. Please use the GPT-4 O model uh to analyze the image and structure the
105:30 - 106:00 output. Please make the app orange and black and white and gray. And so this is the Vibe Code app and this is the app that my team and I are working on. And we actually built in like we built the APIs directly into this app. So you actually don't need to go get an API key when you're testing it out. So we can actually just send this in and now the app is working. So now the code is
106:00 - 106:30 generating. Okay, it is now done. And so now we can actually upload a receipt or we can scan a receipt. Let me just scan one on my computer or let's go ahead and test this app real quick. And so if we open this up and we let's go ahead and take a photo of this receipt. Let's see if this works. Analyze receipt. Okay, there we go. And so
106:30 - 107:00 12453 and it broke everything down. And so there we go. This is a good example of using an API, the same one that we tried to use on cursor, except we actually didn't even need to use an API key. The app just kind of worked automatically. There we go. And we had to fix one little error, but we created this in I don't know, like seven or eight minutes. And we can add people to the bill. Let's see. Add person. Okay, let's add Riley. Okay, now we can add Kevin
107:00 - 107:30 again. Okay, now we can tap to assign this item. Okay, so we'll give that one to Riley. We'll give this one to Kevin. We'll give this one to Riley. We'll give this one to Kevin. Obviously, I don't think this user experience is the best. But we can assign all of them. We could probably make this a lot quicker. But there we go. We have the full total for each one. Riley and Kevin. And now we can actually add a little feature that
107:30 - 108:00 says, "Please create new receipt option at the end after I've assigned all of the costs." That creates a little sharable list that I can very easily send to someone via text. So, this should be a native iPhone feature that lets me share this info with someone on text right away after I've assigned all of the costs on the receipt. Okay, so now the code is generating and we can actually watch the updates as it happens
108:00 - 108:30 on the Vive Code app and it's building this native uh iOS app. And since it is a native iOS app and that's what we're building, we get access to the native features like share on messages. And so we should have a feature that pops up on here that allows me to very quickly share it with Kevin. As you can see here, it hit refresh. Okay, I saw that some changes are happening. I still don't see the share feature. Let's see if it adds it. Okay, so look at this. It added it in two locations actually. It said share
108:30 - 109:00 bill details. And so all I have to do is press share bill details and it allows me to very quickly send this information over. And so I'm going to go ahead and send it to my girlfriend Emily, even though I did put in Kevin. But as you can see here, the total tack room receipt split Riley Kevin. And there it is. There's all of the items sent from my bills spplitter app. And that's a
109:00 - 109:30 pretty cool feature. You can just in one click share it with whoever you want. And that would be good to share it in the group chat if it was a bigger group. And so this is just another good example of a cool workflow that you can create in just a few prompts directly from your phone. And you could also create something like this in cursor. I just think it's easier, you know, on your phone just like this. Okay. Wow. We actually did cover a lot in this video. And I am going to post this full guide online. The link will be in the description so you can find this mind
109:30 - 110:00 map and all these tools. But we talked about a bunch of different tools. We talked about automation workflows and we used Zapier to create a notion workflow that had some AI features built into it. And then we discussed vibe coding tools. We built a landing page on v 0ero. We talked about APIs and how these can give your apps powerups. And I guess we only showed an example of the OpenAI API being used, both text and the image input. But in future videos, I'm going
110:00 - 110:30 to show you how to use uh replicate. I've actually made videos in the past talking about replicate um and other APIs that you can use including Perplexity and 11 Labs. Um and we also talked about how to come up with the best vibe coding apps. And I showed two examples, right? We had uh one on cursor and then one using the vibe code app. Uh if you're on the computer, you should use cursor from the phone. If you want to create mobile apps, you should use the Vibe Code app. And yeah, so this was a long video. We also made a video. We
110:30 - 111:00 made this ad um which is pretty fun. Do you want to change your life? Don't buy this. Introducing the do nothing button. It does nothing. And so we created this little video here. And yeah, so thank you for watching and I will see you in the next video.