Building AI Agents In 44 Minutes

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In her video, Tina Huang shares a comprehensive guide on building AI agents, blending her extensive experience in the field with practical insights and tools for various levels of expertise. She explains the nuances of AI agent components, implementation workflows, and the importance of tailored technologies for different tasks. Highlighting both no-code and coding methodologies, Tina aims to equip viewers with a robust framework to create effective AI agents, highlighting market opportunities and the future potential in AI businesses.

Highlights

AI agents function as autonomous systems that act within their environment to achieve set goals. 🤖
Components like models, tools, and orchestration are essential for an AI agent's operation. 📦
Different workflows for AI agents include routing, prompt chaining, and autonomous agents. 🌐
Tools and frameworks are available for both no-code solutions and those requiring coding expertise. 📊
The future holds many opportunities in AI, especially in realms like audio, video, and image processing. 🚀

Key Takeaways

AI agents can be built using both no-code and coding tools, making them accessible for different skill levels. 🔧
Understanding the components and workflows of AI agents is crucial for their effective implementation. 📚
The video provides a detailed exploration of agentic workflows like prompt chaining, routing, and autonomous agents. 🤖
Tina emphasizes the importance of prompt engineering as it holds the components of AI agents together. ✍️
Considering your own needs or business processes is a great way to decide what AI agents to develop. 💡

Overview

Tina Huang presents a detailed guide on creating AI agents, focusing on different tools and methodologies that cater to varied expertise levels. Whether you're new to coding or a seasoned developer, you'll find strategies to develop AI agents tailored to your needs. Through a deep dive into AI components, workflows, and prompt engineering, Tina equips viewers with a comprehensive understanding of building functional and effective agents.

The video breaks down crucial AI components and workflows, such as routing and autonomous systems. Tina provides insights on selecting appropriate models and tools for developing AI agents tailored to specific tasks. Her approach blends practical examples with theoretical frameworks, making it accessible and informative.

Tina further explores market opportunities for AI agent applications, encouraging viewers to think creatively about unmet needs in their lives or industries. She emphasizes the evolving nature of AI tech, especially in audio, video, and image processing, suggesting viewers stay informed to seize future opportunities.

Chapters

00:00 - 00:30: Introduction to Building AI Agents This chapter introduces AI agents and outlines the speaker's extensive experience in building them, including running a program called Lonely Octopus to teach AI skills. The chapter aims to provide a comprehensive guide to building AI agents, suitable for both non-coders using no-code tools and experienced software engineers looking to create AI startups.
00:30 - 02:30: Structure and Components of AI Agents The chapter "Structure and Components of AI Agents" starts with an introduction through real examples of AI agents built using different tools. It acknowledges the partnership with HubSpot and outlines the video structure. Initially, the chapter introduces the crucial components that constitute an AI agent, discussing their nature, available tools for each category, and guidelines on selecting the appropriate tool. The latter part of the chapter delves into common workflows of AI agents used today.
04:00 - 05:00: Frameworks for Understanding AI Agents In this chapter, the focus is on understanding the frameworks for creating AI agents. The importance of prompt engineering is highlighted, as it can significantly influence the effectiveness of an AI agent. The chapter provides examples of implementing AI agents using both no-code tools and full-code approaches. Additionally, it discusses how to determine the most practical types of AI agents and startups to develop, along with specific recommendations for tech-enabled projects. Progress in developing agents for voice, video, and image is also noted.
05:30 - 15:30: Implementation Strategies for AI Agents The chapter 'Implementation Strategies for AI Agents' begins by introducing the concept of AI agents. It defines an AI agent as a system that perceives its environment, processes information, and autonomously takes actions to achieve specific goals. Furthermore, it discusses AI agents in the context of serving as AI counterparts to human roles or tasks, exemplified by coding AI agents like cursor. The chapter likely explores various strategies for implementing such AI agents effectively.
23:30 - 26:42: Crash Course on Prompt Engineering Chapter Title: Crash Course on Prompt Engineering The chapter discusses the use of AI-powered code editors like Claude Sonnet 3.7 and Gemini 2.5 Pro, which can autonomously perform coding tasks. It also talks about the common application of AI agents in customer service chatbots. These chatbots are being developed by various companies to handle inquiries, communicate with customers, file complaints, and resolve issues. The chapter provides a definition and understanding of an AI agent and hints at the complexities arising during implementation.
27:00 - 47:00: Examples of AI Agent Implementations The chapter titled 'Examples of AI Agent Implementations' highlights the complexity involved in implementing AI agents. It starts by explaining that AI agents are not solitary entities but are often comprised of multiple sub-agents, each tasked with specific functions. These sub-agents collaborate within a multi-agent system to create a cohesive and functional AI that appears as a complete agent to users. The chapter sets the stage for an in-depth exploration of various AI agent implementations, emphasizing the intricate nature and collective operation of sub-agents within broader systems.
57:00 - 71:30: Choosing the Right AI Agent to Build This chapter discusses the process of building an AI customer service agent. It highlights the division of tasks among sub-agents: a general one handling initial customer queries and identifying issues, and specialized agents handling specific categories such as billing, IT, or sales based on the query tags. This approach streamlines customer service by directing inquiries to the most qualified agent.
72:00 - 73:30: Innovations and Future Directions The chapter titled 'Innovations and Future Directions' explores the concept of agentic workflows, particularly focusing on a type known as routing, which is employed by phone companies and other industries for effective problem solving. The discussion emphasizes the importance of understanding the underlying mechanics of these agents as they are being developed. It also addresses why multiple agent systems and diverse implementations are necessary, describing this need as intuitive when thinking about the functionality and versatility of agents.
74:00 - 75:00: Conclusion and Final Advice In the concluding chapter titled 'Conclusion and Final Advice,' the discussion centers around the analogy between human roles within a company and the roles of AI agents. Just as humans have specialized roles and cannot effectively do everything themselves, AI agents too should have specialized functions to excel in their respective tasks. The chapter emphasizes the importance of specialization for achieving better outcomes, rather than overburdening a single AI agent with multiple tasks. This concept is crucial for understanding how to utilize AI agents effectively. The chapter also introduces a framework for understanding the components of AI agents, aiming to provide a structured approach to their implementation and integration.

Building AI Agents In 44 Minutes Transcription

00:00 - 00:30 I learned how to build AI agents for you. I have spent hundreds of hours building AI agents and I actually run a program called Lonely Octopus where we teach people AI skills and give them the opportunity to build AI agents for companies as well. So, in this video, I'm going to attempt to distill down everything that I've learned to give you that comprehensive guide with frameworks and a variety of different tools to build any type of agent you want. Whether you're someone who doesn't know how to code and wants to stick with no code tools, or if you're a seasoned software engineer looking to build your next AI startup, I'll also be walking
00:30 - 01:00 through real examples of AI agents built using different tools. As per usual, there'll be little assessments throughout this video to help you retain the information as we go through things. Now, without further ado, let's get started. A portion of this video is sponsored by HubSpot. Here's the exact structure of the video. First, I'm going to introduce the crucial components that make up an AI agent. Talk about what they are, some of the tools for each category, and how to choose which tool you want to be using for each category. Next, we're going to go into the nitty-gritty and talk about some of the common agentic workflows that people are using today. I'll also be including a
01:00 - 01:30 crash course on prompt engineering for agents specifically because the prompt is literally the thing that's going to make or break your agent. I'll then walk you through full examples of AI agents implemented using both no code tools as well as full code. But what is the use of building these AI agents if they don't actually serve a purpose? That's why I'll also be covering how to figure out what kinds of AI agents, what kind of AI startups or businesses that you should be building as well as tech enabled specific suggestions for what to build. The progress made in voice, video, and image agents has enabled so
01:30 - 02:00 many cool use cases. Agents are coming. Agents, hi you fellas. Let's first define what is an AI agent. An AI agent is a system that perceives its environment, processes information, and autonomously takes actions to achieve specific goals. Now, from a more human perspective, often times we tend to think about AI agents as an AI counterpart to a human role or a task that a human performs. That's why you often hear about AI agents in the context of a coding AI agent like cursor
02:00 - 02:30 or windsurf which are AI powered code editors that have a agent mode that can autonomously perform coding tasks either with claude sonnet 3.7 or Gemini 2.5 pro. Another very common AI agent use case are customer service chat bots. Many companies are now experimenting with customer service agents that are able to do things like handle inquiries, communicate with the customer, file a complaint for them or to resolve specific issues. Now, this is the definition and the experience of an AI agent. But when it comes to implementation, there's actually a lot
02:30 - 03:00 of different ways to implement these agents, and there's a lot of nuance to it. I'll give you a little preview about what I mean by this. Now, I'll be going into a lot more detail about this when I'm going to be covering the exact implementation of different agents. But for now, I just want you to note that when we say like AI agent, we're not just talking about, you know, an AI just sitting there doing its AI agent things by itself. It's oftentimes a bunch of sub aents that do specific things and ultimately come together in multi- aent systems to form what we perceive as the actual like complete agent. For example,
03:00 - 03:30 a classic implementation of a customer service agent is oftentimes split into first a sub aent that handles the customer queries like interacts with the customer, figures out what an issue is and then tags it to be passed along to a more specialized sub aent like for example my recent phone billing payment issue. This would be tagged as a billing and payments issue and passed along to another sub agent that will be specialized in dealing with billing and payments. There would also be other sub aents specialized in IT and sales and other things that customer service in
03:30 - 04:00 phone companies do. By the way, this type of agentic workflow is called routing and it's proven to be very very effective at this type of problem. Anyways, we'll go into more detail about routing and other types of agentic workflows in a bit. But yes, I hope this gives you a little bit of understanding about how agents actually work under the hood, which is very important to know as we build them. Also, just to answer this question, which you might be thinking of, why is it that we have to have these multi- aent systems, these different types of implementations? And the reason for this is actually quite intuitive. If you think about agents the same way that
04:00 - 04:30 you think about humans in a company, humans have different roles. You don't have just one human that is trying to do everything at the same time. that human will get very confused and not be able to prioritize what they're supposed to do and not be very good at any specific thing. And it's the same for agents. When we have different agents that are specializ in different things, the results of it all coming together is going to be far better than just having a single AI agent try to do everything. All right, so now I want to take a step back and give you a framework for understanding the components of AI agents. Sort of like say if you're
04:30 - 05:00 making a burger, a burger is made out of different components. There's a bun, a patty, vegetables, and condiments. You could switch out the type of bun, the type of vegetables, the type of patty, and the type of condiments. But you do need to have all these components for your burger to function as a burger as opposed to a weird sandwich or a hot dog. Same thing for agents. There are different components and you can switch out the different components for different things, but ultimately you need to have these components for it to be an agent. Now, unlike the components of a burger that have been long established, the components that make up
05:00 - 05:30 an AI agent is still relatively new. So people kind of have like varying different definitions, but the most comprehensive and well- definfined one comes from OpenAI. As they explain, building agents involves assembling components across several domains such as models, tools, knowledge and memory, audio and speech, guardrails, and orchestration. And OpenAI provides composible primitives for each. Yeah, you know, obviously OpenAI is going to list its own things there first, but for each of these components, there are actually a lot of other tools that are available out there as well. Depending
05:30 - 06:00 on the type of agent that you want to build, some are better than others. And I will go into more detail about each of these components. But first, I just want to make a note that if you ever feel super overwhelmed because there's just like a new tool or new technology that's coming out like every single day, do not panic. Don't feel overwhelmed. It's okay because whatever this new like innovation or tool thing is that is revolutionizing AI agents, just realize that it's still going to be part of this framework. It's like a new type of condiment in the condiments category that just happens to be a little bit more spicy or something. I hope that
06:00 - 06:30 makes sense. I hope you get what I mean by that. Anyways, let's now actually move on to each of these different components. OpenAI has this handy dandy little table. So, first you have the models component. These are your AI models, your large language models that are the core intelligence capable of reasoning, making decisions and processing different modalities. Of course, the examples that Open Eye gives us are the 01, 03 mini, GPD 4.5, GPD 40, etc. Now, depending on the specific type of agent that you're building, you want to choose a different type of model within the OpenAI ecosystem. GPD 40 is
06:30 - 07:00 your flagship model. It's a thinking model that's really great at reasoning, multi-step problem solving, and complex decision-m, great at answering most questions. Now, if you want something that is more intensive, the trade-off is that it's going to be slower and more expensive. You have GPT 4.5. It's good for writing and exploring new ideas, you also have 03 mini that has advanced reasoning capabilities, but it's also faster. And 03 Mini High that is particularly good for coding and logic. Outside of the OpenAI ecosystem, Claw 3.7 Sonnet is usually the go-to model for people who do a lot of coding and
07:00 - 07:30 reasoning and STEM subject based stuff. Although Gemini 2.5 Pro is challenging this right now, but honestly like in a month or whatever it is that you watch this video, probably the rankings have all shifted anyway. But overall speaking, if you care the most about things being cheap, then you probably want to go with an open- source model and host it yourself. And if you want to go with things being fast, you want to go for smaller models. And most Google models, at least as of the time of this filming, also has longer context windows if you care a lot about maintaining a high context window. Anyways, there are
07:30 - 08:00 a lot of websites out there that actually rank these different model performances like Vim for example. I don't know if that's how you pronounce it, Vellum. So depending on what your use case is, you can actually just check out the rankings and see which model suits your needs the best. Next up is the tools category. Now, do not underestimate the importance of tools. Your model is simply your base model. But what really starts making models powerful is adding on different capabilities like the ability of using tools. Tools allow agents to interface with the world. Like being able to search the web, for example, and all of
08:00 - 08:30 the different applications that you see out there, these can potentially be turned into tools for the AI. Like you can give it access to Google products like your Gmail, your calendar. You can give it access to the things that are in your hard drive. You can give it access to what's happening on your screen. You can give it access to your favorite apps like Slack or Discord, YouTube, Salesforce, Zapier, whatever. You can also build your own custom tools that you can give to the AI agent as well. If you use Open AI's agents SDK, uh you do need to know how to code to be able to
08:30 - 09:00 use this. They give you the ability of defining your own tools as well as some built-in tools like web search, file search, and computer use. You may have also heard of something called MCP, which is kind of all the rage these days that was built by Anthropic. It stands for model context protocol and it's a protocol that standardizes the way that you can provide things like tools to your large language model. This is quite a leap forward because previously it's quite difficult for developers to provide their agents with different tools because different softwares configure their services in different ways. So as a developer you kind of had
09:00 - 09:30 to like figure it out and piece it together. But basically MCP has made it a lot easier. Now do not worry if you're not a cody person. There's also a lot of no code or low code tools that have inbuilt within them the ability for you to provide tools to your models. Some of the examples I'll show you later like N8N for example, it allows you to very easily drag and drop different tools and connect them to your large language models. For example, if you're trying to build a market research agent, it would need to have a tool to be able to search the internet, a tool to be able to analyze the data that it gathers. And maybe if you wanted to send a email
09:30 - 10:00 report to you also would need a tool to be able to access your email. Now moving on to knowledge and memory. So there's two different types. The first one is called the knowledge base or static memory. This allows you to give your AI model static facts, policies, documents, just information that I can reference and access that remains relatively static over time. This is important if you're building something like an AI agent that does legal tasks. It may need to have specific legal documents for a particular case for a particular company and maybe like certain policies that are relevant for that specific company as
10:00 - 10:30 well. The other type of memory is persistent memory. So this is memory that will allow an AI agent to be able to track conversation histories or user interactions past just a single session. This is really important for a lot of chatbot use cases like say if you have an AI personal assistant, you want to make sure that the personal assistant will still remember what happened like yesterday. Again, OpenAI provides its own hosted services like vector stores, file search and embeddings. There's also open- source versions of this where you can host your own databases and then you can also perform different ways of doing rag which is retrieval augmented
10:30 - 11:00 generation. Not going to go into way too much detail about this. But some solutions that people look into would be Pine Cone, which is cloudnative and optimized for vector search or Weeat, which is open source. Again, if you're leaning more towards a no code solution, you don't really have to worry about this is usually already taken care of by that solution. Like N, for example, already allows you to deal with this without you having to like figure out all the complex cody stuff. Next up is audio and speech. So, it's pretty interesting because OpenAI does split this into its own separate category while many other kind of like frameworks
11:00 - 11:30 don't really include this one specifically. And I think the reason they do this is because there's just been such innovations recently in audio formats. But basically, giving your agent ability to have audio and speech allows it to interact with natural language. This is really important for chatbot AI agents because having that ability to communicate directly using natural language can be a much better user experience. Within the OpenAI ecosystem, they have their own ways of implementing this. While outside of that ecosystem, what people seem to use a lot, at least right now, is 11 Labs, which is used for voice cloning and
11:30 - 12:00 generation. Oh, and for audio transcription, like audio to text, people do stick with whisper, which is an open AAI model. As of right now, like I said, these things change a lot. It's more important for you to understand kind of like the general category, the general component as opposed to the specific tools within it. Next component is guard rails. So guards are really important in order to prevent irrelevant, harmful, or undesirable behavior. You know, once you create this agent, you got to make sure that it's actually doing what it's supposed to be doing and not doing something else. If you have a customer service agent, you
12:00 - 12:30 want to make sure that it is in fact talking about customer service stuff and not giving you like haikus or something like that. Outside of the OpenAI ecosystem, what's popular right now is guardrails AI and lang chain guardrails. There's honestly a lot of different options in this category. But again, if you are using no code tools, I think it's important for you to understand this category, these component, but a lot of no code tools already have solutions built into their platform. Finally, there is orchestration. And this is something that's super overlooked. Remember how we were talking about different sub aents, like how it is that you're chaining together different sub agents in order to come up
12:30 - 13:00 with a final result for something. It also involves deploying like so it's able to do its thing in production, monitoring it, and improving the agent. Like once you deploy the agent, you don't just run away and then just like not look at it again, right? Like over time the models keep changing. A lot of these technologies thoughts change as well like data keeps changing. So you need to keep monitoring and making sure that your agent is behaving the way that's supposed to be behaving. There's also a lot of different tools in this category. Oftent times there's usually like a framework and then the orchestration part of it is built into that framework. Like OpenAI has its own
13:00 - 13:30 system. There's also crew AI which is another framework for implementing multi- aent systems. It also has its own kind of system for orchestrating and finally deploying it. Lang chain is also very popular for managing different agent interactions and deploying it as well as llama index which is particularly useful if you creating an AI agent that has a lot to do with documents and static memory and knowledge bases. Here is also a little pneummonic for you to remember the different components that make up an AI agent which is going to be immediately useful because right now we are going to
13:30 - 14:00 do our first little assessment. I'm going to put on screen out some of the questions. Comment below your answers to make sure that you just retain all the information that we went through. Okay, so this is a very practical guide to building AI agents. HubSpots offers us a very practical free guide to building AI agents from a business perspective. I think this free resource is a really great compliment to everything we're covering today because it goes in depth on how to now take these AI agents and make sure they're driving maximum business success. The playbook explains how AI agents are being used in businesses today with actual examples
14:00 - 14:30 and use cases, common pitfalls, and also discusses the future of work. It includes a checklist that helps your organization think through each phase of implementing AI agents from identifying the highest return on investment opportunities to defining success metrics as well as integration and scaling. I highly recommend that you check it out at this link over here also linked in description. Thank you so much HubSpot for creating these free practical resources and for sponsoring this portion of the video. Now back to the video. All right, now that we know the components that make up AI agents, let's now move on to the implementations. If
14:30 - 15:00 you remember what I said a bit earlier, AI agents are often times not just like a singular entity. They're actually broken down into different sub agents that are interacting with each other. My favorite resource that covers these common agentic workflows and agent systems is the building effective agents guide from anthropic. So let's go through it first. First of all, you have the basic building block of agentic systems. This is what Anthropic calls the augmented LLM. From this image, you can see that you have an input, you have the LM, and you have the output. NDLM is able to generate their own search
15:00 - 15:30 queries, select appropriate tools, and determine what information needs to retain through memory. If you were paying attention earlier, you'll see that there are overlaps between the components in this augmented LLM and OpenAI's components. This version is a little bit more bare bones, like it doesn't address things like guardrails or orchestration, but you can see that there is definitely overlap. That's okay. When it comes to things like testing and deployment, just remember the OpenAI components for those specific things. Just FYI, in terms, these augmented LM building blocks are often called sub agents as well. So now let's actually see how these building blocks,
15:30 - 16:00 these sub aents fit into each other and work with each other to form your bigger AI agent. We're going to be starting with the simplest agentic workflows all the way to the more complex and the truly autonomous. All right, so the simplest common agentic workflow is called prompt chaining. Prompt chaining decomposes a task into a sequence of steps where each sub aent processes the output of the previous one. In its simplest form, this is just like an assembly line, but you can also add in little gates where you can split it off into different things, but the logic is the same. You'll have an input, a sub agent does something with that input,
16:00 - 16:30 passes along to another sub agent who does something else, and maybe to another one, etc., etc., until you finally get a output. This kind of implementation is the most ideal for situations where the task can be easily broken down into subtasks and decomposed. An example for when prompt chaining could be useful is if you want your AI agent to be generating a report. The input could be the description of what the user wants and then the sub agent will take that maybe generate an outline, pass it along to another sub agent who may check the outline for like specific criteria and then pass it along to a writer sub agent that would actually write the report and then maybe
16:30 - 17:00 to an editor sub aent that would actually edit the report. And the final output would be the report that follows the criteria that was specified. Routing is another type of workflow where you would have an input coming in and you have a sub agent that is dedicated to directing that specific input into a specific follow-up task. And each of these tasks is governed by a sub aent that is specific to that task. Then finally you get the output after the processing. Routing works really well for complex tasks where there are distinct categories that are better handled separately. A classic example when routing is useful is if you have a
17:00 - 17:30 customer service bot. You have a customer service bot that will be getting different types of queries that could be like general questions, refund requests, technical support, whatever it is that people ask customer service. Based on the nature of the query, the first sub agent should be able to route the most relevant task to the sub agent that is specialized for that task. Like if it's a refund request, then it would be routed to the specialist sub agent for refunds. Like if it's a technical support query, then it would be routed to the AI sub agent that is a specialist for handling technical support questions. Another common use case is by
17:30 - 18:00 routing different questions to different types of models. Some models are better at doing certain things than others. Like if it's a difficult STEM related question, you might be routing it to Claude Sonnet 3.7. Or if it's an easy question where you value speed, you might be routing it to Gemini Flash. Next workflow is parallelization. Paralleliz Oh god, I can't pronounce this. Specific agentic workflow usually has two key variations. This is when you have sub agents that are working simultaneously on a task and then have all of its outputs then aggregated
18:00 - 18:30 together. The first one is sectioning, which is breaking a task into independent subtasks that are run in parallel, or voting, which is running the same task multiple times using different sub aents to get different diverse outputs that you aggregate together. An example of sectioning is if you're trying to evaluate how good the performance of a new model is for a given prompt. Each sub agent could be evaluating a different aspect of the model's performance. Like one of them could be evaluating speed and one of them is evaluating accuracy, etc., etc. An example of voting is reviewing a
18:30 - 19:00 piece of code for vulnerabilities. You have different sub aents that are evaluating the code and ultimately you aggregate together to vote to decide if this is in fact a vulnerability or not. Next workflow and we're getting increasingly more complex is the orchestrator workers. The orchestrator worker actually looks pretty similar to parallelization, but what's different about it is that you don't have a predetermined list of subtasks that will be done. So this is especially useful for more complex problems where you can't actually exactly predict what are the subtasks that are going to be needed ultimately. Like for example, if you're
19:00 - 19:30 building agents that involve coding, often times you don't know the exact number of files that need to be changed and the exact nature of the change itself. So you need to be dynamically making changes to multiple different files. Another example are search tasks. Like if you have a research assistant agent, this would involve gathering and analyzing a lot of different types of information from a lot of different sources which cannot be predetermined ahead of time. Even more complex is the evaluator optimizer workflow. This is approaching more autonomous situations where you're giving the sub agent, the
19:30 - 20:00 AI agent a lot more autonomy and freedom in determining what it is that it should be doing. You have some sort of input and the first sub agent would generate something a solution based upon that and pass it along to an evaluator sub aent. The evaluator sub agent would evaluate it and if it's accepted then that will be the output. Or if it feels like it's not good enough, it would send it back to the first sub agent telling it's rejected and some feedback to improve. And this is like a circular loop that you would keep doing until the evaluator sub agent thinks that the solution is good enough and pass it along to output. This workflow is particularly useful if
20:00 - 20:30 there's a clear evaluation criteria and when you can see iterative refinement and improvement over time. An example where the evaluator optimizer workflow is useful is if say that you're doing some sort of literary translation for something. There may be nuances that the translator sub agent cannot capture the first time around. So the evaluator sub aent would be sending it feedback and telling it to keep doing it until it's able to capture all the nuances in the language. Another example is if you're having a complex search task that you're trying to aggregate together into like some form of ultimate report. You might
20:30 - 21:00 be doing research and the eval sub agent would be like it feels like it's not deep enough research is like keep doing that keep doing it keep doing it until you're able to gather all the necessary information that it feels like it's able to capture you know your super complex report fully. And finally we have the truly autonomous agent implementation. So this one is tricky because it is actually the simplest implementation wise but it can result in very different types and very complex potentially solutions. The agent will begin his work
21:00 - 21:30 with some form of human interaction. And once that task is clear, the agent will be completely independent. It will perform some sort of action or actions that will have some form of reaction to the environment. And the agent has to somehow figure out itself from the environment what is considered to be the result of what it's doing. Like for example, if it decides to use a tool where it decides to execute some code, it needs to figure out itself if it's making progress towards the ultimate completion of task or not and it's going to keep doing that getting the feedback
21:30 - 22:00 from the environment judging how it's progressing until ultimately it feels like it has completed the task that it was assigned. This kind of implementation, the very autonomous freedom giving type of agent implementation is usually used for very open-ended problems where it's very difficult to predict the number of steps that it should take or the exact path to get to the final result. You're basically just telling an agent, hey, like do this thing and it just kind of has to figure out itself how to do the thing like what are the task involved whether it's making progress or not towards the thing and then at some point
22:00 - 22:30 deciding that it has in fact completed the thing and comes back to you. You can get like really crazy good results from this, but sometimes often times you also get some really crazy in general that can come from this. Some examples from the anthropic article include a coding agent that's able to resolve different software engineering bench tasks which involves editing like a lot of different files on a task description or their computer use implementation where Claude was able to use a computer have access to all of the different functionalities of this very complex computer machine to accomplish specific
22:30 - 23:00 tasks. Here's a diagram that illustrates the path that a coding AI agent took in order to complete its task. You can see that there's a lot of different back and forths interactions in environments coming back and refining and everything like that before going back to a human. As the article suggests, this kind of truly autonomous implementation is not something that you generally want to do because in most situations, you can actually go with a more predetermined agentic workflow and it would yield more predictable results and be a lot cheaper. This article keeps saying like repeatedly that you should always go
23:00 - 23:30 with the simplest implementation possible. Like if you can achieve your AI agent goals through prompt chaining or routing, don't be doing things that are more complex. Just a general rule of thumb when you're building your AI agents and actually just engineering and building things in general. Don't overengineer. Okay, so we've covered all the different workflows. Now I want to first do a quick little crash course on prompt engineering for AI agents from a practical perspective for these AI agents. the prompt engineering, the prompts matter so much. It's really what holds everything together. Like you can have your agents and it has all these
23:30 - 24:00 tools and has access to all these really cool things, but if you don't have a good prompt, you're not able to pull all this together. So that's why I'm going to emphasize this part. When you're prompting for an AI agent, you need to have the full prompt, all of it just there. Like you can't interactively correct it and add more information throughout the process. So there are six components that you should consider putting into your AI agent prompt. The first thing to specify is the role. So this is where you tell it that it's an AI research assistant, but you also want to include things like the tone and how it is that it should be behaving. So for example, you could write you are an AI
24:00 - 24:30 research assistant task with summarizing the latest news in artificial intelligence. Your style is succinct, direct, and focus on essential information. Next up is a task and you can write given a search term related to AI news, produce a concise summary of the key points. Then we have input. This is where you can specify what it is that the AI research assistant will be receiving. In this case, you can just write that the input is a specified AI related search term provided by the user. But you can imagine that there could be other inputs that the AI research assistant could be receiving like certain graphs and different documents. You want to specify and let
24:30 - 25:00 the AI assistant know exactly what it is that will be receiving. Fourth is the output. This is where you want to go into detail about what it is that you want the AI research assistant to come up with. What is it supposed to ultimately look like? What's the final deliverable? In this case, you can write provide only a succinct information dense summary capturing the essence of recent AI related news relevant to the search term. The summary must be concise. Approximately two to three short paragraphs totaling no more than 300 words. It exactly knows what it's supposed to output. Now, fifth step of the framework is constraint. This is a
25:00 - 25:30 really, really, really important part that you want to be including in your prompt. Not just what it's supposed to do, but also what it is that it should not be doing. You could write, "Focus on capturing the main point succinctly. Complete sentences and perfect grammar are not necessary. Ignore fluff, background information, and commentary. Do not include your own analysis or opinions. We don't care about the AI agents. We only want to focus on the facts. Finally, you have capabilities and reminders. This is where you want to tell the AI what it has access to, like certain tools that it may have, as well
25:30 - 26:00 as provide reminders for things that it should really, really, really keep top of mind, things that are really important. In this example, we gave the AI agent the ability to do web search. So, we can tell it you have access to the web search tool to find and retrieve recent news articles relevant to the search term. Also, we want to remind it that it needs to be very aware of the current date. A common issue that a lot of LMS have is that it's not really aware of what date or time it currently is. So, since we're only interested in searching for things that are relevant
26:00 - 26:30 right now, we want to make sure that it's aware of what time it is and what's the search window. So we might write you must be deeply aware of the current date to ensure the relevance of news summarizing only information published within the past seven days. A general tip is that the more important something is the lower down on the prompt it is that you want to remind you. It's just the way that the AI is able to process that information. It has a bias towards the most recent things first. That was the crash course on AI agent prompt engineering. I hope you guys are also
26:30 - 27:00 not mad at me for making you actually learn the foundations first because I do find that a lot of people who are doing vibe coding these days who don't actually know the foundations, you end up, you know, having you building something and it's just, you know, not that great. It's kind of just like Or if it's something that you want to tweak slightly, you end up, you know, making a lot of stupid mistakes because you don't understand the foundations. So now you're equipped with the information to actually go build something with the knowledge and the confidence that it is in fact the best implementation to do so. Here's now. Alo now a little quiz
27:00 - 27:30 that I will now put on screen. Please answer these questions in the comment section to make sure that you're retaining all of this information that I am presenting. Now the next section I'm going to be showing you the actual implementations of AI agents. I have included some no code low code examples as well as fully coded examples as well. So there should be something for everybody here. This is a customer support AI agent and we implemented this using N8N. So this was NAN. It's a platform. It's a no code, low code platform that is super
27:30 - 28:00 easy to use that you can use it to create different AI agents. In this case, we implemented this AI agent using a multi- aent system that follows the routing agentic pattern which we talked about earlier. The way it works is that a customer will send an email inquiry and then we have a text classifier which is powered in this case by an open AI model that's able to route the inquiry as technical support, billing or general inquiry. And each of these have their own specific workflows after that. Let us see actually how it works. Let's go to my email over here and I'm going to
28:00 - 28:30 write an email to customer support. This case is going to go to [email protected]. I'm going to say refund because I am angry. Hello, I want a refund. Yes. Click send. You can see that the emails here. It classifies it as a billing situation. We have the AI agent and the AI agent is able to use the email to respond back to the inquiry. And if we check our email again, we saw the agent has responded to us. Hello, thank you for reaching out regarding a request for a refund to assist you effectively. Blah blah blah,
28:30 - 29:00 you know, give all these information and then you can go ahead and send that information to the agent for you to process your refund. If it's classified as technical support, it also has this workflow. If it decides that it can answer your technical support question directly using documentation, it can directly email you back the response as well. But here we also have an option where if they can't figure out how to support you from a technical support perspective, it would actually escalate this and send it on Discord like this. Hello team, customer needs help. Please investigate further. The email ID is this ID over here. So a real agent would
29:00 - 29:30 be able to jump in and start helping the customer in this case. Really important to actually have this here because you always want to have some way in your AI agent to be able to escalate to an actual human. And of course, if it's a general inquiry, it would route to this branch over here, and then it would send a general email asking for additional information. This is another AI agent. It is a AI news aggregator agent. The way it works is that it's scheduled at 7:00 a.m. every day, and it's going to go and gather information, gather news from different newsletters as well as
29:30 - 30:00 Reddit. Then it will aggregate all of that information together, and ultimately come up with a summary that it's going to send to me on WhatsApp. This is an example of a parallelization workflow pattern. So, it's not 7 am, but I'm just going to trigger the workflow right now and have it do its thing. It's going to be running everything over here. So, I want to actually make a note that even though it is a parallelization workflow, the limitation of NATO is that it actually still runs sequentially. If you implement this using a coded tool like OpenAI's agents SDK for example, which I
30:00 - 30:30 will show you an example of in just a little bit, it would actually run in parallel. But yeah, in this case just kind of let you know technically it is parallelization but it isn't able to do that because of the platform limitations itself. Okay. So after running it's going to send me a notification on WhatsApp where it gives me an aggregated information from all of the different news sources. So open AI launches GP5 alpha AI ethics Google's AI ethics regulatory developments blah blah blah like all these different things that are happening over here. And in the prompt I specified to make sure that it cites the sources. So, if I wanted to actually go
30:30 - 31:00 in and learn more about each of these different news reports, I could actually just click in and be able to look at the actual source itself. This is actually a really helpful AI agent to have cuz in this prompt over here, you can see that I can like exactly specify what it is I'm interested in like AI related search term provided by the newsletter mind for example, right? Where like whatever it is I want, how I want it to be summarized, how I want everything to be aggregated together. So, it's a really handy little tool. think it would be really useful for you as well if you are
31:00 - 31:30 someone who also has to just like go through a lot of information every day. Final NAN example. This is a multi-input daily expenses tracker AI agent. That is such a mouthful. So the way it works that you interact with it using WhatsApp. You can send it pictures or receipts of whatever it is that you've spent. You can send it text as well. Like if you spend like $10, you can tell it that you spent $10 as well. is able to take all of that information and ultimately aggregate everything together to give you a final expenses track report every single day. It also
31:30 - 32:00 stores it in memory on Google Sheets and it will also give you that report and send it to you on WhatsApp as well. And finally, at 9:00 p.m. every single day, it would then on WhatsApp send you a summary of how much money that you've spent. For example, here I said that I spent $10 on a potato. I don't know why it's like $10 on potato is very expensive. Then it would be able to put this on my expense tracker. So potato over here, $10 a potato. Here's like all the other things that I've bought. See that I've bought a lot of things these
32:00 - 32:30 days. And at night, it tells me that my consumption has focused on living expenses specifically with the purchase of potatoes totally $10. This indicates a straightforward and essential spending pattern with no other itemized purchase recorded for the day. On some of my previous days when I bought more than just the one potato, it says here like on April 7, 2025, the spending now showed a significant emphasis on food with large purchases like steak and chocolate totaling $4,000 making food the most dominant category. Minor expenses, living expenses was also recorded with the purchase of peanuts.
32:30 - 33:00 Okay, that is not exactly correct as you can see. Maybe we still need to modify this prompt a little bit. Um, but yeah, this is an example of how it is that you can track your expenses based upon my explanation of how this works. What a gentic workflow design pattern do you think the multi-inputs daily expenses tracker AI agent is implemented with? Put that in the comments. I wanted to show you an example that is implemented using code. Now specifically this was implemented
33:00 - 33:30 using OpenAI's agents SDK. It's done using Python and what it is is a financial research assistant that is able to take in an inquiry and is able to search the internet, gather information about it, aggregate it and it also has voice functionalities and also like language and translation functionalities as well. And this follows the routing agentic design workflow pattern where we have a main manager. And actually instead of me just like showing you the code to explain this to you, I'm actually going to use cursor to show you how the AI agent works and also to run it as well. just a little preview to my vibe coding video
33:30 - 34:00 that's going to be happening maybe in like two weeks. So stay tuned for that. Okay. So I'm going to say, could you please explain the way that the financial research assistant agent works? So we have a main orchestrator which is the financial research manager and the core workflow steps is that it plans searches, perform searches, write reports and verifies the report. The way it does this is that after the manager kicks off the program, it would pass it on to a planner agent. So it uses a
34:00 - 34:30 planner agent to break down the user's query into specific search terms. Each search term contains a query and reason for searching and it returns a financial search plan with multiple search items. So then it passes along the search terms to a search agent which then performs each of these searches and which then collects and aggregates all the search results. Then we go on to the analysis phase. It uses specialized agents for different aspects. So we have two agents that's going to be over here. First one, it passes to the financials agents that would analyze key financial metrics as well as the risk agent that identifies
34:30 - 35:00 potential red flags. And both agents will return analysis summaries. Then they would pass along all of these analysis summaries to the report writing phase where you have a writer agent that's able to synthesize all that information together, combines the together search terms with financial and risk analysis, and generates a structured report using markdown, short summary, and follow-up questions. Then we have a verifier agent, which then goes through the report's accuracy and completeness. We also included a voice interaction functionality that's so you're able to communicate and ask it questions based upon the report that is
35:00 - 35:30 generated using audio. And finally, you'll get your output and your results for your financial report. You can see that it's implemented based upon the prompt chaining agentic workflow where the main orchestrator manager kicks off the query and it passes along to the planner agent, the search agent, and many other agents until finally you're able to get a financial report.txt with with all that result contained within the financial report.tx. txt. Let's actually run this now. Let's run the financial research agent, whatever. I
35:30 - 36:00 can't spell. It's fine. By the way, if you've never seen AI coding agent, coding editor at work before, this is kind of what it's like. Honestly, like after I started using cursor, windsurf, and just like AI coding agents in general. It has been a huge game changer for how people code and run code as well. So, all right, we will let it do its thing. It says at first help you run a financial research agent first. Let me check the workspace rush to ensure we have everything we need. Blah blah blah. So let it do that. Okay. It's telling me that I need to install some things. So we'll just do that. Installing dependencies. Running into errors. Okay.
36:00 - 36:30 Run more things. Five minutes later. Okay. After running all of the these dependencies, it says that we have the server running. Let's now run the financial research agent. I'm just going to write what are the key financial metrics for Tesla. So, we're going to run this. Oh no, it didn't work. Honestly, a lot of Vibe coding is just running things and then letting it install stuff and fix its own problems.
36:30 - 37:00 So, we're going to patiently wait for it to work. Okay, it says enter a financial research query. Enter. Oh, looks like we don't have an open AI key. Let me put that in the key metrics for Tesla. It is starting financial research and starting to do its thing. We'll perform seven searches searching planning report structure. And there we go. It looks like we have the report. All right. The financial agent has gener successfully generated comprehensive report and we
37:00 - 37:30 can actually find that report over here. So instead of actually having to read through everything, I'm going to use the voice functionality that has been implemented. So run the voice functionality. Tell me about the key metrics in the report. Sure. Here are the key financial metrics mentioned in the report. One, revenues. Tesla recorded revenues of $24.93 billion. The substantial revenue figure is largely attributed to the successful sales of their Model 3 and Model Y vehicles as
37:30 - 38:00 well as strategic expansions in Berlin and Texas factories. Two, so you can communicate directly using voice. And finally, I want to show you guys how you can translate your report into Spanish. So this uses MCP. So which allows it to have access to a tool that can translate the report into Spanish which it did over here. So this is an example of a coded implementation and if you want to check out the code I'll actually link in
38:00 - 38:30 the description so you can check it out and play around with it yourself too. Remember that there's actually a lot of different ways that you can use to implement an AI agent. Choose what makes the most sense for the AI agent that you're building as well as your own skill level. By the way, if you are interested in learning more about AI agents and how to build AI agents, I wanted to let you know that I'll be launching an AI agents boot camp in the next few weeks. It's a four-week long program that is really hands-on where you're going to be building your own AI agents like the ones that you see in this video, as well as ones that are going to be more advanced and more custom towards specific use cases. So, if you interested, please do check out
38:30 - 39:00 the link over here, also linked in description. Instead of just ending the video right now and being like, "All right, guys, go build your AI agents." I actually want to include this final section where I want to share with you how it is that you should be thinking about what kind of AI agents that you want to be building in the first place because ultimately speaking we're trying to build AI agents not just for funsies hopefully or maybe it is I don't know then that's fine but for a lot of us we're trying to build AI agents so that they can be useful for us useful for a business useful for enterprise whatever right maybe some of you guys also want to be starting your own AI agent
39:00 - 39:30 businesses or startups by the way if you haven't already please do check out the Y combinator YouTube channel I have learned learned so much in terms of figuring out what kind of AI agents to build, kind of startups to be doing, what kind of things to be aware of while playing around in the AI space and their videos are really, really worth watching. But I'm going to share with you the major insight that I got from watching this video, which is how to find your AI startup ideas. The easiest way of figuring out a useful AI agent to build is by starting with yourself first. What is it that you're currently doing that if you were to offload to an
39:30 - 40:00 AI agent would make your life so much easier? Again, don't worry about what kind of tools and frameworks and tech stack it is right now, okay? Just think about what is it that if you did would just make your life so much easier. For example, I work with a very lovely team and agency that takes care of the sponsorships that I do. And one of the people on the team actually messaged me on Slack saying that she wanted to build an AI agent that is able to access her emails and be able to screen like what are considered good leads versus bad leads and only respond to emails that are considered to be good leads. I
40:00 - 40:30 thought that this was a great idea and I was like, "Yes, you should totally do this." And you can totally do this through no code using nan as well. You can use the prompt I shared earlier to figure out what is the agentic workflow that is the most applicable in this specific situation. And then you can go build it using a no code tool. But what if you're someone who is not currently working and solving problems every day? Like maybe you have just graduated where you're currently still a student. Don't worry, YC also has really great advice for this. In this case, what you want to do is go undercover. Seeing as you yourself don't have the experience to understand what can be automated instead
40:30 - 41:00 of just thinking of something in your head. The best approach to doing this is to go and meet up with someone who is in fact working like someone who either owns their own business or has a job or something like that. Just ask if you can shadow them. Try to figure out their problems. The thing is often times they might not even know their own problems because they're so deeply entrenched in whatever it is that they're doing on a day-to-day basis. They don't even realize there could be ways of doing things that is so much easier and so much better if they incorporated AI into their workflow. But you, you're coming in with a fresh pair of eyes. So look at
41:00 - 41:30 what they're doing and try to identify where it is that you can build an AI agent and offload some of their tasks, automate some of what they're doing so that they're able to accomplish their goals even better. Once you start doing that and developing that, you often times start to realize that whatever issue it is that you had or you know somebody else had is something that many many people have. And there you go. That's how you can start working on something that could eventually turn into a business or a startup as well. And finally, if you just want some like highlevel guidance, the absolute goal that I got from one of the YC videos as well, is that for every SAS company that
41:30 - 42:00 you see out there, software as a service company that you see out there, there will be an AI agent equivalent of that. Literally, every company that is a SAS unicorn, you could imagine there's a vertical AI unicorn equivalent. So there you go. That is literally like such clear overarching guidance. Look at all the SAS companies that are available right now. Think about what is the AI agent equivalent to that company and create that. Finally, I want to talk about the specific tech enabled innovations that
42:00 - 42:30 you can be working on right now. As always, the AI industry is just moving so quickly and there's so many new technologies that are being developed every day. But the major like fundamental developments that we can see right now in 2025 is that there's huge leaps forwards in terms of voice and audio. Audio generation is just freaking unreal right now. Here's a little excerpt for me to show you what I mean from Sesame. This is actually from a friend who showed me this and I was just freaking like mind blown. Hobbies to meet people. Well, joining a club or online community can
42:30 - 43:00 be really fun, especially if you're into gaming or crafting. Volunteering is also a great way to connect with awesome people who care about the same things as you. And hey, if you're watching this, don't forget to subscribe to Tina's channel for more awesome tips. This is also why OpenAI itself and its SDK has a whole category dedicated to voice agents because it is just so many use cases that are enabled from that. There's also massive developments in image models like Rev, Gemini Flash image generation as well as GPD40 image generation as well. And there's also video models like
43:00 - 43:30 Sora. So, anything related to image and video. These are all things that are also ripe for disruption. Now, ending this video with a final general piece of advice. There's always so much stuff that is happening in his industry. If you ever feel overwhelmed by what is happening, try to relax, calm down, and think back to these frameworks and components that I presented today. There's a reason why I created this video where I'm not just showing you tutorials of things and just telling you about the new like things that people are building and the new
43:30 - 44:00 agents that people are building as well. It's because like with all of this that's going on, if you just focus on understanding the fundamental components, the fundamental frameworks and the fundamental technologies, everything that comes on top of that, you're able to categorize in your mind as it being actually important for you to learn about or not important. So, keep up with the actual big innovation in this category. Things like actual model innovations. Gemini 2.5 Pro recently came out, for example, MCP that enables better tool use and a lot of the other stuff. You don't really need to pay so much attention to that hype. Keep
44:00 - 44:30 learning. Keep doing your own projects. Build out your own AI agents. And when the time comes, when the opportunity comes where your skill set and your interest, they align together with what is in demand in the world right now. You'll be off building a successful AI agent business or startup or just side hustle or fun project as well. Be patient, my friend. All right, as promised, here is the final little assessment. Please write in the comments your answers for these. Now, thank you so much for watching to the end of this very long, very intensive video. And I really hope that it has been helpful and
44:30 - 45:00 I will see you guys in next video or live stream.