Enhancing AI Interactions with Model Context Protocol

Building Agents with Model Context Protocol - Full Workshop with Mahesh Murag of Anthropic

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In this workshop, Mahesh Murag from Anthropic delves into the Model Context Protocol (mCP), an innovative approach to standardizing the interaction between AI applications, agents, and external systems. The workshop emphasizes mCP's role in improving AI systems by providing a seamless integration layer, akin to APIs and LSPs, for efficient data and tool access. Mahesh outlines the motivation behind mCP, its components, and the adoption trends across various AI tools and industries. He also highlights future possibilities, including remote servers, authentication advancements, and registry systems to enhance the protocol's utility. This interactive session with a practical demonstration showcases the potential of mCP in making AI systems more dynamic and capable of evolving autonomously.

Highlights

Mahesh Murag enthusiastically presented the Model Context Protocol at Anthropic 🎤
mCP acts as a bridge for AI systems to access and use external data seamlessly 🌉
The protocol is inspired by APIs and LSP for better interaction standardization 🛠️
Mahesh demonstrated how mCP can make AI systems more personalized and powerful 💪
Future enhancements in mCP promise to make AI even more dynamic and secure 🔮

Key Takeaways

mCP is key to standardizing AI-agent interactions for smoother operations 🌐
It simplifies connecting AI apps to data and tools through a protocol layer 🔗
mCP makes AI systems more personalized and context-aware 📚
The protocol supports dynamic discovery and integration of new AI capabilities 🚀
Upcoming mCP features include remote server hosting and improved authentication 🔒

Overview

In a dynamic workshop led by Mahesh Murag from Anthropic, participants explored the innovative Model Context Protocol (mCP). This protocol is designed to standardize the way AI applications communicate with external systems, thereby enhancing their efficiency and personalization capabilities.

Mahesh explained how mCP acts as a seamless integration layer, much like APIs and Language Server Protocols, allowing AI applications and agents to effortlessly tap into various tools and data sources. This advancement not only makes AI systems more powerful but also more contextually aware.

The session also highlighted future developments for mCP, such as remote server capabilities and better authentication processes, poised to make AI systems even more adaptable and secure in the ever-evolving tech landscape.

Chapters

00:00 - 00:30: Introduction The introduction begins with a warm welcome from Mahes, a member of the applied AI team at Anthropic. He expresses his excitement at seeing a full room and is pleased that the audience chose his talk over one from OpenAI.
00:30 - 01:00: Overview of the Talk The speaker introduces the topic of the mCP (model context protocol) and sets the tone for an interactive session. They encourage audience participation through questions, even though it is primarily a talk rather than a workshop. The speaker plans to discuss the philosophy behind mCP and the reasons Anthropics decided to develop and launch it.
01:00 - 02:00: Why MCP? The chapter discusses the rise and adoption of mCP (Model Contextualization Platform) over recent months. It explores the patterns that facilitate the use of mCP in AI applications, specifically for agents. The chapter also outlines the future roadmap for mCP. The primary motivation for developing mCP was the realization that AI models are only as effective as the context given to them. This concept, although obvious now, was not as clear a year ago when AI assistants were less capable.
02:00 - 03:00: Standards in AI Development The chapter titled 'Standards in AI Development' discusses the evolution of chatbot applications from basic systems where users had to manually input context, to more advanced models with integrated access to user data and context. This integration has led to more powerful and personalized user experiences. The mention of 'mCP' suggests a specific development or product in this field (though more context is needed for a complete understanding).
03:00 - 05:00: Benefits to Developers and Enterprises The chapter discusses the concept of mCP, an open protocol designed to enable seamless integration between AI applications, agents, tools, and data sources. It draws a parallel with the evolution of APIs, which standardized interactions between the front-end and back-end of web applications by translating requests across layers. The goal of mCP is to create a similar standardization for integrating advanced technologies in a cohesive ecosystem.
05:00 - 06:30: Adoption and Community Involvement The chapter 'Adoption and Community Involvement' explores the concept of LSP (Language Server Protocol) and its impact on the interaction between IDEs (Integrated Development Environments) and language-specific tools. LSP standardizes these interactions, allowing for seamless integration between different coding languages and IDEs. The chapter discusses the functionality of LSP in enhancing the development environment, enabling IDEs to communicate effectively with various programming language servers, thereby fostering broader adoption and community engagement in the tech ecosystem.
06:30 - 08:00: Core Concepts of MCP The chapter titled 'Core Concepts of MCP' explores the inception and purpose of the Model-Controller-Protocol (MCP). It discusses how MCP standardizes the interaction between AI applications and external systems, highlighting three primary components: prompts, tools, and resources. The narrative begins with a reflection on the scenario before MCP existed, illustrating the need for such a standardized protocol.
08:00 - 09:30: Tools, Resources, and Prompts Explained The chapter discusses the challenges faced by companies in building AI systems due to fragmentation across the industry. Different teams often create AI applications with custom implementations that may not integrate well with other systems. The chapter emphasizes the need for a unified approach to developing AI applications to streamline processes and enhance compatibility.
09:30 - 15:00: Questions and Clarifications The chapter titled 'Questions and Clarifications' delves into the concept of prompt logic and the integration of tools and data, discussing various methods for federating access to these resources. It explores the standardized AI development environment, highlighting the use of mCP (Modular Component Protocol) as a framework. The discussion includes examples of mCP clients, such as first-party applications, illustrating how both individual teams within a company and the broader industry adopt these AI development standards.
15:00 - 25:00: MCP in Practice: Cloud for Desktop Demo This chapter discusses various applications like Perser, Windsurf, and Goose, which are MCP clients. It highlights the standard interface which allows these client applications to connect to any MCP server without requiring additional work. The MCP server is described as a wrapper or method of federating access to various systems and tools, particularly those relevant to AI.
25:00 - 39:00: Agents and MCP This chapter explores the concept of integrating language models (LM) with various external systems and applications, such as databases, CRMs like Salesforce, and version control systems like Git. The focus is on how LMs can access remote servers and local systems to read, write, and fetch data, highlighting the potential for seamless connectivity through APIs, enhancing the capabilities of language models in interacting with diverse environments.
39:00 - 48:00: Future of Agents with MCP The chapter discusses the evolving ecosystem around MCP (Multi-Client Protocol), emphasizing significant benefits for application developers and API providers. Application developers find value in making their clients MCP-compatible, allowing seamless connection to any server without additional efforts. Similarly, API and tool providers can build their MCP servers once and expect widespread adoption across different platforms. This setup ensures that LLMs (Language Model Models) can access essential data efficiently.
48:00 - 55:00: Building Effective Agents with MCP Agent Framework This chapter discusses the challenges faced in AI applications, focusing on the problem of numerous permutations in interactions between client applications and servers. The MCP Agent Framework is introduced as a solution to streamline these interactions by serving as an intermediary layer between application developers and tool/API developers. This aims to improve the accessibility of data for language models, enhancing overall performance and power.
55:00 - 66:00: Capabilities of MCP for Agents The chapter focuses on showcasing the capabilities of MCP (Microsoft Cloud Platform) for AI agents, particularly in context-rich applications. It mentions demonstrations on platforms like Twitter, where applications like Curser and Windsurf are highlighted. These AI systems are noted for being context-aware and capable of executing actions in the real world. Additionally, it discusses the benefits for enterprises, stating that there's now a structured approach to delineate responsibilities among different teams working on various projects, allowing a team to manage infrastructure separately.
66:00 - 73:00: Sampling and Composability In this chapter titled 'Sampling and Composability', the discussion revolves around the use of vector databases and retrieval-augmented generation (RAG) systems within organizations. It highlights a common scenario where individual teams within a company develop their own unique methods of accessing and managing these systems, including handling prompts and chunking logic. However, the introduction of mCP (Managed Cloud Platform) offers a more centralized approach. Enterprises can designate one team to maintain the vector database interface, turning it into an mCP server, hence ensuring consistency and ease of maintenance across the organization.
73:00 - 79:00: Remote Servers and OAuth In the chapter titled 'Remote Servers and OAuth,' the focus is on the efficiency of API utilization through proper documentation, enabling different teams within a company to develop AI applications rapidly and independently. It describes a microservices architecture where teams can own specific services, resulting in a more agile company roadmap.
79:00 - 83:00: Registry and Discovery The chapter titled 'Registry and Discovery' discusses the recent excitement and developments surrounding adoption, which frequently arise in conversations with colleagues and customers. It highlights different personas, particularly focusing on applications and IDEs, noting the exciting prospects they offer for developers by providing contextual support while coding.
83:00 - 87:00: Self-Evolving Agents The chapter 'Self-Evolving Agents' discusses how agents operate within certain IDs to interact with external systems like GitHub and documentation sites. It also highlights significant development on the server side, with approximately 1100 community-built open-source servers available. Additionally, companies have developed their own servers.
87:00 - 93:00: Complementary Technologies The chapter discusses the integration of complementary technologies into existing systems, particularly highlighting the official integrations published by various entities. It emphasizes the high level of adoption, especially within the open-source community, with many contributors actively working on the core protocol and infrastructure layers. This collaborative effort enhances the ecosystem and demonstrates the strength and adaptability of open source solutions.
93:00 - 102:00: Medium-Term MCP Developments In this chapter, the discussion centers around the medium-term developments and building processes associated with mCP (a protocol referenced in the transcript). It highlights the importance of understanding core concepts related to the mCP protocol. The transcript outlines a schematic view involving the mCP client on the left side, which is responsible for invoking tools and querying resources, emphasizing practical construction with mCP and its integral components.
102:00 - 104:30: Conclusion and Contact Information The chapter discusses the server Builder's capability to efficiently supply tools, resources, and prompts to clients. A tool is intricately developed as model control that allows server-side interpolation of prompts with useful context. This provides seamless consumption by any client connected to it.

Building Agents with Model Context Protocol - Full Workshop with Mahesh Murag of Anthropic Transcription

00:00 - 00:30 [Music] hey everyone hello thank you all for coming uh my name is mahes and I'm on the applied AI team at anthropic um really excited to see a very full room and very excited that you chose me over open AI
00:30 - 01:00 thank you very much so today we're going to be talking about mCP model context protocol um this is more of a talk than a workshop but I'll do my best to keep it interactive if you want to ask questions uh feel free to do so and I'll do my best to answer them um today we're going to talk about the philosophy behind mCP and why we at anthropic thought that it was an important thing to launch and build uh we're going to talk about some of the
01:00 - 01:30 traction about uh of mCP in the last couple of months um and then some of the patterns that allow mCP to be adopted for AI applications for agents um and then the road map and where we're going from here cool so our motivation behind mCP was the core concept that models are only as good as the context we provide to them um this is a pretty obvious thing to us now but I think a year ago when most AI uh assistant or
01:30 - 02:00 applications or chatbots uh you would bring in the context to these chatbots by copy pasting or by typing um or uh kind of pasting context from other systems that you're using but over the past few months and the past year we've seen these evolve into uh systems where that the model actually has hooks into your data and your context which makes it more powerful and more personalized and so we saw the opportunity to launch mCP which is a an
02:00 - 02:30 open protocol that enables seamless seamless integration between AI apps and agents and your tools and data sources the way to think about mCP uh is by first thinking about the protocols and systems that preceded it apis became a thing um a while ago to standardize how web apps interact between the front end and the back end it's a a kind of protocol or layer in between them that allows them to translate requests uh from the back end to the front end and
02:30 - 03:00 vice versa and this allows the front end to get access to things like servers and databases and services LSP came later and that standardizes how idees interact with language specific tools uh LSP is a big part of our inspiration um and it's called language server protocol and allows an IDE that's LSP compatible to go and uh talk to and figure out the right ways to interact with different features of coding languages you could build a go LSP server once and any IDE
03:00 - 03:30 that is LSP compatible can hook into all the things about go when you're you're coding in go so that's where mCP was born mCP standardizes how AI applications interact with external systems and it does so in three primary ways um and three interfaces that are part of the protocol which are prompts tools and resources so here was the The Land Before the of a land before mCP that
03:30 - 04:00 anthropic was seeing um we spend a lot of time with customers and people trying to use our API to build these agents and AI applications and what we were seeing is across the industry but also even inside of the companies that we were speaking to there was a ton of fragmentation about how to build AI systems in the right way one team would uh kind of create this AI app that hooks into their context with this custom implementation that has its own cust
04:00 - 04:30 prompt logic with different ways of bringing in tools and data and then different ways of federating access to those tools and data to the agents and if different teams inside of a company are doing this you can imagine that the entire industry is probably doing this as well the world with mCP is a world of standardized AI development you can see in the left box um which is the the world of an mCP client and there's some client examples here like our own first-party application ations um
04:30 - 05:00 recently applications like perser and windsurf uh agents like Goose which was launched by block all of those are mCP clients and there's now a standard interface for any of those client applications to connect to any mCP server with zero additional work an mCP server on the right side uh is any uh it's a wrapper or a a way of federating access to various systems and tools that are relevant uh to to the AI
05:00 - 05:30 application so it could be a database to query and fetch data and to give the LM access to databases and Records could be a CRM like Salesforce where you want to read and write to something that is hosted on a remote server uh but you want the LM to have access to it it could even be things on your local laptop or your your local system uh like Version Control and git where you want the LM to be able to connect to the apis that run on your computer itself
05:30 - 06:00 so we can talk about the the value that we've seen for different parts of the ecosystem over the the past few months the value for application developers is once your client is mCP compatible you can connect it to any server with zero additional work if you're a tool or API provider or someone that wants to give llms access to the data that matters you can build your mCP server once and see adoption of it every everywhere across
06:00 - 06:30 all of these different AI applications and just a quick aside the way I like to frame this is uh before mCP we we saw a lot of the N times M problem where there are a ton of different permutations for how these folks interact with each other how client applications talk to servers um and mCP aims to flatten that and be the layer in between the application developers and the tool and API developers that want to give lm's access to these data for end users obviously this leads to more powerful and
06:30 - 07:00 context-rich AI applications um if you've seen any of the demos on uh on Twitter with curser and windsurf um either our own first-party applications you've seen that these systems are um uh kind of context-rich and they actually know things about you and can go and take action in the real world and for Enterprises there's now a clear way to separate concerns between different teams that are building different things on the road map you might imagine that one team uh that owns the infrastructure
07:00 - 07:30 layer has a vector DB or a rag system that they want to give access to to other teams building aifs in a pmcp world what we saw was every single individual team would build their own different way of accessing that Vector database um and deal with the the prompting and the actual chunking logic that goes behind all of this but with mCP an Enterprise can have a team that actually owns the the vector DB interface and turns it into an mCP server they can can own and maintain and
07:30 - 08:00 improve that publish a set of apis um they can document it and then all of the other teams inside their company can now build these AI apps in a centralized way where they're moving a lot faster without needing to go and talk to that team every time that they need access to it or or need a way to get that data and so you can kind of Imagine This is like a world with microservices as well where uh different people different teams can own their specific Service uh and the entire company and the road map can move a lot faster
08:00 - 08:30 cool so let's talk about adoption um this is something that's been really exciting over the past couple of months um it kind of comes up in almost every anthropic conversation with uh people that we work with and a lot of our customers um this this slide covers a few different personas um but we can start with the the applications and the idees um this has been really exciting recently and it provides this really nice way for people that are coding in IDE to provide context to that
08:30 - 09:00 ID while they're working um and the Agents inside those IDs go and talk to these external systems uh like GitHub like documentation sites Etc we've also seen a lot of development on the the server side um I think to date there are something like 1100 Community built servers uh that folks have built um and published open source there are also a bunch of servers built by companies themselves I just built one as an example uh there are folks like uh and a
09:00 - 09:30 bunch of others that have published official Integrations for ways to hook into their systems there's also a ton of adoption uh on the open source uh side as well so people that are actually contributing to the core protocol and the infrastructure layer around it so those
09:30 - 10:00 bit about what it actually means to build with mCP and some of the Core Concepts uh that are part of the protocol itself here's kind of a view of the the world uh of of how to actually buildt with mCP so on the the left side you have the mCP client that invokes tools that queries for resources and
10:00 - 10:30 interpolates prompts um and and kind of fills prompts with useful context for for the model on the server side the server Builder exposes each of these things they expose the tools the resources and the prompts in a way that's consumable by any client that connects to it so let's talk about each of these components a tool is maybe the most intuitive and and the thing that's developed the most over the past few months a tool is model control and what that means is the server will
10:30 - 11:00 expose tools to the client application and the model within the client application the llm can actually choose when the best time to invoke those tools is so if you use cloud for desktop or any of these agent systems that are mCP compatible uh usually the way this works is you'll interpolate various tools into the prompt uh you'll give descriptions about how those tools are used as part of the server definition and the model inside the application will choose when the best time to invoke those tools are
11:00 - 11:30 and these tools are are kind of uh the POS possibilities are kind of endless I mean it's read tools to retrieve data it's write tools to go and send data to applications or or kind of uh take actions in various systems uh it's tools to update databases to write files on your local file system uh it's kind of anything now we get to to resources um resources are data exposed to the application and they're application control controled what that means is the
11:30 - 12:00 server could Define uh or create images it could create text files um Json maybe it's keeping track of you know the actions that you've taken with the server within a Json file and it exposes that to the application and then it's up to the application how to actually use that resource resources provide this Rich interface for applications and servers to interact that go just beyond you talking to a chatbot using text so so some of some of the use cases
12:00 - 12:30 we've seen for this are files where the server either uh surfaces a static resource or static file or a dynamic resource where the client application can send the server some information about the user about the file system that they're working in and the server can interpolate that into this more complex data structure and send that back to the client application inside Cloud for desktop uh resources manifest as attachments so we let people when
12:30 - 13:00 they're interacting with a server uh go and click into our UI and then select a resource and it gets attached to the chat and optionally sent to the model uh for whatever the the user is working on the resources could also be automatically attached you could have the model decide hey I see that there's this list of resources this one is super relevant to the task we're working on right now let me automatically attach this to the chat or send it to the model uh and then proceed from there and finally prompts promps are user controlled we like to think of them
13:00 - 13:30 as the tools that the user invokes as opposed to something that the model invokes um these are predefined templates for common interactions that you might have with the specific server a really good manifestation of this I've seen is in the IDE called Zed where you have the concept of Slash commands where you're talking to the llm to the agent and you say hey I'm working on this PR can you uh go and summarize the the work that I've done so far and you just type SL GPR uh you give it the pr ID and it
13:30 - 14:00 actually will interpolate this longer prompt that's predefined by Zed inside of the the MTP server and it gets sent to the llm um and you generate this really nice full data structure uh or full prompt uh that you can then send to the LM itself a few other common use cases uh that we've seen are different teams have these standardized ways of uh let's say doing document Q&A uh maybe they have formatting rules they have you know uh inside of a transcript they'll
14:00 - 14:30 have a different speakers and different ways they want the data to be presented they can service that or Surface that inside the Ser uh server as a prompt and then the user can choose when it makes the most sense to invoke cool I'll pause there any questions so far about these various things and how they they all fit together yeah in the back
14:30 - 15:00 yeah I I think we a big part of mCP Sor the question is why aren't resources uh modeled in the same way as tools why couldn't they have just been tools um a big part of the thinking behind mCP broadly is it's not just about making the model better it's about actually uh defining the ways that the application itself can kind of interact with the the server in these richer ways and so tools are are typically model controlled and
15:00 - 15:30 we want to create a clean separation between what's model controlled and application controlled so you could actually imagine an application that's mCP compatible decides uh when it wants to put a resource into context maybe that's based on predefined rules uh maybe that's based on it makes an llm call and makes that decision but we wanted to create a clean separation for the client Builder and the server builder for what should be invoked by the the model and what should be invoked by the application I saw you go first yeah
15:30 - 16:00 glasses yeah um the question is are tools the right way to expose let's say a vector database to to model um the answer is kind of up to you uh we think that these are really good to
16:00 - 16:30 use when it's kind of ambiguous when a tool should be invoked um so maybe the LM sometimes should go and call a vector DB uh maybe sometimes it already has the information in context and sometimes it needs to go talk to uh maybe you need to go ask the user for more information before it does a search um so that's probably how we think about it if if it's predetermined then you probably don't need to use a tool you just always call that Vector DB uh sorry second so the most things NCP is able to
16:30 - 17:00 more authentication how I'm going to get to that one later um because it's very relevant and we have a lot to say so I may have miss this on that um but so have you gone down the route of using a gentle framework tool calling um
17:00 - 17:30 as you progressed would you just WRA the mCP with a tool if you had a Sy yeah I think that it sounds like the broader question is how does mCP fit in with agent Frameworks um cool yeah I mean the aners they they kind of complement each other um actually Lang graph just this week released a bunch of connectors for or I think they're called adapters for langra agen to connect to mCP so if you already have a system
17:30 - 18:00 built inside Lang graph or another agent framework uh if it has this connector to mCP servers you can expose those servers to the agent uh without having to change your system itself as long as that adapter is installed so we don't think mCP is going to replace agent Frameworks uh we just think it makes it a lot easier to hook into servers tool prompts and resources uh in a standardized way okay so I can the framework a tool the tool mCP so going forward many of the tools would
18:00 - 18:30 just be a W for mCP yeah the the framework could call a tool and that tool could be exposed to that framework from an mCP server if the adapter exists does that make sense cool I'll take one more if there are yeah
18:30 - 19:00 so the the question is kind of does mCP Rel replace an agent framework and why still use one um I don't think it replaces them um I think parts of it it might replace the parts uh related to Bringing context into the agent and calling tools uh and invoking these things but uh a lot of the agent Frameworks value I think is in the Knowledge Management and the the agentic
19:00 - 19:30 loop um and how the agent actually responds to the data that's brought in by tools um and so I would think that there is still a lot of value in something where the agent framework defines how the llm is running in the loop and how it actually uh decides when to invoke the tools uh and reach out to other other systems but I don't think mCP as a protocol itself fully replaces it mCP is more focused on being the standard layer to bring that context to the the agent or to the agent
19:30 - 20:00 framework yeah I don't know if that's the the most clear answer but uh that that's the one that we've at least seen so far uh that might change as mCP involves cool sorry I saw one more which I'll take and then I will move on if that exists yeah
20:00 - 20:30 for me for I know
20:30 - 21:00 yeah so the question is why do resources and prompts exist and why isn't this all baked into tools because you can serve a lot of the same context via tools themselves um so I think we we touched on this a little bit but uh there's actually a lot more protocol capabilities built around resources and prompts uh than what I'm talking about here so part of your question was aren't resources in prompt static can't they just be served as static data as part of a tool uh in reality resource ources and prompts in mCP can also be dynamic they
21:00 - 21:30 can be interpolated with context that's coming in from uh from the the user or from the application uh and then the the server can return a dynamic or kind of customized resource or customized prompt based on the task at hand another kind of really valuable thing we've seen is resource notifications where the client can actually subscribe to a resource and anytime that resource gets updated by the server with new information with new context the server can actually notify the client and tell the client hey you
21:30 - 22:00 need to go update the state of your system or Surface new information to the user but the the broader answer to your question is yes you can do a lot of things with just tools but mCP isn't just about giving the model more context it's about giving the application richer ways to interact with the the various capabilities the server wants to provide so it's not just you want to give a standard way to invoke tools it's also if I'm server Builder and I want there to be a standard way
22:00 - 22:30 for people to talk to my application uh maybe that's a prompt maybe I uh you know I have a a prompt that's like a five-step plan for how someone should uh invoke my server and I want the client applications or the users to have access to that that's a different Paradigm because it's me giving the user access to something as opposed to me giving the tool access to something um and so I I kind of tried to write this out as model control application controlled and user
22:30 - 23:00 controlled the point of mCP is to give more control to each of these different parts of the system as opposed to only just the model itself yeah I hope that that kind of makes sense Co all right um let's see what this actually looks like the the Wi-Fi is a bit weird hopefully this works cool so what we're looking at is cloud for desktop which is an mCP client um let me try to pause as this goes through so
23:00 - 23:30 Cloud for desktop which is on the left side an mCP client and on the right side I'm working inside of a GitHub application um let's say I'm a repo maintainer for the anthropic python SDK I need to get some work done what I'm doing here is I give the cloud for desktop app the URL of the the uh repo I'm working in and I say can you go and pull in the issues from this GitHub repo and help me triage them or help suggest the ones that that sound most important to you the model Claude automatically decides
23:30 - 24:00 to invoke the list issues tool which it thinks is the most relevant here uh and actually pulls calls that and and pulls these into context and start summarizing it for me you'll also notice that I told it to triage them so it's automatically using what it knows about me from my previous interactions with Claude maybe other things in this chat or in this project to uh kind of intelligently decide here are the top five that sound most important to you based on what what I know about you and so that's where the interplay between just giving models
24:00 - 24:30 tools and actually the application itself having other context about who I am what I'm working on the types of ways I like to interact with it um and and those things interplay with each other the next thing I do is can you I I ask it can you triage the top three highest priority issues and add them to my ASA project um I don't give it the name of the as project but uh Claude knows that it needs to go and find that information a autonomously so I've also installed an ASA server and has that has
24:30 - 25:00 like 30 tools it first decides to use list workspaces then search projects it finds the projects and then it starts invoking tools to start adding these as as tasks inside of so this might be a pretty common application that you like to use but the the things I want to call out are one I didn't build the Asana server or the GitHub server these were built by the community each of them are just a couple hundred lines of code um primarily it's a way of surfacing tools
25:00 - 25:30 to the server and so uh it's not a ton of additional logic to build I would expect they could be built in an hour um and they're all kind of playing together with Claud for desktop being the central interface it's really powerful to have these various tools that someone else built for systems that I care about all interplaying on this application that I like to use every single day Claud uh Claude for desktop kind of becomes the central dashboard for how I bring in cont context from my life and I actually
25:30 - 26:00 like run my dayto day um and so ins side anthropic we've been using things like this a ton to go and reach out to you know our our git repos to uh even make PRS or to bring in context from PRS and MTP is the standard layer across all of this cool and so just to close that out um here's wind Surf and it's an example with using different servers but it's wind surf own application layer for connecting to mCP they have their own
26:00 - 26:30 kind of UI inside of their agent uh their own way of talking to the mCP tools um other applications don't even call them mCP tools for example Goose calls them extensions um it's really up to the application Builder how to actually bring this context into the application the point is that there's a standard way to do this uh across all of these applications awesome so so far we've talked about um how to bring context in
26:30 - 27:00 how mCP brings context into a lot of AI applications that you might already be familiar with but the thing that we are most excited about and starting to see signs of is that MTP will be the foundational protocol for agents broadly um and there's a few reasons for this one is the the actual protocol features and the capabilities that we're going to talk about in just a second um but it's also the the fact that these agent systems are becoming better the the models themselves are becoming better and they use the data you can
27:00 - 27:30 bring to them uh in increasingly effective ways and so we think that there's some really nice Tailwinds here um and and let's talk about how or why we think that this is going to be the case so um you might be familiar with the the blog that we put out uh my friends Barry and Eric put out a couple months ago called building effective agents and one of the core things in the blog um that one of the first ideas that was introduced is this idea of an mented llm it's an llm uh in the the
27:30 - 28:00 traditional way that it takes inputs it takes outputs um and it it kind of uses its intelligence to decide on some actions but the augmentation piece are those arrows that you see going to things like retrieval systems to tools and to memory um so those are the things that allow the llm to query and write data to various systems it allows the llm to go and invoke tools and respond to the results of those tools in intelligent ways and it allows the the
28:00 - 28:30 llm to actually have some kind of state such that every interaction with it isn't a brand new Fresh Start it actually kind of keeps track of the progress It's Made as it goes on and so mCP fits in as basically that entire bottom layer um mCP can Federate and make it easier for these llms to talk to retrieval systems to invoke tools to bring in memory and it does so in a standardized way it means that you don't need to pre-build um all of these
28:30 - 29:00 capabilities into the agent when you're actually building it it means that agents can expand after they've been uh programmed even after they've been initialized and they're starting to run to start discovering different capabilities uh and different interactions with the world even if they weren't programmed or built in to start um and and the the court thing in the blog or one of the simpler ideas in the blog is Agent systems at its core aren't that complicated they are this
29:00 - 29:30 augmented llm concept running in a loop where the augmented llm goes and does a task it kind of works towards some kind of goal it uh invokes a tool looks at the response and then does that again and again and again until it's done with the task and so where mCP fits in is it gives the llm the augmented llm these capabilities in an open way uh what that means is even if you as an agent Builder don't know every everything that the agent needs to do from the time at the
29:30 - 30:00 time that you're building it uh that's okay the the agent can go and discover these things uh as it's interacting with the system and as it's interacting with the real world you can let the users of the agent go and customize this and bring in their own context and their own ways that they want the agent to touch their data and you as the agent Builder can focus on the core Loop you can focus on context management uh you can focus on how it actually uses the memory what kind of model it uses the agent can be very focused on the actual interaction with the llm at its
30:00 - 30:30 core um so I want to talk about a little bit about what this actually looks like in practice um let me switch over to screen sharing my screen cool so to talk about this um we're going to be talking about this framework this open source framework called mCP agent that was built by our friends at Last Mile AI I'm just using it as a really clean and simple example of how we've seen some of these agent
30:30 - 31:00 systems uh kind of play in with mCP so I'm switching over to my code editor U make this bigger and what you see here is a pretty simple application um the entire thing is maybe 80 lines of code um and I'm defining a a set of Agents inside of this python file the the overall task that I want this agent to achieve is defined in this uh this task. MD and and uh basically I wanted to go
31:00 - 31:30 and do research about Quantum Computing uh I wanted to give me a research report about quantum's computing's impact on cyber security and I tell it a few things I want I want to go look at the internet synthesize that information and then give that back to me in this nicely formatted file and so what mCP agent the framework lets us do is Define these different sub agents the first one I'm defining is what's called a research agent where I give it the task that it's an expert web researcher um it's its role is to uh you
31:30 - 32:00 know go look on the internet to go visit some nice URLs and to give that data back in a nice instructed way in my uh file system and you'll see on the bottom is I've given it access to a few different mCP servers I've gave it access to Brave for uh searching the web I've given it a fetch tool to actually go and pull in data from the internet and I've given access to my file system I did not build any of those MTP servers um and I'm just telling it name and it's going to go and invoke them and install
32:00 - 32:30 them and making sure make sure that the agent actually has access to them the next one similarly uh is a fact Checker agent it's going to go and verify the information that's coming in from the research agent um and it's using the same tools Brave Fetch and file system um and these are just mCP servers that I'm giving it access to and finally there's the research report uh writer agent and that actually synthesizes all the data uh looks at all the references and the factchecking and then produces a report for me in this nice format this time I'm only giving it
32:30 - 33:00 the file system and fetch tools or servers um I don't need it to go look at the internet I just needed to process all the data uh that it has here um and it knows what servers each of them have access to and then once I
33:00 - 33:30 kick it off the first thing it's going to do is go and form a plan uh a plan is just a series of steps for how it should go and interact with all these systems and the various steps you should take uh until it can call the task done so as an example the the first step it's going to go and look at authorative sources on Quantum Computing um and it's going to invoke the Searcher agent in in various different ways it knows uh it creates this plan based on the context about the agent's task about the servers it has
33:30 - 34:00 access to uh and so on the next step is maybe it goes and verifies that information uh by focusing on the fact treer agent specifically and then finally it intends to use the writer agent to go and synthesize all of this the kind of uh core piece of this is mCP becomes this abstraction layer where the agent Builder can really just focus on the task specifically and the way that the agent should interact with the systems around it as opposed to the
34:00 - 34:30 agent Builder having to focus on the actual servers themselves or the tools or the data it just gives uh it kind of declares this in this really nice declarative way of this is what your task is supposed to be and here are the servers or tools that you have available to you to go and accomplish that task and so just to close out that part of demo I'm just going to kick this off um and what's going to be going on in the background is it's going to start doing some research uh it's invoking the search Tool uh the search agent and it's
34:30 - 35:00 going to invoke the fact checking agent and you'll start to see these outputs uh appear on the left side of the screen um and so this is a pretty simple demo but I think it's a very powerful thing for agent Builders because you can now Focus specifically on the agent Loop and on the actual core capabilities of the agent itself and the tasks that the the sub agents are working on as opposed to on the server capabilities and the ways to provide context to those agents the other really nice piece of this which is obvious is we didn't write
35:00 - 35:30 those servers uh someone else in the community built them maybe uh the the most authoritative you know source of research papers on Quantum Computing wrote them um but all we're doing is telling our agents to go and uh interface with them in a specific way and so you start to see the the outputs form the it looks like the Searcher agent put a bunch of sources in here um it's already started to draft the uh the actual final report and it's going to continue to rate in the
35:30 - 36:00 background cool try to ad something definitely yeah um so the question is have we seen Agent systems uh also working for proprietary data uh the really nice thing about mCP again is that it's open and so you can actually run MTP servers uh on inside your own VPC um you can run it on top of on your your employees individual systems or
36:00 - 36:30 laptops themselves so the answer is definitely no I'm just sep
36:30 - 37:00 yeah so uh the question is what does it mean to separate uh the agent itself and now the capabilities uh that other folks uh kind of give to it I I think the answer kind of varies um some of the ways that we've seen to improve agent systems are uh you know what kind of model do you use is it actually the right model for the specific task if you're building a coding agent or probably you should use use CLA um and there's also things like Conta management or Knowledge Management how do you store the the context and
37:00 - 37:30 summarize it or compress that context as the context window gets larger there's orchestration systems like if you're using multi-agent are they uh in series are they in parallel and so there's a lot more that you can focus on based on your task uh in that sense as well as uh the interface itself like how is the surfaced to the user and the separation is then uh maybe you build a bunch of your own mCP servers for your agent that are really really customized to what you want to do but when you want to expand
37:30 - 38:00 the context to what the rest of the world is also working on or the systems that exist in the rest of the world that's where mCP fits in like you don't need to go and figure out how to hook into those systems that's all pre-built for you uh let's do yeah that most of the what tool people
38:00 - 38:30 AG don't yes uh there's a slide that we'll get to um which is exactly that um no you're good that's great uh really good questions let's I'm going to do this side of the room because I didn't
38:30 - 39:00 yeah yeah um not a ton of this is specific to Last Mile I think it's a really great framework um it's called mCP D agent and specifically
39:00 - 39:30 what they worked on is they saw these things come out one one was the agents framework there's really simple ways to think about agents and they saw mCP which is there are really simple ways to think about bringing context to agents um and so they built this framework which allows you to implement the various workflows that were defined in the agent blog post using mCP and using these really nice declarative Frameworks so what's specific to mCP agent the the framework is uh these these different components or building blocks for
39:30 - 40:00 building agents so one is the concept of an agent and an agent as we've talked about is an augmented llm running in a loop so when you invoke an agent you give it a task you give it tools that it has access to and the framework takes care of running that in a loop it takes care of the llm that's under the hood and all those interactions and then using these building blocks you go a layer above and you hook those agents together uh in different ways that are more identic and those are described in the paper but one of the things in the
40:00 - 40:30 in the blog post was this orchestrator workflow uh example so that's what I've implemented here which is I've initialized an orchestrator agent which is the one in charge of planning and keeping track of everything and then I give it to uh give it access to these various sub agents uh using all these nice things that are part of MTP agent that being said it's open source like it's not that I'm blessing this is the right way to do it necessarily but it's a really simple and elegant way of doing it uh sorry there a lot
40:30 - 41:00 yeah yeah uh so the question is how do resources and prompts fit in in this case uh the answer is they don't um this example was more focused on the agentic
41:00 - 41:30 loop and giving tools to them I would say resources and prompts come in more where uh the user is within the loop so you might imagine instead of me just kicking this off as a python script I have this nice UI where I'm talking to the agent and then it goes and does some asynchronous work in the background and it's a chat interface like what you might see with Claude in that case the chat interface the application could uh you know take this plan that I just showed you and surface this to me as a resource the application could have this
41:30 - 42:00 nice UI on the side that says here's the the first step the Second Step third step and it's getting that as the server surfaces it to uh surface is the plan to it as as this kind of for line uh prompts could come in if um there's a few examples but you could say uh a slash command to summarize all of the steps that have occurred already you could say slash summarize uh and there's a predefined prompt inside of the server that says here's the right way to give
42:00 - 42:30 the user a summary here's what you should uh provide to the llm when you go and invoke the summarization prpt uh so the answer your question is it doesn't fit in here but there are ways it could yeah okay I'll take like two more let good with you does this introduce any like new workflows as it relates to like evaluations in this if you're surfing a bunch of different tools
42:30 - 43:00 the right I think the answer so the question is how does this fit into evaluations uh in particular evals related to uh assessing tool calls and and that's being done the right way I think um largely it should be the same as it is right now um there is potential to have
43:00 - 43:30 mCP be even a standard layer inside evals themselves I I probably need to think this through but uh you can imagine that uh there's an mCP server that surfaces you know the same five tools and you give that server to one set of evals you also uh let's say you have one eval system running somewhere to evaluate these five different use cases they have a different eval system the mCP server could be the standard way to surface the tools that are relevant to your company to both of them um but
43:30 - 44:00 largely I think it's similar to how it's been done already yeah uh the white
44:00 - 44:30 um I'll get to that
44:30 - 45:00 yeah um I can address part of this so the the question is where what is the separation between a lot of the uh the logic that you need to implement in these systems where should it sit should it sit with the client or the server and the specific examples are things like retry logic authentication um I'll get to O in a bit but on things like retry logic um I think my personal opinion and I
45:00 - 45:30 think this remains to see be seen how it shakes out is a lot of that should happen on the server side um the server is closer to the end application and to the end system that's actually uh running somewhere and therefore the server should have more control over the interactions with that system a big part of the design principle is ideally mCP supports clients uh that have never seen a server before they don't know anything about that server before the first time it connected and therefore they shouldn't have to uh know the right ways
45:30 - 46:00 to do retries they shouldn't have to know you know how to do logging in the exact way that the server wants uh and things like that so the server is ideally closer to the end application and and it's the one that's the end to service and it's the one that's implementing a lot of that business logic it depends um I don't have a a really strong opinion I take on on where the agent Frameworks themselves go um I
46:00 - 46:30 could see one counterargument being that you don't always want the server Builders to have to deal with that logic either um like maybe the server Builders want to just focus on exposing their apis and like letting all the agents do the work um and yeah honestly I don't have a really strong take on that yeah yeah that's a a really good question
46:30 - 47:00 why'd you ask that um so a lot of the questions that we get so sorry the question here is is there a Best practice or a limit to the number of servers that you expose to an LM um in practice the models of today I think are uh good up to like 50 or 100 tools like Claud is good up to a couple hundred in my experience um but beyond that I think the the the question becomes how do you search through or expose tools in the right way without over overwhelming the context with especially if you have thousands um and I think there are a few
47:00 - 47:30 different ways like one of the ones that's exciting is a tool to search tools um and so you can imagine a tool abstraction that uh implements rag over tools uh it implements fuzzy search or keyword search um based on you know the entire library of tools that's available um that's one way um we've also seen like hierarchical systems of tools so maybe you have a group of tools that's uh you know Finance tools you have like read data then you have a group of tools that's for writing data and you can uh progressively expose those groups of
47:30 - 48:00 tools based on the current task at hand as opposed to putting them all in the system from for example so there are a few ways to do it I don't think everyone's landed on one way um but the answer is there's technically No Limit if you implement it the right way okay methodology or best practice of like I have the steps take like first server
48:00 - 48:30 St you have that documented yeah I'm not going to go through it yet but um we do have that documented so the question is like what are the right steps to approach building an mCP server what's the order of operations um we actually have this entire docs page that's like how do you build an mCP server using Claud or using llms um all the servers that we launched with in November um I there were like 15 of them I wrote all of those in like 45 minutes each with
48:30 - 49:00 Claude um and so it's like really easy to approach it and I think tools are typically the best way for people to start grocking what a server is uh and then going to prompts and resources from there yeah definitely I'll share links later yeah in the red [Music]
49:00 - 49:30 yeah so the question is if a lot of these servers are simple can llms just generate the automatically the answer is yes um if you guys have heard of Klein
49:30 - 50:00 which is one of the most popular idees that's open source it has like 30k stars on GitHub they actually have an mCP autogenerator tool inside the app you can just say hey I want to start talking to gitlab can you make me a server and just autog generates on the Fly umoc that being said I I think that that works for like the simpler servers like the ones that are closer to just exposing an API but there are more complex things that you want to do
50:00 - 50:30 you'll want to have logging or or logic or data Transformations um but the answer is yeah for the more simple ones I I think that's a pretty normal workflow are talking to any yeah so question is are we talking to the actual owners of the services and the data answer is yes um a lot of them a lot of the servers actually are
50:30 - 51:00 official and public already so if I just scroll through official Integrations these are like real companies like Cloud flare and stripe that have already built official versions of these um we're also talking to to bigger folks um but I can't speak to that yet they might also host the servers remotely yes yeah like they'll build it and then they'll also maybe provide the infrastructure to expose it yeah in the
51:00 - 51:30 back uh you're asking about versioning as it relates to the protocol or to servers yeah so uh the question is how do we do best practices or versioning uh all these servers are so far a lot of them are typescript packages or on npm or on pip um therefore they also have version package version associated with them and so there shouldn't generally be
51:30 - 52:00 uh code breaking changes there should be a pretty clear upgrade path um but yeah I don't think we actually have best practices just yet for what to do when a server itself changes um for something like I mean generally I think it might break the workflow but I don't think it breaks the application if the server changes since uh as long as the the client and server are both following the mCP protocol the the tools that are available might change over time or they might evolve uh but the model can still invoke those in intelligent ways for
52:00 - 52:30 resources and prompts uh they they might break users workflows if those resources impr prompt changes but like they'll still work um as long as they're being exposed as part of the mCP protocol with the right list tools call tools list resources Etc I don't know if that answers your question though think more fuzzy right um I think using versioning uh of the packages themselves make sense
52:30 - 53:00 for that and then I'm going to talk a little bit about a registry and uh having a registry mCP registry layer on top of all of this will also help a lot with that yeah okay I'll take one more and then continue
53:00 - 53:30 Yeah question is how are we thinking about distribution and extension system I'll get there too yeah cool let's let's keep going so um so we've talked about one way to build effect agents and I I showed how to do that using the mCP agent framework
53:30 - 54:00 now I want to talk about the actual protocol capabilities that relate to agents and Building atic Systems um with a caveat that these are capabilities in the the protocol but uh it's still early days for how people are using these and so I think a lot of this is going to evolve but these are some early ideas so one of the most powerful things that's underutilized about mCP is this uh Paradigm called sampling sampling allows an mCP server to request completions um AKA llm inference calls from the client
54:00 - 54:30 um instead of the server itself having to go and Implement interaction with an llm or to go you know host an llm or call clad so what this actually means is you know in typical applications the one that we've talked about so far it's a client um where you talk to it and then it goes and invokes
54:30 - 55:00 server uh to have some kind of capability to get user inputs and then decide hey I actually don't have enough input from the user let me go ask it for more information uh or let me go formulate a question that I need to ask the user to give me more information and so there's a lot of use cases where you actually want the server to have access to intelligence and so sampling allows you to Federate these requests by letting the Cent own all interactions with the llm they can the client can own
55:00 - 55:30 hosting the llm if it's open source they can own uh you know what kind of models it's actually using under the hood and the server can request inference using a whole bunch of different parameters so things like um model preferences maybe the server says hey I actually really want you know specifically this version of Claude uh or I want a big model or small model uh do your best to to get me one of those um the the server obviously will pass through a system prompt and a task prompt to the client um and then
55:30 - 56:00 things like temperature max tokens it can and request uh the client doesn't have to listen to any of this the client can say hey uh this looks like a malicious call like I'm just not going to do it and uh the client has full control over things like privacy over the the cost parameters maybe it wants to limit the server to you know a certain number of requests um but the point is this is a really nice interaction because one of the design principles as we talked about is oftentimes these servers are going to be something where the client has never seen them before it knows nothing about
56:00 - 56:30 them yet it still needs to have some way for that server to request intelligence um and so we're going to talk about how this builds up a little bit to agents but just uh putting this out there is something you should definitely explore uh because I think it's a bit underutilized thus far cool one of the other kind of building blocks of this is the idea of composability so I think someone over there asked about composability which is a client in a server is a logical
56:30 - 57:00 separation it's not a physical separation and so what that means is any application or API or agent can be both an mCP client and an mCP server so if you look at this this very simple diagram let's say I'm the user talking to Claud for desktop on the very left side and that's where the llm lives and then I go and make a call to an agent I say hey can you go uh you know find me this information I ask the research agent to go do that work and that research agent is an mCP server but it's also an mCP client that research agent
57:00 - 57:30 can go and invoke other servers uh maybe it decides it wants to call you know the file system server the fetch server the web search server um and it it goes and makes those calls and then brings the data back does something with that data and then brings it back to the user so there's this idea of of chaining and of these interactions kind of hopping from the user to a client server combination to the next client server combination uh and so on and so this allows you to build these really nice complicated or
57:30 - 58:00 complex architectures um of of different layers of llm systems where each of them specializes in a particular task that's particularly relevant as well any questions about composability I'll touch on uh agents as well so
58:00 - 58:30 yeah so the question is how do you deal with compounding errors if uh the system itself is is complex and multi-layered I think the answer is the same as it is for complex hierarchical like agent systems as well um I don't think mCP necessarily makes that more or less difficult um but in particular um yeah
58:30 - 59:00 in particular I think it's up to each successive layer of the the agent system to uh deal with information uh or controlling data as it's structured so like to be more specific you know the third node there the kind of the middle client server node uh should collect data and fan in data from all of the other ones that just reached out to and it should make sure it's up to par or meets whatever data structure dayon inspect it needs to before passing that data to the system right before it um I don't
59:00 - 59:30 think that's special to mCP I think that is true for uh all these like multi Noe systems um it's just this provides like a nice interface between each of them does that answer your question yeah I saw other hands
59:30 - 60:00 yeah the question is um why are these why do they have to be mCP servers as opposed to just a regular HTTP server um the the answer in this case for composability and like the layered approach is that each of these can basically be agents um like in in the
60:00 - 60:30 system that you're kind of talking about here uh it I I think that there's there are there are reasons for a bunch of protocol capabilities like resource notifications like server to client communication the server requesting uh more information from the client that are built into the mCP protocol itself so that each of these interactions are more powerful than just data passing between different nodes like let's say each of these are agents like the first agent can ask the next agent for you know a specific set of data it goes and
60:30 - 61:00 does does a bunch of asynchronous work talks to the real world brings it back and then sends that back to the first client which um that might be multi-step it might take multiple interactions between each of those two nodes um and that's a more complex interaction that's captured within the mCP protocol that not might not be captured if it were just regular HTTP servers I think that the point I'm trying to
61:00 - 61:30 make is that uh each of these so you're you're asking like if uh the Google API or the file system things were just apis like regular non mCP service but MC making it an mCP server in this at least in this case allows you to capture those as uh agents as in like they're more intelligent than just you know exposing data to the llm it's like the each of them has autonomy um you can give a task to the second server and it can go and make a bunch of decisions for how to pull in richer data um you could in
61:30 - 62:00 theory just make them regular apis but you lose out on like these being independent autonomous agents each node in that system and the way it interacts with the the task it's working on yeah so in terms of controlling blows or infin and limits is that just handled by call or is there yeah um kind of depends on on the Builder but I I do think it's Federated uh because the llm is at the application
62:00 - 62:30 layer um and so that has control over uh how R rate limits work or how it should actually interact with the llm uh it doesn't have to be that way like in theory if the server Builder like the first node wanted to own the interaction with a specific llm maybe it's running open source on that specific server it could be the one that controls the LM interaction but in the example I'm giving here the llm lives at the very base layer and the application layer and it's the one that's controlling rate limits and control flow and things like
62:30 - 63:00 that so follow uh if it wants user input it does have to go all the way back yeah and mCP does allow you to pass those interactions all the way back and then all the way back forward yeah um I'm going to go on this side first yeah um you have to the primarying out a little bit there's a
63:00 - 63:30 discrepancy it's thisat uh yeah the question is how do you elect a primary how do you make decisions in network uh the answer is it's kind of up to you I I'm not aining on like Network systems themselves or how uh you know these these like logic requirement it's not a requirement it's not part of the protocol itself it's just that mCP enables this architecture to exist
63:30 - 64:00 yeah so um I think the idea so the question is how do you do observability how do you know the the other systems that are being invoked uh from a technical perspective there's no specific reason that the application or the user layer would know about those servers um in theory for example like the the first client application or the
64:00 - 64:30 first MTP server you see there uh is kind of a blackbox like it makes the decisions about if it wants to go invoke other sub agents or other services and I think that's just how like the internet layer like apis work today like you don't exactly know always what's going on behind the hood the protocol doesn't app Pine on how observability should work or enforcing that you need to know the interactions it's really up to the builders and the ecosystem itself
64:30 - 65:00 not even not even posibility even posibility it's like do you guys have best practices on this where now you don't even know like you're calling a server that's created by somebody else yeah you're right you call it API don't know I call St API I don't know exactly what that API based on the interface how describe dots but the mCP
65:00 - 65:30 server if it's more than just like that already exist how can you tell how can you debug yeah so the question is how do you actually make mCP servers debuggable um especially if it's more than just a wrapper around an API and it's actually doing more complex things the answer is it uh the protocol itself doesn't enforce like specific observability and and
65:30 - 66:00 interactions it's kind of like incentive alignment for the server Builder to expose useful data to the client um mCP does of course have ways for you to pass metadata um between the the client and the server and so if you build a good server that has good debugging and and actually provides that data back to the client you're more likely to be useful and actually like have a good ux um but the protocol itself doesn't kind of En enforce that if that's kind of what you're asking um which I think is the same answer for apis like people will
66:00 - 66:30 use your API if it's ergonomic and it's good and makes sense and provides you debugging and logs um so we think servers should do that I think we do have best practices um I don't know off the top of my head but I can follow up with that so I guess somebody ask but for best practic and building servers kind of goes into that now we're just talking talking resources something what was
66:30 - 67:00 theour sounds like kind are still develop that's best practices are still emerging are practices yeah I think the answer is we will get there like either anthropic or like mCP Builders themselves or the the community will start to converge on best practices but I agree with you that there needs to be best practices on how to debug and
67:00 - 67:30 stuff that mics anal to we also want
67:30 - 68:00 service that's exactly right yeah just comment on like this is very similar to microservices except this time we're bringing in intelligence but there are patterns that exist that we should be drawing from yeah um
68:00 - 68:30 andar that language yeah the question is um let's say that the client wants some amount of control or influence over the server itself or the tool call like uh limit the number of web pages you go and look like look at how do you do that so yeah one suggestion is by doing that via the
68:30 - 69:00 prompt like that's an obvious one that you can do uh one thing we're thinking about is something called like tool annotations um these extra parameters or metadata that you can Surface uh in addition to the regular tool call or specifying the tool name to influence something like can you limit the number of tools uh or limit equals five that's something that the server Builder and the tool Builder inside that server would have to expose uh to be invoked by the the client but we're thinking about at least in the protocol a standard uh a
69:00 - 69:30 couple of standard fields that could could help with this so one example that comes to mind is maybe the server Builder exposes a a tool annotation that's read versus write and so the client actually can now know hey is this tool going to take action or is it only just like read only um and I think the opposite vice versa of that is what you're talking about where uh is there a way for the server to expose more parameters for how to control Its Behavior yeah
69:30 - 70:00 you have any wrot mCP tool to diagn yeah so question on like on devx and how to actually you know look at the logs and actually respond to them so one um shout out is we have something called inspector in our repo and inspector lets you go look at logs and actually make sure that the connections to servers are making sense so definitely check that out I think your question is uh could
70:00 - 70:30 you build a server that debugs servers um pretty sure that exists and I've seen it where it goes and looks at the standard iio logs and goes and make changes to to make that work um I've seen servers that go and set up the desktop config to make this work so yeah the answer is definitely you can have loops here I'll take the last one uh and then I'll come back to these at the end
70:30 - 71:00 should yeah the question is around governance and security and who makes the decisions
71:00 - 71:30 about what a client gets access to um I think a lot of that should be controlled by the server Builder um we're we're going to talk about oth very shortly but that's a really big part of it like uh there should be a default way in the protocol to there there is a default way in the protocol to do authorization authentication um and that should be a control layer to the end application that the server is connecting to um and yeah I think that's the design principle is like you could have not malicious clients but clients that want to ask you
71:30 - 72:00 for all the information and it's the server Builder's responsibility to control that flow yeah I'm going to keep going and then I'll make sure to get back to questions in just a sec um so I think we basically have covered this but the com combination of sampling and composability um I think is really exciting for a world with agents um specifically where if I'm an end user talking to my application and chatbot uh I can can just go talk to that and it's a orchestrator agent that orchestrator agent is a server and I can reach out to
72:00 - 72:30 it from my cloudword desktop but it's also an mCP client and it goes and talks to an analysis agent that's an mCP server a coding agent another mCP server and a research agent as well and the this is composability and sampling comes in where I am talking to Claude from Claude for desktop and the each of these agents and servers here are federating those sampling requests through the layers to get back to my application which actually controls the interaction with Claud um so you get these really
72:30 - 73:00 nice Hier well they will exist they don't exist yet but you will get these really nice hierarchical systems of agents and sometimes these agents are going to live you know on the public web or they won't be built by you but you'll have this way to connect with them while still getting the privacy and security and control that you actually want when you're building these systems um so in a sec we're about to talk about uh what's next and registry and Discovery um but this is kind of the vision that I personally really want to see and I
73:00 - 73:30 think we're going to get there um of this like connectivity layer while they're still being guarantees about who has control over the specific interactions in each of these okay I'll get take questions in in a sec I'm just going to keep going so we've talked about a few things we've talked about how people are using mCP today um we've talked about how it fits in with agents um there's a lot of really exciting things that a lot of you have already asked about uh that are on the road map and coming very soon so one uh is remote servers and off so um let
73:30 - 74:00 me pause this to say what's going on so first um this is inspector uh this is the application I was just talking about where um it lets you you know install a server and then see all the kinds of various interactions uh inspector already actually has Au support so we added off to the protocol about two three weeks ago we then add it to inspector it's about to land in all the DK is uh so you should go and and check for that as soon as it's available but basically what we're doing here is we
74:00 - 74:30 provide a URL to an mCP server for slack um this is happening over ssse which uh as opposed to standard IO ssse is the the best way to do remote servers um and so I just give it the link uh which is on the left side of the screen there and then I hit connect and what happens now is that the server is orchestrating the handoff between the server and slack it's doing the actual authentication flow and the
74:30 - 75:00 way it's doing that is uh the protocol now supports oop 2.0 and the server deals with the handshake where it's going out to the slack server getting a callback URL giving it to the client the client opens that in in Chrome the user goes through the flow and clicks yeah this sounds good allow uh and then the server holds the actual uh oaf token itself um and then the server federates the interactions between the user and and the slack application by giving the client a session token for
75:00 - 75:30 all future interactions um so the Highlight here and I think this is the number one thing we've heard since day one of launch is this will enable remotely hosted servers this means servers that live on a public URL um and can be discoverable by people through mechanisms I'll talk about in a sec but you don't have to mess with standard IO you can have the server fully control those interactions as request and they're all happening remotely the the agent and the llm can live on a completely different system
75:30 - 76:00 than wherever the server is running uh maybe the server is an agent if you bring in that composability piece we just talked about um but this I think is going to be a big like explosion in the number of servers that you see because it removes the devx friction it removes the fact that you as a user even need to know what mCP is you don't even need to know you know how to host it or how to build it it's just there it exists like a website exists and you just go visit that website so any questions on remote Ser actually because I know a lot of people are interested in this when you're using protocol are you
76:00 - 76:30 also controlling do an adaptation of it so like to increase level of access that they have yeah I think the question is um does our supportive oth also allow for it sounds like scope change or like again starting off with face basic permissions but allowing people to request elevated permissions and for
76:30 - 77:00 those to respect it through the surve protoc yeah um like elevating from basic to Advanced permissions I think in the first version of it it does not support it out of the box but we are definitely interested in involving our support for off so uh the question is uh isn't it a bad
77:00 - 77:30 thing that the server holds the actual token I think if you think about the design principle of the server being the one that actually is closest to the end application of slack or wherever you want the data to exist like let's say slack itself uh builds a public mCP server and decid decides the way that people should op into it uh I think slack will want to control the actual interaction between that server and the slack application um and then the the
77:30 - 78:00 way that I think the fundamental reason for this is clients and servers don't know anything about each other before they start interacting um and so giving the server more control over how the interaction with the final application exists um I think is what allows there to be a separation do that kind of make sense
78:00 - 78:30 yes uh you should be judicious about what servers you connect to I think that's true for all web apps today as well what servers they have access to but yes Trust of servers is going to be increasingly important um which we'll
78:30 - 79:00 talk about with the registry in just a sec don't like a API yeah the question is how does this
79:00 - 79:30 fit in with restful apis and does it interact I think uh mCP is particularly good when there's uh I don't know data Transformations or some kind of logic that you want to have on top of just the the interaction over rest um maybe that means there are certain things that are better for llm than they would be for just a regular old client application that's talking to a server maybe that's the way that the data is formatted maybe that's the amount of context you give back to the model um you get a request uh you get something back from a server
79:30 - 80:00 and you say hey Claude like these are the five things you need to pay attention to this is how you should handle this interaction after this the server is controlling all that logic and surfacing it restful is going to still exist forever and that's going to be more for those stateless interactions where you're just going back and forth you just want the data itself yeah Noah
80:00 - 80:30 yeah um the question is how do we think about regressions as servers change as tool descriptions change how do we do evals um so a couple of things one is
80:30 - 81:00 we're going to talk about the registry in just a sec but I this probably something we talked about with versioning where you can pin a registry and as it changes you should test that new behavior um I think that this doesn't change too much about the evil eval ecosystem around tools um you might imagine like a lot of the customers that we work with we help them go and build these Frameworks around how their agent talks to tools and that's you know what's the right way what when should you be triggering a tool call how do you
81:00 - 81:30 handle the response uh these are pre-existing evals that exist or should exist um I think mCP makes it easier for people to build these systems around tool calls but that doesn't change anything about how robust these evals need to be um but it does make it easier right because you could at least the way I think about it is like I have my mCP server 1.0 my Builder my developer publishes 1.1 and then I just run 1.1 against the exact same evils framework and it provides this really nice like diff I guess um yeah I don't think it
81:30 - 82:00 changes too much about the needs of building evals themselves yeah just the ergonomics rights um it's in the draft spec it's in there's an open PR in the sdks so it's like I would say days away yeah it is an inspector though it's like fully implemented in there so check it out cool I want to go to registry because uh a lot of questions about registry so a
82:00 - 82:30 huge huge thing that we've seen over the past two months is there's no centralized way to discover and pull in uh mCP servers you've probably seen the the servers repo that we launched uh it's kind of a mess there there are like a bunch that we launched there are a bunch that our partners launched uh and then like 1,000 that the community launched and then a whole bunch of different ecosystems have spun up around this um which is pretty fragmented and part of the reason is like we didn't think it would grow this fast um and so we weren't quite ready to to do that but
82:30 - 83:00 what we are working on is an official mCP registry API this is a unified and hosted metadata Service uh owned by the mCP team itself but built in the the open um that that means the schema is in the open the actual uh development of this is completely in the open uh but it lives on an API that we're we're owning just for the sake of there being something hosted and what allows you to do is have this layer of above the various package systems that already exists where mCP servers already exist and are deployed these are things like
83:00 - 83:30 npm uh pipie um we've started to see other ones develop as well around Java and rust and go but the the point is a lot of the problems that we've been talking today talking about today like uh you know how do you discover what the protocol for an mCP server is is it standard IO is it ssse does it live locally on a file that I need to go and build and install or does it live at a URL uh who built it are they trusted uh was this verified by you know if Shopify has
83:30 - 84:00 an official mCP server did Shopify bless this server um and so a lot of these problems I think are going to be solved with a registry um and we're we're going to work to make it as easy as possible for folks to Port over the entire ecosystem that already exists for mCP servers uh but the point is this is coming uh it's going to be great and we're very excited about it because I think a huge problem right now is discoverability and people don't know how to find find mCP servers and people don't know how to publish them uh and and where to put them uh so we're very
84:00 - 84:30 very excited about this and the last thing I'll touch on is is versioning uh which a lot of people are asking about but you can imagine that this has its own versioning where there's this log of hey what's change between this and this like uh maybe the the apis themselves didn't change but I added a new tool or I added a new tool description or changed it uh and this allows you to capture that within the central ecosystem or metad service when um soon I I it's under development we're actually working with uh block for
84:30 - 85:00 example like they're one of the open source folks that we work pretty closely with on mCP um but it's coming there's a spec and I've I've read it yeah so question is can companies host their own registry yeah we think of it I think kind of like artifactory where there's a public one there's there's an open registry you can still obviously do your own um the nice artifact of this as well is there are ecosystems like cursor or like BS code
85:00 - 85:30 where you could hook into if you have an existing application and Marketplace that you work with you just hook into the API as like a second set of servers but we are not going to Aline on what the UI for that necessarily looks like we're just providing the data path to putting something in the registry that's not uh yes yeah uh that's a great point because yeah not all of these need to live on npm I think uh yeah the answer is yes
85:30 - 86:00 basically we can just let you put in a URL as long as it's like trusted and and you provide morea data oh sorry yeah
86:00 - 86:30 when you say execution um do you mean like how to actually surface these tools and like let them be built or like can you say more about that you know yeah I mean it could actually just be Docker we like work really closely with Docker themselves and they have a an exact mirror of that reposit the servers repo but it's all Docker images and they've done the whole build system
86:30 - 87:00 uh so it literally could just be Docker um there's also a world where it's entirely remote servers like maybe you self host and you don't want anyone to deal with building uh so you just publish it at a URL as well yeah payments and permission batteries so haven't thought about payments yet um it's not something we're thinking about right now permission boundaries um what
87:00 - 87:30 do you mean by that does that mean like who gets to install or like look at one of these servers
87:30 - 88:00 yeah it's a good question I think we've touched on this a bit and this sounds a little bit separate from the registry API or like maybe parallel um honestly I think best practices are still emerging that that's the real answer like people are still figuring out the right way to do data governance around this um so yeah I don't really have like a like the authorative answer on this just yet
88:00 - 88:30 um so I think our philosophy or like the principle maybe generally about open source is we built it we launched it and we want our products to be the world's best mCP clients but they're not going to be the world's only mCP clients and
88:30 - 89:00 we are totally fine with that uh we have and are talking to other foundational model providers can't comment on like who but uh the the point is this is open and we intend for it to be open and if that creates more competition um that's broadly good and I think it's good for users and it's good for developers um so I think there will be periods of time where Claude and our first-party services and apis are the best uh that might always not always be the case and I think that's that's fine and that's a good thing as well but um yeah we we'll talk to other model companies
89:00 - 89:30 if they're down can I just swap uh there is no specific Advantage Rel relating to mCP uh that requires you to use Claude with mCP Claude is just better for many reasons but like that's I mean it's true uh but that's
89:30 - 90:00 that's more about like Claud is just really good at tool use and agent of work and that's not about something fundamental with mCP itself at least for now
90:00 - 90:30 yeah um the question is like how do we think about servers being more proactive or initiating connections to the client so uh there's a lot that we're thinking about here for Server initiated actions so the simplest one that we can think about uh that already is supported is server initiated notifications when a resource changes or the server is maintaining a file or a log list and it wants to tell the client hey I just made an update to this or a new uh a new resources available um when it comes to sampling there isn't something in the protocol just yet for the server
90:30 - 91:00 initiating sampling from scratch um where maybe it makes some decisions on its own and it's it reaches out uh that's that is something we're going to build um where the server will say hey actually like completely unrelated like you didn't ask me any questions but I want to start this interaction with you um and it reaches out to the client and the client is ready to receive those messages
91:00 - 91:30 the the server reaching out to the client would happen if like the system itself decides it needs something like deterministically maybe it not not even predefined it could be event driven it could be like it just got a request from a user from some other system got an API request and it initiates the client thing uh also if you think about composability the server could in theory also be a client and have its own llm that controls uh so that's another reason why it could initiate connections yeah
91:30 - 92:00 [Music] you have any guidelines for that yeah the question is guidelines uh
92:00 - 92:30 between standard I and SSE the answer is like mCP is transport agnostic um so the actual like Behavior and the interactions between the client and server don't matter about uh you know the fundamental nature of like the underlying transport that being said the divide that we've seen so far is local or inmemory communication happens over standard IO and remote is going to happen over s um and I think that's the pattern that makes most sense but uh again it's it's transport agnostic if
92:30 - 93:00 you want to build your own transports uh and support them with mCP you can easily do that [Music]
93:00 - 93:30 Yeah question is does the the model have to be in the loop to interact with the server the answer is no uh the server exposes a standard set of apis or um that's probably the wrong word to use but a standard set of functions so call tools list tools call resources list resources um the the client application can call deterministically
93:30 - 94:00 um oh interesting are you talking about like server to server communication perhaps as well
94:00 - 94:30 yes yes I don't think there's a built-in way in the protocol to do this today uh um a lot of the interactions do have to go back to the client before it allows the tools to talk to each other and the main reason for that is servers don't really
94:30 - 95:00 know that other servers exist for the most part uh that being said it pretty sure it's possible like it's pretty flexible so I think you could make that happen it's just not like a first class thing that we support
95:00 - 95:30 just mentioned like code approach I generate code say outut on a variable what's your opinion that
95:30 - 96:00 how to be quite honest I don't know or like I I don't have a strong opinion on this um but yeah Happ to chat with you after if that makes sense okay I'm gonna keep going um I want to talk about real quick why registry is is amazing um besides the reasons of ergonomics and verification all that stuff we've talked about but for agents specifically uh an mCP server registry allows you to make
96:00 - 96:30 agents self-evolving what that means is you can dynamically discover new capabilities new data on the fly without having to know anything about those from the time that that agent was initialized or programmed in the first place so if you're a user and you have this General coding agent that knows exactly how you work it knows the systems that you usually already have access to and it has a control flow that really works well for you you say can you go check my graffan logs uh I think something's wrong with them and can you go fix this
96:30 - 97:00 bug the let's say the agent wasn't programmed to know that the grafana uh server existed so it's going to go talk to our registry it's going to do a search for an official verified grafana server uh that has access to the right apis uh and then it's going to install or invoke that server maybe it lives on a remote over ssse uh and then go and do the actual querying and go and fix fix the bug um so this is pretty simple example but the point is as Barry mentioned in his talk a couple days ago at this conference agents are going to become self-evolving by letting them
97:00 - 97:30 discover and choose their own tools and that makes that augmented llm system that we've talked about even more powerful because you don't have to pre- package these you don't have to predefine these the agent itself will go out and look for them and make itself better it gives itself context um and I just want to close that Loop because I think that's going to be really powerful and and I'm really excited for that
97:30 - 98:00 yes the question is how do you enforce control over arbitrary servers arbitrary access I think the artifactory example is a really common one like you'll you
98:00 - 98:30 can self host Registries and ferate which ones are approved or not approved um you could also instead of using let's say if we had a search API you could have a whit list of specific servers and allow there to be a tool in between uh where that agent has to go through that tool and the tool filters which servers it has access to um there's also the concept of verification like we'll in the registry figure out how to do this but uh allowing there to be an official Shopify server an official graffo server um which of course helps with this just
98:30 - 99:00 a little bit but largely it it will follow similar things to like artifactory and Enterprise tools as they exist today this this is the future yeah not something that I currently am using but I don't think it's very far away trust agents I trust agents to do it correctly
99:00 - 99:30 from a functional perspective I don't trust yet like the the servers themselves because there isn't like a great registry and all that kind of stuff but the CLA or like again models are are good enough already at like deciding which tools to use among hundreds uh so I do trust that part of it cool um we're getting close to time so I'm going to keep going there's another complement to server Discovery uh that's different from a registry and that is the concept of a wellknown um on the top here this is not a real URL but
99:30 - 100:00 let's say Shopify had aw wellknown mCP dojon and that provided this this nice interface for you know here's Shopify we have an mCP endpoint that you should know about uh it has the resources and tools capabilities and you off with it uh using oaf 2 and what that means is if I'm a user and I talk to my agent and I say hey help me go manage my store on shopify.com
100:00 - 100:30 and so this is a really nice compliment to the registry where the registry is focused on Discovery and verification and the ability for people to find tools from scratch but if you also want to go top down approach where you know you want to talk to Shopify or you uh have an agent that's going and looking around around on the internet um it can go and check this uh welln as a verified way of hey these tools do exist and uh this is
100:30 - 101:00 how you use them and and that's really powerful and a specific thing that I'm particularly excited about is there's a really nice complement to computer use uh anthropic release computer use model in October uh or just our regular model is a computer use model and what it allows you to do is go and click around in these systems and these uis that it's never seen before that don't have apis that it can go and interact with but what if you could have that plus mCP Json there's a predefined way for that agent to go and call the apis that are
101:00 - 101:30 surfaced by shopify.com but for the longtail where that doesn't work it can use computer use it can click around on the UI it can go login it can go interact with buttons and I think the world where those coexist inside One agent is the future and I think that's something we're thinking about um I'm sure other people are thinking about it as well and I think mCP is going to be a big part of that cool I'm going to keep going and I'll take questions at the end uh actually this is the last slide but uh besides everything that we've talked about today
101:30 - 102:00 uh there's a lot more things that we're thinking about in the medium term um this is roughly in order of uh how much we're thinking about it right now but uh there's a big discussion this is a bit granular about stateful versus stateless connections right now mCP servers are somewhat stateful they they hold State around the connection between the server and client uh a lot of folks are interested in these more short lived connections where the client can disconnect from an mCP server go offline for a little bit come back later and continue the conversation or the request
102:00 - 102:30 uh in the same way without having to rovide data so we're working on this um idea around maybe that there's a bifurcation between the more basic capabilities where it's the client asking the server for things versus uh capabilities where the server is asking clients for things and uh I think this is going to be really elegant but uh you can imagine for more advanced capabilities like sampling or server to client notifications they use something like ssse which requires there to be a longlived connection but for shortlived
102:30 - 103:00 things where it's just say hey can you help me invoke this Tool uh maybe that's a a more shortlived like HTTP or a regular request that doesn't require a Long Live connection streaming big one we're thinking about is um how do we stream data and actually have like chunk multiple chunks of data arrive at the client from the server over time uh how to uh support that first class in the protocol name spacing uh which is also somewhat relevant to agents and and Registries as we've been talking about but right now tools if you install 10
103:00 - 103:30 servers they have tools of the same name there is conflict uh often and like there isn't a great way right now to separate that other than like appending the server plus the tool name before you surface it I think the registry is going to help a lot with this but we also want to uh kind of allow this to exist first class in the protocol uh and maybe even allow people to create these like logical groups of different tools that are prepackaged into uh a really nice like package of I don't know Finance tools that are specific to these Finance services that people care about and finally I think someone asked about this
103:30 - 104:00 over there but uh proactive server Behavior Uh elicitation where the server is either event driven or has some kind of deterministic system where it decides it needs to go and ask the user for more information or notify them about something uh we're just trying to figure out better patterns for that existing in the protocol cool that's my talk uh my name is m uh you can reach out LinkedIn I don't really use Twitter but I felt compelled to put it on there I'm not going to respond to you on
104:00 - 104:30 Twitter though um but yeah thanks so much for listening this is really great [Music]