Performance tuning for Azure Cosmos DB - Hasan Savran - CPH DevFest 2024

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

Hasan Savran presents on performance tuning for Azure Cosmos DB during the CPH DevFest 2024. The talk begins with an introduction and an agenda overview, focusing on making Azure Cosmos DB more scalable and cost-effective. Savran discusses the challenges related to physical partitioning and highlights the massive scalability of Cosmos DB through examples like Chat GPT's use. He emphasizes selecting the correct partition key and shares techniques for managing workloads, such as leveraging change feed functions for data synchronization, using point read functions for efficiency, and exploring various connection modes like Gateway and Direct connections for better performance. The session ends with tips on indexing, throughput settings, and the benefits of utilizing reserved capacity for cost savings.

Highlights

Hasan Savran begins his first session in Copenhagen, detailing the agenda for optimizing Cosmos DB.
Physical partitioning issues are tackled, with Cosmos DB's capacity for horizontal scaling highlighted.
Learn how Chat GPT effectively uses Cosmos DB with over 25,000 partitions for transaction management.
Best practices for connection modes and how they affect latency and throughput are discussed.
Savran shares tips on optimizing costs with auto-scaling and reserved capacities.
The importance of indexing and how it affects performance and operational expenses is emphasized.

Key Takeaways

Always choose the right partition key for better scaling; it’s crucial for performance! 🗝️
Scale efficiently with Change Feed functions and Point Read operations. 🚀
Explore different connection modes like Gateway and Direct for optimal performance. 🌐
Leverage serverless and auto-scaling features to adapt to varying workloads easily. 📈
Use indexing wisely to improve performance and reduce costs. 📊
Utilize reserved capacity to save costs in the long term. 💰

Overview

At the CPH DevFest 2024, Hasan Savran delivered an insightful session on enhancing the performance of Azure Cosmos DB. He kicked off his talk by highlighting the necessity for making Cosmos DB configurations more scalable and cost-effective. Savran, a Cosmos DB and SQL Server expert, offered a treasure trove of tactics that attendees could apply to overcome common challenges, like improper partitioning, that often hinder performance.

Diving into the heart of Cosmos DB, Savran explained the intricacies of physical partitioning and its impacts on application scalability. He used the example of Chat GPT's robust use of Cosmos DB, pointing out how scalable and effective solutions could be when properly configured. He further discussed change feed functions for synchronizing data seamlessly and the benefits of point read operations in fetching documents efficiently without burdensome overhead.

Savran wrapped up the session with strategic advice on indexing, managing throughput, and leveraging potential cost savings through reserved capacities. He also demonstrated practical tips using his VS Code extension for monitoring database performance, which he offered to the Cosmos DB community. The crowd left equipped with practical knowledge to optimize their Cosmos DB setups while saving costs effectively.

Chapters

00:00 - 01:00: Introduction and Session Overview The chapter introduces the session and provides an overview of the key discussion points. The speaker expresses excitement about being in Copenhagen for the first time and outlines the session agenda, focusing on enhancing the scalability and affordability of Cosmos TV.
01:00 - 02:30: Presenter Introduction and Cosmos DB Overview The chapter starts with the presenter introducing themselves as Assan Saan from Cleveland, Ohio. Assan is a Cosmos DB expert, known for their presence on platforms like YouTube and various events. They are a Data Platform MVP and express enthusiasm about being part of the presentation. The agenda of the presentation, though initially appearing sparse, actually contains a lot of content, and the focus will be on Cosmos DB and SQL Server.
02:30 - 05:00: Physical Partitioning in Cosmos DB This chapter is an introduction to the Cosmos DB SQL Studio, a tool that can be downloaded via a barcode or directly through Visual Studio Code. The presenter invites questions via LinkedIn or Twitter. Additionally, the chapter introduces the concept of physical partitioning in Cosmos DB, indicating it's a crucial aspect worth discussing.
05:00 - 07:00: Horizontal Scaling and Transactions The chapter discusses the challenges related to physical partitioning in web applications, particularly when using Cosmos DB. It explains the limitations of a single database server, referred to as a physical partition, which includes a 50-gigabyte storage limit and a maximum 10,000 request units per second capacity. These constraints become significant as data size increases.
07:00 - 10:00: Partition Keys and Performance Issues The chapter discusses the concept of horizontal scaling in Cosmos DB. It explains that as the demand on the application increases, particularly in terms of storage, Cosmos DB responds by providing additional database servers to meet these needs. This is a crucial aspect for preventing performance issues as it allows the system to scale out rather than trying to expand a single database beyond its optimal capacity.
10:00 - 13:00: Handling Partition Key Absence In this chapter titled 'Handling Partition Key Absence,' the discussion revolves around the scalability of databases, specifically focusing on Cosmos DB. The context begins with a reference to a video about Cosmos DB, noting its use in the backend by chat GPT for its high transaction capacity. The chapter highlights significant statistics from March 2023, where Cosmos DB was handling 2.3 billion transactions, showcasing its robust performance and capability in managing massive data operations.
13:00 - 15:00: Change Feed Function and Data Syncing The chapter discusses the changes in database performance and handling as new models and features are implemented. In November 2023, there was a significant increase in transactions, reaching 10.6 billion. Correspondingly, the number of physical partitions in the database also increased to 25,000, indicating an adaptation to the rising transaction volume.
15:00 - 18:00: Point Read Function and Efficiency This chapter focuses on the 'Point Read Function and Efficiency' in the context of server management for chat applications like GPT. It emphasizes the importance of choosing the correct partition key, specifically the user ID, to ensure scalability. The text highlights the challenges of managing large numbers of partitions, drawing attention to the significant impact of partition key selection on the system's overall performance and efficiency, noting an example of dealing with 25,000 partitions and 50 gigabytes per partition.
18:00 - 23:00: Connecting to Cosmos DB (Gateway and Direct Mode) The chapter discusses how transactions are managed in Cosmos DB with a focus on both Gateway and Direct modes. It explains the concept of physical partitioning, where a high number of transactions, such as 424,000, are handled by individual physical partitions. Additionally, it highlights the capacity of physical partitions, indicating that a full partition can manage around 625 terabytes of data.
23:00 - 29:00: Dedicated Gateway Mode and Caching The chapter discusses the scalability and efficiency of Cosmos DB, particularly in handling global applications. It highlights the ability of Cosmos DB to automatically manage scaling, traffic, and partitions without requiring manual adjustments or changes. This implies a high level of convenience and performance optimization for users who are processing data as recent as November 2023.
29:00 - 35:00: Cosmos DB SDKs and Resource Management The chapter "Cosmos DB SDKs and Resource Management" covers the management and optimization of applications using Cosmos DB, particularly in smaller-scale environments where tools like chat GPT might not be available. It discusses a scenario with a simple application that stores orders in a single container and how to analyze and manage data using metrics such as customer ID, data size, and partitioning. By understanding physical partitions, such as having a single physical partition with 202GB of data, decisions on resource management can be made effectively based on provided metrics.
35:00 - 40:00: Stream API and Document Client Singleton The chapter titled 'Stream API and Document Client Singleton' discusses the splitting of a 44-gigabyte database into two 22-gigabyte servers due to increasing demands. As the database is set to split, the importance of maintaining 10,000 request units per second for optimal operation is emphasized. The high request capability ensures the system can handle numerous operations per second, indicating the robustness and scalability of the setup.
40:00 - 45:00: Consistency Modes and Metadata Requests In this chapter, the discussion centers around consistency modes and metadata requests in databases. It begins with an explanation of how adding more database servers can impact the performance in terms of request units per second. Currently, each physical partition has 2,000 request units, amounting to a total of 10,000 request units divided across the physical partitions. The speaker emphasizes the importance of monitoring which physical partition is likely to be split, as this affects the distribution of request units and thus the overall performance.
45:00 - 50:00: Optimizing Bandwidth and Partial Document Updates The chapter discusses the importance of optimizing bandwidth and implementing partial document updates. It highlights the role of frequent queries and partitions in maintaining a balanced and scalable system. A reference is made to John, who has successfully designed a robust architecture ensuring that the system operates efficiently as long as the customer ID is present. The narrative suggests satisfaction from John's manager due to the efficient running of the system.
50:00 - 65:00: Indexing Strategies and Composite Indexes The chapter discusses challenges related to using Cosmos DB, particularly when different developers work on projects. It highlights a situation where one developer, unfamiliar with Cosmos DB's intricacies, doesn't have access to a partition key, potentially leading to issues if following the structure set by another developer, John.
65:00 - 70:00: Computed Properties and Costs The chapter discusses the implications and potential issues related to using a specific select statement in a production environment. It emphasizes the importance of resource management, noting that the statement will execute on every partition, thus consuming significant memory and CPU resources on each server. This increase in resource usage could lead to additional costs and the risk of disrupting the performance of other operations by diverting the necessary resources, ultimately leading to potential system failures or degraded performance.
70:00 - 75:00: Throughput Modes in Cosmos DB This chapter starts with a humorous introduction about Joe, who faces a large bill due to a common situation many might encounter while using Cosmos DB. To address this issue, one recommended action is utilizing the 'change feed' function. This function helps manage data by duplicating it to another container with a different partition key, which can help alleviate the problem. The chapter likely continues with further details on effectively using throughput modes in Cosmos DB.
75:00 - 85:00: Cost-saving Tips: Reserve Capacity and Prioritization The chapter discusses cost-saving tips, focusing on the concepts of reserve capacity and prioritization. It highlights a strategy of optimizing database architecture by creating an additional container for orders under a different partition key, which is the product ID. This involves moving data from the original 'orders' container to the new 'orders product' table to synchronize data efficiently and possibly reduce costs by managing duplicates.
85:00 - 90:00: Q&A and VS Code Extension Demo The chapter discusses the use of a change feed function available in NoSQL Cassandra and Gremlin APIs. It explains the application of Azure Functions to trigger actions whenever there is a data change in the orders container. The chapter highlights a specific use case of sync data to the order product using Azure Functions.

Performance tuning for Azure Cosmos DB - Hasan Savran - CPH DevFest 2024 Transcription

00:00 - 00:30 all right I think I'm going to start here I have a lot of content to cover thank you for coming to my session and it's pleasure to be here that's my first time in Copenhagen it's a nice city uh so here's our agenda I'm going to talk about the Cosmos TV and how to make it more you know scalable hopefully more affordable
00:30 - 01:00 uh with this you know uh uh presentation so our agenda is kind of looks like there's not that much out there but actually there's a lot of slides here so before I start I guess let me talk about myself my name is Assan saan and I am from Cleveland Ohio uh I'm a cosmos dbme you might be see me on the I don't know YouTube or any other events usually I talk about the cosmos CV SQL Server I'm a data platform mvp2 so I'm uh really happy to be part
01:00 - 01:30 of the program uh I developed the cosmos CB SQL Studio you can actually download that from that barcode or you can download directly from the vs code if you like and if you will have any questions after this session please uh follow me from the LinkedIn or the Twitter I be more than happy to you know help you from there and answering all your questions now uh let's talk about the cosmos TB first I guess we should talk about the physical partition of Cosmos TB because that's usually where the
01:30 - 02:00 biggest problems are uh in the physical partition when you actually create let's say one web application Cosmos CB gives you one database server and well we call that physical partition so this server has actually limits there are two numbers here uh one of them is the storage limit 50 gig storage and the other one is the 10,000 request unit per second that's really the horsepower that it can give you the maximum so as your data is going to get larg ler and your
02:00 - 02:30 application is going to get busier one of those numbers you know the number is going to get close to I don't know storage is going to be 40 45 and as soon as it's going to get hit to that number Cosmos DB is going to actually give you more database uh servers depending on your needs so this is the horizontal scaling that means that you know okay if you get the close to 50 gig we are not going to try to make this database bigger we are going to give you another database that you can use so you you
02:30 - 03:00 might guess like okay how big we can get with this I was actually uh watching that video from the cosmos DB and they share this information I don't know if you know or not but chat GPT is actually using the cosmos TB in the back end so these are the numbers that actually uh they deliver in that video so as you can see the first one is giving you the number of the transactions those are the billions so in March 2023 we are looking like 2.3 billion of transactions Cosmos
03:00 - 03:30 DB was handling and as soon as they actually put some more uh modles and features you can see that that number kind of go crazy from there in November we have like 10.6 billion uh transactions and also if you look at the physical partitions those are actually respecting those numbers they are going high with the transactions so in November 2023 we have 25,000 uh partitions which is database
03:30 - 04:00 servers uh just for the chat GPD and looks like the partition key is the user ID which makes sense uh whenever you do something with the chat GPT you have the user ID so you know which partition has the uh you know the data so everything kind of scalable very easily if you pick the wrong partition key you are really in deep trouble here because we have like 25,000 you know partitions here to deal with and also by just looking those numbers I said that you know okay we have like 50 gig that's the of a per
04:00 - 04:30 partition and also we have 10,000 so by just looking at the you know the physical partition you can easily figure out how many actually transactions are getting handled by one physical partition and that's going to give you the last one which is the right now 424,000 uh Handle by one physical partition also my other guess here is in the right corner here so if each physical partition is Health full that makes 625 terabyte of data uh is getting
04:30 - 05:00 you know processed in November 2023 so I'm not sure what is the latest but probably is much higher than that right now so Cosmos TV is easily you know very scalable and it can handle Global application like that very easily uh and also you know they didn't have to kind of go change those things in the outer scale actually let's get outter scale up so Cosmos TB is actually controlling the scaling uh the traffic and the partitions
05:00 - 05:30 now not everybody has chat GPT right so we our applications are usually uh smaller than that so let's say we have a simple application and we are just keeping orders in one container we pick the customer ID and looks like we have 202 gig of data uh by just two numbers that I just talk about the 50 gig and 10,000 you can easily read this diagram and make decisions on it for example we have one physical partition looks like
05:30 - 06:00 it's 44 gig that is going to get split soon and you know that this is going to split and we are going to have another database uh server and health 22 gig is going to stay here and we are going to have another one with 22 gig that is important because right now it looks like we have 10,000 request units uh per second right 10,000 request units per second actually can deliver all these operations per second so that's really not like a small it can handle all those
06:00 - 06:30 uh you know in per second if you are going to have uh one more uh database server though that numbers are might change for example as it is right now each physical partition has 2,000 request units your total is 10,000 so it's divided by the number of physical partition if you are going to have one more then that means each of them is going to have like a lower number so that's why it's important to kind of watch which physical partition is going to get split so you can kind of
06:30 - 07:00 be you know uh on top of you know if something goes wrong you will be you will know it also looks like frequ queries out there has the partition Kingdom so that is really great well meet my friend John John feels good because well everything is working great here he created a good you know architecture here and as long as you know customer ID is out there this is going to scale well and it's balanced so maybe John's manager see that well
07:00 - 07:30 Cosmos TB is you know looking promising so maybe he's going to give another project to another developer so other developer the project was well we don't have the customer ID anymore but maybe the other developer is not really familiar with the cosmos DB that much so if we are going to use the same structure that actually John created well we have a problem here because well we don't have the partition key available in this case and if we are
07:30 - 08:00 going to actually not catch this and go production with it you probably break the other one which is actually working because this select statement is going to go in every partition you have and well you're going to be in trouble because you're going to start to use the memory and the CPU of each server and it's going to build you for that then that means you potentially broke the other one because this one is going to need more resources and also this one is going to use resources that you really uh don't have
08:00 - 08:30 to well my friend Joe here is really confused because when this happened you're going to have like a huge bill coming up and your manager is not going to be happy with this now I see this a lot and when this happens there are a couple of things you can do uh the first thing you can do is probably the change feed function in change feed function all you are really doing is you are duplicating the data to another container with a different partition key so in our case we have
08:30 - 09:00 orders container and our partition key is customer ID the first step we are going to do here is we are going to create another container and let's call that on orders under product with a different partition key that actually we have so product ID is the one that we were trying to use then we are going to actually move that data from orders to the orders product table so they kind of uh get sync we have the same data duplicating the data
09:00 - 09:30 here now we are going to use a change feed function change feed function is available in the nosql cassendra and gron apis uh when change Fe function happens we prefer to use the Azure function and Azure function is going to get trigger every time there's a data change in our orders container so from here you can really do anything with the Azure function but in this case we are just going to take this data and sync it to the order product so that's how we are
09:30 - 10:00 going to sync keep the syn between two containers you are going to get another insert charge here which is fine because you are really going to you know save in long term because in this case the query is going to be different uh one of them if you are looking if you have the order ID you are going to look at the orders container if you have the product ID you're going to look at the new container so by duplicating the data in no SQL databases is the cheapest way to kind of handle situation like that rather than you try to repartition or
10:00 - 10:30 try to find a perfect partition key uh in nor equal databases included Cosmos DB uh another option that might actually save you a lot is point read function uh Point read function is available and it can uh retrieve only one document it will not be like a query and it was the fastest and cheapest way to actually get a document from Cosmos DB so as you you can see there are some uh requirements
10:30 - 11:00 here uh first one is the partition key it is required and also the document ID is required to actually execute a point read uh feature or operation uh if you compare that to the query in query if you need more than one then you must need to use the query uh and as you can see request charge let's say our JSM data model is maybe 1 kilobyte data that is going to cost you if you use the point read one request units in the is going to cost you at least 2.3 or higher
11:00 - 11:30 uh and the main reason for that is uh for quy to run you need to run the execution plan which point read doesn't even need that so that's the one of the main reasons uh Point read is much cheaper so here is for example what uh query looks like and what point read looks like they pull the exactly the same document but one of them is uh cheaper than the other one which is the point read and you can do the point Point only from the SDK so you cannot
11:30 - 12:00 run that from the as a query next one is connecting the Cosmos TV so Cosmos TV is a cloud database so sometimes depending on your you know uh I guess the it structure firewall settings it can be kind of um sometimes challenge to connect the Cosmos TV so we have the first option Gateway mode Gateway mode is really we just put one server between your application and the cosmos TB and your application actually
12:00 - 12:30 connects the to the server first and server is the one which actually retrieves or inserts the data for you and you almost like create this network up between the cosmos TB and your application so I guess the first thing to know when you use this you are actually using the uh https and the single DNS Point uh many companies use this because it might not be easy to change your firewall settings or opening new ports so this is the really the
12:30 - 13:00 easiest way to reach to Cosmos TV without kind of going to it and try to I guess involve the security and all kind of stuff that can get long I guess from there but as you can see this server is actually getting share by other Cosmos DB customers so that is really not your server but but that's is why it's free I guess uh and the other thing to know we have an inactivity limit here so if you are not going to use that connection 2 minutes that's going to get disconnected and DK will reenable that but we will
13:00 - 13:30 see in a little bit actually opening ports or opening connections can be costly in Cosmos DB uh and also since this is an https just like any other you know browser when you kind of try to connect to the applications you have a limited number of connections which is usually 50 uh if you want to change that you can easily do that by changing the documents. client connection policy and Max connection limit that number can go from 100 to 1,000 and you can actually
13:30 - 14:00 you know make that connection uh much larger if you need to next one is the direct connection direct connection is the default way that uh from the SDK uh using the cosmos DB using the uh drag Point drag point so that's a long live connection that means that you don't have a two-minute limit anymore and it's using the TCP using TLS so it's pretty secure but because of that you might need uh a firewall change
14:00 - 14:30 or you might need to kind of go and open New Ports to make this available and in here as you can see we don't have a network hop anymore we don't have a server in the between your application and the uh database which makes things much faster and quicker uh this is available only in net SDK and Java SDK if you are using the uh the version 2 sdk2 you can actually overwrite that as you can see here I'm doing it for the cosmos client uh you
14:30 - 15:00 can easily do that to direct because sdk2 is default way to Gateway uh rather than direct next one is dedicated Gateway mode this is the same idea with the Gateway mode but as the name suggest this is your server uh and well since it's your server you have to buy it right it's not free anymore so to actually be able to use that first you have to go to Dedicated mod and buy a size of a server just pick it wisely
15:00 - 15:30 because you cannot change the size later uh as as soon as you have that uh available the connection string will be available under the keys and just use that connection string to be able to use the dedicated mode and really uh you might ask now why do I need to do that the other Gateway is free why do I want to pay that well first of all this is yours so you are not really sharing with anyone but also in dedicated Gateway
15:30 - 16:00 mode it gives you a little bit more features like caching data if you have an application and you are always using the same data again and again well caching might be really a good point for you to actually save some uh you know bills with the cosmos DB as you can see uh in here the first one our application is trying to read or query the data uh it goes our dat dedicated Gateway uh server server is the one which is getting the data and pulling it back
16:00 - 16:30 when we do the same request next time then you know that's going to cost us zero request units because well we already cach that data and if your data is not changing in your application and if you are trying to read the same data again and again this will save you a lot of request units the only I guess the thing that you really need to watch is the consensus level your consensus level must be session or eventual for this thing to work if it's something else you know it might not work but what you can do is I'm not really saying go and
16:30 - 17:00 change your databases consistency level you can easily change your query's consistency level uh and make this run if you really you know need them so how do we make this work uh first of all you need the SDK 3.2.1 or later uh then as I said before we have the dedicated Gateway uh right here first you have to go here and pick your server size it's not as it becomes available it will be under the keys you
17:00 - 17:30 just take your connection string from there and put it in your uh web config or wherever you are keeping your you know like keys and you are ready to go so from here first of all as I say before the direct mode is the default mode of stk3 which is the last one right now so that means that you have to go and overwrite your connection mode by using the connection string here uh so as you can see I'm picking the connection mode Gateway I'm forcing that
17:30 - 18:00 uh for this connection then looks like this is the query I'm running so to make the caching work I have this request options here first of all I want to be sure that consant a level is a session or eventual and here I just keep it eventual if your databas is cons level session eventual already you don't really need this one then you kind of need to tell Cosmos TB how long you want to cach this data so it looks like I am caching this for 30 minutes and this new one here I just commented out
18:00 - 18:30 that one is actually a new one it's still in I believe private mode right now and if you for whatever reason if that function you don't want to get the data from the integrated cache you can easily control that with that function you can make it true or false and or maybe only one function will always come from database even the is the data is available in Cache so you can Contra that with the bypass integrated cache parameter so as you can see here uh the first time when we run it it returns 592
18:30 - 19:00 documents cost me 32.25 request units when I send the same request in next 30 minutes it's always going to cost me zero request units so yes you're going to pay for the server but if you are reading the same data again and again this can actually save you uh a lot of request UNS uh and it will be useful next one uh let's talk about the cosmos dbsd case Cosmos DB stks do actually much more than other stks of
19:00 - 19:30 databases especially like if you think about the SQL Server uh really in the other uh SQL Server usually like all the aggregation all the you know the group bu sort by all are done by the database engine Cosmos TV doesn't work like that well I guess it cannot work like that because it has many physical partition the data is not centralized place so for example we have 10 servers here right looks like we have all the orders and I want to get get top 10 orders and I want
19:30 - 20:00 order by The View count property well if you really think about it well I don't have the partition key here too so that means that I am going to send this query to every physical partition I have which means the data is going to come from all of these places and the database cannot do the top 10 or the order by because it doesn't know what other physical partition data it has so it needs to come to the cosmos DV SDK and SDK is the one which is going to actually eliminate n of the properties is going to give me
20:00 - 20:30 the top 10 and also it's going to order by The View count so all the aggregations actually happens in the STK which means probably there's all kind of Loops happening out there all kind of resources is going to need so you want to be sure that your SDK wherever it runs it has you know good amount of like resources because it's pretty busy it's doing a lot of stuff there so if you actually see this is the JavaScript SDK and if I try to run the same one here as you can see in the left it we have that query info object and which actually
20:30 - 21:00 senses what kind of aggregation you might have and it kind of finds for example it finds the order by here and also it kind finds the top 10 out there as long as there is something out there it actually rewrites your query so my query end up with the the one in the right side you see on the top so that kind of helps uh to you know the SDK when the data comes back it can easily order by or do the aggregation uh for the SDK but that's what happens in the back
21:00 - 21:30 end uh next one if you are using theet SDK there are a couple of things that you should know the first one is you want to be sure that your project is in 64-bit not any CPU or 32-bit but 64-bit main reason for that is 64-bit is the only one which has a service interrupt that DL file this file can create the query plans locally that means you don't have to make a request Cosmos TB for a quy plan
21:30 - 22:00 so you save one request by just making your application 64-bit uh with this uh next one for the garbage collection you want to be sure that you are using the server mode rather than workstation mode and you want to disable the default Trace listener sometimes it can be bottleneck for the CPU and iio in production and if you are testing your application and your application is like 40 or 50,000 request units or more one uh machine is not going to able to
22:00 - 22:30 actually handle that because that's pretty high request units you might want to test your application with you know multiple uh applic multiple servers because if you're going to do that with the 40 or 50,000 since the application or the server is not going to able to you know handle that you might end up with 429s and you know all kind of weird problems which there's really no problem but your server is the one which is cannot handle all the requests and responses so you want to be sure that you have multiple uh testers I guess test
22:30 - 23:00 workstations next one is stream API uh this is available in SDK 3 and if you are uh try to use it uh to make actually more uh scalable and smoother because as you can see in sdk2 usually what we do is we Ser serialize the data then deserialize the data when we actually give it back to the application so that's like two kind of uh I guess the ways so in SDK is much smoother just almost like the drag mode and the gate
23:00 - 23:30 weight mode really we don't have that hop in the in the middle here so for example let's look at the old one which with the sdk2 this is the most common one so as you can see we have a list of sales so every time we are actually creating a sale here we are doing the serialization when you are actually done with that and try to give back to the application usually you convert this to Json object so that's going to be the distalization this so you are doing like
23:30 - 24:00 two steps here you can easily make this more simple by using the stream API so in stream API as you can see we are really creating like a Json text reader here and we get the data as Json and we return it as a Json so rather than having like two uh I guess the steps this is just one step uh the I guess the only catch up here is you must need the partition key for this thing to work without partition key it's not going to uh work so that's the only I guess requirement for stream
24:00 - 24:30 API next one is the document client document client is how you actually access the cosmos DB uh my suggestion is you want to be use the Singleton when you do the document client because the last thing you want to do is create a new document client for every request you want to create one document client and use for all other you know like calls to the cosmos DB uh main reason for that is actually let's see what
24:30 - 25:00 happens when you declare or or initialize the document client the first thing Cosmos DB is going to do it's going to request your account information after that it's going to get all the container information that's another request then it's going to get all the partition and routing information so we are talking like three requests here we didn't even make we haven't even make any kind of request for any kind of query yet so if you're going to do this then that means you're actually calling Cosmos CB four times to you know run a query so you want to
25:00 - 25:30 execute this once and use with the Singleton and use the same uh document client for all the requests here we can talk about actually the Conant in the mode which uh you know I just want to touch it because that's a big big topic but the cons the mod is going to actually help you or the help application to get to figure out where this data is going to come from so as you know in the cosmos DV we have uh let's say in one physic physical the physical server we have
25:30 - 26:00 actually Four servers in it we have one leader and three followers so it's a not set so conso System Mod is going to actually figure out where this data is going to come up because if you really think about it you have four places to pull the data from so depending on the cons system mode document client is going to pick that data from a leader or from the leader and uh follower or just the follower so that's where the cons mode is going to actually happen so from here it just psts the data and well uh you get the data after
26:00 - 26:30 that next one is metadata requests so depending what kind of application you have you might be creating dynamically containers databases maybe you want to list all the containers in a drop- down so all of those are actually metadata requests and you know you don't really use your uh threshold you pick for the request units so let's say if you have 1,000 request units when you try to get the list of database you don't really use those request units there is a
26:30 - 27:00 system reserve request units only for the metadata requests and which is 240 request units and there are some limits as you can see out there is like maximum uh collection create is 100 per minute so you cannot create more than that even you want it you cannot do it so if you are doing things like that you want to be sure that you cach this information as much as you can because that 240 request unit you have out there you cannot change it so I guess just be careful with with that uh and
27:00 - 27:30 yeah next one is optimizing the bandet so in here many times when you try to insert data for example I'm not sure how big your data model is going to be let's say you have like 10 KOB data model you insert it it's you send it to Cosmos DB and when Cosmos DB actually give you a response usually what you need is okay this is success or not usually what's what that's what you care but when Cosmos CB actually give you the response it actually attaches your data model
27:30 - 28:00 your insert request to the response which makes the you know the packages much bigger so you can easily control that by using the enable content response on right parameter if you make this false that means that object will not be attached to the response and the packages will be much uh smaller and actually that will help you uh especially if you are doing any with the bulk mode uh things so this will actually help
28:00 - 28:30 you next one is partial document update this one is a newer function and you can do all kind of things with this uh if you don't use this one and if you let's say you have a property and you have a flag you're going to change it from True to false uh if you don't use that you have to change you send the whole data model to the cosmos DB and Cosmos to figure out what is change and change only that one in here you can easily say that I want to change this property to the this value so really as I said
28:30 - 29:00 before the with the packages your request is going to be much smaller and it's going to be much faster this is not going to cost I guess this is not going to cost less request units it's going to really cost you the same because if you really think about it Cosmos DB is still going to open that and change it and save it back but it's going to make I guess developer uh site is much easier and also you can actually combine operations here so it doesn't have to be one property you can have two or three
29:00 - 29:30 uh this one supports the transactional batch so you can actually keep that with the transactional batch and also conflict resolution support is there too so those are the operations we have ADD set replace remove and increment and move uh we have the increment here but if you actually pass a minus number you can decrease a number easily so if you have for example uh number of orders and you want to increase it or decrease it you can say one or minus one and it can do it easily like that so let's look at an example here uh the
29:30 - 30:00 first one here it looks like I am creating a list of patch operations the first thing I'm doing is I'm adding a totally new property named sale date and I'm just passing today's date and I am removing the original sale date so when I run that as you can see I am going to need the document ID and the partition key for this need to work and pass my patch operation so that's how it works that's the basic
30:00 - 30:30 one another interesting option this one has is actually a filter so in here as you can see in the options I am actually have a passing a filter predicate and it says that I want this update to go through as long as this is true so if the number of items properly is larger than five this update is going to go through if not Cosmos DB is not going to update it so it looks like what we are doing here is I am adding a new proper name audit and I make it true as long as
30:30 - 31:00 the number of items is larger than five so with one operations I can actually do both together here with the partial update all right next one is indexing indexing is important uh in any databases is important so when you actually create a new container in Cosmos DB well Cosmos DB uh indexes everything so you don't need to worry about what I show index so many people
31:00 - 31:30 like it but you they figure out that maybe that is really not a good option when your data is going to get larger especially you have like a a data file which you keep all the data and you have a index file only one index file so if you are going to actually index everything that means your index file might be get larger than your data file and if you are for example let's say you have a property name description that you is never going to be in your web CLA but you are still indexing that so you
31:30 - 32:00 should kind of exclude properties like that uh later when you figure out what you are filtering by so we have three indexing modes uh in Cosmos DB it default is the consistant uh that's really the I guess the consistency with between the data file and index file when you say consistent that means that your data an index file always matches perfectly all the time as long as data changes index changes immediately if you make this lazy which is really dangerous
32:00 - 32:30 one which means that you the data file actually no the actually index file eventually match the data file which means if your data is actually coming from the index file potentially you might return wrong data so you don't really want to use the lazy uh lazy is actually a little bit cheaper that's many people use it that for but it's really not worth it so if you're using lazy just go back to the consant because it's worth it and also none that means you don't
32:30 - 33:00 really need indexes usually you do that when you bulk a lot any kind of you know maybe that's the first time you are pushing the data you don't need indexes because that's going to make things you know slower you just put the data in database first then figure out what needs to get index later so that will be a faster way to actually push the data first time to Cosmos DP so uh there are a couple of index types in Cosmos TB and the first one is the range indexes so the range indexes
33:00 - 33:30 is really covering most of the filters equal not equal range any kind of order by join everything is uh handled by the range indexes we have Composite Index which can be sometimes confusing because it has a lot of rules out there you need to follow but they are very useful if you have a lot of filters in your work cloud or if you need to order by by the multiple properties actually you must have the composite index if you have multiple uh properties in your order by we have the geospatial indexes those are
33:30 - 34:00 available only for geospatial data so as I say before everything is index but if you have a geospatial data index that will not be uh created automatically so you have to actually create those geospatial indexes yourself and the new ones we have is the vector indexes uh SQL API actually supports the vectors now and you can easily kind of index and this index will get used by the vector dist distance system function so this is still in preview right now uh but it's available
34:00 - 34:30 uh you can control the dimensions and the types easily with that index so as I said before the composite indexes can be challenging because as you can see out there uh you want to be sure that everything is in the composite index when you create it it has to be in the same order when you are actually putting them in the work clous so for example let's look at some of the composite indexes I create here the first one you see here my Composite Index index I create an index on name
34:30 - 35:00 and AG properties together and if you look at my query I am using them in the same order the name NH so Composite Index works just great with this one next one I am using the same uh Composite Index because this time the AG is not equal anymore it's larger than 18 which is a range filter but in composite indexes all the range filters must be the last one last decay in the composite index because of that it works so if my
35:00 - 35:30 Composite Index was H comma name this actually will not work uh next one I have here another kind of aggregation I have that one just works with the name and age Composite Index uh also I have the name and age as you can see everything works here until we have this select all from C where C name is not equal to John that doesn't work because well it's not equal anymore
35:30 - 36:00 so it's almost like the range so I have two kind of range filters here and you cannot use the composite index for this one uh if you want you can create multiple uh composite indexes for example if you look at the select all from C name is John and I'm passing the time stamp here but the first one is not working because the range index in the middle not in the bottom but in the other one it works works because I actually have two composite indexes here
36:00 - 36:30 so that actually handles both of them together so composite indexes can be tricky uh I guess test it in QA and be sure it's working uh then you know push the production because well when you push the production reindexing can take long time because you have many physical partitions out there in this get your index so you want to be sure whatever you are pushing you are making the right uh you know choices with the indexes so as I say before you to exclude if you are not filtering by any
36:30 - 37:00 properties because that's can actually improve the right performance because well Cosmos CB doesn't have to make the changes to the uh index file it's going to reduce the request units charges for the inserts and it's going to reduce the index storage so in Cosmos CB you pay for request unit and you pay for the storage so if you are not going to index things that means Less storage that means cheaper which is good and you want to do all the indexes change together so
37:00 - 37:30 if you for example compare the SQL Server you do this index first that index later you know maybe third one later so if you need to make three index changes in Cosmos DB you want to actually make that policy change together because for example we see that chat GPT has 25,000 partitions well when you push a you know index change like that you are actually you know doing 25,000 reindex so that can take time that can take
37:30 - 38:00 resources so that's why you want to make those changes together in Cosmos CV next one this one is actually new one brand new one competed properties so you know in this is an aqqle database and you know sometimes you have to use the lower function because the well the the I guess the post type here can must be you know uh answer anything like this so when you actually run that this is a function this function is going to be uh
38:00 - 38:30 costly for you and the system function is going to be costly in this case it is like 23.25 request units you can easily create a computed property and they are under here the computed properties uh tab here in here it looks like I have two computed properties one I'm going to demo is the second one here the C post type as you can see I am running the lower for that one so that is going to actually create a new property a virtual property uh cpor post type and if I'm going to
38:30 - 39:00 use that cpor post type rather than running the function like that in the first one this is going to actually drop uh well almost the halfway with the 13.75 request un so you can actually use the computed columns if you are using any kind of system functions uh in your work class and if you create a computed properties with that you can easily save some request UN in the Cosmos TV the first one here as you can see I have I'm using the daytime different so if you have anything like that you can easily
39:00 - 39:30 use that too next one is true put so the default one with this one is the provision I guess when it comes to the trut I try to explain that to you let's say we are actually uh throwing a pizza party right and you don't know how many people you're going to come to your pizza party or maybe you know you know exactly how many people are going to show up if you know how many people are going to show up then you're going to go with the provision and order just right amount of pizza for everybody to use so that's
39:30 - 40:00 just a provision works so you know you're a web application or any kind of servers you know how many people are going to come to your application you know exactly request un unit Autos scale is great I try to use it as much as I can and that that scales your application uh up and down easily in this one you are actually giving a cosmos DB a number that you want to go up to for example maybe normally your appc works with 1,000 request units but tomorrow is a big sale day and you know
40:00 - 40:30 you maybe it's going to go 5,000 6,000 you don't know so you said okay I'm going to put a 10,000 here so it can go up to 10,000 as long as people start you know like coming to the application so it can easily do that for you if you pick the auto scale you pick a large number and application scales up as long as it needs and as long as it reaches that number uh the only thing you want to be careful with the scale just pick something that you can actually AFF for just don't pick something like 100,000 request units and because what can
40:30 - 41:00 happen is somebody can attack your application right if that happens then that means you might actually hit that number and you might end up with a surprise Bill next month so you want to be sure that you know whatever you pick here is you know uh something you can afford and the serverless is the one right now uh actually there's an update for that one right now which you can create a serverless if this is the first time you are creating the application you don't really need I guess request in this per second you maybe pick 1 million request units and each time you use the
41:00 - 41:30 uh Cosmos CB it will subtract that from 1 million so you can easily start your application with that and now actually there's a new function you can actually convert your server list to the provision if you like so if this is a new application you can just start as a serverless when you are ready to go to production you can maybe convert to provision uh this works with the great with the Azure functions too maybe you are running that Azure functions couple of times a day it doesn't has to be provision so you don't have to you know pay per second you can go with the
41:30 - 42:00 serverless and that will actually uh you know save you uh some uh money with the cosmos DP also I know that you know you come here you try to you know save money and try to make the you know Cosmos DB bills cheaper with knowing learning some tips and tricks but there's one actually option here that you don't have to change anything and it will actually save you money which is the reserve capacity if you know you going to use the Azure Cosmos DB in next one year or
42:00 - 42:30 three years then you can easily pick the reserve capacity and that will give you 20% to 30% depending the number of years a discount so you don't have to change anything you can just use that uh this number here is between 100 to 1 million request units if it's higher than that believe me that number actually even goes higher than that so it can go up to 50 60% if you are you know application using more than 1 million uh request unit that becomes a variable so that's the easiest way to save you know money
42:30 - 43:00 with the cosmos DB without changing any code uh if you are if you are sure you are using the cosmos be long term that's the way to go uh next one is this one is in preview too so you might be dealing with uh some you know 429 since the cosmos div is a rate limited sometimes you might be hitting that rate limit and let's say maybe you have 20 transactions going on and you want to be sure that maybe 18 of them is really high that two if that fails you don't care so you can now actually control which uh
43:00 - 43:30 should be the high and which should be the low by using the priority base execution so by default everything is high so you can try to keep the which one is low or you can keep the high and make the you know the lws uh I guess notable so in here that's an example here so it looks like I am trying to create this uh well transaction looks like I'm creating a TR stack Overflow post and just making this priority level high this will be prioritized with the
43:30 - 44:00 cosmos TB so if I'm close to that 429 this will actually execute first then other low priority items so you can kind of control if you are hitting that 429 rate limit so this is still in private preview but it will be available hopefully soon now let's see how many minutes we have maybe five more minutes so if you have any questions I can kind of help now or I can actually show you something else here I'm not sure if you know
44:00 - 44:30 anything about the vs code extension I have I can maybe demo some items for you let me see this is going to be let's see if I can share this sure you can use a service in production uh I mean as long as you you know you have the right correct request as because it can get usually in production things can get busy so you know you
44:30 - 45:00 might need more resources for that but yeah you can use it but most most of the time people kind of goes back to provision or other scale so you don't need to worry about keeping up to rec how many TR unit you need yeah so let's see here how we can do this one minute
45:00 - 45:30 all right all right this is going to be tricky but I'm going to make it work so I cannot see on my screen but I can see it on here so in here as you can see this is a vs Cod extension is free I create for the you know Cosmos DB Community you can download it free uh you can use it rather than the browser you can actually uh pick your database
45:30 - 46:00 here it's going to go and give you all the containers it has and from here you can easily for example pick here this will actually give you how many physical partitions this container has so you can easily see the structure without kind of you know worry about which tab or button you click in aure Portal it can be sometimes challenging in Azure portal and this one is going to give you like for example if I go with the top 10 let's say run this one here I click execute and it's going to give you how
46:00 - 46:30 much request that thing needs or run for and it return me 10 items well as I said before SDK actually is the one which is handling that because we know that we have five or six partitions here so actually I pull much more than that by clicking execution uh Matrix here you can see that we made five requests here and we got 50 items not 10 so you can easily click on this link here and see we actually got 10 from each physical
46:30 - 47:00 partition because we did not Define a partition key here so SDK is the one which is actually eliminate 40 of them and display the 10 so that's why the SDK is going to need some you know resources because it does all kind of stuff out there especially if you group buy order buy all those are going to actually happen by the SDK and this one is the last one actually I show you this is for example in real time you can see we are using the lower C post type let's actually execute this one
47:00 - 47:30 here uh none of them is actually my partition key so it's going to take some time it looks like it cost me 3234 request units if I'm going to actually use the uh the computed property and run it again the same query should go much cheaper so it was 32.3 24 now is 26 so that number can be you know variable depending how much data you have uh and if you're using the partition Q or not but that actually helps uh the other thing I can probably
47:30 - 48:00 share with you if you go under the options here you can click the indexing metrix here so this is actually suggesting you indexes so in here if I'm going to go back and execute this one more time let's see what it's going to say as you can see I have a new tab here indexing metrix and after testing runs you can see it says that you are using the owner user ID that's utilized so don't touch it uh but you know it looks like CP post type is not index so if I put an index
48:00 - 48:30 on it probably that number is going to go even uh lower than that so this is kind of useful uh easily you can see the actually here let me see indexing policy this kind of tells you what is indexed and what is excluded for example score is excluded if I'm going to put the score here probably that's going to cost much more and it's going to tell me potential index is is a score uh other than that it has all kind of options features uh I'm always kind
48:30 - 49:00 of working on this I'm hoping to actually make it open source so everybody can kind of help because now believe it or not I have like 11 or 12,000 downloads on it already and people are kind saying all kind of stuff to me but I don't really have that much time so I'm going to hopefully make it that open source uh this whole and it's in JavaScript since it's a vs code so if you I guess up to it I would like to you know get help from it from you so but that's all I have for you today I hope
49:00 - 49:30 everybody learned something new and thanks again for coming to my session uh yeah I enjoy it thank you very [Applause] much I can answer any questions you have because since this is the last session we done sure sure
49:30 - 50:00 can you say question one more time okay so that really depends first depends on how big your database is so if your database is pretty large it will help you dramatically because you know
50:00 - 50:30 when you actually when you make that call then that means it's going to find that index in one call rather than three times it's going to be like when I cover the range indexes that means that you have that index available in three places so but when you make that Composite Index it's going to find it with one call it's going to be much faster but to see that dramatic change you need to have a lot of data too because that's going to actually make a big difference yeah I I really find very useful to
50:30 - 51:00 around price when I correct corre so that
51:00 - 51:30 also sure yeah that that's a that's a good feedback and also you know whenever it kind of gives you a good number but in the meantime when you select try not to select everything either you know what I mean because the number is going to give you if you like 20 many properties for example and if you only care about the three properties you can easily just say select three properties that's going to make the request unit much smaller too so that calculator is great but you know it calculates for ever properties you have so whenever you try to query the data I guess try not to
51:30 - 52:00 say select all because that will actually help you with the request that's one of the I guess good tips too yeah yeah no thank you thank you for coming all right thank you everyone