Exploring the cloud's potential

Benefits of High Availability and Scalability in the Cloud - AZ-900 Certification Course

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

High availability and scalability are key concepts in cloud computing that offer incredible flexibility, agility, and cost-efficiency. Unlike on-premises systems, the cloud provides vast capacities through globally distributed data centers, allowing businesses to seamlessly run various services like virtual machines, databases, AI, and machine learning. With cloud providers like Azure, users can utilize services in proximity to their customers for optimal performance, ensuring high availability through distributed instances and disaster recovery strategies. This means businesses can easily scale their resources based on demand, providing resiliency and elasticity without upfront equipment costs.

Highlights

Cloud computing provides massive scale and flexibility with global data centers. 🗺️
Services can be consumed from anywhere, increasing resiliency and decreasing latency. 🚀
Pay-as-you-go pricing in the cloud offers financial agility, minimizing waste. 💡
Horizontal scaling is preferred for its simplicity and effectiveness in cloud environments. 📈
High availability ensures services are uninterrupted during failures by utilizing multiple instances. 🔌

Key Takeaways

The cloud offers unmatched scalability and high availability, enabling businesses to grow their resources as needed. ☁️
Global cloud data centers ensure services are close to users, enhancing performance and user experience. 🌍
Cloud services' pay-as-you-go model alleviates the burden of paying for unused resources. 💸
Modern architectures in the cloud support horizontal scaling for better availability and cost efficiency. 🔄
High availability in the cloud can be achieved through strategic resource distribution across multiple data centers. 🏢

Overview

Cloud computing transforms how businesses approach capacity and scalability. By using vast global data centers, companies can deploy various services like virtual machines and AI close to their user base, ensuring better performance and lower latency. This proximity not only enhances user satisfaction but provides robust high availability options, minimizing downtime risks.

The financial model of cloud services, notably the pay-as-you-go approach, liberates businesses from investing in costly hardware that often remains underutilized. With the flexibility to scale services vertically or horizontally as needed, cloud users can efficiently manage workloads, scaling up during peak demands and down during lulls.

Furthermore, cloud solutions bolster disaster recovery strategies by allowing workloads to shift between regions seamlessly in case of outages. Businesses looking to modernize can benefit immensely from the cloud's elasticity, enabling them to adapt quickly to changing demands and maintain a competitive edge in the marketplace.

Chapters

00:00 - 00:30: Introduction to Cloud Benefits In the chapter titled 'Introduction to Cloud Benefits', the lesson focuses on explaining the benefits of high availability and scalability in cloud computing. The concept of cloud computing is demystified, pointing to the common saying that the cloud is essentially just someone else's computer. The chapter delves into essential elements such as capacity, management platforms, and user experience, highlighting their roles in facilitating service operations.
00:30 - 01:30: Capacity in On-Premises vs Cloud In this chapter, the focus is on understanding the differences in capacity management between on-premises and cloud environments. On-premises capacity is limited to the physical hardware available in local data centers, which includes servers, network switches, and storage areas. In contrast, cloud capacity is more flexible as it can be adjusted according to demand, allowing for scalability without the need for physical infrastructure expansion.
01:30 - 02:30: Cloud Data Centers and Regions The chapter discusses cloud data centers and regions, focusing on networks and network-attached storage that provide capacity for on-premise workloads. It covers the use of various technologies, such as virtual machines and containers, and the creation of pods that run specific container images. The emphasis is on understanding and managing different types of capacity for efficient workload management.
02:30 - 05:00: Cloud Services and Flexibility The chapter 'Cloud Services and Flexibility' discusses the advantages of using cloud services over traditional on-premises infrastructure. It highlights how the cloud offers vast capacity and flexibility for users to install and run business applications. Unlike on-premises systems, which require upfront hardware purchases and are limited by physical capacity, cloud services provide scalable resources that can be adjusted as needed without the obligation of continual payment for unused capacity.
05:00 - 08:30: High Availability in the Cloud The chapter discusses the concept of high availability in cloud computing. It emphasizes the global distribution of data centers, specifically referencing Azure, to ensure data and services are consistently accessible worldwide. The focus is on understanding how these geographically dispersed data centers contribute to the robustness and reliability of cloud services.
08:30 - 15:00: Scalability and Elasticity The chapter titled 'Scalability and Elasticity' introduces the concept of grouping resources based on proximity, referred to as regions. The speaker mentions how different regions are identified, such as region one, region two, etc. This concept is foundational in understanding the distribution and management of resources in a scalable system.

Benefits of High Availability and Scalability in the Cloud - AZ-900 Certification Course Transcription

00:00 - 00:30 In this lesson, we're going to look at describing the benefits of high availability and scalability in the cloud. Now, when we think about cloud computing, what exactly is it? There's a joke. There's no such thing as the cloud. It's just someone else's PC. And there's a lot of truth there. We often think about capacity. Capacity is something that enables services to run. they have some management platform, some user experience. So we can consume the
00:30 - 01:00 service. Now on premises, if I think about my local data center in my on premises weld my capacity is provided by well I have various types of server. I'll have network switches providing network throughput. I'll have storage area
01:00 - 01:30 networks, other types of network attached storage, but these are all providing capacity for the workloads I want to run on premises. And then on top of that capacity or maybe I'm running virtual machines, maybe I'm using containers and so I'm creating pods that run a certain container image. But I can think of on the capacity. Well, there's different types
01:30 - 02:00 of service that capacity provides that I can then install my business applications on and actually use. on premises I'm bound by the physical hardware I have and obviously I have to buy it in advance and I'm always paying for it. If we now think about well how does that work in the cloud? In the cloud that capacity is huge in scale that are
02:00 - 02:30 housed in data centers all around the world. I can think about there are specific sets of data centers. So if we now pivot to thinking about the cloud, I'm going to draw this massive idea and this is Azure. All around the world there are these groups of data centers. So there's these various groupings
02:30 - 03:00 available certain proximity to each other and we think about we group these based on a certain amount of proximity. So I could say, well, okay, this is a region. And you'll hear that term a lot. That might be region one. This is another region. So that's region two. And so on top of all of this
03:00 - 03:30 capacity, I can then run the various different services. And in the cloud, there's more types of service available to you. Yes, there's obviously things like, hey, I can go and spin up a virtual machine. I can use container environments like Kubernetes, but I might also have database offerings that are managed for me. There might be artificial intelligence and machine learning services that I can go and consume and many many others. So we get
03:30 - 04:00 this many different types of services, hundreds of them available that I can go and consume and I have a lot of flexibility in where I can go and consume that. If we jump over and we look at Microsoft's map, all of these blue dots are existing regions and again within those regions can click on one of them. They will have many data
04:00 - 04:30 centers and we can see information about when did it open different compliance offerings that are available plus many more. And so I as a customer have great flexibility that I can go and consume services from points all around the world. That gives me flexibility in terms of well I might want them in lots of places for resiliency from any kind of failure but also if my customers are distributed all around the world well it would be great
04:30 - 05:00 to offer my services close to the customers so they get a fantastic experience. There's not a long delay and having to go really long distance over the network. And so what all of these different types of services and available in so many different places gives me is this fantastic agility because I'm only paying for the service I consume. I don't get locked
05:00 - 05:30 into I've bought this server, I have to use this server. I can change where I want to host my services. I could host it in more places. I could host it in fewer. I could change where I host it. I can switch sizes of VM, number of VM. I could switch from VMs to containers to app services. I can switch database sizes. There's no penalty for me because I just pay for what I'm consuming typically on a per second basis. It's also fantastic because it means if I do
05:30 - 06:00 make a mistake, I've done a sizing exercise, I've picked a certain service, and then if I work out that's not the right thing, I can very easily change. Now, if I switch from VMs to containers or app service, there may be some application development effort, but that's completely different from being locked into a particular set of infrastructure I may have purchased. Another key concept when we architect any solution but it includes the cloud is high
06:00 - 06:30 availability. High availability is about ensuring that if certain types of events occur around disruption our service continues to function. Now every single Azure resource has its own specific service level agreement or SLA. This is a financially backed guarantee around what you can expect in terms of its availability to be communicated with to be used and if it breaks that SLA you
06:30 - 07:00 get a certain financial credit back. We cover SLAs's later in this course. It's important to understand the SLA your solution needs so I can then architect it accordingly. Now, very often to meet my high availability, it means I need multiple instances of my resource over different blast radiuses. And a blast radius, you can think of, well, a server can fail. So, I'd want to make sure my instances were in different servers. An entire
07:00 - 07:30 server rack could have a power supply or a network switch failure. So, I'd want to make sure I'm in different racks. But data centers can have failures. Maybe calling, maybe their power. So I might want to distribute over multiple data centers. And so we think about what is the blast radius I want high availability for. And then by being able to distribute them, I'm able to survive larger potential blast radius type problems. So different servers, different racks, likely different data
07:30 - 08:00 centers. So in this case, I would think about from a a high availability If I thought that this was a specific region and we have this concept of availability zones which again we'll cover later but they're isolated sets of data centers independent power calling networking. So for my high availability at minimum well I'd want to make sure I got at least two instances of my service
08:00 - 08:30 in completely different sets of data centers which means I'm also getting resilience from rack level or node level failures because hey I've got that nice distance between them. Now additionally you do have to think about disaster recovery. So we think typically of high availability I might think within a certain location they're close together I can do different types of replication without any risk of data loss but then if there was this horrible region level outage
08:30 - 09:00 well then we might think about disaster recovery and so disaster recovery would mean I have the ability to run my workload in another region. So we think of this as dr. And what I would want here is a very big distance. I might think hundreds of miles because I want to make sure if there was some natural disaster here, but it doesn't impact this region as well. Now the form that disaster
09:00 - 09:30 recovery takes will vary greatly. It depends on how long I have for my service to be able to be up and running again. Maybe it's I create the new resource, I restore a backup. Maybe I'm constantly replicating data and it's kind of ready to start up. Or maybe I really don't have a lot of time at all and so it's constantly synchronizing. It's running and I would just switch something over. Modern architectures I may even be running active in lots of regions. Then there's some endpoint that
09:30 - 10:00 enables the client to go to whichever one is currently active or is closest to them. And so that agility I spoke about to have that flexibility and only pay for what I'm consuming when I need it, that also gives me something called elasticity for my services capacity because I can think about well over time the amount of work that my service needs to perform is going to vary depending on
10:00 - 10:30 maybe what's coming in. And so what I want is this ability to scale my various workloads. So let's now think about this idea that I want to provide scalability and as part of that I get this great elasticity. My service can grow and shrink based on the work it needs to actually do to
10:30 - 11:00 service the number of incoming requests. And I could think about many different types of workload have some element of seasonality. And what I mean by that is if we think about time and the work I need to do very often it's not a flat line. Very often there's some peak of work to do, maybe a really quiet time, maybe another peak, maybe a more average amount of
11:00 - 11:30 time. And that could vary. This could be daily, weekly, monthly, yearly, maybe even every few years. Consider a tax application. It's really busy for one month out of the year. A pizza site is busy for a few hours a night. It's even busy on a Friday and a Saturday night. The Olympics, hey, I'm hosting the Olympics. I'm busy every four years. And so there's this variance in the amount of work my service needs to do. And what I don't want to do is for my service to
11:30 - 12:00 always operate to be able to handle the peak. That's what we have to do on premises. We have to buy all the servers for the busy times. And often they're very idle. What I want instead to be able to do is for my service in the cloud because I pay for only when it's running on a per second basis. I want to change the amount of service I have at any moment in time to match the amount of work it actually needs to do. And that could be I'm manually doing that
12:00 - 12:30 scaling or maybe I want to automatically do that. For example, I could say, hey, look, for the instances I'm running, if the CPU is over 70%, we'll add some more instances. If the CPU is less than 30%, let's remove some. So we have this great flexibility. Now how can I do this scaling? One option would be to scale vertically. So I could do a
12:30 - 13:00 vertical scale. And what this is meaning is I have my instance and if I'm scaling vertically well I'm making it bigger. I'm adding more CPUs. I'm adding more memory. In reality, that would mean I have to stop the instance. I would stop the workload. I would resize it and start it again. That's because very few operating systems actually support adding much less removing CPUs or
13:00 - 13:30 memory. Neither fewer applications would handle it, which means I'd have downtime. Also, I don't want just one instance. Remember, that's not good for my high availability where I'd rather have more smaller instances. And so while I can do that, we give this a little bit of a a frowny face. We try not to do that. The much better option is to scale horizontally. So I can do horizontal scale, which means obviously
13:30 - 14:00 I'm scaling this way. And here we just have a certain number of instances. And if we're getting busier, well, we add some new ones. If we're getting quiet, well, we could remove some. And this is the type of scaling we want to do in the cloud. We give that a happy face. And this is how the cloud is designed because now I've got no downtime. I'm just adding and removing instances. Again, there'd be some kind of balancing solution that the client
14:00 - 14:30 would talk to to then balance between the ones that do exist at a moment in time. And we would typically never scale to less than two because remember for our high availability. I always want at least two. Now horizontal scanning does require your application to support more than one instance of it. But again in modern architectures that's becoming very very common. And so when you think of the cloud we think we want to pay on a per second basis for the amount of
14:30 - 15:00 work we're actually doing. And horizontal scale lets us do that. and we could automatically say hey look CPU is busier than a certain amount or Q depth is above a certain number let's add some hey we're not doing very much work let's remove some instances and that could be automatic it could be scheduled it could be manual so those are the key concepts I wanted to cover in this we think about we have great agility we can have many different types of service we can change their sizes we can change the service we want to
15:00 - 15:30 use this concept of high availability by having multiple instances spread over different blast radiuses. And then we have this elasticity, the ability to scale, yes, vertically by making them bigger or smaller, but more commonly we're going to scale horizontally by adding and removing based on the amount of work that varies over type.