Amazon Interview Insights

STOCK EXCHANGE SYSTEM DESIGN | AMAZON INTERVIEW QUESTION DESIGN STOCK EXCHANGE

Estimated read time: 1:20

    AI is evolving every day. Don't fall behind.

    Join 50,000+ readers learning how to use AI in just 5 minutes daily.

    Completely free, unsubscribe at any time.

    Summary

    The video by Tech Dummies Narendra L provides an in-depth look into designing a stock exchange system, applicable to cryptocurrency or forex trading as well. It highlights the key elements required for an effective stock exchange system, such as real-time price updates and risk management. The video breaks down the architecture into major components like matching engines, API gateways, and risk management systems. It also details the process of handling and matching orders using data structures like queues and heaps. The focus is on high availability, low latency, scalability, and the use of microservices and APIs for efficient processing.

      Highlights

      • Designing a stock exchange system involves understanding software architecture suitable for various trading platforms. 💡
      • Orders have to be matched quickly and accurately, using heaps or linked lists to maintain efficiency. ⚡
      • A strongly designed API gateway plays a crucial role in managing client requests and routing them to appropriate microservices. 🌐
      • The system should prioritize in-memory operations for high-speed transaction processing, minimizing latency. 🚀
      • Zookeeper helps in maintaining high availability by coordinating between active and passive matching engines. 👥
      • HFTs rely heavily on real-time data and efficient routing to perform trades with high precision and speed. 🕒
      • Using Kafka for message queuing and routing ensures robustness in order processing and coordination. 📬
      • Serializability and backup strategies are important for recovering states and maintaining order integrity in case of failures. 🔄

      Key Takeaways

      • Real-time price updates are crucial for stock exchange systems to ensure users can act swiftly amidst price fluctuations. 📈
      • A matching engine is the core component for balancing buy/sell orders efficiently, using data structures like heaps and lists. 🔄
      • High availability and scalability ensure the system can handle high volumes of orders and expand as needed. ⏱️
      • Risk management systems are essential for safeguarding transactions and maintaining system integrity before order execution. 🔒
      • HFT (High-Frequency Trading) systems use algorithms and machine learning to make rapid buying and selling decisions. 🤖
      • The system should maintain states in memory for faster processing, with periodic backups for disaster recovery. 💾

      Overview

      Creating a system for stock exchanges requires a deep understanding of matching engines and real-time processing. The matching engine acts as the brain, ensuring buyers and sellers' orders are executed swiftly. Implementing this requires selecting the right data structure, such as heaps or linked lists, to sort and match orders efficiently.

        An API gateway acts as the first point of contact for client requests, ensuring secure and efficient communication with internal services. It manages and routes thousands of requests per second, even allowing High-Frequency Trading (HFT) entities to plug into the system effectively via APIs.

          High availability and scalability are paramount. Zookeeper aids in coordinating active and passive engines to prevent downtime, while Kafka helps process and queue orders reliably. The system must perform most operations in memory to reduce latency and keep up with the fast-paced trading environment.

            STOCK EXCHANGE SYSTEM DESIGN | AMAZON INTERVIEW QUESTION DESIGN STOCK EXCHANGE Transcription

            • 00:00 - 00:30 hello everyone my name is marina and indecision let's understand software architecture or system design for stock exchange and this system design is also holds good for crypto currency or forex trading as well thanks a lot for all the support until now and in future as well if you like the content on this channel and if you'd like to share or buy a cup of coffee for me you can do so by joining to the channel in written I have custom emojis for
            • 00:30 - 01:00 Prada Me's and also I'm gonna feature your name in the end of the videos which I am going to upload in future and thanks a lot for the support we can use this system wherever we have to match the supply to the demand or vice versa so here are the requirements for our system and this is the goals of our system design so the requirements which we want to cover in this system design is users should be able to buy and sell stocks on this platform the second one
            • 01:00 - 01:30 is real-time price update information because we need to keep on sending the real-time price information to the users for the fact that users will be looking at the price fluctuation and then he is going to place an order to buy our set so it has to be real-time and the third one is we should have basic functionality like look at all the stocks which he owns and also the ordering in history or information and the fourth one is risk management we have to have our system which analyzes the risk before executing the order
            • 01:30 - 02:00 matching and what are the goals of our system so the first one is we have to process at least you know 0.1 million orders for a second second one is thousands of stocks are registered companies are in the platform so we have to process orders for all of them with low latency and the third one is geo-specific we don't have to design the system to cater users from all over the world because usually the exchanges are specific to our country even though
            • 02:00 - 02:30 there are users from other outside the country it's fine our goal here is to design the system specifically for users in a specific geo and the fourth one is low latency we have to process this art of matching as fast as possible it could be in an around 300 milliseconds something like that and the fifth one is high availability or scalability where in this system should be always available for users to place orders to buy or sell and also the system should be able to add more companies or stocks into the
            • 02:30 - 03:00 platform and keep on scaling linearly first let's design the core component of our system that is matching engine but before understanding the match engine you need to understand what exactly stock exchange does a stock exchange is a place which is usually a regulated place where people come together and then they have something to buy and something to sell and the matching engine is going to match those two together and how does it look like
            • 03:00 - 03:30 it usually looks like this if you see this image is for the reliance stock which has these many orders to be debate or buy and the right side shows the orders to be sold and this one is the screen shot taken from the crypto currency exchange where people want to trade bitcoins and it all boils down to this one called as order book which usually looks like this where in this example we can see that there are four
            • 03:30 - 04:00 columns the buy quantity is the number of shares waiting to be born and the buy price is the price which a buyer is willing to pay for this particular shape and the set price is on the right side is the price which a seller is quoting for a share the set quantity is the number of shares you want to sell so in total we have around 500 shares waiting to be bought at a rate of 5,000 250 and
            • 04:00 - 04:30 we have thousand six 50 shares waiting to be sold at five thousand two hundred and fifty two so now let's design our matching engine so the characteristic of our matching engine should be it should be able to add an order it should be able to cancel the order if I don't want to buy or sell and it should be able to split an order that means that if I want to buy ten stocks and if nobody is selling 10 stocks on a whole our matching engine should be able to buy in a split or whatever it can and it should
            • 04:30 - 05:00 be able to match obviously and the second one is it should be able to do all of these actions much efficiently ideally it is always good to do it in order of one are better in log n time and it should do everything in reliably so how does the matching engine work it's not really complicated it's very simple consider you have these orders to buy okay usually people buy try to buy the shares
            • 05:00 - 05:30 or the stocks or any commodities at cheap price so the winner is who wants to buy at the higher price so he's going to stand out right because anyone who is trying to sell something he doesn't want to sell cheap so usually the sellers as well won't try to sell it with higher price but the people who stands out in cell queue is one who wants to sell it cheaply so the buy order list will be
            • 05:30 - 06:00 sorted in the decreasing order where the buyer with higher price or the buyer who wants to buy a high price will stand out okay here in this case there are a couple of orders or people who wants to buy some stocks and here someone want to buy em an order with price of $10 and he wants to buy 5 stocks and here with price of $9 he wants to buy 10 this guy with the price of $8 he wants to buy 5 stocks and in said cube there are a lot
            • 06:00 - 06:30 of orders here and we want to sort this in increasing order and this one in the decreasing order so here the one the guys who's who is are the orders which stands out is the one who wants to sell cheaper so in this case the orders are there are order with $8 and 5 quantity $9 and 5 20 and $9 by quantity $9 by quantity because no one wants to be reserved for cheap right
            • 06:30 - 07:00 because most likely the prices will go is same or it will be always higher so when we start it in the increasing order all the other orders which are selling at really high price will go in the end the orders who wants to sell it if the cheaper price will come in the front in this case they're the guys are the orders wants to buy at higher price will come in the starting of the list here so now how to match it it's very
            • 07:00 - 07:30 simple it's just like the array elements matching that's simple so what we have to do is take the first order from this this and then try to match from the starting of the cell list so $10 and 5 so someone wants to buy five shares at $10 price but someone is also selling 5 shares with $8 price now it's definitely a good match even we are getting for $2 cheap now we can match this because the quantities are also matching that means
            • 07:30 - 08:00 that this and this can be matched so the the buyer who wanted to buy a $10 he is actually getting it only for $8 and these two are matched and we can execute this consider we executed so I'm gonna remove this from the list okay no those two orders are gone and who is standing out here so there is a guy who wants to buy at $9 and 10 stocks and someone selling some stocks with $9 price and
            • 08:00 - 08:30 high quantity now we can definitely match this and how do we match it is this one and this one get matched okay and the problem here is the quantity now he wants to buy 10 and here there is an order where the guy wants to sell only 5 shares only but there are a couple more orders here where you can buy from but we can't do really match like that what do we do is we only purchase 5 starts from here okay and then we can either
            • 08:30 - 09:00 update this to 5 or other strategy is totally remove this and then we add it into this list so it will come back here with you know $9 and 5 why it will come back is every time when the new order comes into the list will always be sorting in the descending order and here we always sorted in the increasing order so the same order which was split comes back and sits in the front of this list
            • 09:00 - 09:30 and we can match it now now this and this will get matched and these two can be executed okay and these two are executed and similarly we can keep on executing and this list will keep on burning and as and when the new order comes in the list will also be inserted with the new orders like say someone wants to buy add say $10 20 shares and if there is something we can this match now $10 $9 $1 chip you can keep matching and then executing so what are the data structure we can use to do with this the first one is you can use
            • 09:30 - 10:00 lists and do this the second one is you can use linked lists and the third one is you can use heaps okay now in linked lists you can keep the order of this list in increasing and decreasing because we don't have to sort it because when we start usually we start from one order and then when we keep on appending we can always check and then insert it and the same thing with the list as well we start with
            • 10:00 - 10:30 empty lists and then as and when the entries are inserted we keep on searching for the proper place where that order should be inserted and then we keep inserting based on the price now the time complexity is you know right in the in this case of Lists the insertion deletion will all be order n but in case of access it will be order of 1 but in case of linked list it will be in order of 1 where this is the idea candidate where you can use and also you can use heaps where you can use max heap
            • 10:30 - 11:00 and min heap so in this case you can use max it because we always want the the order with the highest price so we can use massive here and we can use min heap here and then we can do perform this in log in time as well couple of things to remember here we need to make all of this operation in memory there is no escape to that the reason is we want to
            • 11:00 - 11:30 perform all of this so efficiently as soon as we write this into database this list into the database or into the disk we can tree need to function efficiently or we can't really process thousands of order millions of orders per second so key takeaway is we have to consider to do the whole thing always in memory and the second thing is we need to consider that these data structures are whatever
            • 11:30 - 12:00 we are doing should be serializable because we always want to keep this state somewhere periodically we want to take a backup and then keep it in the cache so in case of our matching engine crashes or the machine where this matching engine is running crashes we can go back to the persistent storage and we can populate the list which we had and all the orders in it so we get back to the state so you can think of the whole thing as a state
            • 12:00 - 12:30 machine where it does all of these things so the next thing we need to understand is what are the information which we need to capture for a specific order if it if it was a linked list in that case in each node what are the information we need to save or if it was a list in that case list of object or list of dictionary or list of pupil what is the information we want to capture the first thing is we need to capture the ID of the order and the next thing is the price of the order at which we want to sell or buy the quantity of the
            • 12:30 - 13:00 stocks or the shares which we want to buy and the status of that usually a queue and once we execute it maybe we can change it as executed and once we delete it we will not need the status anyway but if you want have any other use cases you can use that and of course metadata where you want us to store any other extra information so this could look like if you're using lists it will be looking language either struct or
            • 13:00 - 13:30 dictionary or it could be even tuple so it will be looking like this tuple with only values in it something like that but if it was a node we'll have all of these keys in it and you are basically linking it all together let's understand the system design diagram in here you can see there are users and there is hfp users are like any normal user who uses UI to interact with the system whereas hft basically uses API calls directly
            • 13:30 - 14:00 what does hft stands for is high-frequency trading what happens here is the companies or individual persons have will usually own couple of servers and they know sophisticated algorithms are machine learning techniques to identify when to purchase an order and when to sell it usually as a user's what we do is we see the information on the UI and take a
            • 14:00 - 14:30 decision but in this case HFD consumes all the price changes and how many stocks were sold and bought all of this information and based on that the algorithms decide when to place an order and when to sell those stocks for the profit usually companies make a lot of money like crores and crores of money using hft so that comes to the first entry point for all our API is API gateway the hf key might be placing a call here or
            • 14:30 - 15:00 users using the UI or mobile apps could be placing calls through HTTP REST API calls to API gateway so what is the functionality of API gateway so API gateway takes all the API calls from the clients and hft basically all of them are clients and then this API gateway routes them to appropriate services here I have only shown the accumulator and
            • 15:00 - 15:30 risk management system but internally they could be having a number of micro services they want to make call in this case just just for your understanding API gateway will mainly perform some of the actions like protocol transformation where in all of these API gates to a are where in all of these ApS are exposed using HTTP but internally demo might be using some other protocol it could be
            • 15:30 - 16:00 RPC it could be another rest call or it could be anything it doesn't matter in here usually all the API will be interacted using HTTP or HTTPS and inside it could be only HTTP so all the security layer will be dropped here there could be a firewall as well in the front that is a web firewall or raph you
            • 16:00 - 16:30 can call it as once the request is sent from API gateway it goes through accumulator what happens in accumulator is all the others are given a sequence number based on a lot of different criterias for for your understanding all you have to understand is when the order comes in it needs a unique ID the unique ID is basically as I need at this point of time and you could be thinking why in the world we are as an unique ID
            • 16:30 - 17:00 over here why not when we save in the databases because you will understand a lot of our functionality over here is done all in memory without using database so we need a unique identification number so that will be happening here so we will basically I didn't assign a unique ID for any order request it could be buy or a sell order they will assign a unique ID oh here and then that order information will be sent
            • 17:00 - 17:30 to risk management system or it is also called as RMS usually it is what it does is it makes sure's that order is safe to be entered into the whole system and there are so many checks which are mostly to check if the order is within the predefined parameters of price size capital etc there are I so these are the
            • 17:30 - 18:00 list of risk management usually any stock market will do for any given order I will be posting some of the links in the description you can go and read all about it so once the order passes through the risk management if it is safe to be processed that order will actually go to the router what happens in the router is based on the kind of stock you want to
            • 18:00 - 18:30 sell or buy it will be redirected to one of the cube so will be having n number of queues which is equivalent to the number of companies we will be handling so if suppose we are handling 1000 companies in our system obviously will be having 1000 stocks to handle so there will be 1,000 Q's over here so all the requests say for example all the requests for Google stock will be
            • 18:30 - 19:00 redirected over here all the orders you know buy orders and sell orders for Tesla will be redirected over here so at any given point of time in any of the queue only similar kind of stocks will be there so this router could be anything you could implement using Kafka streams if you are using Kafka queue otherwise if you're on AWS maybe SNS can also act as a router based on your
            • 19:00 - 19:30 routing key these queues could be sqs or it could be Kafka as well once our order is placed over here inside here and this is picked up by our matching engine we have already discussed how we match it it is simple algorithm which runs completely in memory so the idea over
            • 19:30 - 20:00 here is so consider if you have three queues so there will be one virtual machine for our queue okay so this VM is responsible to handle Apple and this is responsible to handle Google and this is responsible to handle Tesla so this will be having one process which will be reading from the queue and then matching by using two arrays one is for by and
            • 20:00 - 20:30 one is for cell so as we discussed the algorithm it's the same algorithm or a matching engine which will be running inside this virtual machine which will be using heaps or linked lists or whatever array and keeps on consuming from the queue and then matches it or if you have very less orders per second what you could do is you can just have a couple of virtual machines and then say for example it is this one
            • 20:30 - 21:00 suppose if you're handling very less traffic or orders per second you can have a couple of virtual machines and you can run you know one or more instances of matching engine in in the same virtual machine and you could be consuming from multiple q q q so say for example Tesla Google and Apple you might
            • 21:00 - 21:30 be consuming all of that into same machine in different processes instant you know different processors each instance of matching engine will be running here and then they will be executing in the same machine but it all depends on how you want to design it's okay it based on the requirement and traffic so the idea here again is we've so any
            • 21:30 - 22:00 at any given part of time we need two matching engine instances per stock or two matching engine processors per stock and there should be deployed in different machines and different data centers in this case let's see for Google will be having active matching engine over here and also a passive matching engine over here why do we need that way because we need to have high
            • 22:00 - 22:30 availability and also reliability if for some reason if our active matching engine fails then immediately the passive matching engine will take over and starts processing the orders and then it does all the things which active matching engine without done we will be needing zookeeper here because we need someone to take care of who is the
            • 22:30 - 23:00 active engine and it has to do its all other responsibilities in in usual case what happens is if both of these engines like active engine passive engines are always a and running then both cannot be updating into the databases only the active engine should be updating all the data into the database and backups and everything so we need these systems to understand who is the active matching
            • 23:00 - 23:30 engine because both will be running they don't have an idea of am i the active matching engine so zookeeper will basically help you to identify who is the active matching engine zookeeper has a concept called sequential Z node and also FP ephemeral you know what this is a short living Z node it only have an entry or forgiving server when there is always an active connection from any
            • 23:30 - 24:00 machine in this case our epic ephemeral 0 will have two entries one for the primary active matching engine and one for the passive man you know matching engine in the sequential node whoever registers first they will have an entry and the second will be having so the first one is always the active instance if suppose the active machine get disconnected then this will go away so
            • 24:00 - 24:30 the second will become active in that case so this will not be there until unless it comes back again and then the ID will be here and that will be the passive and also one more interesting thing you need to know is both of these matching engines will be active active that means both will be consuming ok both will be consuming the messages which is sent from this router to this group all at the same time that means
            • 24:30 - 25:00 that everything is done over here in the same time it could be a little bit of difference in the time or it could be same but basically the idea over here is whatever this pass active matching engine is processing in the same time the passive matching engine is also processing the only difference is it will not be writing or sending any outputs to the underlying system but the active matching agent will be keep on sending why do we have to do that is if this
            • 25:00 - 25:30 goes down this passive matching agent should take over immediately so it is already in the same state of the active matching engine so this kind of design is needed because we are doing everything in memory so that's the reason why we need both of these active and passive matching engine to keep on performing the same things so what
            • 25:30 - 26:00 happens later there once the active matching engine figures out a perfect match for by order in the sales selling array or linked list it what it does is yeah not here it actually sends a message to another router saying that I actually found a matching order and then it is sent to a lot of different partitions so in case of Kafka the N
            • 26:00 - 26:30 number of threads are a number of nodes in the consumer side we will be having n number of partitions you can design here as well something like how we did it here like you can have individual queue for each stock or you can do it as a partition and only the server consumes every stock what what is the idea of the primary data server is it basically computes the stock tix star-struck tech
            • 26:30 - 27:00 looks like this it is simply the representation of minimum representation or measurement of the minimum upward or downward moment in the price in this stock exchange the server once the order is matched it receives the information about the order or the trade we just happen and then it is going to convert the price changes into the tip and that
            • 27:00 - 27:30 changes will be written into our DBMS and the idea in the database is that this our DBMS is sharded you can shard it into shared by stock so basically you'll have one shard per stock so this could be Apple shard this could be Google shard so you will be maintaining thousands of charts or if
            • 27:30 - 28:00 your data is not too much you can use hashed charting or any other strategy it's up to you so usually it's a good way to charge buy the stock itself so you have all of the specific stocks data in one shard and it will be easier to scale and you can have more number of writes and reads achieved in the same shard itself and also this primary data server is going to write into to the time series databases which will be used
            • 28:00 - 28:30 to render the graphs like time did the price of variation at any given point of time and and something like that so time series are really good in that so who is going to consume that data is the UI and the front-end application needs to show graphs like this one so it you know always need time series databases it is not really a good idea to use a DBMS for that pattern and also
            • 28:30 - 29:00 all of this information is also consumed by your stream processing engine where you can do a lot of other streaming processing like fraud detection other machine learning whatever you want to do basically will be having a you'll you will be having a stream processing over here so I guess we did a one cycle of how an order starts its journey from an
            • 29:00 - 29:30 API gateway into risk management router and it goes to Q from the queue it goes to active matching engine from matching engine it comes to router and here and then saves and everything so meanwhile this primary data server will be will can talk to this payment system and deduct the amount or money from the users wallet our account once the matching is done and that information is
            • 29:30 - 30:00 also updated over here okay so we need a payment payment system that itself is a big system I don't want to talk about it now there if the user want to add some money you can actually make this API call and redirect it to bank or whatever it is so he'll be loading all the money into the payment wallet so this system will basically take care of that and you can also see that the UI is coupled with
            • 30:00 - 30:30 Syrian or the aps can be this ApS can be coupled to the Syrian to cash some stuff like like the graphs I showed you here because this data is mostly once it is there it is always historical data so you don't need to really query the time series database all the time so you can cache that as well so the other important thing we need to discuss is how the reliability of this matching
            • 30:30 - 31:00 engine we already spoke about passive and active engine what happens in certain cases that both machine is down in that case how the things work is on a specific duration or interval like say every one minute once data in the active matching engine will be dumped into you know Redis or any storage engine the
            • 31:00 - 31:30 idea here is as I already mentioned in the algorithm your data structure whatever you use in your algorithm should be serializable okay so that way you will be dumping all the recent data into the snapshots so in other words in the worst case if both matching engine is down so when they come back they can read all the latest the last-known state of the machine from
            • 31:30 - 32:00 here and and and that way you will get back all the order our orders which were there already in the order book and one more thing you also need to understand is Kafka does support transactions and manual acknowledgments which you can actually manual acknowledgments and transactions which you can actually use it to build
            • 32:00 - 32:30 your matching algorithm as well that way you can make it make the system more robust also as and when you receive your order in the active engine or passive engine you can also write a copy of order into Cassandra this is not really the job of matching engine there could be one more thread which is running that
            • 32:30 - 33:00 could keep on writing into the Cassandra so that way after you get all the latest information from the snapshot you can query the remaining order information from the Cassandra and then load it back to your matching engine memory so you have the latest state available I mean here there is a lot of different strategy you can use it somehow you need to persist the latest known state into Redis and then get back that information so that's about it
            • 33:00 - 33:30 I guess I have covered most of the information you need there the other things to understand here is this primary data server is also signing the information to the pop servers so what pop servers does is they are basically providing the updates to the users or hft in real time like basically the broadcast using WebSockets are using
            • 33:30 - 34:00 tcp/ip or UDP ports the idea here is you should be able to send the price changes or ticks as soon as possible to the end users because H of T is high-frequency trading where they will deciding what stocks to purchase or sell in you know in terms of ten milliseconds
            • 34:00 - 34:30 precision so they need to know as soon as possible so they will be listening to these pop servers there are other strategies as well in in real exchanges what they do is they specifically as an only specific number of connections to each servers because they don't want to overload and also they try to instead of looping through each and every connection instead of looping through each and every connection and broadcasting is they try to broadcast it
            • 34:30 - 35:00 instead of just looping through because when they loop through there were very first connection might get the the the tick information of the prize information and then the last connection might get a little slowly so they don't want to lose out that as well so they want to receive as fast as possible there are a lot of strategies over here as well that is out of the subject right now