Oracle Database Optimization

Explaining Explain Plans

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

Chris Saxon, part of the Oracle Developer Advocate team, provides insights into understanding execution plans for optimizing SQL queries. He draws parallels with real-world navigation, explaining how to read and traverse execution plans, identifying common mistakes, and providing solutions such as gathering statistics or creating indexes to improve query performance. The session emphasizes the importance of execution plans over explain plans, noting the discrepancies in row estimates and actual data retrievals. Various strategies like indexing and understanding logical I/O are discussed to enhance SQL performance.

Highlights

Chris Saxon highlights the importance of reading execution plans to optimize SQL queries. 📊
Execution plans are likened to driving directions, guiding you on how the database retrieves data. 🗺️
The session demonstrates transforming scalar subqueries into joins for better performance. 🔄
A practical explanation is given for when to gather statistics to correct execution plan estimations. 📊
Chris Saxon discusses logical I/O and its significance in assessing query performance. 💡

Key Takeaways

Execution plans are crucial for understanding and improving SQL query performance. 🚀
Indexes can drastically reduce I/O operations, speeding up query execution. 📈
Logical I/O is key in measuring query efficiency and scalability. 🔍
Understanding execution plans' tree structure helps in identifying performance bottlenecks. 🌳
Not all performance improvements come from SQL tuning; sometimes business logic needs revisiting. 💡

Overview

Chris Saxon takes the audience on a journey through the realm of reading execution plans, wrapped in an engaging and practical session aimed at optimizing SQL queries. With a light-hearted tone, he equates execution plans to driving manuals that show you the path your database takes to fetch results quickly and efficiently.

Participants explore relatable examples where missteps in execution plans, like incorrect row cardinality or excessive full table scans, affect overall performance. Chris brilliantly folds in explanations about potential solutions, such as gathering better statistics or adding indices to make queries fly.

The session emphasizes the significance of accurate row estimates in execution plans. Real-life database tuning stories underline the importance of aligning actual data retrieval with predicted outcomes, advising viewers on recognizing when business logic itself requires a tweak alongside SQL tuning for optimal performance.

Chapters

00:00 - 00:30: Introduction and Session Overview In this introductory chapter, Chris Saxon, a member of the Oracle developer advocate team, introduces the session's topic on reading execution plans. He highlights the goal of helping users maximize their use of Oracle databases, emphasizing both technical improvement and enjoyable learning experiences. The session builds upon the previous month's discussion, which focused on optimizing queries, providing a detailed continuation of enhancing query efficiency and understanding execution plans.
00:30 - 01:00: Understanding SQL Execution Plans Understanding SQL Execution Plans introduces the basics of interpreting execution plans, after previously covering complex topics like date criteria. Emphasis is placed on simplifying the subject matter, with encouragement for attendees to engage actively by asking questions, either through chat or by unmuting, while observing proper protocol when not speaking.
01:00 - 01:30: Reading Execution Plans The chapter titled 'Reading Execution Plans' discusses the declarative nature of SQL, highlighting its ability to determine the most efficient way to retrieve desired results. However, it also addresses the challenge that arises when the database's execution decisions lead to slow performance and how one might go about identifying and rectifying these issues to optimize query speed.
01:30 - 02:00: Explaining Key Concepts The chapter titled 'Explaining Key Concepts' discusses strategies for optimizing SQL queries. It opens by highlighting a common scenario where someone seeks advice on improving SQL performance. Experts often suggest examining the execution plan as a solution. The execution plan includes various details such as joins and full table scans. However, many people find themselves confused by the technicalities of the execution plan and struggle to understand how it can help them improve SQL performance.
02:00 - 02:30: Examples and Demos The chapter 'Examples and Demos' focuses on optimizing queries and understanding how database optimizers plan and execute routes to achieve efficient query performance. It draws a parallel between mapping out driving directions with arrows and how optimizers use plans to guide query execution. The narrative provides insights into interpreting these plans for better optimization.
02:30 - 03:00: Scalar Subqueries and Optimization The chapter discusses the concept of scalar subqueries and the optimization of execution plans in database systems. It begins with a real-world analogy of traveling, as the speaker mentions their annual trip from the UK to San Francisco for an event. The explanation uses this travel example to relate to following directions in execution plans for databases.
03:00 - 03:30: Comparing Different Plans The chapter discusses the complexity of execution plans in computing, comparing them to travel plans. While one might expect a straightforward path like flying from London to San Francisco, execution plans often resemble a convoluted map filled with full scans, joins, and other technical details that do not provide clear guidance on navigating from start to finish. The analogy highlights the confusing nature of these plans, which lack the simplicity of following direct routes like driving directions.
03:30 - 04:00: Questions and Comments The chapter titled 'Questions and Comments' includes a transcript that discusses the process of traversing through a plan. The speaker mentions having an execution plan to start drawing arrows and intends to demonstrate examples of this process. The chapter aims to walk through some common problems encountered in execution plans and offers potential resolutions. The discussion begins with a prelude to the detailed process of drawing arrows.
04:00 - 04:30: Execution vs. Explain Plans The chapter titled 'Execution vs. Explain Plans' focuses on the structure of an execution plan, which is depicted as a tree. The execution plan is not literally a botanical tree, but rather a data structure known as a directed acyclic graph (DAG). In this structure, each operation can have at most one parent and potentially many children. The relationship between parent and child operations within the execution plan is indicated through indentation; the more an operation is indented to the right, the further it is in the hierarchy.
04:30 - 05:00: Gathering Statistics and Improving Plans The chapter titled 'Gathering Statistics and Improving Plans' discusses the importance of preserving indentation, especially in text-based plans. It highlights a common issue observed in forums where plans are often posted without proper indentation, making it difficult to understand the sequence of operations. Proper indentation is crucial in visually conveying the structure and flow of the plan.
05:00 - 05:30: How to Optimize SQL Queries The chapter focuses on optimizing SQL queries, emphasizing the complexity that arises when dealing with extensive execution plans involving multiple tables and numerous lines. It underscores the challenges in understanding the hierarchical relationships between elements in such plans. To facilitate easier interpretation and collaboration, the chapter stresses the importance of maintaining formatting consistency and using fixed-width fonts when sharing text-based files. This practice helps in clearly depicting the relationships and structure of SQL execution plans.
05:30 - 06:00: Real-World Examples The chapter 'Real-World Examples' explains how to discern parent-child relationships in hierarchically structured operational data using indentation as a guide. It uses a specific courier format display to showcase these relationships visually, aiming to clarify which operations are parent tasks and which are their respective children by observing the indentation patterns. A practical example given is identifying the parent operation of a 'hash join' task, which is the first operation listed above it that is less indented, and correspondingly identifying its children as those operations indented more to the right.
06:00 - 06:30: Leveraging Tools like SQL Developer The chapter 'Leveraging Tools like SQL Developer' discusses the intricacies of SQL operations such as hash joins, focusing on analyzing the hierarchy of operations in a query execution plan. It highlights the structure of parent-child relationships within the SQL operations, where a 'parent' operation in a plan is one indented above, and a 'child' operation is one indented below. The chapter uses a metaphorical language, referring to operations as 'children' and 'bricks', to conceptualize the full scans and their connections in query optimization processes.
06:30 - 07:00: Common Mistakes and How to Avoid Them This chapter discusses common mistakes in understanding hierarchies, specifically focusing on the parent-child relationship within data structures. It clarifies the concept of siblings and children within a hierarchical context. The chapter emphasizes understanding these relationships by considering the depth and level of indentation in diagrams or setups. Additionally, it suggests visualizing the hierarchy by drawing it out, which aids in better planning and comprehension of the structure.
07:00 - 07:30: Conclusion and Final Remarks In the conclusion and final remarks chapter, the discussion is centered around joining database tables and understanding the traversal of plans in database management. The process involves starting at the top of a structure and working down to the unvisited leaves, illustrating a hierarchical approach to database traversal. Further details on how this traversal works are not provided in this segment.
07:30 - 08:00: Q&A Session The chapter 'Q&A Session' delves into a process similar to navigating through a data structure, focusing on performing a depth-first search on a tree. The discussion involves a detailed explanation of starting at the top of the structure, moving down to explore each branch thoroughly before backtracking and proceeding to the next branch. Specific examples include conducting a full scan of a colors table, moving back to parent operations, and iterating this process, analogous to a depth-first search. This approach helps in understanding the intricacies of performing operations on complex data structures in a structured manner.
09:30 - 10:00: Wrap Up and Goodbye In this chapter titled 'Wrap Up and Goodbye', the speaker prepares to conclude the discussion by summarizing key points and preparing to demonstrate how certain plans work in practice. The speaker asks the audience for any questions or comments before proceeding, and Peter presents a relevant question regarding SQL development, which the speaker finds valuable and intends to address with a demonstration.

Explaining Explain Plans Transcription

00:00 - 00:30 [Music] hi everyone I'm Chris Saxon I'm part of Steven Feuerstein Oracle developer advocate team my job to help you get the best out Oracle database and hopefully have a little bit of fun whilst doing so so that mind we're going to be looking at how to read execution plans today so session last month we talked a bit of detail about optimizing queries with brains base
00:30 - 01:00 criteria in them you know dates between from and to dates and so on I thought would bring it back down a notch this time just focus on some of the basics you know just how we actually read an execution plan Apple to start with so as always this is your chance to ask asked questions so if you've got me put them in chat if you want to speak and actually talk to me that's fine to unmute yourself you're not talking please do keep yourself on mute
01:00 - 01:30 as always the recording will be available would be already lent so as I said welcome and let's get going so sequel is fantastic one of the great powers of it is its declarative nature you tell they face the result you want and it figures out the best way to get it but if it's figuring out the fastest way to do something and your queries too slow this gives you a bit of a problem how the heck do you make it faster right you know the database is taking charge
01:30 - 02:00 of before the optimization process so what do you do well if you start asking around about how to make sequel faster improvements performance you'll ask you know some expert and they'll say something like look at the execution plan and you're like okay so you do we go out and get the plan for your query and start looking at things saying join full table scan and so on you're looking like one how the heck does this help me you know I've no idea what's going on here I really don't know how this helps me
02:00 - 02:30 optimize my query in any way so what we'll do is look at some of these things now the first thing to note is the plan is like the driving directions for the optimizer so if you planning a route you'll get your map out and draw some arrows on it these days you'll get your favorite mapping software software to draw the routes on it but you'll end up with a series of arrows on your map showing you the you need to follow follow so what we really want to do is draw our arrows on
02:30 - 03:00 our execution plan to see what's actually going on so just to relate it back to the real world example so I live in the UK and once a year or at least once a year I fly out to San Francisco usually for open well not next year anymore we're moving to Vegas for open well next year but that's use it a few years it's been San Francisco and if I want to follow these directions it's pretty obvious I've got my map it says leave my house take the train and
03:00 - 03:30 whatever get to the airports at London get on the plane why to San Francisco fairly straightforward but what with an execution plan you got something that looks a bit like this I say full scans joins and so on I'm like no how does this help us how does this help us figure out what's actually going on you know you're like no don't look like driving directions there's not an easy obvious series of arrows that I'm going to follow to help me figure out how to get from through
03:30 - 04:00 the plant how I Traverse through it so what we'll do is show you the process of doing it basically we've got our execution plan here we're going to start drawing some arrows I show some examples how you can do that and at the end we'll then walk through some examples of some kind of common problems or things to look for in an execution plan and how you might resolve them all right then so Before we jump into the details of how we actually draw these areas on bit of
04:00 - 04:30 background so the thing to notice here is the execution plan is a tree now of course I don't mean like a plant that you'll see out in your garden and we're talking about the data tree right that the directed acyclic graph so each operation has at most one parent and can have many children so the parent-child relationship in an execution plan is determined by its indentation so the further indented to a right to the right an operation is more
04:30 - 05:00 likely it is to be a child and particularly with these text-based plants it is really really important you preserve this indentation quite often you'll see plans and posted on forums and so on they look kind of like this everything is just perfectly left justified and all the operations appear kind of aligned with each other on left this makes really really tricky to figure out what's actually going on here
05:00 - 05:30 so simple plan like this two tables we can probably cobble it together but imagine there's 10 15 20 30 lines in s execution bun comes quite tricky figure out how the things relate to each other you can spend 10 15 minutes just trying to piece together what is the child of what and in some cases it's really not very clear so if you're sharing text-based files like this absolutely critical make sure you preserve the formatting and use a fixed-width font
05:30 - 06:00 such a courier so we can see the indentation clearly okay so we can see parent-child how do we know what the parents of an each operation is all its children flat start with this hash join kind of in the middle here so the parent of an operation is the first one above it that is indented to the left so we look at that hash join the first one above it to the left is that selecting so that is its parent the children of it are those indented to the right that are
06:00 - 06:30 below it so the full scans of both the children of this full table of this hash joint sorry okay so we've got the parent and got those children what about the full scan of the colors here what are its parents children well the parent is the first operation above it that's indented the left slap that hash join it's child well it has no operations below it that are indented to the right so the bricks the full scan of the
06:30 - 07:00 bricks table below it is its sibling it's not a children child so all the operations below it indented to the right up until the next one is sibling that is at the same level same depth of indentation are are its children and then got sibling and then we repeat process okay so we know the parent-child relationship there what we could also do is draw the plan out like this you know like you drew draw a typical hierarchy
07:00 - 07:30 select the top that's joining and then it's got its two scans below it of the two tables right so we can kind of starting to piece things together a bit more here how do we actually follow the process how do we traverse through this plan well what we do you know whenever generally traversing a hierarchy you start at the top the route and then work our way down so we work the way down to the first unvisited leaf so that is
07:30 - 08:00 right down to the bottom so that would be that full scan of the colors table once you've got there you go then go back up to the parent of this operation and having done that you then repeat the process so we got to the join go down to the next and visited leaf which we the full scan of colors go back up to parent and so on and so forth through the plan so this is the basic this is a depth-first search of the tree so some at the top go down for the first leaf
08:00 - 08:30 back up and keep recursively doing this so what I'm going to do is walk through a few example plans to show you how this works in practice and we got or what questions or comments do you have at this point before we jump into that nope no questions in okay from Peter out see them in sequel developer very great question Peter I'm going to have a demo
08:30 - 09:00 later on where I'm going to be showing these plans in sequel developer and show you how to get these as well so I'm going to stick with text-based plans like you might get with DBMS X plan just the slides we will move to a demo later the easy sequel developer so stay tuned for that okay so let's bring up our plan again so this is just our basic two table joint so remember first thing we need to do so at our select thing to the top and
09:00 - 09:30 travel down the plan to the first unvisited leaf that is the top operation in the plan with no children that's the full scan of the color stable so we go down for that once we've done that read all those rows we pass them back up to its parent the hash joint build the hash table here and then looks for the next unvisited leaf here that's the full scan of the Brix table read all the rows out of that start passing them back up to the hash join find excuse me the
09:30 - 10:00 matching rows and return them to the client so so far hopefully fairly straightforward let's level things up a bit make bit more complicated we'll now go for a four table joy seems a bit more complicated but the process is exactly the same you start that select statement at the top travel all the way down the plan until you get the first unvisited leaf at the top one in the plan has no children that's the full
10:00 - 10:30 scan of the colors table once we've got there read those rows pass them back up to the hash join bill bash table look for the next unvisited leaf watch this full scan of the toys table read those rows pass them back up to the hash join now at this point we've read all the rows from the children of this hash join operation so there are no more unvisited leaves below it so rather than looking for the next i'm visited leaf we then just go straight up to the parent that
10:30 - 11:00 hash join so we join the first few tables we're now building the hash table from the result of those then look for the next and visited lease that's down for the Pens table we read all the rows out of that and then continue on through the plan following a little kind of sneaking pattern that looks a little bit like this so I'm a bit more complicated but hopefully still relatively straightforward let's level things up again go for another more complicated example
11:00 - 11:30 this time we will again have four tables in the plan but they're not all going to be joined in the same way well they're not going to be combined in the same way so we've got plan look something like this and you can see we've got a union operation there there's something a bit different going on so this time again start at the top that's select statement travel down the plan to the first unvisited leaf that's that colors table feed the rows out of that pass them up to its parent a hash join now from here we need to travel
11:30 - 12:00 down the plan to the next unvisited leaf that is all the way down and that full scan of bricks table now these scans with bricks toys and pens are all at the same depth they are all children of that Union all operation so we'll walk down reading those rows and do know there is an optimization available if you run this query in parallel database can read all three of those tables at the same time in normal serial operation it will
12:00 - 12:30 do one or the other by the way we're reading those rows you'll pass them back up through Union all operation which will combine them into a single result set this point there are no more unvisited leaves in the plan so all we need to do is just walk all the way back up its passing the rows back to individual operations so we can combine them give our final result set okay so there's a bit more complicated but hopefully still relatively easy to
12:30 - 13:00 follow let's level up again now I'm going to show you a plan that looks much simpler at first but it's a bit different so let's have a look at this plan okay we've just got two tables in here but it looks a bit different to our original two table join right so what's going on here what anyone noticed something a bit different or a bit suspicious about this plan so the sort I serve yeah there's a bit different group I okay so the sort in
13:00 - 13:30 the group I tones got it scaler sub query and select list and you get this tone yeah silly thing to note here about this what's different is there's no joint operation right where is the join in this query or someone saying missing a join condition you know looking in that execution plan there is no join there and as tone correctly
13:30 - 14:00 identified the query that we got that generated is something a bit like this so and the key thing to note here is I said we had this rule of we walk down the plan and read the first unvisited leaf things change a little bit when it comes to scalar sub grace so the Select clause is processed after the from course so we need to get all the rows out of this colors table and then we start passing them to this sub query so
14:00 - 14:30 we're passing the rows from the colors table into that sub query and we look at the execution plan while we see hang on colors tables down at the bottom anything hang on didn't didn't I say we go down to the first unvisited leaf which in this case looks like the brics table right no M this is one of these scenarios exception that proves the rule and there's a couple of reasons I'm
14:30 - 15:00 showing you this first up just to kind of make you aware of this scenario but secondly for every kind of rule or guideline or principle I show you there is almost there's going to be at least one edge case where that doesn't apply and and this part of what can make sequel optimization tricky is you've got some guiding principles so to speak but there's exceptions all over the case all over the shop you know you can always find an unusual scenario or something which kind of breaks everything that
15:00 - 15:30 you've learned so you think about how see quiet reprocess is it understanding the worm sequel engine in how it works generally is a good principle as well I won't cover that today but maybe we can discuss that in a future episode so the thing here is I said colors tables in the front claws so the first thing we'll do is read the rows after that pass them back up the Select statement then for those rows we will run the scan of the
15:30 - 16:00 Brix table so we'll start reading that and the interesting thing to note here is and this scan can happen could happen as much as once for every single row we get out of that Kalista so we could end up running this full scan of the Brix table a lot a lot of times now there are some optimizations in Oracle database which means usually it's not quite that bad but in general it is likely if you've got a scalar sub query you will
16:00 - 16:30 run that many times so in this case we're going to full scan that Rick's table more than once quite likely okay so there was a few kind of example plans and that we walk through showing process that the database follows to get through them what questions or comments do you have at this point okay what what does a predicate show okay so good question you this and
16:30 - 17:00 the predicate section shows you when the when the predicates are applied so you've got conditions in your where clause so things like column equals value you know join criteria and so on the predicate section shows you where exactly in the plan those criteria were applied to the operations I'm not doing examples on the slides themselves but when we come to sequel developer we can talk about that in a bit more detail all
17:00 - 17:30 right then so okay so those are the examples I wanted to walk through here and that we can see is the basis of how to follow this so the question now is well how does this help make our queries faster and how do we try to improve things so just in the comment from Yunus in all the versions example a time the scalar sub queries did not appear at all in the plan I really hope no one watching this is using a tie still but
17:30 - 18:00 it's a good point okay be aware that everything I say just may change in future releases and there's always exceptions so there are situations which may not follow exactly the rules principle guidelines I've discuss here so yeah one of the sigh harder aspects of sequel tuning okay so let's see how how can we make our sequel
18:00 - 18:30 faster using this plan you've got a plan you know the route the date place followed through it how does this help us well let's return to my example traveling to San Francisco this is a long journey and I'm coming to you and kind of going i listed forever how can I help me make this faster come on Chris oh come on how can I get there quicker I come to you and say that chances are your answer is going to be willing um oh no right you know I don't know how you can travel quicker
18:30 - 19:00 at least knowing nothing more than I got from London to San Francisco there's a whole bunch of other questions that you'd want answered before we can provide any kind of meaningful result that would actually is likely to help improve things first up things like well how far did I actually travel so distance between them is something like 5,000 miles so that is quite a long distance and the thing to note there is no matter how fast a transport method I use it's going to take me a while to do
19:00 - 19:30 this journey and if you know so even basic laws of physics and that we can't travel at the speed of light and probably never will be able to so if I want to make truly huge gains then I'm perhaps need to think is there a way we can eliminate this journey time altogether you know we're traveling 5,000 miles can we not travel that maybe instead of living in the UK I can move to San
19:30 - 20:00 Francisco ok that's a big effort and I'm probably not going to do that but it does mean I can get that you know I don't have a travel halfway across world or maybe you know I won't travel I won't go to open worlds I'll stay at home I probably not going to happen for various other reasons you know my boss that won't let me or maybe I could stay at home and give a way back so we don't need to travel the key thing here is when you are processing lots of data and
20:00 - 20:30 particularly returning them to the client well you could ask yourself is it reason are your performance goals here reasonable every now and again you get someone coming up and saying something like I've got a query that returns 10 million rows takes 30 minutes to run I need to run in under a second and you're going to think well is that actually plausible you know processing 10 million rows is going to take a little while now granted we could probably get it right down from
20:30 - 21:00 30 minutes maybe down to for example 30 seconds or something like that but fundamentally if you are processing huge gobs of data your query is going to take some time so think is there a way we can process less data do we even need to run this query in the first place so you've got really long running queries process lots of data and you need to improve their performance try and find ways to just not do that work so there's a good principle to start with okay so we know
21:00 - 21:30 I've got to travel a long way so the next thing to look at is well how long did this take me compared to how long it was predicted to take me so the flight time was predicted at around 11 hours journey time was actually 10 and a half hours so I actually are now quicker than expected so you know this estimates fairly bang-on so in terms of the transfer you know flying getting commercial airliner reasonably we've done reasonably well in terms of
21:30 - 22:00 estimated time versus actual time if the estimate was 11 hours and actually took me over 30 then maybe I could have chosen a different flight which would have been more effective so if what we're estimating is close what we actually did then that's a good match similar when it comes to processing data your execution plan look at the rows we're processing if the number of rows the database is estimating is close to
22:00 - 22:30 the number of rows it actually gets chances are is come up with a reasonable plan or perhaps better way to think about it is if those estimates are way out there's a good chance there's a better plan they say sinks is going to get one row but actually gets 1 million rows probably a better plan available so look at what the database thinks it's do this is what it actually does how do they stack up and finally of course we looked at you know estimates here kind
22:30 - 23:00 of okay we know I have to travel that distance I'm not moving to San Francisco how else could I have taken this journey what other routes transport methods were available a lot of those are going to be a lot slower you know maybe I could travel by hot air balloon or something clearly going to be a lot slower to be a lot faster probably going to do something like charter a private jet you know you know I'm just traveling by a commercial airliner here if I'm Larry's kind enough he'll charging me a private
23:00 - 23:30 jet from a local airfield and maybe I can have the time it takes to actually do this journey but is that really cost-effective you know chartering a private jet is going to be expensive okay and so that's another thing to think about when you are tuning tuning your queries however how can you create other routes so common example is creating an index but some of the more some things you can do or create to improve performance
23:30 - 24:00 things like partitioning in-memory columnstore and so on do have additional cost associated with them at which point you've got to ask yourself a business question if the business benefit we get from using those features the faster query worth the extra cost and this isn't me saying you should go out and buy these point is you need to assess how much does it actually cost you to work around these issues you know in terms of developer time or you
24:00 - 24:30 know having the customers live with slower queries this is how much does it actually cost to buy these kind of things and this doesn't just apply to you know Oracle database too often people look at the sticker prices and go that cost money we can do it for free IIE ourselves and end up costing themselves more money in the long run whereas you say well in-memory causes this but the performance gains we get are why and we think that kind of like
24:30 - 25:00 to said look this benefit you can then do analysis well is X greater than that the cost greater than the business benefit you expect so think about that don't just think in terms of the sticker price of things think about how much business benefit do you get by using additional tools okay so those are questions when I'm flying how does this relate to a single execution plans so various questions you
25:00 - 25:30 can ask here so first thing is row estimates are they accurate is the number of rows the optimizers predicting close to the number of rows actually get their clothes there's good chance that you've got good plan is there way out there's a very good chance there's a better plan available so that's just a no looking at how to access the tables and how to join them we also need to look at how much work it actually took to do though to read those rows matter
25:30 - 26:00 how much I owe you did so this is a term it comes in the various names in Oracle database consistent gets buffer buffer reads they all kind of mean the same thing which is logical IO how many times did we have to read some data and which IO did we have to do to read those rows all other things being equal the query that read does less IO will be faster and more scalable again it's not always
26:00 - 26:30 only percent true but generally speaking this is one of the key metrics you want to tune you want to look at how can we do less I am okay so we've looked at those and those figures you know our estimates are good io reads are feasible and for the work that we're doing can we find a way to access the rows more efficiently can we create an index can we do things like reorganize the tables table clusters partitioning blah blah blah blah how can
26:30 - 27:00 we read those rows more efficiently how can we do less IO to read the rows that we want and so that's the thing to look at that finally of course important to remember response time is king how long did it actually take to run your queries the figures for all those previous things could be bad you know the estimates could be way out we could be doing lots of IO and we're doing full scans all over the place if the query runs in attempts a second you know who gets if the customers happy is
27:00 - 27:30 fine equally all those figures could be good but if the query takes 10 seconds to run that's probably not good enough the key point here is the customer doesn't care how many logical iOS you're tuning efforts saves they care that the query went down from 10 seconds to less than a second so the previous three questions are important to help you decide what you can do to make your query faster but the final thing is the thing that you really need to pay
27:30 - 28:00 attention to in terms of customer experience how the bayview was actually going on ok so that question from Inglot how much off is too much I guess you're referring to row estimates there keep that mind we'll come back to that in a little bit out late I'm gonna show some demos of some execution plans we'll look at what we can do to make them run quicker so there isn't a hard and fast figure but
28:00 - 28:30 generally speaking once you're more than order-of-magnitude out so we've estimated 100 rows you've got a thousand rows or more it's a sign that you want to start thinking about things and so once you're more than an order of magnitude difference then start thinking about Thanks so Ravi which is generally more bad logical IO or physical IO and has a good question so point here is that all physical I I Oh
28:30 - 29:00 is contained within logical ayah so let's say we do thousands logical iOS and it could be that they're all from memory or it could be that all those actually do disk reads and physical IO so physical IO is a lot slower but it is dependent on things like is the data cached and and generally you don't really have much control over that so think you look at the logical I hope
29:00 - 29:30 because that will include your physical I have essentially and then so you tune the logical IO and hopefully the physical IO sorts itself self out but physical IO is substantially slower than logical IO because it's actually from disk you know I know disks are getting a lot faster SSDs and so on but there's still orders of magnitude slower than memory so
29:30 - 30:00 the point there any other so what are the questions that people have at this point nope no other questions at this stage okay so before we jump into some demos one of the things I wanted to mention so so far we've been no subbing talking about execution plants there are also explained plans sound very similar they are very different and explain plan is a
30:00 - 30:30 prediction about what the database might do and critically it doesn't include information such as how many rows we read how many logical iOS we did to know whether the plan is right you really need the execution plan the stats the runtime stats for how it performed when it actually executed when it ran for the various techniques to do this I'm just going to mention briefly DBMS expand and
30:30 - 31:00 because I think this is the best one if you're sharing stuff with other people no posting on forums emailing for help and so on you get that nice text output that we've been looking at so far the key thing here is you want this gather plan statistics int which captures the row IO information and the all stats format so it actually displays that the gallop and statistics in captures that information the all stats format displays information and so once we've
31:00 - 31:30 done that you'll get some extra columns on your plan so there'll be can be a whole bunch of them but the core ones will be these and we'll just walk through these tana quickly so start fairly straightforward how many times did we actually do this thing you know we did each of the things in this plan here what but we could do them several times next Eros estimated rows how many rows did the optimizer think we would get and how does that compare to the a rows
31:30 - 32:00 actual rows how many rows did it really process at the particular time next up a time actual time how long did it take to run this and buffers so this is your IO to say also known as consistent gets but for reads and so on how much work did we do to process all these things so and we look at the ones on the Left these are all specific to
32:00 - 32:30 that particular line in the plant that particular operation so we're saying this second one here we started it once we estimated sixty eight rows we got 56 traits so 5856 pretty close that's a reasonable and guess reasonable estimates on how we did so what you're looking at here is are these close now one little wrinkle you want to multiply the number of start by the estimated rows and see that equals the
32:30 - 33:00 actual and we'll see an example later that's different so start times s eros is that roughly equal to a rose the things on the right here are all cumulative so man is the duration or the work for that particular line and the plant plus all of its children so to find out the total number of buffers is we don't add all those up to get a figure of around 100 now the
33:00 - 33:30 total is the top line so we did just 24 to find out how much work we did for this particular step you need to subtract off the buffers operations for its direct children which is just the operation below it so 24 minus 24 no buffers for that particular step in the plan similarly for x we need to subtract them off notice with dbms x prime x are all round you know the granularity here is hundredths of a second which may not
33:30 - 34:00 be granular enough for fast really fast queries so we'll we'll see a sequel developer how you can get a bit more accurate than that okay so I can see the few comments come in in the chat so let's just look at those before we move on all right so so tone a logical i/o is a close by memory read from SGA a physical i/o is a they read they can still be from memory eg a flash memory or actual physical disk the latter would take more time so thanks tone for showing that and
34:00 - 34:30 clarifying ok Peter can you share the query again was a bit quick to write it down and they'd have some comment in there which which query were you referring to Peter okay so Daniel is there a formula to calculate a reasonable time for the steps shown in the execution plan um I'm not aware of a good formula to kind of calculate this I mean you know one of
34:30 - 35:00 the big drivers is how many rows you're processing and how much I owe you doing which depends on things like how fast your underlying hardware is if any you know it'd be really difficult to come up with a specific formula generally speaking one of the points I said about these eight times being of hundredths of a second when we're processing small amounts of data like this it takes well left one hundredths of a second to actually do most of these things as well
35:00 - 35:30 seen in some examples in a minute so Peter the query to get the additional pecks out that you just sounds Oh sort of on the previous slide I will rather than do that I will show you something sequel developer in a minute Peter okay also the all the scripts and demos I'm going to show you they are all in my sequel I will give you a link to that at the end of this
35:30 - 36:00 session so you hang around there is will be a link to live sequel where you can get everything I've shown you today and play around yourself so don't worry if you haven't captured it and of course as I say we are recording this so you can come back and review it later as well okay what are the questions or comments do we have before we jump into a some examples of bad plans or plans that may be not quite right so you this I think
36:00 - 36:30 the hardest tuning question is this how can we know whether or query can or cannot be further optimized so that's a good question and of course partly one of the answers is you can almost always make things a bit faster a little bit faster the question is is it worth the effort of making it a bit faster now different people have different perspectives on at what point is good enough and really
36:30 - 37:00 this is where really I think it's important to work with the customers the business and understand what your tuning goals are you know if they're happy with response times of one second and you know your system that you've got can cope with that load comfortably then you can stop you know whereas if other businesses may need response times of sub a tenth of you know less than point
37:00 - 37:30 zero one of a second it entirely depends and this is where you need to work with the customers the business so you can prioritize because you can always take a query and say we how can we make this faster or what can we do to make it faster there's loads of tricks and things in Oracle database to do that if question is easy worth your time and really that's the question you should ask how much effort you know is this high enough priority compared to the five
37:30 - 38:00 thousand other things the boss is asking you to do to optimize it further and and you know what I don't think there is a better you know a right answer to this question different people have different perspectives what matters is that you speak with whoever is using the application or whoever owns the application whether they are happy performance if other people have other perspectives please let us know okay so this point let's jump into some demos
38:00 - 38:30 and we're going to walk through some some bad plan some plans that are not quite right so let me just make sure I'm connected my thing is still alive so let's give that a second any what are the questions or comments people have before I jump into these okay so got up to do all right so I have got a full
38:30 - 39:00 table join here so we've got a whole bunch of information so someone was asking earlier about getting the plans in sequel developer so up on the top here there's a thing called auto trace which by default is the f6 button and when we run that I'll show you what happens so I'll hit that little button up there also trace and zoom in a little bit so you can see so that's what you want to hit or hit f6 it will get the execution plan for your query and we've
39:00 - 39:30 got a plan that looks a little bit like this okay so let's zoom in on this a little bit hopefully you can all see what's going on and I think this is not a good plan so who can see something that's not great here or something suspicious and what might we do about it and in suggestions comments ideas can't
39:30 - 40:00 even join yeah that's a good thing yeah merge-join mmm yeah it's a bit suspicious isn't it so the we've got this merge-join first thing we're doing is reading rows out of bricks table now the thing here is the database thinks we are going to get one row out of this table cardinality of one but in fact we actually get 24 mmm so that's not good so there's a miss estimate so if we look and I look at
40:00 - 40:30 those two columns cardinality this is last output Rose expected one but actually we got 24 now it's worth noting that if your query if this query really does return just one row and we're joining to every other row in the other table emerged showing carthesian can be the fastest way to do this but it relies on us getting exactly one row out of that table we've got 24 side effect of that is if you can see over on just
40:30 - 41:00 below it we've got the buffer sort we do that 24 times well that's what the plan tells us so we're doing a lot of extra work there hmm so this is you know definitely a sign that something's out so we've got the big miss estimation in how many rows we're getting so how might we fix it any suggestions or comments might work what we might do gather stats there we go there we go that's the thing I was
41:00 - 41:30 looking for so I fix the stats for these tables specifically to kind of get this kind of bad plan but it's possible but you might have something like this let's just go down and gather the statistics for that I'll just pin this plan here just for a second so we've gathered the statistics I run F that man and I'll get we get the execution plan again will rerun auto trace so we do that and now then I can
41:30 - 42:00 actually run that let me just make sure I've got gather those snaps and hopefully now on this time it will do do there we go so now we look at that the merge-join Cartesian is got we've replaced it with a whole bunch of hash joints and at this point those commonalities and those output rose basically all match have you got good memory you might recall this looks suspiciously like that full table join
42:00 - 42:30 plan I showed you earlier and that's because it is you know I listed the figures straight from that and so we can see we've got things going on there so one of the reasons I pin this is in sequel developer we've got our two plans side by side we can do a little compare here and see what the differences are so we can compare those little things highlighted in red show us where the differences are and just quickly you'd if you had a question about filters and so on so you can see in this plan here
42:30 - 43:00 below each thing actually if I just expand predicates and projections like that below each thing we've got access predicates so that's showing us what thing we're doing for each of these joints basically which join is happening here and each set so Hegarty does the auto truth Auto trace feature show the cost of each steps in the plan is cost us all good guy for areas to focus on so let me first point Matt on that question if we
43:00 - 43:30 go into the Preferences let's just show that again tools preferences search for auto auto trace there is a whole stack of things that you can show here I focused on the core things to keep it simple but there's lots of things you can display third in less cost notice I've unchecked that why well partly I think actually it's not that helpful generally I think people get a bit too hung up on this cost and say well the
43:30 - 44:00 cost is bad ba-ba-ba-ba and it's for the most part there's not much you can really do about it you know things like can we process less data can we get the estimates right other things you can change the cost is a consequence of those so generally avoid looking at that if you want to look at it you certainly can do okay other questions comments at this point and
44:00 - 44:30 other thing I mentioned quickly while we're here so look at those elapsed times those values there are in microseconds so remember I said hundredths of a second might be not granular enough it's nowhere near enough here so we can see that thousand microseconds for the whole query which is you know a thousand a thousandth of a second so certainly for path queries this displaying sequel developer is much
44:30 - 45:00 more useful for getting these time breakdowns okay cumulative yes it is so the one of the things if you are just doing it yourself nice thing about sequel developer is you can expand and collapse these things so we collapse those down this hash join here time here of two and a half thousand is you have to subtract off this 797 + 35 so 800 odd microseconds so
45:00 - 45:30 2 and 2400 - 800 1600 odds microseconds to process the hash join itself so yes last time is cumulative I say quite easy to spot in people developer just by collapsing everything the leap beneath it so you can then see what to direct children our same advice for these uh forgets operationally okay let's move on let's
45:30 - 46:00 close this down so next so we've got another query here and I'll get another plan for this and to do think thinking what I've got here is a hash join of the two tables and and I'll just run the query itself so you can see we get we get just one row in the output so we kind of glance over this plan I'll shrink things down a bit to see what help you see what's going on here any
46:00 - 46:30 thoughts about any thoughts about this comment suggestions what we might be do to make them improve this query so we're just full scanning both tables and getting one row in the output what do we think of this good bad indifferent everyone thinking hard hopefully not sure what you're saying now Antonis I
46:30 - 47:00 can't read though that writing I have better to avoid natural join that besides the point is so index index index creates an index very good yes so um so one of the things said before is look at the i/o so if we look at the CR buffer get consistent reads we are doing before scanning clicks get to read one row bit suspicious right when you've got really
47:00 - 47:30 you want to read multiple rows without one i/o not have multiple iOS to read one row you're doing more IO than the query hit you know then the queries outputting it's a sign that maybe there's something you can do to make it run faster so um we're getting you know one row from both a table six gets from each 12 in total so what can we do and some indexes and great session for lis
47:30 - 48:00 so brick ID is in fact a primary key I've been very lacks not to find it very noisy me let's create those indexes my ship in that and run the query again did you do and now we look at that and we look at what we're doing here so cardinality output rose one more one more everything lines up nicely I forget to drop down considerably just the form so notice we've got one get to read the
48:00 - 48:30 index and one get to read the rope so that's two for each giving four in total so this gives a big saving compared with twelve that we had previously Agata our index is still useful arm Exadata hosted databases and that a question which has a bigger answer than we've really got time to go into the short answer is if you are returning
48:30 - 49:00 base if you only return one or two rows from a table you probably always want to use an index beyond that it all gets a bit fuzzy and when it comes to Exadata there are a lot more optimizations which make full scans a lot more efficient but you know you know primary key lookups basically always want to use an index other types of queries it depends and a
49:00 - 49:30 lot that can be discussed on that okay John do together stats once index is created so gathering gathering creating the index will in fact gather stat as well so you don't need to have a separate staff gathering and that's been true for a long time okay so there's one for good questions folks keep them coming but there's one other example I wanted to show you before we wrap up and that was this scaler
49:30 - 50:00 sub-query let's run it and see the plan we get here so let's get that auto trace in it so we've got three that sort table access full sort that we have there and notice I said we would do that full scan multiple times we know that because we can see from that last start column with doing it three times this is output 24 rows cardinality of 8 3 times 8 is 24 so
50:00 - 50:30 estimates right we're getting 24 rows because we're doing this thing multiple times reading all the rows in table and and that's happening once for each row we get out the colors table so we're doing a lot of unnecessary work here one of the things to mention here notice the top T our buffer gets listed there is 6 and below it we have 18 so basically where things get slightly confusing again and the total here the 6 is from
50:30 - 51:00 processing the Fromme calls essentially and the 18 is from processing scalar sub-query find the total you need to add those together to get 24 luckily sequel developer does also have all the staff listed here to bring this over so if we easier to see but you can see consistent get 24 there hopefully so again with scalar sub queries it can be a bit tricky to figure these out so what can we do for this query how can we improve
51:00 - 51:30 this so while people were you type in your suggestions I see this some other comments come in so let me know what you think about improving that scalar sub query white dress these are the points so Rabi are covering indexes recommended or can it have some pitfalls and again there's weather is almost a session in its own right questions and for the covering index for those you're not familiar covering index is one where all the columns in your query appear in
51:30 - 52:00 which means the database can answer the query just reading the data from the index doesn't have to access the table altogether this can be a big saving because indexes are usually much smaller data structures than the table so a couple of things first up we can search the index efficiently rather than doing a full scan and it's a smaller data structure so they can be very useful the downside is of course you're going
52:00 - 52:30 to have a lot of different queries are you really going to create covering index for every single one mm probably not that's probably not a good idea also at some point the covering index the size of it becomes close to or even potentially bigger than the table itself you know if you've just taken all the columns from your table and stuck them in the index you know as that really helped you in any meaningful way right in which case you might be looking at
52:30 - 53:00 something more like an index organized table so that's a said the petrel they're really just that you end up over indexing you create an index for every single query which is well it uses a lot of space gives the optimizer headaches because it then becomes challenging or can become challenging to choose the right index and at some point is actually no more efficient than a full table scan if you include most all the columns from table in the index so yeah
53:00 - 53:30 its ultimate you know if you've got a query where performance absolutely matters it's worth doing but I wouldn't do it for every single query so you do except for the function based indexes where you need the columns that's for the new virtual columns maybe okay I guess you're referring to the gather stats there yes good point and maybe transform the scalar sub query into an inline a great view and join yes so first thing we could try is turn this
53:30 - 54:00 into a drawing now in order to turn this into an inner join do know so this query will turn every road from the colors and then find a matching want and brix if some of the rows in this table some of those colors are null this will return nothing so to get the same result either we have to use an outer join or make that mandatory so let's run this query and we'll get some details in a second and now we can see
54:00 - 54:30 we've got an index scan of our colors so we've gone from two full scans to a full scan and an index scan that's good saving and our buffer gets has dropped considerably down to eight and notice we're doing most well actually we are doing this last thing three times but much more improved haha yes aggregate the bricks table alone well there's a
54:30 - 55:00 good point in this you know if all if this is our entire query we're only getting the count of things out of bricks and the color is mandatory in the bricks table we could just do this we don't even need to read pelos table right so we could just simplify the queries of this all together and we've got just this making things even faster but we can still even do even better we
55:00 - 55:30 could create an index on that lap now at this point if we check what's happened after we've got that index we've still got a full scan here we go well we're just reading that color why we've got a full scan well an Oracle database holy null entries are not included in a b-tree a standard index so there are only nulls in here they are not going to be included in this index and we get the wrong result by scanning it so either we
55:30 - 56:00 can do this add a where is not long and we run our query again hopefully we'll get nice index scan getting even faster again down to just one get or even better assuming that Kullen color is mandatory and you know importance of defining good constraints we can set that up and if we now just make sure a pin that to show what's going on and that again and we will get the full scan of the index again and there we go so we improve things um notice at this
56:00 - 56:30 point even if we go back and run the scalar sub query again we've now dropped down the work we did considerably I'll compare that with the first one crank me indexes and so on has given us some advantage gone from full scan to an index scan and our work has gone from 24 gets down to 8 so we've heard it our work roughly would count ID have done the job no because we're grouping by
56:30 - 57:00 color so you still need I'm not sure we're using count ID exactly but the thing is we're grouping by color there was unless we've got an index on color we would have to read the data from the table and do that all right then we're getting close to the hour and so I wanted to just jump back here briefly so a couple of things to finish up so where do we go from here so I've shown you some ways of
57:00 - 57:30 getting an execution plan I've also got a blog post that discusses how to use DBMS ex-prime also uses sequel developer also talks about a couple of talks about a couple of other techniques notably sequel monitor that's probably actually one of my favorite techniques but it does have licensing implications and it only kicked in by default for slow queries those in parallel will take longer than five seconds so check that out if you want to look more background
57:30 - 58:00 about how to create these plans and to Peter or anyone else who was interested in how to do get the script get the plans and so on or recreate what I've just talked about got life sequel script there you can access that so head out to that and say yes you can use a hint to make the monitor sequel monitor kick in always be tricky if you know this is code in an application somewhere it's not always possible to go through and fiddle around with that but good point finally just start playing around you
58:00 - 58:30 know a lot of what's going to help you learn how to how this works is just fiddling with things and get a complicated query look at its fun add some indexes drop some indexes gather stat fiddle with the stats the settings so they're wrong or right and see what happens just start playing around and compare the plans see what happens in different scenarios and you know just not get feel for you know when things are good and bad and so on so rather you'll request this session to
58:30 - 59:00 do how to read a powerful parallel execution plans I'll that's a list they are a bit more complicated I'll have a think about if you've huge topic teams out using hints I'll give you the short answer don't write your this I think that's the only session I'm going to give them using hints don't use hint okay that's a bit simplified but really if you thinking about using hints really should be looking at things like sequel plan management I recognize that there are some hints things like gather plan
59:00 - 59:30 statistics which don't actually change the plan but is still useful but I'm not going to do a session on him okay if you have any other questions feel free to get out get in touch either via the ask tom site session itself get email me Chris Saxon the Oracle comm get in touch on Twitter at press R Saxon or put some comments in chat mail I'll hang around for a second or two if there's anything else you want to say but finish up but I
59:30 - 60:00 really hope you enjoyed this I hope you more importantly hope you learned something and found it useful so thank you all for joining me hope you have a great Christmas holidays festive season whatever reduce to celebrate and I hope to see you all some time in 2020 goodbye [Music]