Explore the Power of EDA with Python!

Exploratory Data Analysis (EDA) Using Python | Python Data Analysis | Python Training | Edureka

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In this engaging session by Edureka, the speaker delves into the world of exploratory data analysis (EDA) using Python. The video starts by explaining what EDA is, emphasizing its importance in understanding data through various steps such as cleaning data, identifying important variables, and analyzing relationships. The session then demonstrates how to perform EDA from scratch using tools like Jupyter Notebook and Python libraries such as Pandas, Seaborn, and more. From importing datasets to understanding data distributions and visualizing with plots, this video offers a comprehensive guide to mastering EDA for effective data analysis.

Highlights

Exploratory Data Analysis (EDA) is a crucial step in data analysis, providing insights into data quality and structure. 📊
Python offers a robust set of tools for EDA, including Pandas and Seaborn for data manipulation and visualization. 🐍
Understanding data involves several steps: loading, cleaning, analyzing relationships, and visualizing. 🚀
The tutorial provides a hands-on demo using a dataset from Keggle to perform EDA. 🏆
Various visualization techniques such as heatmaps and scatter plots help in understanding data relationships and distributions. 🌐

Key Takeaways

EDA is essential for understanding and cleaning your dataset before analysis. 🧹
Using Python libraries like Pandas and Seaborn makes EDA easier and more efficient. 🐍
Visual tools like heatmaps, scatter plots, and histograms reveal important data insights. 📊
EDA helps in identifying relationships and potential outliers which are crucial for accurate model building. 🤓
Regular practice of EDA enhances data literacy and analytical skills. 📚

Overview

Exploratory Data Analysis (EDA) is like a detective's job – it's all about diving into the data to understand its mysteries. In this Edureka session, the importance of EDA is highlighted, emphasizing how it helps in understanding the underlying patterns, structures, and anomalies in a dataset. From identifying faulty data points to understanding variable relationships, EDA forms the backbone of any data analysis process.

Using Python for EDA is not only effective but also fun! This session guides viewers through the step-by-step process of performing EDA using popular Python libraries. The speaker starts from the basics of importing data, slowly advancing through cleaning, understanding variables, and using visualization techniques to make sense of it all. Whether it's Pandas for data manipulation or Seaborn for making attractive plots, Python's ecosystem has got your back.

The video doesn't just stop at theoretical concepts; it walks you through a practical demonstration of EDA in action. By loading a student performance dataset, the session offers a hands-on approach to each step involved in EDA. It showcases how to leverage visual tools like histograms and scatter plots to extract meaningful insights, making the complex data analysis process both manageable and exciting.

Chapters

00:00 - 01:00: Introduction and Agenda The chapter 'Introduction and Agenda' begins with music and a greeting from Versine from Eddy Rekha. Versine introduces the session's focus on exploratory data analysis (EDA) in Python. The agenda includes explaining what EDA is and discussing the objectives of performing EDA on any dataset.
01:00 - 02:30: Understanding EDA In this chapter titled 'Understanding EDA', the focus is on explaining the entire process of exploratory data analysis (EDA). The speaker intends to go through the steps involved in EDA and demonstrates how to perform EDA on a dataset from scratch. Additional promotions encourage subscriptions to a channel for more tutorials and provide links to further learning resources, such as a data science certification program with Python. The chapter sets the stage for a hands-on session in EDA.
02:30 - 03:30: Objective of EDA The chapter 'Objective of EDA' introduces exploratory data analysis (EDA) as a method to explore data and understand its various aspects. EDA involves a sequence of techniques aimed at data analysis and comprehension. Though specific techniques are to be discussed later, the ultimate goal of EDA is to gain insights and a deeper understanding of the data at hand.
03:30 - 05:00: Steps Involved in EDA The chapter 'Steps Involved in EDA' emphasizes critical steps in Exploratory Data Analysis (EDA) such as ensuring data cleanliness by removing redundancies, missing values, and null values. It also highlights the importance of identifying crucial variables and eliminating noise that might affect model-building accuracy. Understanding relationships between variables through EDA is crucial, as is the ability to derive meaningful insights from the analysis.
05:00 - 06:00: Practical Demo: EDA on a Dataset The chapter titled 'Practical Demo: EDA on a Dataset' focuses on the importance of Exploratory Data Analysis (EDA) in understanding and preparing data for further analysis. The transcript stresses the key objectives of EDA, which include ensuring the data is clean and free of inconsistencies such as null values. The fundamental idea highlighted is that EDA is crucial for gathering insights and facilitating a conclusive interpretation of the data before advancing to more complex data processing steps.
06:00 - 09:00: Loading and Understanding the Dataset The chapter titled 'Loading and Understanding the Dataset' emphasizes the importance of identifying and removing faulty data points as an essential part of data cleaning. It also highlights Exploratory Data Analysis (EDA) as a tool to understand relationships between variables, providing a broader perspective on the data. This understanding aids in building knowledge and utilizing variable relationships effectively. The chapter outlines these objectives as foundational steps in performing EDA on any dataset.
09:00 - 12:00: Data Cleaning The chapter titled 'Data Cleaning' explores the essential steps involved in exploratory data analysis (EDA). It emphasizes the importance of comprehending the dataset, including the variables it contains, such as the number of columns and rows. An understanding of the dataset's structure is highlighted as the crucial initial step after data loading.
12:00 - 28:30: Relationship Analysis and Visualization This chapter delves into the processes involved in relationship analysis and visualization. It begins by emphasizing the importance of importing raw data into your program, followed by a critical step: data cleaning. The process of cleaning data is crucial as it involves removing redundancies, such as unnecessary variables or columns that do not contribute to conclusive interpretations. Additionally, it highlights the need to address outliers that can introduce noise, potentially causing models to overfit or underfit during model building. The chapter underscores that after cleaning the data, significant progress can be made in the analysis. While the transcript cuts off, it hints towards more steps that follow in the process.
28:30 - 29:30: Conclusion and Next Steps In the conclusion and next steps chapter, the focus shifts towards practical application. The chapter suggests moving from theoretical discussions to practical exercises, specifically by using Jupyter Notebook for hands-on exploration. Readers are encouraged to access a dataset from Kaggle and perform Exploratory Data Analysis (EDA) to understand variable relationships.
29:30 - 30:00: End Credits and Further Learning Resources This chapter provides guidance on further learning resources and shortcuts for understanding specific tools. It mentions a cheat sheet for working with Jupyter Notebook and an Anaconda tutorial for installation. The chapter briefly discusses the initial step of importing necessary libraries, highlighting the importance of tools such as Pandas and Seaborn for data visualization.

Exploratory Data Analysis (EDA) Using Python | Python Data Analysis | Python Training | Edureka Transcription

00:00 - 00:30 [Music] hello everyone this has versine from eddy Rekha and I welcome you all to this session in which I am going to talk about exploratory data analysis in Python so let's take a look at the agenda for this session first of all I am going to explain what exactly exploratory data analysis is and then we will move on to the whole objective of doing EDA on any data set moving further
00:30 - 01:00 I will discuss all the steps that are involved in the whole process of exploratory data analysis and finally we will perform EDA on a data set from scratch I hope you guys are clear with the agenda also don't forget to subscribe to a dear acre for more exciting tutorials and press the bell icon to get the latest updates on ad wake up and do check out Eddy records data science with Python certification program the link is given in the description box below now without any further ado let us begin our session so
01:00 - 01:30 what exactly is exploratory data analysis exploratory data analysis or simply put we can call it as EDA as well is nothing but a data exploration technique to understand the various aspects of data it includes several techniques in a sequence that we have to follow and ok we will learn about those techniques later on in the session but the whole aim or the whole objective is to understand the data and understanding the data can be a lot of things when we are exploring the data and so few things
01:30 - 02:00 we have to keep in mind while exploring the data like we have to make sure that the data is clean and does not have any redundancies or missing values or even null values in the data set and we have to make sure that we identify the important variables in the data set and remove all unnecessary noise in the data that may actually hinder the accuracy of our conclusions when we work on model building and we must understand the relationship between the variables through EDA and last but not least we must be able to derive
02:00 - 02:30 conclusions on gather insights about the data for conclusive interpretation in order to move on to more complex processes in the data processing lifecycle now let us try to understand the objective of EDA in data exploration the very basic idea is to make sure that the data after the idea is clean and my clean I mean the data has to be free of all that a tendency is including null values and all those things so we can narrow it down to two main basic objectives to perform EDA so first
02:30 - 03:00 objective is Elia helps us in identifying the faulty points in data and if you have identified the faulty points then you can easily remove them and clean your data and the next objective is that EDA helps us to understand the relationship between the variables which gives us a wider perspective on the data and it actually helps us build on it by utilizing the relationship between the variables so these are the main objectives of performing EDA on any data now let us move on and take a look at the steps
03:00 - 03:30 involved in EDA so these are the basic steps that are involved so I'll just highlight the few main points although like each step has several other features as well so we will take a look at those while we are working on the demo ways the very first and the basic step is to understand the variables in the data set so you have to be pretty sure about what kind of data you are working on what are the variables like the number of columns and rows and how it actually looks like so that is your first step after loading
03:30 - 04:00 the data into your program then the next step is to clean the data from the redundancies now redundancies can be irregularity in the data it can be some variables or some columns that are not necessary for making our conclusions or interpretations so we can just remove them or they're outliers which can cause noise in the data or you know it may over fit or under fit the model when we are working on the model building as well so this is the second step guys we have to clean the data in order to move forward and last but not least we have
04:00 - 04:30 to analyze the relationship between the variables so let us move on to the fun part guys so what I will do now I will jump right to Jupiter notebook and we will work on a demo I'm going to take a data set from Kegel and perform EDA on it so let's take it up to Jupiter notebook guys so I have already opened this notebook case and if you don't already know how to work at Jupiter notebook we have a full tutorial on how to work with Jupiter notebook you can find it YouTube page guys and if you're still
04:30 - 05:00 looking for shortcuts like if you wanna just understand how it really works we have a cheat sheet as well which you can refer for working at Jupiter notebook and if you're looking at insulation and everything you have anaconda tutorial as well so the very first thing you have to do is import certain libraries that you're going to need so I'm going to import pandas with an alias PD I'm going to import a few of the libraries that you may need I'm going to import Seaborn for visual representation guys because we are going to be visualizing the
05:00 - 05:30 relationship between the variables so for that I'm gonna use C bond so I'll run this program and this cell is successfully running right now it is going to take a while guys so mineralize just wanna tell you like how we are going to approach this okay we have done that so I'm just going to so I'm going to take this variable data and I'm going to use pandas library so first of all the very first step is I have to import my data set guys so
05:30 - 06:00 this is the location of my data sets and name of the data set is students dot CSV can we have a error is file not found all right so we have successfully imported our dataset into the program so the very first step after you load the data into your program is you have to
06:00 - 06:30 understand the data by understanding the variables inside your data l-system named it as first so the very first step is understanding the data and I'm going to check the first five rows of my data rise so this is my data the first five rows we have these columns like gender race parental level of education lunch
06:30 - 07:00 test preparation course math score reading score and last we have writing scores so these are the scores that are going to be important in our it set by just looking at it I can tell you like these are these values that will be very important while working on any model or making the assumptions or making any conclusions like gender has to be there because it's decisive it has to be either male or female so this one categorical value that we are going to need in our data set so there is an ethnicity may be dropped it's not
07:00 - 07:30 necessarily a very importantly not either side and parental level of education if we'll check for the unique values and we'll decide so that is what we are going to do for firm idea on it let's check the tail as well like last two five rows as well so we have all these values we have already taken to look at so one thing you can make sure is it's starting from zero and going until 999 so we can just say that we have a thousand entries in this data set so it's not a very big data set but it's relatively not a very small data set
07:30 - 08:00 either it's perfect for us because while doing the representation it's going to be quite easy for us now let's check for the shape of the data as well so these are all the steps that you have to follow while walking alright so we have checked the shape so we have thousand rows and eight columns guys let's just take a look at few other key points
08:00 - 08:30 when you use the describe it's only showing the maths code that reading score and the writing school because all of the other variables that we have our string objects only the integer objects are showing over here so we have a count here like thousand and we have a mean value we have the standard deviation the minimum value 25 percent 50 percent 75 percent and the maximum value as well so as you can see for all those values 100 marks is the maximum and the minimum we have math score is zero reading score is
08:30 - 09:00 17 and writing score is 10 so all these values you can just get by a described method after that you can just check for columns and rows separately as well so for that you just have to write like theta dot columns all right it's not callable so we have gender race/ethnicity we're into level of education lunch test preparation course math score reading score and writing score so we have
09:00 - 09:30 completed none okay so we'll just check for the n unique values which is nothing but a function which returns a series with a number of distinct observations were requested access so if we set the value of access to be zero then it finds the total number of unique observations over the index axis so let's just check for the unique values so that's what we do now we check for the unique values in our database so I'll just you'd n unique and I've already told you what it does so for all these column is showing us the unique
09:30 - 10:00 values so for gender we have two unique values which is basically male and female for race and ethnicity we have five values parental level of education we have six values for lunch we have two values for test preparation course we have two values for math score reading score and writing score we have quite a few unique values from out of zero to 100 we have scores of 77 unique values for writing score for reading score we have all those things and if you want to check separately for any column you can just write let's say gender and we can
10:00 - 10:30 just write unique and it will show us the unique values inside that column like so which is male and female similarly if you want to check let's say for race and ethnicity we can check so we have Group B group C Group a group B and Group A for parental level of education also we can check all
10:30 - 11:00 right so we are bachelors degree some college we have master's degree associate's degree high school and some high school so these are all the values that you can just make out looking at the data so by looking at these unique values I can tell you we have categorical values like we have test preparation course lunch and agenda which can be converted into the dummy values out of all these values I'm just going to take these three that is match go reading score and writing score and launch test preparation and gender and the other ones like ethnicity and
11:00 - 11:30 painter level of education can be dropped because that are not necessarily very important variables inside our dataset now we will move on to the next part of our EDA which is basically nothing but cleaning the database so the very first thing that would come into your mind is check for the null values inside any of these for that we can just check for the null values get a sum also so inside this dataset we have 0 null
11:30 - 12:00 values so we don't have to worry about dropping any column just because there is no value or replacing it with some other values but in some cases in some data sets which are relatively very large like if you have 7,000 or 8,000 values and if you have even poopers in the null values or missing value inside those data set you have to be pretty sure about either if you want to leave those values untouched or if you want to just drop them or replace any values from them so since we don't have any
12:00 - 12:30 null values inside this we'll move on to the next part which is dropping the redundant data which is not necessarily going to perfect our performance of our table guys so now what we'll do is we'll remove a few columns that we don't actually need inside our data set so we'll move race and ethnicity and parenting level of education so these are two values that I don't need in my day set because I think these are not important values for any valuation so I'll just remove these so I'll take one variable as a student is equal to
12:30 - 13:00 data dot drop and I'm gonna pry the race/ethnicity the column name right and we don't want pay until level of education and it has to be axis is equal to one otherwise it will throw us an error right so when I look at student so
13:00 - 13:30 we have all these values we have gender lunch test preparation course match score reading score and writing score next step would be like checking for outliers which is not necessarily going to be a problem with us because we have a pretty clean data set so you can go for outliers as well if you want know more about our players I will tell you what our players actually are so outliers are nothing but in statistics
13:30 - 14:00 an outlier is a data point that differs significantly from other observations let's say if you have a match score which is 72 you know 69 and suddenly somebody has a 0 and 1 so that is going to be an outlier and an outlier maybe due to the variability in the measurement or it may indicate experimental error as well so the latter are sometimes excluded from the data set because an outlier can actually cause serious problems in statistical analysis so that's why we have to look for
14:00 - 14:30 outliers and in this data set and not necessarily we have any employer so we're gonna leave that and want in a third step that we have which is basically nothing but the analysis of the or we can call it as a relationship analysis so I'll just mark it as three I can just write as relation ship analysis now what we'll do is we take a look at a few other measures so first of all we have correlation matrix and before we
14:30 - 15:00 move on to relationship analysis I hope everything is clear to you guys like we start from loading the data and then we talked about how we can explore the data look at different points in the values and then we check for any missing values while cleaning the derive check for null values you have to check for it and then sees outliers and remove all the unnecessary redundant variables that we have so we have done all that now we were moving on to the next step which is also the final step basically nothing
15:00 - 15:30 but relationship analysis all right so first step I would like to do is correlation matrix because it gives us of a wider X perspective on what exactly are we dealing with here and a correlation matrix is a table showing correlation coefficients between variables and each cell in the table shows the correlation between two variables and correlation matrix is used to summarize data as an input into a more advanced analysis and also as a
15:30 - 16:00 diagnostic for advanced analysis so we'll do that guys so we start with the relationship between the variables by looking at the correlation matrix for that I'm gonna take one variable let's say correlation and I'm going to use my data that is student and I will take the coefficient all right so we have no errors guys now I'm going to put it inside or heat map base using the SNS
16:00 - 16:30 library that I have or C bond library SNS is basically the alias that I'm using for inputting the library so I'm gonna use the heat map to actually show you what it looks like and so we have X Tech labels is equal to correlation columns and then we have oh I take labels columns and then we'll
16:30 - 17:00 have one more are you a notation did you go to set true alright we have an error guys so basically I have four to I don't come over here oh it should work fine now and we have one more error correlation so
17:00 - 17:30 this is our heat map guys so we have math score reading score and writing score and we can take a look at this heat map but since we do not have a lot of integer values we have a few categorical values as well so it is not necessarily that defining for our data set but if we change these to dummy values we'll be able to get a specific value for this for writing score and let's say for math score there is a variability of 0.8 and then we have a
17:30 - 18:00 correlation almost everything is same here so will not rely on this more to the next step for our data set guys so we are going to plot a few other plots so there is one pair plot we can actually Lord okay so I'll just plot it and then tell you what it is so pair plot on the other hand is when you only want to visualize the relationship between two variables where the variables can be continuous categorical or millions as well and pay blood is
18:00 - 18:30 usually a grid of slots for each variable in your data set as you can see we have a writing score match score and all these schools so these are quite descriptive when you look at it so all are in the increasing firmness this is also not quite decisive when we are taking a look at a conclusion guys so we move on to the next approach that we have which is a scatter plot so a scatter plot is a type of data on display that shows the relationship between two numerical variables each a
18:30 - 19:00 member of data cell gets plotted as a point whose left parenthesis and right parenthesis coordinates relate to it so values for the two variables okay so I'm going to use the relation plot right access to be let's say Matt score can you take Y as a reading score we take the hue as gender data is equal to
19:00 - 19:30 student all right so we have a scatterplot here and we have added the hue as well so as you can see guys for all these values that are in blue dots are actually the gender of our female and the other one is the male counterparts and there we are having a relationship between their math score and reading score so for all the relative values or the categorical
19:30 - 20:00 values that we have inside our dataset which are basically nothing but test preparation course as well and we have a lunch which has two values so we can change the hue and check for the relationship otherwise so instead of this I can write lunch so for people who have reduced and free lunch the scores are not actually looking good for the standard lunch people the scores are pretty good guys reading score is also quite high and the match
20:00 - 20:30 score is pretty good as well this can be used to minimize our analysis guys because we're basically plotting three different values inside the same graph so we have the score of two different values that is a readings for and a math score and then we have the hue as well which is basically nothing but telling us the different value inside the lunch table guys and then similarly we can have for test preparation course as well so this is the scatterplot guys which we can use to analyze our data and then we
20:30 - 21:00 move on to the next plots that we have which is basically nothing but our histogram so I stir Graham is a graphical display of data using powers of different heights and in a histogram each bar groups numbers into ranges so the taller bars show that mod rate of range actually falls in that and a histogram basically displaced shape and the spread and continuous sample data so now what we'll do is we'll take a look at a few histograms guys so for that I'm going to use the SNS dot test plot it's going to
21:00 - 21:30 give us distribution of let's say force while we check for math score all right so we have the distribution over here starting from zero most of the values are lying between 60 to 80 so we will take a guess as the highest is basically highest number of people are getting other values around you know 60 to 70 so we'll check for other values as well let's say we check for reading score so as you can
21:30 - 22:00 see we have that distribution again and we check for writing score as well all right so all these values we can check with the histogram lays and we can add bins as well so we will take the pins is equal to let's say 20 or will he take the bins to be 5 so now you can clearly see the distribution because we have only 5 divisions over here the most values are
22:00 - 22:30 between the 60 to 80 your column so that is how you can use the histogram to analyze the relationship between or the relationship in your dataset right so there's one more plot that I won't talk about which is nothing but all right we'll just take the relation plot again or we just take the categorical plot and inside this let's say X is equal to say gender we want the kind to be box
22:30 - 23:00 and data is equal to student right box is not defined cannot perform reduce with flexible type alright so we'll change the values case so we take the match score right so we have our box plot guys so for the match
23:00 - 23:30 score again we are getting the same values around 60 to 80 so most of the values are falling around that so that is how you can check the relationship for reading score again you can check those values similarly for writing score as well all right so for this analysis my solution would be the scatter plot guys that is quite easy to understand because all these values that are the averages we can check with the describe method that we have done in the beginning or we can just go for this one
23:30 - 24:00 as well to understand the relationship between different variables over here which is math score writing score and all those things but in reality we actually want the relationship between the gendell and the math so let's say you want to check to lunch and the math score and the test preparation course and math score for that you're going to use the scatter plot that will just give you all the values that you actually need so this is how you do ETA on any database so this is a relatively very small data which she had only eight columns and thousand values we started
24:00 - 24:30 with importing the data into our program then we check for the first five rows then we check for the last five rows what exactly was there then we described the data in which we had the count the mean the standard deviation minimum values and the maximum values are as well for although integer values so you have to make sure that all these values inside your table either should be in integer values only then you're going to get the describe function to its maximum capacity otherwise you would will be missing a lot of values inside through your table and then of course we check
24:30 - 25:00 for the shape and column names which you can also take a look at when we are looking at the first five or last five rows but the thing is it too has only eight columns and sometimes a few data sets like if you're working on stock prediction there are at least 18 19 or maybe 200 columns that has entries in that data set so that is one thing you have to make sure that you are able to check the shape so that will be able to get like how many rows and how many columns
25:00 - 25:30 and you actually have and then you can check for column separately for all these values that we already have over here and then you can check for unique values for which we have for gender we had two for all the values that are two it could be made into categorical values let's say we can say like lunch if you had lunch or not that is going to be a one categorical value either zero of one which can be a categorize and say yes or no so we'll be able to convert it into integer values for computations and similarly for gender if it's a male or a
25:30 - 26:00 female we can change it to zero and one as well and for race and ethnicity we had five and six values for parental level of education that was quite a challenge so we had to remove these redundancies or later on in the session while cleaning the data and for all these match score reading score and writing score are pretty important in our data set so we have to keep those and then we actually a check for unique values of different or tables so at if we can make out what exactly is necessary for our table or not
26:00 - 26:30 then came the second part in which we clean the data so we check for the null value since we had no null values inside the saw data set that we have we had no problems in to changing these values so if you had any null values let's say you could have replaced them with drop any so that should um just drop the whole row from that table and after that you could have replace the values from some other values so let's say if let's say we had a few null values inside the match score so what we
26:30 - 27:00 would do is we take the average or you know the mean value which is 66 and put this value inside all those null values so it would make a lot of difference because the mean is already 66 so that is one thing that we can do while we encounter the null values inside our data and then of basically we thought two columns that we thought are attending to our data into analyzing any reference but we could have used these for visual analysis I mean understanding the relationship for we can use parental
27:00 - 27:30 level of education to determine how much marks a person is actually carrying so that is one thing we can use so lace and ethnicity does not really give any importance in our data guys does not really manipulate anything in our data it is just a column that we can actually drop so we did that after that we took to relationship analysis for that we took a look at a correlation matrix and for that we had to calculate the correlation of student data and then we put it inside a heat map to check for
27:30 - 28:00 the values so we had not very conclusive analysis over here so we move to the next one which is a pair plot so paper also did the same thing I so we had all these relationship between all these pair points that we had which is the writing score match score and a reading score and then the most efficient or relationship analysis tool that we had found for our data is the scatter plot guys so we'll be able to make a few assumptions based on these I under this quite important for us in this data of
28:00 - 28:30 course and then we took a look at a histogram to understand the distributions in our data base and then we had used the box plot for doing same and after this we can conclude a few things that we had looked into our database so for each step that we have performed over here we can conclude a one thing or the other and the next step after this is mortal building eyes so for that you have to change a few things for your model as well but that does not come under the EDS so let's say if you
28:30 - 29:00 wanted to calculate or predict the gender of the person getting so-and-so marks in so-and-so column then you would have to change a few values like all these string values has to go and it should be converted into integer values or the dummy values like 0 1 and all those things and if you are looking to further predictive analysis we are going to make a tutorial on stock protection program guys so hang in there and now that we're done with this session guys I hope you are clear with what exactly is eda how we do it and what is the
29:00 - 29:30 objective of doing EDA on our data don't forget to subscribe to our dear acre for more exciting tutorials and press the bell again to get the latest updates on ID Waker and also check out Eddie Reapers data science certification with Python program the link is given in the description box below and also take a look at our the data science with python full course you will find a lot of advantages of learning Python for data science stay tuned for mode 2 with us thank you and have a nice day I hope you have enjoyed listening to
29:30 - 30:00 this video please be kind enough to like it and you can comment any of your doubts and queries and we will reply them at the earliest do look out for more videos in our playlist and subscribe to any Rekha channel to learn more happy learning