Sampling&Distributions2

Estimated read time: 1:20

Summary

In this video, Erin Heerey explores the intricacies of linking sample scores to populations using probability density functions. She emphasizes the significance of normal distribution, its characteristics, and the role of random sampling in making inferences about populations. The lecture also covers different types of distributions such as uniform, binomial, and normal distributions, along with an introduction to key statistical concepts like skewness and kurtosis. These are crucial for understanding data distribution in psychological research.

Highlights

Understanding probability density functions is crucial for linking samples to populations 🤓
Normal distribution is described by a mean of zero and a standard deviation of one 💡
Uniform distributions apply to scenarios like fair dice rolls or prize draws 🎲
Binomial distribution involves win/lose experiments with repeated trials, often seen in coin tosses 🪙
Skewness and kurtosis are key parameters in distribution, indicating asymmetry and tail heaviness 📈

Key Takeaways

Probability density functions help link sample scores to populations 📊
Normal distribution is central, featuring the standard bell curve 🎯
Uniform distributions have equally likely outcomes, like dice rolls 🎲
Binomial distributions involve experiments with two outcomes, like coin tosses 🪙
Skewness indicates asymmetry in data distribution, with income data often positively skewed 💸
Kurtosis measures tail heaviness in distributions, affecting statistical tests like the t-test 🧪

Overview

In this engaging session, Erin Heerey delves into the world of sampling and distributions, focusing on the probability density function. She explains how this function is pivotal in linking sample data to its respective population, highlighting the similarities between this concept and the normal distribution - the familiar bell curve.

The lecture transcends into the realm of various distribution types encountered in psychology and statistics. Erin discusses uniform distributions using the relatable example of dice rolls, where every outcome is equally probable. She then transitions into binomial distributions, perfect for scenarios with binary outcomes like flipping a coin multiple times.

The session wraps up with an insightful exploration of skewness and kurtosis, two fundamental parameters shaping data distribution analyses. Erin elucidates these concepts with real-world examples, such as income distribution for skewness, and addresses the significance of kurtosis in relation to the frequency of extreme scores within data sets.

Chapters

00:00 - 01:30: Introduction to Probability Density Function and Normal Distribution This chapter introduces the concept of a probability density function (PDF) and its role in linking a score from a sample to the broader population. It emphasizes the importance of understanding the PDF as a non-trivial problem, and relates it to the concept of kernel density estimates discussed in a previous lecture, highlighting the smoothing aspect of PDFs.
01:30 - 04:00: Random Sampling and Probability This chapter discusses random sampling and probability, with a focus on the concept of probability distributions. It introduces the probability density function, which describes how the probability of observing a particular data point is distributed across various values. A specific type of probability distribution, the normal distribution or 'bell curve', is introduced and explained.
04:00 - 05:00: Overview of Common Distributions The chapter 'Overview of Common Distributions' discusses important characteristics of distributions, particularly focusing on a distribution defined by a mean of zero and a standard deviation of one. It explains the specific proportions or percentiles within each standard deviation band, ranging from the mean to multiple standard deviations. This typically refers to the properties of a standard normal distribution, where intervals from the mean encompass certain predictable percentages of the entire distribution.
05:00 - 07:00: Uniform Distribution Explained The chapter introduces the concept of a uniform distribution, which is a probability distribution where all outcomes are equally likely. It uses the analogy of marbles to illustrate how values from within this distribution are represented. The chapter emphasizes the importance of understanding this concept in statistical analysis.
07:00 - 10:00: Binomial Distribution Discussion The transcript provides a discussion on the binomial distribution, highlighting the range of values from 0.01 to 0.3 or higher/lower and noting that extreme values beyond plus or minus four are rare, with a frequency less than 0.13 percent. The emphasis is on understanding how frequently different values occur in this distribution.
10:00 - 13:00: Normal Distribution Characteristics The chapter delves into the characteristics of the normal distribution, using the analogy of drawing marbles to illustrate the concept. It suggests that a value of zero will be the most common outcome if a large number of marbles are mixed in a manner reflective of a normal distribution. The chapter leaves off with a segue into discussing probability in a subsequent lecture.
13:00 - 15:00: Skewness and Kurtosis in Distributions This chapter discusses the concept of skewness and kurtosis in statistical distributions. Skewness measures the asymmetry of the distribution, where more common values are likely to be drawn from the distribution. The chapter illustrates this concept by explaining that values close to 0, such as 0.01 and 0.02, are more frequent, while higher values like 0.38 and 0.39 are rare and hence less likely to be drawn. However, the occurrence of less common values does not imply it is impossible, just less probable.

Sampling&Distributions2 Transcription

00:00 - 00:30 One problem we need to consider, is how do we link a score from a sample to the population from which it was drawn and that's not a trivial problem. The way we do this, is we consider what we call the 'probability density function' of the sample. The probability density function of the distribution plot, remember last lecture we talked about 'kernel density estimates' so this is similar to that. It's a smoothed
00:30 - 01:00 histogram that describes the probability that any given score falls at any band in this graph. So the distribution plot is a plot of the probability density function for a given variable, and it shows the probability of observing a given data value, a particular data point within a distribution. Let's take this distribution. It's a very special distribution. it's called the normal distribution. You've heard of this distribution - it's the standard bell curve, and it has a couple
01:00 - 01:30 of interesting properties. It's described by a mean of zero and a standard deviation of exactly one . What that means, is that within each band so from the mean to minus one standard deviation, from the mean to plus one standard deviation, from plus one to plus two, plus two to plus three, and so on, there is a specific proportion or a specific percentile of the
01:30 - 02:00 data that falls within that band. Now let's unpack that because this is a really important concept. Imagine that I want to take a random sample from this population. Now let's pretend for a minute that I have a lot of little marbles, And each marble has a value from this distribution on it. Some of those marbles will have a value of zero, some of the marbles have a value
02:00 - 02:30 of 0.01, some will have a value of 0.02, some will have a value of 0.03, there will be values of 0.1 0.2 and so on all the way up[/down] through plus[/minus] four. Let's pretend the distribution ends there - there won't be very many of those more extreme values in fact it's less than .13 percent. So what we're looking at here is the frequency with which those values occur.
02:30 - 03:00 a value of zero is going to be the most commonly occurring marble so if I put let's say 10 million of these marbles with these values in appropriate proportion to this normal distribution in what, I guess would be a very large barrel, mix them up and drew one at random, what number would I get? Well, we'll be talking about probability in the next part of the lecture,
03:00 - 03:30 but if we think about it, values when there are more of a particular value within a distribution, I am more likely to draw a common value that's most common. So there are more values of 0 and 0.01 and 0.02 than there are of values of 0.38 and 0.39 and so forth there are very few of these higher values so those values aren't likely to come up very often. It doesn't mean they won't - the first
03:30 - 04:00 marble I pick could be one of those, but on average I'm going to pick more marbles that have values closer to the center of this distribution simply because there are more of those marbles in the bag. So that's one way to imagine what the probability density function is telling us. It's telling us about how likely different values are to be sampled.
04:00 - 04:30 Ideally, we use random sampling to take samples from populations and those samples allow us to make generalizations to or inferences about that population. By understanding the likelihood of a given value in a distribution, that allows us to characterize the sample and therefore the population more accurately. So it tells us something about the relationship between the value and the population. For example, is the value very far away from the population mean? How often is that value likely
04:30 - 05:00 to come up? So that's what these probability density functions or distributions do for us. There are lots of different types of distributions that we use regularly in Psychology. We use the uniform distributions on a regular basis. We use binomial distributions on a regular basis. We use normal or gaussian distributed data or variables on a regular basis, and so we'll be talking about these particular distribution types today, but please know that when we talk
05:00 - 05:30 about specific statistical tests in the second part of this class, we will also be talking about distributions that are unique to those statistical tests. Each statistical test has its own distribution and we will talk specifically about those distributions when we get to those tests. For now we're going to concentrate on three common distribution types. The first one is what we call a uniform distribution. So you might get a uniform distribution if all of the outcomes
05:30 - 06:00 that you have are bounded, and you know what they are, and they're all equally likely. For example, the role of a single die, a six-sided die let's say. We know that you can get a one, or two, or a three, or a four, or a five, or six. you can't get anything else there's no seven there's no zero there's no 16. You can only get the values one through six. And you also know, assuming the the die is fair,
06:00 - 06:30 that each one of those values is equally likely to be rolled. If you are entering into a single entry raffle or prize draw, your number is just as likely to come up as everyone else's is. In uniform distributions, there are a discrete finite number of outcomes that can occur. We can also have uniform distributions that are continuous, these are much more
06:30 - 07:00 rare, that can have an infinite number of outcomes where the distance, for example, one good example is the distance between two points. You can have a continuous number. if you're thinking about you can roll a ball down the street. How far does the ball go between two points? There are an infinite number of outcomes because you can keep slicing the space into smaller and smaller and smaller
07:00 - 07:30 values. That's a continuous distribution and depending on the role of the ball and how hard you rolled it it can land in any portion of your sample space, and all positions can be equally likely. A binomial distribution is a distribution that is that comprised of the probability of win or lose outcome in an experiment that's repeated multiple times. So this
07:30 - 08:00 prefix 'bi' here in binomial, 'bi' means 'two' possible outcomes. So this might be a coin toss, this might be whether you throw a dart and it hits or misses the target. The number of observations in these experiments must be fixed. So an experiment must be repeated x times, where that x is specified beforehand. so you can think about an experiment that has 50 trials and on each trial participants will either
08:00 - 08:30 get it right or they will get it wrong. So that would be a binomial distribution. If we think about one example of this is a coin toss. So you have a 50% chance of getting heads in a single coin toss, and if you toss a coin 20 times you have close to a 100 chance of getting at least one heads in those 20 tosses of that coin. Importantly one of the important things about a binomial distribution
08:30 - 09:00 is that all the trials are independent. So what happens on trial one is totally unrelated to what happens on trial two. So the probability of a win is identical from one trial to the next and that is how we get a binomial distribution. You will notice that these are discrete points that are distributed across, in sort of a normal fashion, across the potential outcomes here.
09:00 - 09:30 And then finally, we have our standard normal distribution. This is the one we're going to deal with the most often. you've seen this picture before, it's also known as a Gaussian distribution after the German mathematician, Gauss. It is perfectly symmetric so there are exactly the same number of data points on the left of the central mean as there are on the right, and it's described by a mean of zero and a standard deviation of one.
09:30 - 10:00 now of course we can have normally distributed data that have different means and different standard deviations because we're measuring something different than this kind of standard Gaussian distribution, but in general a normal distribution is going to have similar proportions of data distributed across the each standard deviation boundary. As you see in the normal distribution plot, there will be more scores clustered around the mean and as you get further away from the mean there are fewer scores there. Most of the variables we deal with in Psychology are
10:00 - 10:30 normally distributed. One of the reasons for that is they rely on the influence of lots and lots of smaller variables and when different variables influence different things in different ways, what you end up with is kind of this hodgepodge where most people are closer to the mean and fewer people are farther away from it. And that gets you what is essentially a normal distribution. Now there are two other parameters that we need to talk about when we're thinking about
10:30 - 11:00 distributions. One of those is a parameter called skewness. You can have positive skew and you can have negative skew. So what you see here is, here's our symmetrical distribution and it's got a specific characteristic: the mean the median and the mode are all identical values they all line up right on top of one another or pretty darn close. When we have positive skew, the mode tends to be lower than the median which in turn tends
11:00 - 11:30 to be lower than the mean. So that usually means there's a some high scores in these distributions. A classic positively skewed distribution is the distribution of income, right? Most of us don't make that much money, but there are some people who make a lot, a lot, a lot of money and that skews the data. We can also have the opposite type of skew, which is negative skew, where very few people have very low scores and then as we get higher up the the number line here, we have more people
11:30 - 12:00 clustered around higher points. So in this case, when we have negative skew, the mean and the median are both lower than the mode and that indicates negative skew. The other parameter we need to think about is a parameter called 'kurtosis' and that's a measure of how heavy the tails in a distribution are. There is the symmetrical normal distribution. The normal
12:00 - 12:30 distribution is called 'mesokurtic' and it means it has these really nice evenly distributed tails where there aren't too many scores in the middle and there aren't too few scores on the end. We can also get distributions that are 'platykurtic'. The uniform distribution is platykurtic. It has very light tails. Its distribution, it has low tails and you can sort of see this if you if you look at the colors here. This kind of purple distribution, this one here is much,
12:30 - 13:00 it has many fewer scores in the tails here, they do not extend very far. And then we can think about a 'leptokurtic' distribution, that's this blue one here, where it's very peaked in the center and what you can see is that there are more scores out in the tails then there probably should be so that's a leptokurtic distribution. And we'll see that, there is a test, it's a very
13:00 - 13:30 famous and frequently used test statistic called the t-test that has a leptokurtic distribution. So in general, the kurtosis is a measure of how much or how many scores are in the tails of these distributions. So you'll want to be thinking about that when you're evaluating whether or not something is normally distributed. I'll break it there and we'll pick up the next section in the next video.