Understanding Point Estimates in Statistics

Estimation3

Estimated read time: 1:20

Summary

In this lecture, Erin Heerey discusses point estimates, where a single statistic is used to estimate a population parameter based on a sample. The focus is twofold: ensuring the estimator is unbiased and considering its efficiency. An unbiased estimator averages out to target the population parameter over numerous samples, e.g., the sample mean for the population mean. Efficiency addresses the variability of the estimator in skewed distributions, where the median might be more efficient than the mean due to lower standard deviation. The issue with point estimates is their perceived precision, which leads to a false sense of accuracy since samples rarely match the true population parameter precisely. Further solutions for this issue will be shared in the next video.

Highlights

Point estimates offer a single statistic to represent a population parameter based on a sample. 🔍
Unbiased estimators average out to the population parameter, balancing over and under-estimations. ⚖️
An estimator might be chosen based on efficiency, particularly in skewed distributions, where the median can be more reliable than the mean. 📊
The standard deviation is a biased estimator, more likely to underestimate than overestimate the population standard deviation, tackled by Bessel's correction. ✂️
Point estimates, though appearing precise, rarely match the true population parameter precisely; a solution will be discussed in the next segment. 🔮

Key Takeaways

Point estimates use a single statistic from a sample to estimate a population parameter. 🎯
Unbiasedness ensures estimators average out to target the population parameter. It seeks balance in over/underestimation over many samples. ⚖️
Efficiency in estimators involves selecting one with lower variability, especially in non-normal distributions. 📉
Point estimates often seem overly precise, leading to a false sense of accuracy. Estimators vary across samples. 🔍
The standard deviation as an estimator is often biased towards underestimation, corrected using Bessel’s correction. 📏

Overview

In this engaging lecture, Erin Heerey dives deep into the world of point estimates in statistics. These estimates utilize a single statistic from a sample to infer details about a larger population, aiming to find the most representative 'point' to anchor your findings. Examples like using the sample mean to estimate the population mean are common examples in practice.

Unbiasedness and efficiency are two pivotal concepts discussed here. While an unbiased estimator helps avoid systematic error by ensuring your sample estimates closely target the true population parameter over multiple trials, efficiency involves selecting the most reliable measure, particularly in datasets with non-normal distributions. This might involve choosing the median over the mean to counteract variability brought by outliers.

Heerey also underscores a core issue with point estimates: their deceptive precision. While these figures might appear accurate, they rarely hit the bullseye of the true population parameter because samples vary greatly, leading to a false sense of certainty. As a teaser, she assures that solutions to enhance estimate accuracy are on the horizon in the subsequent part of the series.

Chapters

00:00 - 00:30: Introduction to Point Estimates In the section titled 'Introduction to Point Estimates,' the lecture focuses on explaining the concept of point estimates. It defines a point estimate as a single statistic used to infer a population parameter based on a sample. The term 'point estimate' arises from using a single value or point to best represent the population value being estimated. An example given is estimating the population mean (mu) by calculating the sample mean.
00:30 - 01:00: Choosing the Best Estimator This chapter discusses how to choose the best estimator when relating a sample to a population. It emphasizes that the ideal estimator is the one that provides the closest estimate to the true mean or central point of the population. In Psychology, the sample mean is commonly used as an estimator of the population mean (mu), but alternatives such as the median or mode can sometimes be better choices depending on the context.
01:00 - 01:30: Key Properties of Estimators This chapter discusses key properties to consider when choosing the most accurate estimator for a population parameter. The focus is on the property known as 'unbiasedness', emphasizing the importance of not introducing bias into the estimation process.
01:30 - 02:00: Unbiasedness in Estimators The chapter discusses the importance of minimizing bias in statistical models. An estimator is deemed unbiased if, on average, it produces estimates that accurately target the population parameter. This implies that the average value of the estimator aligns with the population parameter.
02:00 - 02:30: The Role of Random Sampling The chapter discusses the importance of randomness in sampling. It highlights how random sampling aims to minimize bias in estimating population parameters. By selecting samples randomly, we ensure that there's an equal chance of overestimating or underestimating the population mean, thus providing a balanced and unbiased estimate.
02:30 - 03:00: Understanding Sampling Distribution In this chapter, we discuss the concept of sampling distribution, focusing on the behavior of an estimator in relation to the population parameter. The essence is that while an estimator may vary slightly from the true population parameter in individual samples, across many samples, it tends to average out to approximate the population parameter. This was illustrated with slides in a previous lecture that depicted the sampling distribution of the sample mean, providing a visual understanding of how sampling from a normal distribution behaves over the long run.
03:00 - 03:30: Biased vs. Unbiased Estimators The chapter 'Biased vs. Unbiased Estimators' discusses the concept of sampling distributions and how sample means converge on the population parameter. The focus is on the mean of the sampling distribution, where different sample sizes (e.g., 3, 5, 10, or 20) are considered. Observations show that on average, the mean of the sampling distribution aligns with the population parameter, albeit with some variability where it might be slightly higher or lower.
03:30 - 04:00: Standard Deviation as a Biased Estimator This chapter explains the concept of a biased estimator, particularly in the context of standard deviation. It differentiates between unbiased and biased estimators, indicating that a biased estimator tends to consistently overestimate or underestimate the parameter rather than accurately estimating it. Unlike unbiased estimators, which have an equal likelihood of overestimating or underestimating, biased estimators show a propensity to err consistently in one direction.
04:00 - 05:00: Correcting Bias in Standard Deviation The chapter titled 'Correcting Bias in Standard Deviation' discusses the concept of bias in statistical estimators. It explains that an estimator is unbiased if it is equally likely to overshoot or undershoot the population parameter. However, if the estimator tends to fall more often on one side of the true population parameter, it is considered biased. The chapter specifically highlights that the mean is an unbiased estimator as it accurately represents the population mean (mu).
05:00 - 06:00: Explanation of Bessel's Correction This chapter explains Bessel's Correction, which is used to correct the bias in the estimation of a population's standard deviation. The transcript discusses how using the standard deviation as an estimator is biased because it tends to underestimate the population standard deviation. The explanation touches on the distribution of the sample mean and how it sits in the middle with a normal distribution around it, hinting at the frequent undershooting of true variability in the population.
11:00 - 11:30: Introduction to Efficiency in Estimators The introduction to the concept of efficiency in estimators begins by highlighting a common issue with estimators where they tend to underestimate rather than overestimate. This behavior is supported by an experiment from the previous week's lab, where a sampling distribution of sample standard deviations was constructed. The outcome showed a tendency for the sample standard deviation to be slightly less than the actual population standard deviation. This bias exists because the standard deviation, on average, is a biased estimator. Efforts are made to correct this bias during calculations.
11:30 - 12:00: Mean vs. Median as Estimators The chapter covers the use of Bessel's correction when calculating the standard deviation of a sample. It references previous lectures on descriptive statistics, particularly focusing on the formula for variance and standard deviation. The chapter emphasizes the importance of understanding these corrections and how they relate back to initial teachings in the course.
12:00 - 12:30: Efficiency and Skewed Distributions The chapter discusses the differences in formulas for calculating population variance (Sigma squared) and sample variance. It highlights that the population variance uses the total number of individuals in the population as its denominator, while the sample variance uses n minus 1 as its denominator. The rationale behind this adjustment in the sample variance formula is explained in the chapter.
14:30 - 15:00: Problems with Point Estimates The chapter titled 'Problems with Point Estimates' discusses the issue of standard deviation often underestimating the population parameter sigma. To correct for this tendency, the formula is adjusted by using n minus 1 in the denominator, which is noted not to be related to the concept of degrees of freedom. The reason for this adjustment is to address the issue of frequently undershooting, rather than overshooting, the true population parameter.
15:30 - 16:00: Conclusion The conclusion chapter revisits the concept of sampling distributions and emphasizes the importance of understanding score distribution, with a focus on the normal curve. It highlights that approximately 68% of scores fall within one standard deviation either above or below the mean, demonstrating a key characteristic of the normal distribution.

Estimation3 Transcription

00:00 - 00:30 In the next part of the lecture we're going to talk about what we call "point" estimates. So this is the use of a single statistic to make an estimate about a population parameter based on a sample. These are called Point estimates because they are based on a single point or a single value that best represents the population value that you're estimating. For example we often estimate mu, or the mean of the population, by taking the sample mean.
00:30 - 01:00 And the best estimate, as we're relating a sample to a population, is the one that's going to get us the closest estimate to the true mean of that population, or the true true central point. Now often, in certainly in Psychology, our estimator of mu is the sample mean. But we could also pick the median or even the mode. Sometimes the median is actually a better
01:00 - 01:30 estimator for the population median than the sample mean is. So what we're going to do, is look at how we choose. So there are two key properties of estimators that we need to consider if we're thinking about choosing the best, most accurate estimator of a population parameter. The first property that we need to consider is a property call that I'm calling here 'unbiasedness'. An estimator is unbiased..., so remember we don't want to introduce bias into
01:30 - 02:00 our statistics or we want to do that as little as we can (there's always some that creeps in) but we want to minimize that level of bias in our statistical models. So an estimator is unbiased if on average it produces estimates that target the population parameter. That means that the average value of the estimator is equal to the population parameter.
02:00 - 02:30 Now we also talked about randomness in sampling. If I randomly sample you who, might be extroverted, and I don't randomly sample your friend who's introverted, that might change the value of my sample mean or my estimator for a population parameter but the idea here is that because we're selecting randomly (we want to pick the most unbiased estimate), our random sample should be as likely to overestimate the population mean as it is to underestimate it. So
02:30 - 03:00 the estimator is sometimes a little bit greater than the population parameter and sometimes a little bit lower than that population parameter, but over a lot of samples, over the long haul, on average it works out to be similar. And we saw that a little bit when we talked about the sampling distribution of the sample mean. So if you remember back to the slides from that prior lecture, I showed you a picture where we were sampling from a normal
03:00 - 03:30 distribution and then there were sample sizes of 3, 5, 10, I think somewhere like 20 maybe, and what you noticed was that on average the mean of the sampling distribution, when it was plotting means not individual participants, but when it was plotting means, on average the mean of the sampling distribution of sample means really did converge on the population parameter. Sometimes it was a bit higher sometimes it was a bit
03:30 - 04:00 lower just due to pure random differences in sampling. But that is an unbiased estimator. A biased estimator is more likely to either overestimate the parameter or to underestimate it than to get it right, so it's more likely to fall on one side of this equation it's not equally likely to overestimate as to underestimate it. So the mean is equally likely to overestimate as to
04:00 - 04:30 underestimate the population parameter because it's equally likely to fall on on either side it and is unbiased. If however your population estimator is more likely to land on one side than the other of the true population parameter, then it is biased and there is a systematic difference between the estimator and the parameter. So the mean, as I said is an unbiased estimator because it overshoots the population mean mu as
04:30 - 05:00 often as it undershoots it, right? That's why we get that mean [of the DOSM] falling right in the middle with this nice symmetric normal distribution around it. Now interestingly, the standard deviation is what we call a 'biased' estimator because it is more likely to underestimate sigma the population standard deviation than it is to overestimate sigma. That doesn't mean you'll never overestimate sigma. It just means that on average you're
05:00 - 05:30 more likely to underestimate than to overestimate it and in fact, in last week's lab, you built a sampling distribution of sample standard deviations and if I were betting, my bet is that it was slightly less than the population standard deviation [in the Monte Carlo simulation], and that's because the standard deviation is on average a biased estimator. Now we do our best to correct that bias when we calculate the
05:30 - 06:00 standard deviation of a sample; and we do that by using a correction called Bessel's correction. You also saw this in the lab. We used that in the denominator of the standard deviation formula. So if you think back all the way to the very first lecture, the very first material that you encountered in the class when we were talking about descriptive statistics, I showed you a formula for the variance and then subsequently the standard deviation, and what you'll notice was that I showed you a formula for sigma and I showed you a formula for the standard deviation
06:00 - 06:30 and both of those, they had a slightly different denominator, right? The formula for Sigma or Sigma squared really, because we were talking about the variance, the formula for Sigma squared used the number of individuals within the population as its denominator and the formula for the the sample variance, that one included n minus 1 in its denominator, and I said I would explain
06:30 - 07:00 that in a future lecture. We're there now! So because the standard deviation is more likely to underestimate sigma than to overestimate it we correct that formula by using n minus 1 in the denominator. This is not degrees of freedom! It is a correction for the fact that we undershoot more often than we overshoot the true population parameter why do we do that?
07:00 - 07:30 Well, again I want you to think back to the sampling distributions lecture, and think about how the scores are distributed. So we know if you look at the picture of the normal curve that you've already seen a bunch of times in this class, we know that about 68 of of individuals, 68 of scores, fall within one standard deviation either above or below the mean. So within one standard deviation of the means it is 68 of the sample; within two standard deviations
07:30 - 08:00 the mean we're now getting into the 96 ish percent domain and it goes out from there. So there are fewer more extreme scores in our sample so we are less likely to select them. And that doesn't mean you won't, right? You sometimes will. But if you happen to get a 'normal' pick what you'll end up with is scores that are much closer to the true population mean
08:00 - 08:30 and that means you are likely to underestimate the population standard deviation because you're really not sampling those outliers with the same probability that you're sampling the scores that are closer to the mean. So the farther away the score is from the mean the fewer of those scores exist in your population and the less likely you are to come up with one by pure random chance.
08:30 - 09:00 So go back and think about your M&Ms, the pretend M&Ms that we made in lab, when you were sampling. If you were to think about selecting one of the rare ones in your sample... So you had a sample of 25. What if you were to select something that only came up once and if you had one of those in your sample, its probability of being selected would have been much lower than the probability of an M&M that occurred more frequently. So the same thing is happening here, because there are some scores where there are just more of them in the population.
09:00 - 09:30 You're more likely to sample those scores and as a result these outliers, these scores that are farther from the mean, those are less likely to come up [because there are fewer of them in the population], and that means that the standard deviation is likely to be smaller than the true standard deviation in the population. It's more likely to underestimate that quantity. So we have to correct it in the
09:30 - 10:00 formula. It's not a perfect correction, by the way. When we take our we take our sample standard deviation or our sample variance formula, we calculate the sum of the squared deviations and then we divide by n minus one. It's not a perfect correction but it's closer, right? It's closer to what it should be than we would get if on average we used n in our
10:00 - 10:30 denominator. And what you can do if you're not sure why we're using n minus 1, is take a number that you can pretend it's the variance divided by a number n, and then divided by n minus 1 and see what happens to it. It should, the outcome of your division operation with the numerator divided by the denominator, should be just a little bit bigger when we decrease the denominator, right? If you think about taking half a pizza or how about a third of a pizza? If you take one over three
10:30 - 11:00 [a third], if you divide by one over two, which is n minus one, you get a little more pizza right? So the same thing is happening here. We're reducing the size of the denominator and that increases the outcome of that calculation. So that's the idea of unbiasedness. If the estimator is unbiased then it produces estimates of a target population parameter that are, on average,
11:00 - 11:30 equal to that population parameter. If a statistic is biased, then it tends to either underestimate or overestimate the true population parameter and we're aiming to have these unbiased estimators of the population. Now I I said, a couple slides ago, that the mean was our most common estimator for mu, the population mean. And that's sometimes true but sometimes the median makes more sense. Here's
11:30 - 12:00 why, so we have another element that we need to think about when we're thinking about how to take a sample and estimate a population, and that's an element we call "efficiency". This is about the sampling variability of an estimator. Now let's pretend you have a skewed distribution or a non-normal data distribution. In that case the median might be a better estimator than the mean because it's less sensitive to outliers. So think
12:00 - 12:30 about a population or a variable like income, right? We've used this one as an example before. And if you think about income, most of us earn some money but there are a few people out there who earn a lot of money - the Elon Musks of the world. There are some people out there who whose scores really genuinely and quite substantially inflate the estimator when we calculate the mean.
12:30 - 13:00 However, when we calculate the median, those inflated scores are higher than the median but there, the degree to which they're inflated is not as significant - it doesn't cause a significant problem when we're calculating the median. And so in distributions like that, we might think about the median as a more efficient estimator because it's sampling distribution may actually
13:00 - 13:30 have a lower standard deviation, and maybe less sensitive to outliers. So in a highly skewed, non-normal data distribution one of the things you can think about is that if you calculate the sampling distribution of a median (rather than the mean) by taking random samples from a population that's highly skewed, if we have two competing estimators that are both unbiased, and the mean and the median are
13:30 - 14:00 both unbiased, the one with the smaller variance for a given sample size, we call that one 'more efficient'. So if we have two estimators, and their variation looks like this versus like this, the one with the the smaller variance here, this green one, Estimator 2 will be more efficient and its sampling distribution will be more tightly clustered around the true population estimate.
14:00 - 14:30 So we don't always use the sample mean to estimate the population mean. Sometimes we use the sample median, especially when we have a lot of outliers or when our distribution is highly skewed, because in that case the median will be a more efficient it will sample with greater precision because it has a lower standard deviation so that's the idea with efficiency.
14:30 - 15:00 Now there's a little problem with point estimates. We make point estimates from single samples and we assume these things are very precise. I calculate a mean, maybe it's 24.76, and that seems like a really precise number and it gives us a false sense of the precision with which we can estimate a true population parameter. But as we know and as we've seen, single samples almost never give exactly the true population parameter it's easy to miss this. It's like taking your fishing rod and casting it in and seeing if
15:00 - 15:30 you get any hits. And random sampling we know leads to variability in the sample makeup and therefore the estimator varies from the quantity it should estimate. We almost never get an exact estimation of our population mean, for example if that's what we're using. We almost never get an estimate of the population mean that is exactly the same as the true population mean, which by the way we never actually know what that is.
15:30 - 16:00 So point estimates they seem very precise, but in reality they can vary substantially from one sample to the next and that's the problem with point estimates. There's a solution to that problem. We're going to talk about that in the next video.