Estimation1

Estimated read time: 1:20

    Summary

    In this engaging lecture, Erin Heerey discusses the concepts of estimation and precision, emphasizing their importance when using a sample to understand population parameters. Through the use of simple analogies and stories, Heerey clarifies how these methodologies help in making accurate inferences about a larger population, while also pointing out the potential issues like bias and sampling errors. The lecture explains the balance needed between accuracy and precision and delves into the challenges of obtaining truly random samples, drawing vivid comparisons to childhood experiences and hypothetical experiments.

      Highlights

      • Estimation aids in understanding population parameters by examining sample data. 📊
      • Accuracy relates to the truthfulness of a measurement, while precision is its detail. 🎯
      • Overcomplicating measurements can introduce noise, reducing accuracy. 📉
      • Sampling, when not random, may lead to biased results. 🤔
      • True randomness in sampling is challenging but essential for reliable conclusions. 🚀

      Key Takeaways

      • Estimation is used to infer population parameters using a sample, acknowledging that we can only make approximations. 📏
      • Accuracy and precision are critical in estimation, but striving for one can sometimes affect the other. ⚖️
      • True random sampling reduces bias but is often difficult to achieve in real-world scenarios. 🎯
      • Sampling errors occur naturally and can affect the reliability of your conclusions. 🔍
      • Exciting childhood stories can effectively illustrate complex statistical concepts. 🤓

      Overview

      In this lecture, Erin Heerey takes us on an insightful journey into the world of estimation and precision. Using the process of sampling, she explains how we approximate various population parameters from samples, even though true values remain unknown. With relatable examples, Heerey illustrates the significance of these methodologies in understanding human behavior and beyond.

        The lecture delves deep into the concepts of accuracy and precision in measurements. While accuracy pertains to the validity of a measurement, precision focuses on its reliability and detail. Heerey warns about the trade-off between these two, as sometimes increasing precision may introduce noise, thereby reducing accuracy. This section is brought to life with amusing anecdotes, making the complex subject easily digestible.

          Further, Erin tackles the practical challenges of obtaining a truly random sample from a population to avoid bias. Through entertaining stories from her own childhood experiences, she not only highlights the concept of sampling errors but also presents the importance of reducing bias in conducting any research. The lecture sets the stage for a deeper understanding of the implications of sampling and how it affects predictions and theories in real-world applications.

            Chapters

            • 00:00 - 00:30: Introduction to Estimation and Precision This chapter introduces the concepts of estimation and precision in statistics. It explains how taking a sample from a population allows us to make estimates about various population parameters. The lecture will delve into the processes of estimation and the importance of precision in both sampling and estimation, establishing a foundational understanding for further topics.
            • 00:30 - 01:00: Understanding Population Parameters The chapter "Understanding Population Parameters" discusses the challenge of determining true population parameters. It highlights the limitations of approximating these values through samples, as the true value is often unknown. The chapter explains the methodology used in sampling to make informed guesses about population parameters, emphasizing the importance of this process in understanding human behavior and other phenomena through experimental design and data examination.
            • 01:00 - 01:30: Accuracy and Precision This chapter explores the concepts of accuracy and precision in data analysis. It begins by discussing how examinations of data allow for estimates on the applicability of theories to populations, and the nature of those populations themselves. A fundamental aspect of this process involves accuracy, which is more thoroughly explained, possibly with a visual aid such as a cartoon for illustration. The summary should highlight how accuracy and precision serve as core elements within data analysis and theoretical assessments.
            • 01:30 - 02:00: Trade-off Between Accuracy and Precision This chapter discusses the concepts of accuracy and precision in measurements. Accuracy is defined as the degree to which a measurement is true or valid, while precision refers to the detail and reliability of the measurement. These two concepts are often closely related but can also present a trade-off situation. As estimates become more precise, they may simultaneously become less accurate due to potential introduction of noise from over-complicating the precision of measurements.
            • 02:00 - 02:30: Precision and Questionnaire Measures This chapter discusses the concept of 'Precision and Questionnaire Measures' in the context of psychological assessments. It highlights the potential pitfalls of making questionnaire measures too detailed, suggesting that over-detailed measures might capture irrelevant data, or 'noise', rather than accurately reflecting the intended psychological trait, such as extraversion. The chapter prompts consideration of whether the elements being measured truly represent the processes or traits of interest.
            • 02:30 - 03:00: Review of Parameters vs. Statistics The chapter explores the differences between parameters and statistics, using the example of extroverts with varying preferences for social activities like parties and clubs. It highlights the complexity of constructs and the trade-off involved in including numerous questions that may not be central to the main idea, potentially complicating the measurement of the key concept. While more detailed measurements can be useful, they may also lose focus and not accurately capture the primary construct intended to be measured.
            • 03:00 - 04:00: Generalizing from Sample to Population The chapter titled 'Generalizing from Sample to Population' discusses the concept of improving the accuracy of estimates by selecting the right level of precision. It explains how this can reduce unexplained variance or noise, thus making outcomes more meaningful. The chapter includes a review on the difference between parameters and statistics, highlighting that population 'parameters' are theoretical constructs applicable to all possible individuals.
            • 04:00 - 05:00: Sampling Error and Random Sampling This chapter delves into key statistical concepts such as sampling error and the methodology of random sampling. It explains that within a population, the mean (mu) and standard deviation (sigma) are vital parameters we attempt to estimate through samples. A sample, described as a subset of the population, includes individuals who participate in observations—such as those attending lab sessions.
            • 05:00 - 06:00: Challenges in Achieving True Random Samples In this chapter, the focus is on the challenges associated with obtaining true random samples for data analysis. The discussion begins with the measurement of data and how it can be used to calculate the mean and standard deviation of a sample. These statistics are then used to make estimates about the larger population. The reliability of these estimates heavily depends on the manner in which the sampling is conducted. The chapter aims to delve into the intricacies of sampling methods and their implications on the accuracy of population estimates.
            • 06:00 - 07:00: Sampling Error and Unbiased Error The chapter discusses sampling error and unbiased error, using a personal anecdote as an example. The speaker shares a childhood story about their grandmother in northern Wisconsin to illustrate how, much like predicting cold winters in Northern Ontario based on October weather, one might generalize findings from a sample to a broader population.
            • 07:00 - 09:00: Conclusion and Introduction to Sampling Bias The chapter begins with a personal anecdote about the narrator's family traditions during winter, particularly involving clearing snow from the roof of a family member's house. The story highlights themes of family bonding and childhood memories, as shoveling snow becomes both a chore and a playful activity, with the kids using the snow piles for sledding and jumping off the roof.

            Estimation1 Transcription

            • 00:00 - 00:30 In this lecture, we're going to be  talking about estimation and precision. When we take a sample and use it as a model of the  population, what we're doing is making an estimate   about a variety of population parameters. We're  going to talk about that estimation process and   the other thing that we're going to talk about is  precision, in terms of both our sampling and our   estimation. Estimation is the process of finding  an approximate value for a population parameter,
            • 00:30 - 01:00 but this is approximate because we never know  what the true value of the population parameter   is. All we can do is take our sample complete our  methodology, in the very best way that we can,   and use that to make a guess  about that population parameter.   We do this because we want to learn about  how people, and perhaps other animals, work.   So we devise experiments we sample from  populations and we examine the associated data,
            • 01:00 - 01:30 and on the basis of those examinations of the data  we can make estimates about both how theories work   or apply in our populations, and we can we  can make estimates of that population itself.   At the heart of this endeavor is an element  of accuracy and an element of precision.   This is a silly little cartoon that illustrates  these concepts. Accuracy is really about the
            • 01:30 - 02:00 degree to which our measurement is true or  valid, and precision is about its detail,   its reliability. Often these ideas are tightly  intertwined and often there's a trade-off between   them, so as estimates become more precise they  can also in some ways become less accurate.   For example we can introduce noise by over  complicating the precision of our measurement,
            • 02:00 - 02:30 or by perhaps measuring things that may be  elements of a process that is not actually the   true process. So we can think about... sometimes  we think about questionnaires. So if I have a   questionnaire measure of extroversion, my measure  of extraversion might, if I make it too detailed,   might actually end up sampling noise rather than  the true concept of someone's extroversion. For
            • 02:30 - 03:00 example there are extroverts out there who really  enjoy going to parties and other extroverts who   really don't like going to parties; or some  who like going to clubs and some who really   don't like going to clubs. So if I include a lot  of questions that over complicate the construct,   that are not central to the main idea, that is go  that's going to give me that trade-off. I might   be able to measure some ideas that I can measure  more precisely, but that measure might also become
            • 03:00 - 03:30 less accurate. So by selecting the right level  of precision you can improve your estimate of   an outcome by reducing unexplained variance or  noise, and making that outcome more meaningful. So let's have a quick review. What's the  difference between parameters and statistics?   Population 'parameters' are purely theoretical  constructs. They apply to all possible individuals
            • 03:30 - 04:00 within a population so the mu or mean and  sigma, standard standard deviation; those two   constructs are population parameters. We are  trying to estimate mu and sigma, for example,   amongst other things from the sample that we  take. A sample is a subset of that population of   individuals. It's what we observe - so it's the  people who actually come into the lab and have
            • 04:00 - 04:30 their data measured. And on the basis of that,  we can measure the mean and standard deviation   of our sample and we use those measurements  to make an estimate of our population.   But how good that estimate? It's going to depend  on how we sample. We're going to talk about that   today. ... So how can we when we use our sample  as a model of a population? How can we actually
            • 04:30 - 05:00 generalize from that sample to the population?  I'm going to give you a bit of a silly example.   This is an example from my childhood. When I was  little my grandmother lived in northern Wisconsin,   which is a climate very much like Northern  Ontario. For those of you who are from there   or who have been there in the winter, it's cold  and snowy and you can kind of tell in October   when it's going to be a cold winter. Now because  my grandmother lived up there and she lived all
            • 05:00 - 05:30 by herself, my family would go up in fact usually  it was myself and my family and then my cousins,   and we would all go up and we'd have to shovel  off her roof every four to six weeks, depending   on the snow level. And if it was going to be a  really good winter, we'd've got a lot of snow,   then my auntie would let us take our sleds up  on the up on the roof and then we would have   this great big snow pile and you could do a jump  from the edge of the roof onto the snow pile and
            • 05:30 - 06:00 hopefully nobody got a concussion (that  only happened once to my brother).   I hope your mother never let you do that. But it's  really fun, so we always hoped for a really hard   winter. And one of the things we all noticed was  that around about late October, if the squirrels   up there were really fat we knew it was going to  be a good winter - we knew we were going to have   some great sledding off the roof. So, if a sample  is representative of a population, it allows us
            • 06:00 - 06:30 to make a really a reasonable estimate of the  population parameters. We could ask this question:   are we going to have a hard winter? And one way  we could do that, is we could measure squirrel   weights in late October, when they're sort of  bulked up - they are storing up fat before a   difficult winter. They have a fine balance, right?  Because you have to be lithe enough to escape from   your predators, while having enough extra fat  keep you warm over the course of a really hard
            • 06:30 - 07:00 winter. And Winters up there can be ... as I  said, they're long they're snowy and they're   really cold. So it's not it's not at all uncommon  to have to see minus 40 and minus 50 temperatures   for weeks on end. So the squirrel weights, if  they were higher than average, might predict   a hard winter. So let's say I wanted to ask that  question and I took a random sample of squirrels   or a convenient sample of squirrels. I could take  an ad hoc sample of squirrels that I might be able
            • 07:00 - 07:30 to sample easily and cheaply. So my population  might be the population of squirrels in Ontario,   and I could take a sample of squirrels from  the UWO campus. I mean they're a dime a dozen,   they follow you around, they might even eat  nuts out of your hand. I also have plenty of   squirrels in my backyard. Now the problem with  that idea is that that sample might be biased.   The squirrels in Southern Ontario live in a  really different climate than the squirrels   in Northern Ontario. I don't think they get  as fat down here as they did when I was a kid
            • 07:30 - 08:00 in northern Wisconsin. So there might be a  difference between these squirrels but there   might not, you never know. The problem with the  sample is that the true population might be seen   in these red dots here that are located in a  rather different location there it seems like   their mean has shifted relative to the blue  dots [of my sample], and if that's the case,   then I probably have a biased sample. Now I would  never know that because this squirrel-sample is
            • 08:00 - 08:30 a sample of the squirrels I could sample. I'm  going to use that sample to make an estimate of   the population. So I would never know that they  were biased, but my conclusions might be wrong.   So the better choice is to take a truly  random sample where all of the squirrels   all over Ontario are equally likely to be  selected I just sort of throw a dart at a   map and any place that it lands that is not in a  body of water, there's a lot of that in Ontario,   anywhere that that dart would land, would be  where I would put a squirrel trap. I would trap my
            • 08:30 - 09:00 squirrel, I would weigh it, and then I would let  it go. And that would be a truly random sample.   But that would be really difficult to achieve.  I mean Ontario is enormous and if you've ever   tried to drive from one end to the other it's a  really long drive. It will take you several days.   So what what we'd ideally do is take  our random sample of squirrels. However   what we often do, we often don't have the luxury  of taking the truly random sample and that leads
            • 09:00 - 09:30 us to sampling error. So the difference between  a population parameter and a sample statistic is   called sampling error. Now we know that if we  take a sample, the sample that we take, and we   calculate its mean, the mean of the sample that  we take will not ever be exactly the same as the   mean of the population. Or at least probably not.  Most of the time it will be slightly different.
            • 09:30 - 10:00 And the difference that occurs, if we take true  random samples, that difference occurs simply   because of the specific population members  that were randomly selected for the sample.   If I randomly select you instead of your friend,  you're going to give me different values on my   measure. If my measure is extraversion, you might  be an extrovert and have an introverted friend or   vice versa, and because I've randomly sampled you  and not your friend, that's going to change the
            • 10:00 - 10:30 mean of the value that I'm measuring. Now if the  sample is truly random, this error is unbiased.   And that means that there's no specific  participant characteristic that alters   the likelihood that a particular  person is sampled instead of another,   and that the error only occurs due to chance  differences, chance variation in who was and was   not sampled. Now let me ask you this: how often  do you think that our samples are truly random?
            • 10:30 - 11:00 Probably not as often as we  would like that to happen.   We're going to talk about what we call  'sampling bias' in the next part of the lecture.