SingleSample3

Estimated read time: 1:20

Summary

Erin Heerey delves into understanding where critical values in statistics originate from. The discussion begins with an explanation of Z values and their relationship with standard deviation boundaries. It covers non-directional and directional Z tests, highlighting the difference between one-tailed and two-tailed tests. Heerey then transitions to discussing one-sample t-tests, contrasting them with Z tests and emphasizing the importance of sample size and degrees of freedom. The lecture includes examples like passenger vehicle emission standards and IQ tests to illustrate concepts. Heerey concludes by exploring how proportions are tested and the importance of understanding the statistical requirements for different tests, including consideration of sample distribution shape and known population standard deviation.

Highlights

Critical values are derived from the proportion of scores within standard deviation boundaries. 🌟
Z values illustrate how far a sample mean deviates from the population mean. ▶️
Directional tests are concerned with only one side of the distribution, marking 5% extremity. 📉
The one-sample t-test evaluates sample means against a benchmark rather than a whole distribution. 🧪
T-distribution curves are influenced by degrees of freedom, altering their similarity to normal distributions. 🔀

Key Takeaways

Critical values are key to understanding Z scores based on standard deviation boundaries in a normal distribution. ✨
Non-directional Z tests look at both tails of distribution, whereas directional tests focus on one. 🎯
One-sample t-tests compare a sample mean to a specific value, differing from Z tests in flexibility and use of sample data. 🚗
Understanding degrees of freedom is crucial in shaping t-distribution, which approaches normality as freedom increases. 🔍
Sample size and distribution shape determine the choice between Z and t-tests. Make sure to check the normality! 📏

Overview

Erin Heerey takes us on an enlightening journey through the world of statistical critical values and tests. Starting with the foundational understanding of Z values, Heerey highlights their connection to standard deviation boundaries and explains how they're used in both two-tailed (non-directional) and one-tailed (directional) tests. The nuances between these tests, especially regarding what percentage of scores falls above or below certain Z values, are key insights.

Diving into the realm of one-sample t-tests, Heerey points out their similarity to Z tests but shifts focus from a population to a benchmark comparison. This part of the lecture explains the computation of the t-statistic, accounting for the sample mean and a hypothesized parameter, and discusses the importance of degrees of freedom in shaping the t-distribution, eventually leading these distributions toward normality.

Finally, Heerey brings attention to the practical implications of selecting appropriate statistical tests. With examples like vehicle emission standards and the amusing scenario of distinguishing children's lies, emphasis is placed on the necessity of knowing whether population standard deviation is known, understanding sample size and distribution shape, and the applicability of comparing proportions similarly. This thorough overview ensures readiness in approaching both single-sample Z and t-tests in various scenarios.

Chapters

00:00 - 01:00: Introduction to critical values and Z-scores In this chapter, the concept of critical values and their origin is introduced. The discussion revolves around the critical values of Z-scores, explaining that they are determined based on the proportions of scores falling within various standard deviation boundaries of a normal distribution. The normal distribution, familiar to readers, is highlighted with its mean of zero, setting the stage for further discussions on single sample tests.
01:00 - 02:00: Critical values for non-directional and directional tests The chapter discusses the interpretation of z-scores in the context of both non-directional and directional tests. It explains that a z-score of zero indicates no deviation from the mean of the population or sample, while a z-score of plus one indicates one standard deviation above the sample mean, and a z-score of minus two indicates two standard deviations below the sample mean. These explanations are linked to the concept of z-tests or Z values.
02:00 - 03:00: Introduction to one-sample t-test In this chapter, the concept of one-sample t-tests is introduced with a focus on non-directional or two-tailed tests. It covers how such tests involve establishing both upper and lower boundaries in the distribution. Specifically, it discusses that for a two-tailed test, there are critical boundaries where approximately 2.5% of the scores fall above and below these points, typically set at 1.96 standard deviations from the mean, which is slightly under two standard deviations.
03:00 - 04:00: Differences between t-tests and Z-tests The chapter discusses the differences between t-tests and Z-tests. It introduces the concept of a 95% confidence level with an alpha of 0.05 for hypothesis testing. It highlights that in a one-tailed test, only one boundary (upper or lower) is considered, focusing on the extreme direction where 5% of the distribution falls.
04:00 - 06:00: Understanding T statistics and degrees of freedom The chapter discusses the boundary values for a z-test, specifically focusing on a one-tailed or directional test. It explains that the critical values are either -1.645 or +1.645. These values represent the number of standard deviations a score must be from the mean for only 5% of the distribution to be more extreme than the observed score.
06:00 - 08:00: How to compute a one-sample t-test This chapter focuses on explaining the one-sample t-test, a statistical hypothesis test. The one-sample t-test is used to determine if an unknown population mean is different from a specific value. Unlike the one-sample Z test, it has some differences. An illustrative example mentioned is comparing passenger vehicle emission standards between 2009 and 2012.
08:00 - 10:00: Assumptions and randomization in t-test The chapter discusses the process of evaluating nitrous oxide emissions from passenger cars, particularly whether these cars meet the standard of no more than 0.07 grams per mile. It explains how to take a sample of passenger cars produced within a certain range of years and test them for their nitrous oxide emissions. The main focus is on determining whether the mean emissions of the sample are similar to or different from the standard, rather than comparing the sample directly to a population.
10:00 - 13:00: Choosing between Z-test and t-test The chapter discusses the decision-making process between using a Z-test or a t-test in statistical analysis. It compares a scenario where population parameters (mean and standard deviation) are known, as in a Z-test, using the example of IQ scores, with a scenario where only a benchmark is available, as with Canadian emission standards for nitrous oxide emissions. To determine if cars produced during a specific time meet the benchmark of 0.07 grams per mile, a one-sample t-test is used, emphasizing when this test is appropriate.
13:00 - 16:00: Comparing proportions and concluding single sample tests Chapter Title: Comparing proportions and concluding single sample tests. Transcript: The chapter begins with an introduction to the concept of a t-test, noting its interesting nomenclature. Although the detailed discussion regarding two sample t-tests is reserved for later, the chapter emphasizes understanding the basics of the T statistic in relation to single sample tests. The T statistic is described as a type of difference score, drawing parallels to earlier discussions of the Z test, where differences were calculated.

SingleSample3 Transcription

00:00 - 00:30 so I said in the last section that I would tell you a little bit about where the critical values come from so let's talk about that before we launch into our next single sample test so critical values of Zed are based on the proportion of scores that fall into different standard deviation boundaries this is our normal distribution you've seen this one many times and what you know right now so far is that percent scores which are located down here in this row have a mean of zero and a standard
00:30 - 01:00 deviation of one so when the mean is zero that is a z-score that doesn't differ at all from the mean of its population or the mean of its sample if a z-score is plus one then you know it is one standard deviation above a sample mean when it's minus two it's two standard deviations below its sample mean so that's what we're thinking about when we think about said tests or Z values anyway
01:00 - 01:30 um now when we're thinking about our critical values for a non-directional test that's a two-tailed test when we do a two-tailed test we have both upper and lower boundaries and so what we're looking for is the boundary where about 2.5 percent of the scores fall above it and the boundary were 2.5 percent of the scores fall below it and that boundary happens to be drawn at 1.96 standard deviations from the means just under two standard deviations
01:30 - 02:00 is where we have our cut off for a 95 percent for Alpha equals 0.05 or a 95 percent confidence for a directional test we have an upper or a lower boundary right so there's only one side that we're looking in so for a one-tailed test we are going to look for the boundary at which five percent of the distribution Falls more extreme in the correct direction
02:00 - 02:30 so the boundary for a zed test that's one-tailed or directional will be either minus 1.645 or plus 1.645 because that's the boundary in terms of units of standard deviation that a score has to fall above or below its mean to be to have five percent of the distribution and only five percent of the distribution be more extreme than it
02:30 - 03:00 so let's move on to one sample t-test these are very similar to one sample Z tests but there are a few differences so a one sample t-test is a statistical hypothesis test designed to determine whether an unknown population mean is different from a specific value for example we could think about passenger vehicle emission standards from 2009 to 2012 and you could think
03:00 - 03:30 about nitrous oxide emissions for driving what we'd want is no more than .07 grams for per mile so you can ask whether car is produced within this range of years meet that standard so you would take a sample of Passenger cars produced during that time you would test them for nitrous oxide Admissions and then you would ask whether that sample of cars had a mean that was similar to this or different so here what we're doing is we're not comparing our sample to a population
03:30 - 04:00 like we were with the one sample Z test when we did IQ where we had a population mean and a population standard deviation here we have a benchmark so this is a benchmark for emission standards in Canada of nitrous oxide emissions of .07 grams per mile and we could ask is do these cars that were produced during that time Meet The Benchmark so what we do to assess that is we do a one sample t-test
04:00 - 04:30 um on a t-test it's called a t-test for lots of very interesting reasons um we're actually going to talk about those next time when we talk about two sample t-tests but let's first talk about the T statistic so you've seen before a graph that looks similar to this actually if you look back at the correlation lecture we've talked about this before um a t statistic is a difference score just like before when we did the Z test we were taking the difference between
04:30 - 05:00 the mean of our sample and the mean of our population when we do a t-test what we're doing is we're taking the difference between the mean of our sample and our Target parameter or our Benchmark our hypothesized uh expectation for what that value should be and again we're creating a signal to noise ratio we're normalizing by a standard error
05:00 - 05:30 now here we have some different T distributions that differ depending on degrees of freedom so we talked about degrees of freedom last time or in in a previous lecture when we talked about Chi Squared and we talked about degrees of freedom both changing distribution shapes and also being important so think back to what you remember a degree of freedom is and remember it's the number of independent independent elements within a statistic Computing a computed score
05:30 - 06:00 that are free to vary so when we're talking about a t-test that has a degree of one we're talking about this red line here and what you can see is that the tails are very heavy we have a name for this does anyone remember what it was it's called leptokurtic the distribution gets less leptocurtic if we increase the degrees of freedom it's when degrees of freedom equals three that distribution sort of looks
06:00 - 06:30 starts to look a little bit more normal when the degrees of freedom equals eight it starts getting closer when the degrees of freedom equals 30. the T distribution is very very similar to a normal distribution so this little tiny dashed line here is the normal distribution and what you can see is that our distribution very clearly depends on degrees of freedom as we increase the number of degrees of freedom we have in our t-test our distribution starts to approach
06:30 - 07:00 normal I will also mention that just like in The Zed test it has a mean of zero because under the null hypothesis there is no difference between the sample we took and the hypothesized parameter or Benchmark that we're testing it against so how do we compute a one-sample t-test and how is it different to a z-test well here's our formula you can see it's very similar to the z-test we have t this time equals the mean of the sample minus
07:00 - 07:30 the mean of our Target I've used the word population and called it mu here but you could also call it Benchmark whatever you want and what we're doing here now is we're dividing by a very similar quantity as before this is the sample standard deviation divided by the square root of n remember with the Z test we divided by the population standard deviation so it's not this not quite the same quantity but very close and so n is the number of participants
07:30 - 08:00 in our sample so if we take the same data that IQ test data that we that we used before what we could actually do is calculate this very specifically this t-test for our specific and for our specific sample so we could say t equals 103.48 because we said that was the mean of Western students IQ and the sample that we showed earlier minus 100 which is our Benchmark or critical value or from that critical value our Benchmark or population mean
08:00 - 08:30 and then instead of dividing by 10 which is the theorized population Sigma or standard deviation what we're doing here is we're dividing by the standard deviation of the sample which we said was 8.65 and taking that number and dividing by the square root of 132. so that gives us this quantity in the denominator 8.65 divided by 11.49
08:30 - 09:00 and that gives us 3.48 divided by 0.75 and that gives us a t value of 4.62 . now that one's a little bit bigger than the last one we calculated now our T critical we have a t a critical value of t which I've derived here using what we call the theoretical method I looked this number up in the back of a book now because we have a large sample here 132 people are T critical is going to be very very close to our
09:00 - 09:30 critical value of Z remember we said our critical value of Z was 1.645 this is 1.658 it's very close so what you can see is as the sample size here is increasing our critical value T is approaching the critical value of Z so T distribution is approaching normality and you can see that in this computation in this derivation of the critical value so our Cohen's D if we want to calculate it out it's calculated out just the same as before the mean of our sample minus
09:30 - 10:00 the minus the population mean which gives us we said 3.48 divided by a population standard deviation and that gives us a Cohen State of 0.402 so this is edging toward a medium effect size now so that's our one sample t-test now importantly I derived that one sample t-test
10:00 - 10:30 theoretically by looking up that value in the back of a t table or in a t table actually on it didn't do it in the back of a stats book like we did in the olden days I looked it up online but nonetheless it's a theoretical value and in order for me to do that I have to have a relatively normal distribution of IQ scores and they also have to be independently sampled both of those things are true there also are not very many outliers in the sample there's actually no significant outliers in the sample so we'll be talking about the
10:30 - 11:00 assumptions of the t-test in um later on um but if we do violate our assumptions we can also do this using a randomization test so we can use the same kind of randomization procedure that we've used before to to derive our critical value empirically so for one sample tests that uses a bootstrap resampling procedure where we draw a sample with replacement we make sure that the sample size we drew is the same as the original sample
11:00 - 11:30 size we calculate our t-distribution or our T statistic depending on our test and we repeat this process many many times we build a distribution of the sample statistic and then what we do is we look for the proportion of that distribution that's more extreme than our original test statistic and that gives us ultimately it can give us our p-value so that's what we're doing when we do a randomization test to compute the T statistic now one question is which test should we
11:30 - 12:00 do um when we're comparing a sample mean to a population mean or to a benchmark parameter or anything like that one of the things we need to determine is whether or not we know Sigma whether or not we know that population standard deviation so I gave you the example of IQ scores where we know what the population standard deviation is because it's predefined but IQ test that we gave is normalized such that it has a standard deviation of 10. so we know the
12:00 - 12:30 standard deviation there often when we're comparing for example car if we wanted to compare to a benchmark right which is just a single number that would not we wouldn't there we wouldn't have a standard deviation of a population because we're comparing it to a single item Benchmark so if we want to know whether those cars are meeting their emissions targets we're comparing it to a single item Benchmark and not to a distribution so we're not comparing
12:30 - 13:00 there to a distribution and in that case we don't have a known population Sigma so we need to determine first whether we know or whether we do not know the sigma and then we also need to know whether our sample size is large or small now a large sample size and the Very roughest rule of thumb is that n is greater than 30. now remember that's a sample size that is statistically reasonable but probably not representative of your population
13:00 - 13:30 especially if we're thinking about psychological constructs which can vary greatly from person to person so remember that this rough rule of thumb here n greater than 30. is for statistical purposes in terms of the T distribution a reasonable number but it is probably not a good number to make a suggestion that a particular phenomenon or particular outcome will generalize to a larger population
13:30 - 14:00 so if we have a large sample or a small sample we need to know that and finally we need to know our distribution shape if you want to conduct a z test you kind of need to have a normal distribution and otherwise if it's not normal you need to rely on that randomization procedure which doesn't require you to have a normal distribution so when you're going through and thinking about which test you would do it's
14:00 - 14:30 always a good idea to consider what your symbol looks like relative to elements of your whether you know the poppy the population standard deviation whether your sample size is large or small and what its distribution shape looks like whether it's normal or not normal and those things will tell you which kind of test you should be doing I will also tell you that we can do these tests with what we call proportions so there's a fantastic study and you can look this one up looking at
14:30 - 15:00 whether or not adults can detect their own children's lives and they play a funny game with the kids in this study where they it's a card game and what happens in the card game is the researcher leaves the room and leaves their cards on a table face down and then the researcher comes back into the room a few minutes later and says to the kid did you peek and the kid almost all the kids say no they didn't Peak some of the kids really
15:00 - 15:30 didn't Peak some of the kids peaked and said they didn't and so it's a very funny study you can look up some video footage online of kids telling this lie and what what we can ask is you know if we if we think about chance right 50 50 who can detect kids lives well it turns out that undergraduates and law students are not so good at this social workers are ever so slightly better than chance but child protection lawyers are not judges and Customs
15:30 - 16:00 officers judges are a little bit better than chance but Customs officers are not police officers slightly better than chance other kids parents are also a chance for detecting kids lives and even kids very own parents are not significantly above chance so we can compare proportions as well the proportion of people who correctly guess whether or not a particular kid is lying um so we can do a similar process with proportions we're not going to talk about the test for proportions today but
16:00 - 16:30 it follows a very similar logic to the t-tests that we have the t-test and the Zed testing for single samples that we have looked at where we're comparing a particular proportion of individuals to a benchmark proportion just like 50 as chance or you know we could say 75 or whatever we wanted to say so know that we can do a very similar process with proportions as well in the sample and I will leave it there for single sample tests we'll talk about two sample tests
16:30 - 17:00 in the next lecture