TwoSample1

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In this lecture, Erin Heerey delves into the fundamentals of two-sample t-tests, starting with its historical origins tied to William Seeley Gossett, who wrote under the pseudonym 'Student', and his association with Guinness. The lecture explains the applications and assumptions of independent samples t-tests, emphasizing their role in comparing means from two distinct groups. By understanding the intricacies of sampling error and population differences, the lecture guides learners through the importance of test conditions and design types, shedding light on how experimental designs can lead to causal conclusions, unlike observational ones which focus solely on differences.

Highlights

The two-sample t-test helps compare two populations to determine if their means differ significantly. 📈
William Seeley Gossett used a pseudonym, 'Student', for his t-test publication due to company policies. 🔍
Key assumptions for the t-test include normal distribution and no significant outliers. 📊
The concept of sampling error explains why two sample means might differ even if they come from the same population. 🤔
In experimental designs, t-tests can imply causation, but in observational studies, they reveal correlations. 🔬

Key Takeaways

The two-sample t-test is used to compare the means of two independent groups. 👫
William Seeley Gossett, under the name 'Student', brought the t-test to life while working at Guinness. 🍻
The assumptions of a two-sample t-test include normal distribution and homogeneity of variance. 📊
A t-test can determine if differences in means are due to sampling errors or real population differences. 💡
In experimental designs, t-tests can suggest causal relationships, while in observational ones, they show differences. 🔍

Overview

In the engaging world of statistics, the two-sample t-test finds its roots in the early 20th century with William Seeley Gossett, a statistician at Guinness. Under the pseudonym 'Student', Gossett introduced this pivotal tool to compare means across different groups. This method has grown to be an essential part of data analysis, providing insights into whether variations in sample means represent actual differences or are mere results of sampling error.

The lecture outlines the significant assumptions necessary for conducting a two-sample t-test: normal distribution, homogeneity of variance, and independent observations. These foundational requirements ensure accurate comparisons between groups like drug vs. placebo or different demographic segments. Understanding and checking these assumptions help statisticians deduce genuine differences, steering clear of misleading conclusions.

Deconstructing both observational and experimental designs, the lecture underscores that while t-tests in experimental settings can reveal causative effects thanks to controlled variable manipulations, in observational contexts, they are limited to identifying differences without attributing cause. This distinction is crucial for how researchers interpret and apply their findings, ensuring clarity in drawing conclusions from data.

Chapters

00:00 - 03:00: Introduction to Two Sample Tests and History This chapter introduces two sample tests, specifically focusing on comparing two groups using the independent samples t-test. It starting with a brief historical context noting that the t-test was invented by William Seeley Gossett, who published under the pseudonym 'Student', giving rise to the well-known 'Student's t-test'.
03:00 - 06:00: Application and Assumptions of Two Sample T-Tests The chapter discusses the historical context of the two-sample t-test, highlighting that it was first published in the journal 'Biometrica' in 1908 by William Seeley Gossett. Gossett worked for the Guinness brewery, which supported an active research lab. The company allowed its statisticians to publish their findings anonymously or without direct linkage to the brand. This background ties the development of the t-test to the beer industry, specifically Guinness.
06:00 - 09:00: Understanding Sampling Error and Observational Design This chapter discusses the challenges faced by William Gossett in comparing the quality of stout beer made from different small batches of barley. It highlights the variations in chemical properties and qualities in different batches, which led to the problem of determining a reliable method of comparison. The context is set against the backdrop of Guinness beer, known for being brewed with barley, illustrating the practical implications of sampling error and observational design in quality assessment.
09:00 - 12:00: Experimental Design and Interpretation of T-Statistic The chapter focuses on experimental design and the interpretation of the T-statistic, starting with the historical context of its development. A notable figure is Guinness, who aimed to enhance beer quality and consistency by comparing different barley batches. This endeavor led to the creation of the Student's T-test. The T-test is significant for analyzing small sample comparisons. Ronald Fisher played a crucial role in popularizing this statistical method. Fisher is depicted as an innovator who excels at expanding on existing ideas.

TwoSample1 Transcription

00:00 - 00:30 in this lecture we'll talk about two sample tests so in two sample tests our goal is to compare one group to another group and the test we're going to talk about today is called independent samples t so by way of History t-tests were invented by William Seeley Gossett who wrote Under the pseudonym student so he came up with students t-test and
00:30 - 01:00 published it in the journal biometrica in 1908 now William Seeley Gossett worked for Claude Guinness so the t-test was brought to you by beer and not just any beer this spear here um Guinness was had a very active research lab and supported an active research lab and allowed his statisticians to publish the work that they produced as long as they didn't link their work
01:00 - 01:30 to their name which is why this test is published under the nickname of student so William gossett's problem was how to compare the quality of a stout beer made from different and small batches of barley so remember Guinness is a obviously it's a stout beer and it's it's brewed with barley and different batches of barley have different chemical properties and different qualities to them and so the idea here was how do we compare the quality of
01:30 - 02:00 beer made from a batch of barley from one batch of barley versus another because Guinness was very interested in improving the quality of his beer and making sure that it was consistent across batches and so he invented the students tea now the use of the students t-test for small sample comparisons along with that distribution that you saw in a previous lecture was popularized by Ronald Fisher so Ronald fisherley is a is a the sort of man who likes to run with ideas so he
02:00 - 02:30 spent a lot of or heavily used the t-test in his work and published quite a lot and popularized that test as well so what's the difference between one sample t-test and a two-sample t-test it's been a one sample t-test we are testing whether a population mean has a value that is specified under a null hypothesis and the researcher has prior knowledge about the population so we know what its
02:30 - 03:00 mean is usually we've defined what its mean is and when we're thinking about or considering a two-sample test what we're doing is we're asking a slightly different question we're asking the question whether the mean of one population is the same or different to the mean of another population so here we have two populations Each of which we've sampled from so we have two sample means and we're asking whether those two means are the same or different the samples must be independent and that
03:00 - 03:30 is why this test is known as an independent samples t-test or too simple t-test it's also called a between samples t-test or an unpaired samples t-test so you'll hear all of these ways of calling this t-test and you will hear me saying independent samples t-test and two sample t-tests most frequently so what do we need to do this test well first of all we need one independent variable so this is a categorical variable that has two levels to it it might be men versus women it might be
03:30 - 04:00 Western students versus queen students it might be drug condition versus a placebo condition it might be patient participants versus control participants then we need a dependent variable so each participant provides data on one outcome variable and preferably those data are ratio or interval level data points although in practice we also sometimes use survey data there and that is often ordinal it has some assumptions and now these
04:00 - 04:30 assumptions are the same across different t-tests so all t-tests have these or almost all t-tests have these assumptions we'll talk about one in a future lecture that has a little change but generally speaking the t-test requires a normal distribution so the data in your sample must be normally distributed and that's true for both one and two samples so if you're doing a one sample t-test your data need to be normally distributed within your sample if you're doing a two-sample t-test they
04:30 - 05:00 need to be normally distributed in both of your conditions they need to have no significant outliers and you can use box plots to attack these there are also other methods of detecting statistical outliers or outliers that are statistically different from the rest of the sample but you can see them and visualize them very easily on box plots we also require an assumption known as homogeneity of variance and I'm going to talk a little bit more about that in depth at the end
05:00 - 05:30 of or later on in this lecture um but the idea here is that both of your samples have similar variants so the variance of your first group or your first sample is about the same as the variance of your second sample and finally each of the scores must be independent of all the other scores so these are the assumptions that we need when we are considering doing a two-sample t-test so what does the t-test tell us
05:30 - 06:00 well from a conceptual standpoint remember we have we've thought about two different populations we can ask did the two samples come from the same population or different populations now remember researchers do not know we don't have prior knowledge about the population parameters the real population parameters in the world and so we estimate population parameters from sample data and so um what we're asking here is are there
06:00 - 06:30 differences in those in the sample statistics by which we can estimate population parameters so we're asking the question is the difference between the sample means indicative of a real group difference or could it just be down to sampling error remember that if we draw two samples from exactly the same population using for example Monte Carlo statistics so we know exactly what population we're drawing from we have a defined mean a defined population standard deviation and we draw two randomly selected means
06:30 - 07:00 from that when we do that process those two samples even though we've drawn them from the same population are unlikely to have perfectly identical means so the difference between the means of the two samples that we've drawn from the same population is called sampling error and we've talked about sampling error before and so if the two means are genuinely from the same population the only difference between them should be sampling error
07:00 - 07:30 so let's think about an observational design let's say we have sample a and we know that sample a was drawn from population one and then we have sample B now the question we're asking is the means of sample A and B are unlikely to be exactly identical but the question we're asking here is is be different enough from population one that it is truly different or are these differences down
07:30 - 08:00 to simple sampling error so we could think about a population of I don't know men and women doing the cognitive task and what we want to ask is are they members of the same population or are they members of a different population well odds are the means if we you know calculate if we sit you down at a computer and have you do a cognitive task um and then we ask you whether you're you identify as male or female the means of your scores are unlikely to be all that
08:00 - 08:30 different but they're unlikely to be exactly identical either so we can ask the question well okay are is the difference between these means simply due to sampling error or are they so different that sample B has to be from a real different population are they just slight variations on one another or is sample B really a horse of a different color is it so different that it becomes on
08:30 - 09:00 that particular test a member of a different population right so we can think about men and women all being members of the population people and that's all fine and good but sometimes men and women are different enough that they are different they have different characteristics in particular ways and so they might be members of different populations in some ways and members of the same population and others so that's what we're thinking about when we're doing these kind of t-tests so that's when we do an observational design what
09:00 - 09:30 we're asking could these two samples are they are their means which whose means are not perfectly identical could these two samples whose means they're not perfectly identical be from two different populations or are those differences so some so small that they could be represented by mere sampling error now we can also think about experimental designs in this way so if I'm going to give a drug treatment for example I'm going to give some people a pill that
09:30 - 10:00 contains an active ingredient in it a drug of some sort and I'm going to give the rest of my participants a pill that contains no amount of that active ingredient it looks just the same it's got the same filler ingredients in it but it just doesn't have anything any of the active treatment in it so here what we're doing is we know that we're drawing from the same population of people and so we take a sample from that population and we know it came from the same population we randomly selected people for example and we randomly assigned those
10:00 - 10:30 participants to condition a or condition B one of those conditions is the drug group one of them those conditions is the placebo group and now what we're asking again those means are unlikely to be exactly different but what we're asking is is condition B does that treatment make people so different from one another that they become a new population so are we changing them enough to make them different to where they were originally we know originally they were drawn from
10:30 - 11:00 population one because everyone was sampled from the same population we randomly assigned them to a treatment condition and so now we can ask does that treatment condition make them so different from where they were originally that they are now new so does the design type matter here well in terms of statistics it does not we conduct this t-test exactly the same way we test the same hypotheses so a
11:00 - 11:30 non-directional hypothesis would be that the means are different or not different depending on whether it was research or not and a directional hypothesis is that the mean of group one is higher or lower than the mean of group two or that they are not that they are the the same or the opposite direction when we're thinking about interpreting our T statistic there the design type matters quite a lot so in the experimental design we can make a causal conclusion that our treatment condition
11:30 - 12:00 changed the sample if we are rejecting the null hypothesis in the observational design we can only talk about differences we can't really talk about causes I'm going to pause there on the lecture and we'll pick up in the next section