Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.
Summary
In this engaging StatQuest episode, Josh Starmer breaks down the concept of alternative hypotheses in the context of hypothesis testing. The video serves as a follow-up to the previous session on hypothesis testing, emphasizing the importance of understanding when to reject or fail to reject a null hypothesis. By using a fun and accessible approach, Josh illustrates how statistical tests use data, a null hypothesis, and an alternative hypothesis to guide decision-making in experiments, particularly when comparing the effectiveness of different drugs. He also discusses the nuances that arise when comparing more than two groups and highlights the power of a well-defined alternative hypothesis in statistical testing.
Highlights
StatQuest's Josh Starmer makes hypothesis testing fun and engaging! 🎉
Alternative hypotheses guide our decisions in experiments, especially with drug comparisons. 💊
Understand the need to clearly define your alternative hypothesis in statistical tests. 🧐
With multiple groups, the alternative hypothesis can vary, impacting the outcome. 🔍
Primary goal: Reject or fail to reject the null hypothesis, not accepting alternatives! 🚀
Key Takeaways
Alternative hypotheses are crucial for hypothesis testing, contrasting the null hypothesis. 📊
Hypothesis testing involves determining whether to reject or fail to reject a null hypothesis. 🤔
Statistical tests require data, a null hypothesis, and an alternative hypothesis. 📈
For two groups, the alternative hypothesis is the opposite of the null; for more, it gets interesting! 🎯
Never say you 'accept' an alternative hypothesis, only reject or fail to reject the null. 🚫
Overview
In this StatQuest episode, Josh Starmer dives into the fascinating world of alternative hypotheses. He unravels the complexities of hypothesis testing in a straightforward and engaging manner, ensuring you're equipped to understand when to reject or fail to reject a null hypothesis. Through relatable examples, particularly involving drug comparisons, Josh makes the topic both accessible and entertaining.
Josh emphasizes the importance of the alternative hypothesis, which in two-group comparisons, simply opposes the null hypothesis. However, things get more intricate when more groups are involved, necessitating careful definition of alternative hypotheses. This clarity can significantly influence the results of statistical tests, and understanding this nuance is crucial for accurate data interpretation.
The video also underscores a critical principle: in hypothesis testing, we don't 'accept' an alternative hypothesis; rather, we only reject or fail to reject the null. Such precision in language reflects the statistical rigor that underpins scientific experimentation. Josh signs off with a call to action for viewers to explore further learning resources and support the ongoing educational mission of StatQuest.
Chapters
00:00 - 00:30: Introduction to Stat Quest The introduction to Stat Quest by Josh Stormer starts with advice about watching Stat Quest depending on the weather: staying inside if it's raining, or watching on a mobile device if it's sunny. The focus of this chapter is on understanding alternative hypotheses, building on previous discussions on Stat Quest.
00:30 - 01:30: Quick Review of Hypothesis Testing The chapter provides a quick review of hypothesis testing, emphasizing the concept of the null hypothesis. It suggests checking out a previous 'quest' for detailed information, accessible through a link in the description. The review highlights that instead of testing numerous potential hypotheses to determine if two drugs differ, the null hypothesis is used to ascertain if a difference exists. The idea is to simplify the analysis and reduce stress by focusing on the null hypothesis during such tests.
01:30 - 02:30: Understanding the Null Hypothesis The chapter discusses how to understand and interpret the null hypothesis in the context of experiments. It explains that if an experiment with a large number of participants shows significantly shorter recovery times for those taking drug C compared to those taking drug D, and these differences are unlikely due to random factors like better diet or more exercise, then the null hypothesis can be rejected. This rejection would indicate a real difference between the effects of drug C and drug D.
02:30 - 04:00: Statistical Testing Basics The chapter provides a basic understanding of statistical testing, emphasizing the importance of the hypothesis testing process. It discusses how the results can vary due to random variations and introduces the concept of failing to reject the null hypothesis if the results are not significantly different. The chapter transitions into discussing the alternative hypothesis by presenting data on recovery times of individuals taking Drugs C and D, setting a practical context for statistical comparison.
04:00 - 05:00: The Role of the Alternative Hypothesis This chapter discusses the process of determining whether to reject or fail to reject the null hypothesis using statistical tests. It highlights that the procedure involves running data through a statistical test, which then provides a decision on the null hypothesis. Additionally, it notes that a statistical test requires three components, though the transcript ends before specifying what these components are.
05:00 - 07:30: Testing the Null Hypothesis with Means The chapter titled 'Testing the Null Hypothesis with Means' focuses on the foundational elements required for conducting hypothesis testing involving means. It outlines the need for a null or primary hypothesis, which serves as the basis for the testing process, where a decision is made either to reject it or fail to reject it. Additionally, it introduces the concept of an alternative hypothesis, which is essentially the opposite of the null hypothesis. The chapter hints at some complexity, suggesting that some concepts might be explained in a simplified manner for better understanding.
07:30 - 11:00: Alternative Hypotheses with Multiple Groups This chapter discusses the importance and application of alternative hypotheses in statistical tests, specifically focusing on scenarios involving multiple groups. The alternative hypothesis is crucial in determining differences between groups, such as in the case of comparing different drugs. The chapter hints at detailed methodologies for these tests being covered elsewhere, such as in a recommended Stat Quest playlist. The discussion is more conceptual, aimed at providing an understanding rather than a detailed procedural guide.
11:00 - 13:00: Importance of Stating the Alternative Hypothesis The chapter emphasizes the importance of stating the alternative hypothesis in a comparative analysis of two drugs, drug C and drug D. It explains the process of calculating distances between observations and their respective means. The null hypothesis posits no difference in these distances around a single mean, whereas the alternative hypothesis suggests there are differences between the distances around the two separate means.
13:00 - 15:00: Summary of Statistical Testing This chapter discusses the concept of statistical testing, particularly focusing on hypothesis testing involving means. It explains how to determine whether to reject or accept a null hypothesis by examining the distances around the means. If the distances around two means are significantly shorter than those around a single mean, it suggests that using two means is more appropriate and the null hypothesis should be rejected. Conversely, if the distances are not dramatically different, it suggests that the null hypothesis could not be rejected.
15:00 - 16:00: Further Learning and Conclusion The chapter emphasizes that differences between two means can often be attributed to random, unaccounted variables. An example given is an individual's reduced exercise resulting in slower recovery, suggesting the perceived difference in means might not be significant. This highlights the importance of considering potential random factors when interpreting data differences. The conclusion likely reinforces learning points from earlier chapters, urging readers to critically analyze statistical results.
Alternative Hypotheses: Main Ideas!!! Transcription
00:00 - 00:30 if it's raining outside stay inside and watch stat quest if it's sunny outside go outside and watched a quest on a mobile device stat quests hello I'm Josh stormer and welcome to stat quest today we're going to talk about alternative hypotheses so that you understand the main ideas note this stat quest follows up on the stat quest on
00:30 - 01:00 hypothesis testing and the null hypothesis if you haven't already seen that one check out the quest the link is in the description below either way let's do a super quick review in the stat quest on hypothesis testing we learn that rather than get stressed out over a large number of possible hypotheses that we could test to see if two drugs are different we simply use the null hypothesis to determine if there is a difference we learn that if we do an
01:00 - 01:30 experiment with a bunch of people and a lot more people taking drugs see had shorter recovery times than people taking drug D so many that it would be hard to imagine that the results were due to random things like everyone taking drug C had better diets or got more exercise than the people taking drug D then we can reject the null hypothesis and then we would know that there is a difference between drug C and
01:30 - 02:00 drug D alternatively we learn that of little random things could easily shift the result from being in favor of one drug to another then we would fail to reject the null hypothesis and then we said triple BAM now that we're done with our review let's talk about the alternative hypothesis first here's some data that shows how quickly people taking drugs C and D recovered from a virus the goal of
02:00 - 02:30 collecting all of this data is to determine if we should reject or fail to reject the null hypothesis in order to decide if we should reject or fail to reject the null hypothesis we run the data through something called a statistical test and the output of the statistical test is a decision to reject or fail to reject the null hypothesis a statistical test needs three things one
02:30 - 03:00 it needs data - it needs a null or primary hypothesis ie it needs something to reject or fail to reject and three it needs an alternative hypothesis in this case the alternative hypothesis is simply the opposite of the null hypothesis warning things are about to get a little hand wavey the idea is to give you a
03:00 - 03:30 general sense of why the alternative hypothesis is important and is used in statistical tests not to give you all the details of how those tests work that said if you want the details there's a stat quest playlist that goes through examples step by step the link is in the description below now one way to test the null hypothesis that there is no difference between drug C and D is to calculate a mean value for
03:30 - 04:00 all the data from both drugs and calculate the distances between each observation and the mean and compare those two distances calculated from individual means for drug C and drug D the distances around the single mean represent the null hypothesis that there is no difference in the distances around the two separate means represent the alternative hypothesis if the distances
04:00 - 04:30 around two means are much shorter than the distances around the single mean then that suggests that using two means to summarize the data makes more sense than using one so we reject the null hypothesis alternatively if the data looked like this and the distances from the single mean were not dramatically different from the distances around the separate means then that would suggest
04:30 - 05:00 that the difference between two means only reflects little random things that we can't account for for example it could be that the subtle difference in the means is due to this one guy getting less exercise than everyone else if he had exercised just a little bit more he might have recovered from the illness a little faster and then we would no longer see a difference between the two means so in this case we would fail to
05:00 - 05:30 reject the null hypothesis just if you're familiar with machine learning lingo failing to reject the null hypothesis it's the same thing as realizing that using two averages means that you have overfit the data if you're not familiar with machine learning lingo ignore what I just said or better yet check out the machine learning stack Quest's note when we only have two groups of data the alternative
05:30 - 06:00 hypothesis is pretty obvious because it is simply the opposite of the null hypothesis however when we have three or more groups the alternative hypothesis becomes more interesting in this case the null hypothesis is that there is no difference between drug C D and E and like before we can represent the null hypothesis by measuring the distances from the data to a single mean
06:00 - 06:30 value however now we have choices for the alternative hypothesis one alternative hypothesis could be that all three drugs are different and in this case we would measure the distances from a separate mean for each drug or the alternative hypothesis could be that there is no difference between drug C and D but drug E is doing its own thing in this case we would calculate
06:30 - 07:00 the distances from a single mean value for drug C and D and a separate mean for drug E so far we have two different alternative hypotheses in depending on which one we use in the statistical test we can end up making a different decision about the null hypothesis and that is why it is important to clearly state which alternative hypothesis we want to use
07:00 - 07:30 however regardless of the alternative hypothesis we used in the test we only reject or fail to reject the primary or null hypothesis if we tested the null hypothesis using this alternative hypothesis and we rejected the null hypothesis we might say that we rejected it in favor of this alternative hypothesis however we would still not say we accept
07:30 - 08:00 the alternative hypothesis because just like we saw in the stat quest on hypothesis testing other alternatives might be better in other words there are too many possibilities to test to know if we have accepted the correct one and this is why we only reject or fail to reject the null or primary hypothesis BAM in summary a statistical test needs
08:00 - 08:30 three things one it needs data - it needs a null or primary hypothesis and three it needs an alternative hypothesis when we only have two groups of data the alternative hypothesis is super obvious because it is just the opposite of the null hypothesis but when we have three or more groups we have options for the
08:30 - 09:00 alternative hypothesis and depending on which one we use in the statistical test we can end up making a different decision about the null double bam note if you're not already familiar with p-values these stat quests would be an awesome follow-up to this one and if you want to learn more about statistical testing check out the playlist on linear models it sounds fancy but if you've
09:00 - 09:30 made it this far it will be a snap now it's time for some shameless self-promotion if you want to review statistics and machine learning offline check out the stack West study guides at stat quest RG there's something for everyone hooray we've made it to the end of another exciting stat quest if you liked this stack quest and want to see more please subscribe and if you want to support
09:30 - 10:00 stack quest consider contributing to my patreon campaign becoming a channel member buying one or two of the stack quest study guides or a t-shirt or a hoodie or just donate the links are in the description below alright until next time quest on