ANOVA1
Estimated read time: 1:20
Summary
In this lecture by Erin Heerey, the focus is on one-way analysis of variance (ANOVA), a vital technique in psychology statistics. The session begins with a discussion on the assumptions of null hypothesis significance testing, highlighting common misconceptions. Key assumptions such as normal distribution, independence, and homogeneity of variances are explored in-depth. The lecture also addresses the significance of data wrangling, emphasizing the necessity to set procedures before data collection. Various tests to check assumptions, such as the normality test, are outlined, along with the role of p-values in hypothesis testing.
Highlights
- Discussion on one-way ANOVA, a commonly used method in psychology. 📘
- Understanding assumptions in null hypothesis testing, such as independence and normality. 🤔
- Importance of data wrangling, to refine and prepare data for analysis. 🔄
- Exploration of how to test assumptions to ensure statistical validity. ⚙️
- Role of p-values in determining the statistical significance of research findings. 📊
Key Takeaways
- ANOVA is a crucial technique in psychological statistics, frequently used in research. 📊
- Understanding null hypothesis testing assumptions is essential for accurate statistical analysis. 💡
- Check assumptions using various tests like normality and independence to ensure valid results. ✅
- Data wrangling is a critical step before data analysis, and procedures should be pre-defined. 🔍
- P-values help determine the statistical significance of your findings, interpreted with care. 📉
Overview
Erin Heerey leads a deep dive into the intricacies of one-way analysis of variance (ANOVA), a technique prevalently employed in psychology studies. The lecture opens by dispelling prevalent misconceptions surrounding the assumptions in null hypothesis significance testing. Critical assumptions such as normal distribution, independence of observations, and homogeneity of variance are dissected to provide a clearer understanding of their role in research.
Discussion then shifts to data wrangling, underscoring its necessity although it might not be covered extensively in the course. Heerey reminds students of the importance of establishing data preparation procedures before embarking on data collection. Tests to verify assumptions, including tests for normality and variance consistency, are crucial touchpoints to ensure that analyses remain robust and valid.
Additionally, the lecture underscores the importance of interpreting p-values correctly. P-values are crucial in evaluating the significance of one's findings but should not be confused with other statistical measures like Z scores. Through a comprehensive understanding of these elements, students are equipped to navigate statistical challenges effectively in their research endeavors.
Chapters
- 00:00 - 03:00: Introduction to One-Way ANOVA and Null Hypothesis Assumptions This chapter introduces the concept of one-way analysis of variance (ANOVA), a widely used statistical technique in psychology. Before delving into the details of ANOVA, the lecture addresses some common assumptions related to null hypothesis significance testing, aiming to clarify existing misconceptions.
- 03:00 - 05:00: Importance of Data Wrangling and Assumptions in Statistical Testing This chapter emphasizes the importance of data wrangling in the field of statistics. While it is not extensively covered in the class, it is noted as a critical area of study for those continuing in statistics. The chapter also touches on the assumptions related to null hypothesis significance testing, highlighting that all statistical tests come with underlying assumptions that must be acknowledged.
- 05:00 - 07:00: Theoretical Distributions and Their Assumptions The chapter 'Theoretical Distributions and Their Assumptions' explores the importance of making specific assumptions when utilizing theoretical distributions of test statistics. Key assumptions include whether observations are normally distributed, the independence of observations, and the homogeneity of variance across groups. These assumptions are critical for the appropriate application of theoretical distributions in statistical testing.
- 07:00 - 10:00: Testing Assumptions: Normality, Independence, Homogeneity of Variance This chapter delves into essential assumptions that underpin statistical analyses: normality, independence, and homogeneity of variance. It explains how theoretical distributions are generated using Monte Carlo sampling methods from perfectly normally distributed data. The importance of this perfect normal distribution is highlighted, noting the challenges that arise when actual data deviates from this model, making comparisons problematic.
- 10:00 - 15:00: Implications of Failing Assumptions in Statistical Testing The chapter discusses the implications of assumptions failing in statistical testing, particularly focusing on the assumption of normal distribution. When data is not normally distributed, the test statistics derived from samples may not align with those predicted by a perfectly normal distribution. This discrepancy can affect the standard deviation and variance, leading to potentially misleading test results.
- 15:00 - 17:00: Non-Statistical Assumptions in Study Design and Measurement The chapter discusses the importance of ensuring that study samples align with theoretical distributions, which is achieved by setting non-statistical assumptions. These assumptions play a crucial role when referencing critical values from statistical tables, regardless of whether these tables are online or in textbooks.
- 17:00 - 21:00: Understanding P-Values and Critical Values in Statistical Tests The chapter discusses the importance of ensuring that data meet certain criteria when conducting statistical tests, particularly focusing on the assumptions of independence of observations. It emphasizes that the independence of observations is a critical assumption that should be established during the experimental study design process to avoid problems. The chapter further examines whether the sampling procedures and study design produce independent data and discusses the implications of related observations on the results of a study.
- 21:00 - 24:30: The Role of Data Wrangling in Research The chapter discusses the role of data wrangling in research, focusing on statistical analysis. Various techniques are mentioned that aid researchers in managing interdependent data. Although formal testing for the independence of observations exists, it is noted that such formal methods will not be covered in this class. Future statistical learning might include these formal tests.
ANOVA1 Transcription
- 00:00 - 00:30 in this lecture we're going to talk about one-way analysis of variance which is a very common psychological technique used in psychology statistics so you'll this is one you'll see frequently In Articles and maybe you'll even do some of this yourself before we get on to the mean the main star of the show I'd like to talk a little bit about the assumptions associated with null hypothesis significance testing because I think there are still a few misperceptions and amongst you and I think it's a good idea to clear those up
- 00:30 - 01:00 and we also need to talk about data wrangling now data wrangling is a really important topic that we don't cover very much in this class if you continue on your statistics Journey you will do so but for now it's important for us to mention it and talk about it and know what it is um but we probably won't do a lot with it in this particular class so the assumptions associated with null hypothesis significance testing as you know all of the tests that we do have
- 01:00 - 01:30 particular assumptions often these are about the norm whether observations are normally distributed whether the observations are independent the degree to which there is homogeneity of variants across groups and so forth now we know that when we use theoretical distributions of test statistics we have to make these assumptions and the reason why we have to when we that we have to do that is that when we use these
- 01:30 - 02:00 theoretical distributions the values that we're comparing to are drawn from they're sampled using a Monte Carlo type sampling method these samples are drawn from perfectly normally distributed data and so when data are perfectly normally distributed and your particular data or not it becomes difficult to compare your particular data to a distribution that is perfectly
- 02:00 - 02:30 normally distributed and the reason why is because the standard deviation of your of your test and have a distribution of test statistics created from your specific sample might not be the same as the standard deviation or the the variance in a distribution of your sample statistic that's created off the off of perfectly normal data and so we need to
- 02:30 - 03:00 address that and make sure we're make sure our our own samples match the the sort of theoretical distributions that they're drawn from and that's the purpose of making those assumptions so anytime you are looking up a critical value in a table somewhere whether it's an online table or a table in the back of a textbook you are making by by merely doing that you are assuming
- 03:00 - 03:30 that your data meet these criteria and if they don't that can lead to problems so the main assumptions we have are the independence of observations and this is one that we certainly established during the experimental study design process so our sampling procedures are they independent is our study design a design that produces independent data when observations are related the relationship is in effect that we need
- 03:30 - 04:00 to account for in this statistical analysis process now we can do that there are many techniques that allow researchers to do that but it's there's there's some some of them get tricky to apply depending on how the data are interdependent with one another we will not do formal testing of the independence of observations although there are formal tests for it if you move on as as I said earlier in your statistics Journey it's highly likely that you will encounter those but we won't do that formally in this class
- 04:00 - 04:30 what we will test formally in this class are normally distributed observations so we'll actually there's a hypothesis test that allows us to ask the question are the data that we're seeing drawn from a normally distributed population so is your sample likely to be drawn from a normally distributed population or not and so that's one of the things that we'll be looking for in some of our in some of our lab material when sample data are non-normal that can
- 04:30 - 05:00 lead to type 1 or type 2 errors depending on how it is they are non-normal and then we also test the Assumption of homogeneity ovariance or homoscedasticity depending on the test we're using so homogeneity of variance applies if you have your data partitioned into groups for example if you're comparing people who got treatment one to people who got treatment two um that there you're looking for homogeneity variants if you're looking for um that data that occurs across a range
- 05:00 - 05:30 so maybe the range of the number of classes you've taken at University for example um that would be a data that occurred across a range and we could use that as a variable to predict something else if we were interested in that and there we would be testing is homoscedasticity so across the range of data if you've taken five classes if you've taken 10 classes across each of those bands what we would expect is our data are normally distributed within
- 05:30 - 06:00 those groups as well so that's kind of what we're what we're looking for when we when we consider homoscedasticity so if we've met the assumptions and the answer is we test them there are statistical tasks that we use to check these assumptions and these are tests where we actually wish and desire to retain our null hypothesis we want to fail to reject the null hypothesis so if we fail to reject the relevant null
- 06:00 - 06:30 hypothesis then you can compute the standard analytical or theoretical processes so you can look your critical value up in the in a table right you can do a standard statistical process using one of um basically using standard without without doing randomization so using a standard statistical process in a statistical package or in Python if you continue in your statistics Journey next year or into the next class you'll be using R
- 06:30 - 07:00 and R will be one of the things that you learn which is a package that's designed specifically for statistics a lot of psychologists use it so what we do is we check our assumptions and for example if we were going to test the Assumption of normality what we would be wanting to test is the null hypothesis that our sample is drawn from a population with a normally distributed PDF as opposed to a
- 07:00 - 07:30 population that is not normally distributed that's our research our alternate hypothesis in that one so there we kind of want to fail to reject the null hypothesis because that allows us to to know that our data are normal right so what the hypothesis that we're testing is there's in in that particular test is that whether our distribution deviates from normal there is also a test that allows us to examine differences in the variances of the group so there are null hypothesis is whether or not the
- 07:30 - 08:00 groups are are drawn from popular or the samples are drawn from populations that are that have the same variance and finally we can test Independence of the observations by checking whether the observations are uncorrelated with each other and as I said we won't do that formally in this class if we find violations in these statistics then we need to examine the data in another way we can do that using randomization tests where we build up a distribution of our test statistic our
- 08:00 - 08:30 analysis of variance which we'll be talking about later today or our t-test or whatever test we're doing where we actually build the distribution of that test statistic using our specific sample data in which case then we have to then we need to be much more careful about do our data apply to the normal to the larger population if we want to think about generalizing but it's also the case that we know that our p-values and critical values are exactly accurate and we can
- 08:30 - 09:00 do that using randomization tests which we've done in and we've talked about in lecture and we've done in lab and bootstrapping and other types of processes that involve randomizing the data that we take and finally we also need to think about that there are non-statistical assumptions that we need to make is the sample representative of the population is it large enough to be reliable and to be a reliable representation of that population so we've talked about when we
- 09:00 - 09:30 when we started the t-test where I showed you the distributions of T and how as the degrees of freedom got larger as it approached 30 um the T distribution began to approach normal and be non-non not not statistically different from normal anymore now that's fine 30 is a fine number but if you think about the variability in human behavior across um a culture across an area across you
- 09:30 - 10:00 know different kinds of situations you have to ask yourself 30 is probably not enough to be representative of the population that you're interested in studying even if it's a small population even if it's just people living in London Ontario 30 probably isn't enough and so um you need to think about you know both the statistical reasons for for sample size but also the non-statistical reasons you need to figure out whether the study design is consistent with the
- 10:00 - 10:30 theory that's driving the the project are the study's measurement or manipulation procedures are they both precise and valid um so are they you know are they reliable do they measure what they're expected to measure do they manipulate what they what they should be manipulating and sometimes it's easier to tell than others if we have for example a dose of drug we can check that manipulation really easily by giving you the this a dose of medicine and then measuring the blood
- 10:30 - 11:00 the levels of that in your blood um and that's a really easy process to do and your doctor is probably taking blood tests from you in the past um or we'll do at some point in the future and they'll test for levels of a particular you know substance in your blood whether it's um whether it's something that's endogenously produced or something that you put in there so um these are important ways of testing the validity but this is a something that's extremely important to think
- 11:00 - 11:30 about and consider when you're interpreting the data that you get and finally you need to consider whether the null and the research hypotheses truly complete the sample space because they don't always and there's a lot of nuance associated with that process we also need to consider p-values and how they relate to distributions so I suspect by now in this class that if I asked you what was the p-value associated with this dark blue line right here this dark blue line right here or either
- 11:30 - 12:00 one of these down on this distribution here I expect you would be able to tell me that that p-value is equal to 0.05 because that is our critical value it's our threshold in statistic I also expect you would be able to tell me that anything that was farther out more extreme in the distribution than P equals 0.05 the p-value is what gets smaller as we got out toward the Tails of the distribution good so what is your p-value in a non-directional test that's this
- 12:00 - 12:30 two-tailed case on the bottom here what is your p-value when you get a score of zero tell you what it is but I want you to pause and think about this p-value when you get a score of zero in a two-tailed hypothesis test is one right it's pretty close to one and that's and that's a feature of the way
- 12:30 - 13:00 this type of test works as we move from zero toward either one of the rejection regions the p-values decline they're tall they're biggest at the height of this distribution so now let me ask you let's take one of these one-tailed tests here what's our p-value at zero here I'll give you a hint it's not one
- 13:00 - 13:30 if you guessed 0.5 50 you are absolutely correct so here in these single tailed tests as you go further and further away from the rejection region the p-value is getting bigger and they're at the far end of this tail is where they hit that kind of 0.99999 Mark so or like basically one they're
- 13:30 - 14:00 basically one over here they're 0.5 here and then they get smaller as you move closer to the rejection region but here because we're only looking in the one tail we have we have to account for this across this rejection region so bear in mind that P values are not the same as Z scores or anything like that they don't work in a number line sort of fashion the way is the ways that scores typically do and finally we'll talk about data
- 14:00 - 14:30 wrangling often before a data set can be used we have to do things to it we can't just sort of take the data and start doing things from a scientific perspective what you plan to do to your data set should be specified before you begin data collection so you should have specific procedures outlined before you even start data collection it's less biased that way where you talk about how you're going to clean your variables so this includes maybe renaming variables a
- 14:30 - 15:00 variable that comes directly out of qualtrics as q1 is probably not very informative that might be the first of the first item of your extroversion scale um and that would be a more informative name so you want to think about renaming your variables you want to think about removing cases where there are problems in the data set where you know that participants have been behaving inattentively or have been responding carelessly so you identify what that looks like and how you're going to deal
- 15:00 - 15:30 with those cases what you're going to do when you have incomplete data right some people might not finish um and then when they don't finish the data are incomplete you can think about scoring the variables so we very rarely use all five questions of an extroversion scale if there are five items on it independently we usually compute a summary score so we might compute somebody's overall average score on extraversion or some other personality trait or something else how do on average what's their reaction time
- 15:30 - 16:00 to this particular type of stimulus versus that particular type of stimulus so we often compute these um or compute summary scores that are that that take the raw data and combine them in some way it's a good idea to identify how you're going to be doing that before you start the analysis process and in fact before you start the data collection process and finally one of the things that we will do in this class in lab and that we have done is check for outlier removal so sometimes we have these statistical
- 16:00 - 16:30 outliers that we know are going to invalidate our hypothesis tasks since sometimes it makes sense to remove it doesn't always but sometimes it does make sense to remove them so it's a good idea to check whether or not you have outliers before you start your analysis process