NHST3
Estimated read time: 1:20
Summary
The video, hosted by Erin Heerey, delves into the nuances of testing relationships within observational designs, highlighting the crucial differences between correlation and causation. It emphasizes that in observational studies, researchers remain passive observers and cannot make causal claims; instead, such claims require an experimental design with manipulation and random assignment. The video uses humorous examples of spurious correlations to drive home the point of misinterpreting correlation as causation. Finally, it underscores the importance of random sampling and assignment in making valid causal claims, setting the stage for a deeper dive into the null hypothesis significance testing process in future segments.
Highlights
- Erin Heerey explains the concept of testing relationships in observational designs. 📊
- Observational studies can't determine causation due to lack of manipulation. 🛑
- Fantastic examples of spurious correlations highlight the danger of assuming causation. 🌟
- Causation is determined through experimental designs with random assignments. 🎲
- Understanding random sampling and assignment is crucial for making causal claims. 🎮
Key Takeaways
- Observational studies only establish relationships, not causation. 🧐
- Spurious correlations can be misleading—causation requires more than just patterns! 🚨
- Experimental designs with manipulation and random assignment help determine causation. 🔍
- Random sampling and assignment are vital for valid causal conclusions. 🎯
- The video uses fun, quirky examples to illustrate misinterpreting correlation as causation. 🍕
Overview
The fascinating world of research methods unveils as Erin Heerey takes us through the intricacies of testing relationships in observational designs. In these scenarios, researchers act as passive observers, merely noting correlations without interfering or manipulating any variables. This video emphasizes the critical distinction between merely observing correlations and making causal claims, which necessitates an entirely different approach—experimental designs.
Heerey uses a variety of humorous and intriguing examples of spurious correlations to demonstrate how easily one might misconstrue correlation as causation. Events such as the letters in spelling bee winning words correlating with spider bite deaths illustrate the absurdity of assuming causation from mere correlation. Through these playful illustrations, the video vividly highlights the importance of careful analysis and the dangers of misinterpretation.
Ultimately, the video underscores the role of random sampling and random assignment in establishing valid causal claims in experimental designs. While observational studies can show a relationship and sometimes generalize findings when random sampling is used, making a causal statement is strictly the realm of experimental methods. The narrative sets the stage for a follow-up exploration on null hypothesis significance testing, promising deeper insights into empirical research methods.
Chapters
- 00:00 - 00:30: Introduction to Observational Design This chapter introduces the concept of observational design in research. It explains how researchers test relationships by collecting data without interfering with or manipulating the environment, acting as passive observers. It emphasizes a fundamental aspect of research methods, focusing on observation rather than intervention.
- 00:30 - 01:00: Limits of Observational Design in Establishing Causation Observational studies can identify relationships between variables but cannot determine causation. They cannot tell if the explanatory variable caused changes in the response variable. Understanding causation requires experimental design, not just statistical reporting.
- 01:30 - 04:00: Example of Spurious Correlations In this chapter, the concept of spurious correlations is discussed, emphasizing that in observational studies where variables are not manipulated, it becomes challenging to establish causal relationships. The chapter underlines the importance of distinguishing between the results reported and the study's design. Researchers are advised to exercise caution in interpreting correlation as causation without thorough investigation.
- 04:00 - 07:00: Establishing Causation Through Experimental Design This chapter discusses the limitations of observational or correlational designs in establishing causation. It highlights the inability to make causal claims from such studies and introduces the concept of 'spurious correlations.' An example or reference made is to the website tylervigan.com, which features numerous such spurious correlations and their fascinating connections, such as the correlation between certain letters in winning words.
- 07:00 - 10:00: Random Assignment and Sampling in Experiments This chapter examines the correlation between seemingly unrelated variables, as demonstrated by data from the Scripps National Spelling Bee and venomous spider deaths from 1999 to 2009. The focus is on how the length of the winning word in the spelling bee correlates with the number of people killed by venomous spiders each year. This is an example of illustrating random coincidences when there is no causal relation, highlighting the importance of distinguishing genuine correlations from random patterns in experimental design.
- 10:00 - 14:00: Limitations of Experiments Without Random Sampling The chapter discusses the limitations of experiments that do not use random sampling. It highlights the risk of finding spurious correlations, which are strong associations that lack a true causal relationship. Examples provided include the correlation between spelling bee winning word lengths and unrelated factors, and another between U.S. spending on science, space, and technology and suicides by specific means, emphasizing that these correlations do not imply causation.
- 14:00 - 18:00: Concluding Remarks on Random Sampling and Assignment The chapter discusses the misconceptions surrounding correlations and emphasizes that mere correlation does not imply causation. It provides examples such as the false correlation between U.S. science spending and skepticism among some U.S. senators, and a humorous example of the correlation between per capita cheese consumption and deaths by bedsheet entanglement to illustrate the randomness in statistical correlations.
NHST3 Transcription
- 00:00 - 00:30 so let's now talk about how to test these relationships what do we do and how does this work so when we're testing relationships in an observational design and you can think back to your research methods because your research methods will also be talking about this researchers collect data in a way that does not interfere with or manipulate any elements of the environment so this is the idea here that a researcher is a passive Observer
- 00:30 - 01:00 can establish the presence of a relationship but cannot provide any information about what variable caused what did the explanatory variable cause the response variable we can't provide any information about that and I want to also suggest to you that this is something that you learn about in the experimental design and not actually in the statistical reporting if I have an experimental design and I
- 01:00 - 01:30 report a correlation I can make a causal claim if I have an observational study where I didn't manipulate anything I did not manipulate or interfere with the elements of the environment I simply measure them it becomes very difficult for me to make a causal Claim about anything so you need to be really careful to differentiate the reporting of the results from the design of the study in
- 01:30 - 02:00 an observational or correlational design I cannot make causal claims why not well actually let's just look at some spurious correlations so here's a really interesting one by the way if you want to see more of these this website tylervigan.com he reports loads and loads of these Furious correlations under some pretty fantastic ones so this is letters in the winning word
- 02:00 - 02:30 of the Scripps National National Spelling Bee and that correlates with the number of people killed by venomous spiders in a year so these are data attract from 99 to 2009 so a Whole Decade and what they're calculating here is the spelling bee the the winning word how many letters it had and the more letters in the word look the more people died by being bitten by venomous spiders
- 02:30 - 03:00 good gracious they should quit the spelling bee or they should keep the spelling bee winning words really really short this is a spurious correlation even though it's a strong correlation there's actually really no relationship between these elements here's another one U.S spending on science space and Technology correlates with suicides by hanging strangulation and suffocation the more the U.S spends on science funding the more people die by suicides
- 03:00 - 03:30 every year again this is a 10-year period over the course of a decade really now I'm sure that there are some senators in the U.S who are very anti-spending on U.S science that would like you to believe that this is a thing it's not a thing these two variables are totally uncorrelated with one another and here's my favorite one per capita cheese consumption is correlated with the number of people who die by becoming entangled in their bed sheets
- 03:30 - 04:00 ah you see I don't like cheese so I consume about zero pounds of it a year and that means that my risk of dying by becoming entangled in my bed sheets is going to decline again this is a spurious correlation there is no real relationship between these variables but of course it does look like it based on the numbers so we need to be really careful when we're thinking about correlation and causation
- 04:00 - 04:30 um variables might correlate with one another for all kinds of reasons so just so you need to be careful that you're not talking about observational designs that simply observe things there is a way to determine causation but it's not when the researcher is a passive observer in a correlational or observational design so how do we determine causation well in experimental designs what we do
- 04:30 - 05:00 is just I've shown you versions of these little circles before we take a sample from a population hopefully it's a random sample or at least close enough and what then happens is participants get assigned to an experimental group who gets an experimental treatment and a control group who gets some kind of a placebo treatment so how do we determine causation here there's a manipulation one group of participants is getting something special some experimental treatment and
- 05:00 - 05:30 the other group is not so already we have a difference between how participants are being treated there's a manipulation going on so the experimental group includes the treatment or independent variable and the control group is ideally identical to the experimental group except for the independent variable now the important element here is that we randomly assign people from this sample to either the experimental or
- 05:30 - 06:00 control group so we're not looking in the sample and going oh you know we're going to take all the people who have some characteristic and we're going to put them in the experimental group I'm going to take all the people who don't have that characteristic put them in the control group I'll like that we have random assignment where we are going to essentially flip a coin for every single person and if it comes up heads they'll be assigned to the experimental group but if it comes up Tails they'll be assigned to the control group no we don't actually flip coins anymore there are random number generators that do this for us that will randomize participants to
- 06:00 - 06:30 groups for us so we don't have to do it ourselves um but the important thing is that participants are randomly assigned to groups so that everyone in this sample the idea is this sample is a model of the population especially if the participants in that sample were randomly selected and then because they're being randomly assigned either to this group or this group the only difference between the groups is the presence of that treatment we then measure the dependent variable
- 06:30 - 07:00 whoops and we can make an inference if this group differs from this group then that in the inference we make is that the differences must be due to the presence of this independent variable because there are no other characteristics that distinguish the experimental group from the control group that becomes a really powerful technique and our ability to make a causal inference so that is an experimental design because we've introduced a
- 07:00 - 07:30 manipulation and we've randomly selected participants into these groups we've randomly assigned them we can make causal statements about the outcomes again I can report in an experimental design I can report a correlation and make a causal claim but I can't do that if I had an observational design in an observational design I can also report a different
- 07:30 - 08:00 kind of test statistic that we'll learn about later and if I don't have I have a crew experimental design it becomes very difficult to make causal claims So Random assignment is the key feature here that we need to pay attention to so in an ideal experiment we have random sampling
- 08:00 - 08:30 and random assignment to groups and then what we can do is make a causal conclusion that generalizes to the whole population in an ideal observational design we have random sampling and there's no random assignment we're simply measuring then we have we can make a causal conclusion we can make a correlational statement that generalizes to the population
- 08:30 - 09:00 in when we don't do random sampling so as I said most psychology studies don't allow really random sampling there's always people who are being excluded because of some characteristic associated with those people whether it's they have an internet connection and an email address that we can email them at or a mobile phone that we could you know send them a text to or make a call to whether it's
- 09:00 - 09:30 um people who have an address versus don't have an address whether it's people who live in London and can come into my lab versus don't so often we don't in most experiments we don't have random sampling and instead what we try to do then is we're really careful about our random assignment procedures that allows us to make causal conclusions but of course they won't they don't generalize to the whole population because it really depends on how representative the sample is of the
- 09:30 - 10:00 population that allows us to generalize um when we're doing correlational research there's no random assignment to groups because we don't have groups we're simply measuring things um or people are self-assigned into groups for example we might say people who have a particular disorder high cholesterol depression schizophrenia whatever it happens to be people who have a particular disorder and those are not randomly assigned
- 10:00 - 10:30 people have those things or they don't they're probably caused by lots and lots of various factors um there are lots of reasons why someone might have high cholesterol or be or have schizophrenia there's no sort of magic box there no magic button we can turn on or turn off so we don't have random assignment to groups and in fact we usually don't necessarily except in some of these studies we don't even have groups um sometimes we're just measuring and looking at relationships
- 10:30 - 11:00 um now if we've done a good job with our random sampling there which is which a lot of observational studies do um the results are generalizable but we can't we can only make correlational statements and then finally if we're not using random sampling and we are not using we're not doing any manipulation then we can really only then it becomes difficult either to generalize or to make um or to make causal claims so you need
- 11:00 - 11:30 to be careful and you need to understand how random assignment and random sampling both together cause us to be able to make causal claims and to be able to generalize to one population or the other and we're going to use an example experiment to kind of talk further about the null hypothesis significance testing process because now we're sort of through the experimental methods the experimental
- 11:30 - 12:00 design elements and so we're going to use an example experiment from a real experiment that was done a bunch of years ago and I will tell you about that in the next section of the video