DescriptiveStats1

Estimated read time: 1:20

    Summary

    In this lecture led by Erin Heerey, Descriptive Statistics are discussed as foundational tools for summarizing data. The session highlights the importance of graphical representations of data, referencing Florence Nightingale’s contributions. Heerey also delves into historical anecdotes such as Dr. James Lind’s scurvy trials, illustrating the need for accurate theories and representative samples in research. The lecture underscores that descriptive statistics are vital for drawing meaningful conclusions from data, aiding in the understanding and estimation of population characteristics.

      Highlights

      • Descriptive statistics help summarize and find patterns in data 📊.
      • Florence Nightingale used data visuals to pioneer effective policy-making 🌸.
      • Scurvy experiment by Dr. Lind highlights the need for accurate scientific methods 🍋.
      • Descriptive statistics are foundational to making inferences from statistical data 🔍.
      • Graphs and charts highlight data patterns missed in spreadsheets 📉.

      Key Takeaways

      • Descriptive statistics are essential for summarizing data and understanding its patterns 📊.
      • Florence Nightingale pioneered the use of graphical data representation to influence policy decisions 🌸.
      • Dr. James Lind's scurvy experiment showcased the significance of accurate theories and proper sampling 🍋.
      • Understanding graphical representations helps in viewing data patterns that are not evident in raw numbers 📉.
      • Descriptive statistics are not just about numbers; they are about drawing insights and conclusions about populations 🔬.

      Overview

      Erin Heerey begins the lecture by delving into descriptive statistics, highlighting how they serve as foundational tools for summarizing data. The focus is on understanding central tendency and dispersion, and how graphical representations like charts and graphs can reveal hidden patterns in data. Heerey underscores the importance of visuals for better comprehension and informed decision-making.

        The lecture is sprinkled with engaging historical anecdotes, most notably Florence Nightingale’s use of graphical data to influence sanitation practices during the Crimean War. Heerey highlights how Nightingale's ‘Rose Diagram’ emphasized policy changes, demonstrating the practical application of descriptive statistics in real-world scenarios.

          Finally, Heerey shares the 18th-century story of Dr. James Lind who conducted early clinical trials aboard a British Navy ship. His experiments underscore the importance of theories and representative samples in research, showing the intricate relationship between theory, data, and inference. It's a compelling narrative on how descriptive statistics go beyond numbers to offer insights into population dynamics.

            Chapters

            • 00:00 - 00:30: Introduction to Descriptive Statistics The chapter introduces the concept of descriptive statistics, which focuses on summarizing and describing data. It covers key topics such as measures of central tendency and dispersion, as well as interpreting graphical representations of statistical data. The chapter begins with a question about recognizing a famous statistician.
            • 00:30 - 01:30: Florence Nightingale's Contribution Florence Nightingale is a notable figure both as a nurse and a statistician. She was the first woman to be elected as a Fellow of the Royal Statistical Society, and is recognized for her innovative approach to presenting data visually.
            • 03:00 - 05:00: The Importance of Descriptive Statistics This chapter discusses the historical development and significance of descriptive statistics. It highlights the contributions of a pioneering individual who proposed the use of graphical data displays to enhance understanding and inform policy decisions. The chapter features a notable quote from August 1857, illustrating the use of diagrams as a tool for expression and influence.
            • 10:30 - 12:00: Dr. James Lind's Scurvy Experiment The chapter discusses Dr. James Lind's experiment related to scurvy, though the initial details describe a famous diagram by Florence Nightingale known as the Nightingale Rose Diagram or Polar Area Diagram. Nightingale's diagram visualized excess deaths during the Crimean War caused by poor sanitation in hospitals, emphasizing the preventability of those deaths. Lind's work, while not directly mentioned in this passage, would relate to similar efforts in understanding and preventing medical conditions.

            DescriptiveStats1 Transcription

            • 00:00 - 00:30 All right. Let's begin the  material for this course.   We're going to start by talking  about descriptive statistics. The goal for this lecture is to talk  about these basic summaries of data   including quantifying measures of central  tendency and dispersion. We're also going   to talk about how to read graphical  representations of these summaries. I'll   start by asking, does anyone  recognize this famous statistician?
            • 00:30 - 01:00 Odds are, some of you thought you recognized  this person until I said the word statistician.   This is also a famous nurse. Her name is  Florence Nightingale and Florence Nightingale   is the first woman ever to be elected as  a Fellow of the Royal Statistical Society.   She's interesting because she really started  this idea of showing data, not just presenting
            • 01:00 - 01:30 lots and lots of numbers and little spreadsheets  that are crabbed and difficult to make sense of,   but she helped pioneer this idea of using  graphical displays of data to help shape   understanding of the data and therefore shape  policy. A famous quote that she wrote in a letter   in August of 1857 was, "Whenever I am infuriated I  avenge myself with a new diagram." So her diagram,
            • 01:30 - 02:00 this is a picture of one of her most famous  diagrams. It used to be called the Nightingale   Rose Diagram it's also known as the Polar Area  diagram. And what she did was she diagrammed   excess deaths associated with the Crimean  War. The British were in the Crimean War   and there were a lot of excess deaths associated  with poor sanitation conditions in hospitals.   She documented these deaths and talked about how  they were preventable; and in fact how to prevent
            • 02:00 - 02:30 them. These were things like digging latrines far  away from drinking water, and for doctors to wash   or at least in some way sanitize their hands  between seeing patients to avoid the spread of   communicative diseases. So she's responsible for  some of the earliest sanitation practices that   we see in hospital settings around that time  and she did that by making graphs. So looking
            • 02:30 - 03:00 at your data is probably one of the most important  things you can do because a graph or visualization   of the data shows patterns in the data that  you wouldn't see if you were just looking at   a sheet with hundreds or thousands or millions of  numbers on it. We're going to talk about a number   of different graphs over the course of lecture.  I'm going to talk to you about how to read them   and what they mean and we're going to talk about  the statistics that they show. So let's first   begin with the idea of descriptive statistics.  Why are we learning this? Let's talk about the
            • 03:00 - 03:30 elephant in the room and I hope that this helps  you understand why it is this is probably one of   the more important elements and statistics that  you can and should take away from this course. Descriptives are absolutely fundamental  to understanding a data set. We can't draw   conclusions about what data mean without a summary  of those data, so these summary statistics allow   us to see patterns in the data where the raw data  would be too detailed. One reason that's important
            • 03:30 - 04:00 is because most of the studies we run only sample  the available population, we don't sample everyone   in a population. The populations that you know of,  i.e., people in the world are just far too large,   so descriptive statistics can really help us  quantify the uncertainty in the sample and that   allows us to eventually (we're not talking about  this yet) but without descriptive statistics we
            • 04:00 - 04:30 can't make inferences about how things work or  about populations because those inferences need   to be based on facts. They need to be based on  things we know about the data we have observed   and descriptive statistics quantify some of those  facts. So, these are things that we quantify based   on the data we specifically collect so what do  we mean by descriptive statistics? This is a   term that's given to the form of data analysis  that produces meaningful summaries of data   that produces patterns in the data or that  identifies patterns in the data. Like how
            • 04:30 - 05:00 data vary across groups or across conditions,  relationships between different variables   (we won't talk about that in this lecture we  will give that its own lecture another time).   And then there's also the notion of uncertainty  so the variability within a data set tells us   about how certain we can be about the patterns  that we see. We're going to talk about how to   get measures or metrics of that variability in  this lecture, as well as in this week's lab.
            • 05:00 - 05:30 So descriptive statistics are facts that  describe a specific data set. and they are   critical to capturing how well a sample fits  the population from which it was drawn. they're   very critical to our ability to  make estimates about a population. Why are they important? Well, they help  us to understand the world and how it   works by allowing us to gain insight about  these specific data that we have collected.
            • 05:30 - 06:00 So descriptive statistics allow us  to describe a data set that we have   collected. The statistics we calculate become  estimators that tell us about the population   and without accurate descriptions of a  sample we cannot make good inferences. And remember there's a lot more to making good  inferences than just descriptive statistics. In
            • 06:00 - 06:30 fact we need research methods for that also.  The inferences we make also require accurate   descriptions of data. Accurate inferences, they  require amongst other things a good experimental   design that allows us to rule out extraneous or  confounding variables. We need representative   samples so our population is accurately  represented by the sample that we've sampled,   and that the population is the correct population  with respect to when we're thinking about to whom
            • 06:30 - 07:00 the theory applies that that we're interested  in. And finally, we need accurate theories.   So, theories are always oversimplifications; they  are always abstractions of how a process works in   real life. But the theory needs to be close  enough to the ground truth to be a reasonable   description of that theory. Now, this isn't going  to be on the test but I'll give you an example   one of the very first randomized clinical trials  was conducted by Dr James Lind in 1747. Dr Lind
            • 07:00 - 07:30 was a physician, or a surgeon as he was known  at the time, on board a British Navy ship.   This story what happened you know during  the 1700s and and in the centuries before   it. Basically between about 1500 and probably  1850 or so what were the British out doing they   were out colonizing various places in the  world. If you haven't colonized the world
            • 07:30 - 08:00 you have to sail for a pretty long time before  you get to a friendly port where you can do things   like take in fresh water and take in fresh  food. So, as the British began to colonize   the world, because the they certainly did  that, what they were doing is going on very   long voyages so they would be out at sea for  three or four or sometimes even six months at
            • 08:00 - 08:30 a time and during those voyages Sailors would  fall ill with a with a disease called scurvy.   Scurvy is awful your bones. They disintegrate  inside of your body, you develop all kinds of   lesions. We know today that scurvy is caused by a  deficiency in vitamin C vitamin C is a substance   that our bodies cannot make. We have to take in  food that contains it in order to avoid scurvy.
            • 08:30 - 09:00 So, because they were out sailing for such long  periods of time sailors on every voyage would   fall ill with scurvy and many of them would die.  So, Dr James Lind was on one of these voyages and   in a period of time because this does  tend to happen relatively regularly,   he had 12 sailors who were all suffering from  scurvy and to suffering to the extent that they
            • 09:00 - 09:30 were no longer able to carry out their ship's  duties. They were consigned to the sick bay.   Dr Lind decided that he would try some different  treatments to see what worked and he had six   treatments all of which were, or at least most of  which were, adding something to the sailor's diet.   One of the treatments was, of course, a control  treatment in which nothing special was added to
            • 09:30 - 10:00 the sailor's diet they got the same rations  the same water the same of everything else.   The second one was adding salt water to the  sailor's diet. The idea there was that, as we all   know, salt kills lots of bacteria so it's a good  preservative. For example in if you've ever eaten   a piece of beef jerky you've eaten a lot of salt  there are all kinds of cultures in which salt is   a preservative for various vegetables and meats.  Lind thought that drinking salt water might be one
            • 10:00 - 10:30 possible way of helping to cure scurvy and so the  sailors who were assigned to that condition ended   up getting an extra portion of salt water in their  diet - well not really an extra portion of just   a portion of salt water. The third condition that  he used was a condition called elixir of vitriol.   Don't try this at home, folks. Elixir of vitriol  is sulfuric acid. It's relatively poisonous.
            • 10:30 - 11:00 Another group of sailors got an  extra portion of ship's grog added   to their diet which I'm sure nobody was  disappointed about. Ship's grog is rum   and so they had an extra portion of  watered rum as part of their their diet.   And then the final two groups: one of those groups  got apple cider vinegar as part of the treatment   and the other group got the juice of lemons and  limes every day mixed together with their ship's
            • 11:00 - 11:30 grog and hence we have a cocktail. It turns out  that the group who were getting Citrus juice mixed   in with their Grog that group was doing relatively  well and in fact within the first week or a week   and a half of the trial that group, both the two  men who had been randomly assigned to that group,   these were all male sailors on this boat. So, the  two men who'd been randomly assigned to that group
            • 11:30 - 12:00 found that they were better enough, they were  healed enough to return to their normal duties and   the other the group that got apple cider vinegar  got a little bit better not quite so quickly and   not quite so thoroughly as the lemon and lime  as a citrus group but they got better as well,   and so the idea became that lemon and lime  juice were good for scurvy and that was all   fun and good and scurvy was cured for a very long  time in terms of the British Naval experience.
            • 12:00 - 12:30 Now, the problem is, scurvy wasn't  really cured. Here's what happened.   This is the bit where you need an accurate  theory. so Lind publishes this this paper in   which he claims that lemon and lime juice  the juice of citrus fruits cures scurvy.   So, the British Navy on the back of this  advice starts to bottle lemon and lime   juice and they send it out with their sailors.  It's boiled to purify it and then it's bottled
            • 12:30 - 13:00 so that it could be kept over the long term.   Now what you might not know is the process of  boiling citrus fruit gets rid of the vitamin C.   It's very volatile and it comes out in the  steam associated with the boiling process so   that was, it turned out, not to be a good idea or  it was, but it wasn't quite the right theory. It
            • 13:00 - 13:30 wasn't quite the right theory but nobody knew  it because by the time they started doing this   the journeys that sailors were making on boats  were substantially shorter and so by reducing   the travel times, sailors got off the ships  more quickly. They got on land more quickly   where they could eat fresh fruits and vegetables  and replenish the vitamin C in their systems
            • 13:30 - 14:00 before getting back on the boat again and so  scurvy didn't rear its ugly head again until   the Shackleton Expeditions where they were  walking toward poles for very long periods of   time carrying things like dried beef and not much  in the way of dried vegetables to go with it so   scurvy reared its ugly head again on those  Expeditions and that's when they discovered
            • 14:00 - 14:30 vitamin C, which was the thing that was  the active ingredient in the treatment. So,   in order to make good inferences we actually  need to have all the pieces in place. We need   to have our descriptive statistics we need  to have a good experimental design and James   Lynn's design was pretty good. Actually his  population was totally representative of the   population of human beings, even though he  had only men and probably only white men.
            • 14:30 - 15:00 My guess is he probably didn't  have too diverse of a crew.   However people are people in this domain. If you  don't get enough vitamin C you will get scurvy   eventually. So in this case it didn't matter  that he didn't have a really truly representative   population. His population was representative to  the group to which the theory applied. And finally   they needed an accurate theory. So the theory that  was created on the back of this experiment wasn't
            • 15:00 - 15:30 quite the right one - it was an oversimplification  of what was really needed so this is an example   of the kind of information the kind of things we  need to combine when we think about statistics.   We can't just think about statistics. You have to  think about all the other pieces as well when you   are considering what the real statistics are, what  the real story is, and what a statistic tells you.   So descriptive statistics start with  the facts. They give us the facts
            • 15:30 - 16:00 from specific data sets. And the inferences  we make rely on the presence of those facts.   The facts are things we know about our data from  what population did we sample; what sampling   methods were used; was it a random sample; was  it a convenience sample; how were participants   assigned to groups? Were they randomized in  the James Lind example, he used a used a die   and so he cast it and by casting the  dice, they assigned participants to groups
            • 16:00 - 16:30 until he had two people for each group.   What are the relationships between the variables?  What do the distributions of data look like - the   range, the standard deviation and variance; what  are the central tendencies looking like in the   data as well. So these facts that we use when  we talk about data sets, those are critically   important elements of descriptive statistics  and we need them for that reason. So I'm going   to cut the lecture here and we'll pick up with the  next section in the next portion of the lecture.