10.2 Displaying Data Part I - Dr. Travers Page of Math

Page created by Leslie Reed
 
CONTINUE READING
§ 10.2 Displaying Data Part I
Types of Variables

     1   Categorical
     2   Quantitative
Types of Variables

     1   Categorical
     2   Quantitative

   Definition
   A categorical variable is a variable that can take on one of a limited,
   and usually fixed, number of possible values.
Types of Variables

     1   Categorical
     2   Quantitative

   Definition
   A categorical variable is a variable that can take on one of a limited,
   and usually fixed, number of possible values.

   Definition
   A quantitative variable is naturally measured as a number for which
   meaningful arithmetic operations make sense.
Pictographs

   Definition
   A pictograph is a form of a bar graph where the bars are graphics that
   each represent a predetermined quantity.
   Generally these are graphs used to display categorical variables.
Pictographs

                             What’s Good
      Gives good visual and can easily be adapted to multiple graphics
Pictographs

                             What’s Good
      Gives good visual and can easily be adapted to multiple graphics
      Easy for children to understand since it includes graphics
Pictographs

                             What’s Good
      Gives good visual and can easily be adapted to multiple graphics
      Easy for children to understand since it includes graphics
                           What’s Not Good
      Tedious to draw
Pictographs

                             What’s Good
      Gives good visual and can easily be adapted to multiple graphics
      Easy for children to understand since it includes graphics
                           What’s Not Good
      Tedious to draw
      Fractions of the scale amount can be difficult to determine
      exactly which amount was intended
Dot Plots

   Definition
   A dot plot is the representation of a set of data over a number line.
   The number of dots over a number represents the relative quantity of
   the value.
Dot Plots

   Definition
   A dot plot is the representation of a set of data over a number line.
   The number of dots over a number represents the relative quantity of
   the value.

   Example
   The following are the test scores for a particular high school student
   in their math class over the course of an academic year.
                           64   73   85    74   83
                           71   56   83    76   85
                           83   87   92    84   95
                           92   95   92    91
Dot Plots

                            Grades of a High School Student

                        3                                  •         •
            Frequency

                        2                                  ••        ••

                        1        •        •    ••••        •••• •••

                            50       60       70      80        90        100
                                          Grades
Dot Plots

                             What’s Good
      Gives a good idea of distribution
Dot Plots

                              What’s Good
      Gives a good idea of distribution
      Preserves all of the data points
Dot Plots

                              What’s Good
      Gives a good idea of distribution
      Preserves all of the data points
                            What’s Not Good
      Tedious to plot
Dot Plots

                              What’s Good
      Gives a good idea of distribution
      Preserves all of the data points
                            What’s Not Good
      Tedious to plot
      Can be hard to read
Dot Plots

                              What’s Good
      Gives a good idea of distribution
      Preserves all of the data points
                            What’s Not Good
      Tedious to plot
      Can be hard to read
      Not practical for large data sets
Distributions
   Definition
   A distribution is a representation of data vs. frequency. It shows all
   possible values and how often they occur.
Distributions
   Definition
   A distribution is a representation of data vs. frequency. It shows all
   possible values and how often they occur.

        Range: Highest value - lowest value
        Here, the range would be 95 − 56 = 39.
Distributions
   Definition
   A distribution is a representation of data vs. frequency. It shows all
   possible values and how often they occur.

        Range: Highest value - lowest value
        Here, the range would be 95 − 56 = 39.
        Center: The central value(s) is the center. It could be a value or a
        class, depending on the type of graph.
        Here, the center is the 10th value, since there are 19 data points in
        the set. The value we seek is 84.
Distributions
   Definition
   A distribution is a representation of data vs. frequency. It shows all
   possible values and how often they occur.

        Range: Highest value - lowest value
        Here, the range would be 95 − 56 = 39.
        Center: The central value(s) is the center. It could be a value or a
        class, depending on the type of graph.
        Here, the center is the 10th value, since there are 19 data points in
        the set. The value we seek is 84.
        Shape: How many peaks are there? Is it roughly in the middle or
        to one side?
        Here we have one peak, so we would say the distribution is
        unimodal. That peak is to the right, so the tail stretches out to the
        left. We would say this graph is left skewed.
Distributions
   Definition
   A distribution is a representation of data vs. frequency. It shows all
   possible values and how often they occur.

        Range: Highest value - lowest value
        Here, the range would be 95 − 56 = 39.
        Center: The central value(s) is the center. It could be a value or a
        class, depending on the type of graph.
        Here, the center is the 10th value, since there are 19 data points in
        the set. The value we seek is 84.
        Shape: How many peaks are there? Is it roughly in the middle or
        to one side?
        Here we have one peak, so we would say the distribution is
        unimodal. That peak is to the right, so the tail stretches out to the
        left. We would say this graph is left skewed.
   We must ALWAYS have axes labeled and have a title for our graphs.
Stem-and-Leaf Plots

                        Similarities to Dot Plots
      Gives idea of distribution
Stem-and-Leaf Plots

                        Similarities to Dot Plots
      Gives idea of distribution
      Preserves data
Stem-and-Leaf Plots

                         Similarities to Dot Plots
      Gives idea of distribution
      Preserves data
      Not practical for large data sets
Stem-and-Leaf Plots

                         Similarities to Dot Plots
      Gives idea of distribution
      Preserves data
      Not practical for large data sets
                       Differences from Dot Plots
      Used for quantitative variables
Stem-and-Leaf Plots

                         Similarities to Dot Plots
      Gives idea of distribution
      Preserves data
      Not practical for large data sets
                       Differences from Dot Plots
      Used for quantitative variables
      Easier to read actual data elements
Stem-and-Leaf Plots

                         Similarities to Dot Plots
      Gives idea of distribution
      Preserves data
      Not practical for large data sets
                       Differences from Dot Plots
      Used for quantitative variables
      Easier to read actual data elements
      Can be used for comparisons of two data sets
Stem-and-Leaf Plots
   Example
   Using the same data set as we did for the dot plot, construct a
   stem-and-leaf plot.
Stem-and-Leaf Plots
   Example
   Using the same data set as we did for the dot plot, construct a
   stem-and-leaf plot.
   First thing we need to do is order the data elements.
                          56    64   71    73   74
                          76    82   83    83   83
                          84    85   85    87   91
                          92    92   95    95
Stem-and-Leaf Plots
   Example
   Using the same data set as we did for the dot plot, construct a
   stem-and-leaf plot.
   First thing we need to do is order the data elements.
                          56    64   71    73   74
                          76    82   83    83   83
                          84    85   85    87   91
                          92    92   95    95
                     Grades for a High School Student
                                   5
                                   6
                                   7
                                   8
                                   9
Stem-and-Leaf Plots

      This is the spine with the stems
Stem-and-Leaf Plots

      This is the spine with the stems
      The leaves are the last digit
Stem-and-Leaf Plots

      This is the spine with the stems
      The leaves are the last digit
      List in increasing order from spine
Stem-and-Leaf Plots

      This is the spine with the stems
      The leaves are the last digit
      List in increasing order from spine
      No commas
Stem-and-Leaf Plots

      This is the spine with the stems
      The leaves are the last digit
      List in increasing order from spine
      No commas
      Repetition is important

                   Grades for a High School Student
                   5 6
                   6 4
                   7 1 3 4 6
                   8 2 3 3 3 4 5 5 7
                   9 1 2 2 5 5
Analysis of Stem-and-Leaf Plot

                  Grades for a High School Student
                  5 6
                  6 4
                  7 1 3 4 6
                  8 2 3 3 3 4 5 5 7
                  9 1 2 2 5 5

      Same range as dot plot
Analysis of Stem-and-Leaf Plot

                  Grades for a High School Student
                  5 6
                  6 4
                  7 1 3 4 6
                  8 2 3 3 3 4 5 5 7
                  9 1 2 2 5 5

      Same range as dot plot
      Same center as dot plot although we would only give class
Analysis of Stem-and-Leaf Plot

                  Grades for a High School Student
                  5 6
                  6 4
                  7 1 3 4 6
                  8 2 3 3 3 4 5 5 7
                  9 1 2 2 5 5

      Same range as dot plot
      Same center as dot plot although we would only give class
      Shape is unimodal and skewed left
Analysis of Stem-and-Leaf Plot

                      Grades for a High School Student
                      5 6
                      6 4
                      7 1 3 4 6
                      8 2 3 3 3 4 5 5 7
                      9 1 2 2 5 5

        Same range as dot plot
        Same center as dot plot although we would only give class
        Shape is unimodal and skewed left
   Notice that the values on the right are essentially in columns - this is
   what allows us to quickly see which classes have more elements.
Stem-and-Leaf Plots

   What if we had a 3 digit number? Suppose the student got a 100 on
   the next exam?

                    Grades for a High School Student
                    5 6
                    6 4
                    7 1 3 4 6
                    8 2 3 3 3 4 5 5 7
                    9 1 2 2 5 5
                    10 0
Back-to-Back Stem-and-Leaf Plots

   Example
   Suppose we wanted to compare the careers of Babe Ruth and Mark
   McGwire in terms of their yearly home run totals to determine which
   player was the more consistent long ball hitter. Make a back-to-back
   stem-and-leaf plot to make the is determination.

   Ruth: 54, 59, 35, 41, 46, 25, 47, 60, 54, 46, 49, 46, 41, 34, 22
   McGwire: 49, 32, 33, 39, 22, 42, 9, 9, 39, 52, 58, 70, 65, 32, 29
Back-to-Back Stem-and-Leaf Plots

                             Ruth v. McGwire
                                     0
                                     1
                                     2
                                     3
                                     4
                                     5
                                     6
                                     7

   We set up the graph with one set of data increasing out to the right
   and the other increasing out to the left. This way we have a
   side-by-side comparison of the data sets.
Back-to-Back Stem-and-Leaf Plots

                                 Ruth v. McGwire
                                        0 9 9
                                        1
                                5 2 2 2 9
                                5 4 3 2 2 3        9   9
            9   7   6   6   6   1 1 4 2 9
                            9   4 4 5 2 8
                                   0 6 5
                                        7 0

   Who is more consistent and why?
Histograms

      Used for quantitative variables
Histograms

      Used for quantitative variables
      Tracks frequency and shows distribution
Histograms

      Used for quantitative variables
      Tracks frequency and shows distribution
      Does not preserve individual values
Histograms

      Used for quantitative variables
      Tracks frequency and shows distribution
      Does not preserve individual values
      Good for a large number of values
Histograms

      Used for quantitative variables
      Tracks frequency and shows distribution
      Does not preserve individual values
      Good for a large number of values
      Bars must be vertical and must touch
Histograms

  Example
  For our test scores example, construct a histogram and analyze the
  distribution.
  It is easier if the values are in order as we will be grouping them into
  classes.
                          56   64    71   73    74
                          76   82    83   83    83
                          84   85    85   87    91
                          92   92    95   95
Histograms

  We first want to create a frequency table. This is a collection of
  non-overlapping classes and the frequency of observation in each of
  those classes. We need to determine the following in this order:
Histograms

  We first want to create a frequency table. This is a collection of
  non-overlapping classes and the frequency of observation in each of
  those classes. We need to determine the following in this order:

  Number of classes
  The rule of thumb with the number of classes is to use the square root
  of the number of observations in the data set.
                              √
                                 19 ≈ 4.36

  So, we can use 4 or 5 classes. I tend to go up to the next integer to be
  sure I have enough classes. So we will use 5 for our graph.
Histograms

  Size of each class
  We want them to be the same width so that the taller classes will be
  known to have the most elements. If not then we have to find the area
  of each rectangle to determine relative size.
  To find the size, we divide the ‘range’ by the number of classes.
                               95 − 56 + 1   40
                      size =               =    =8
                                    5        5
  So we will use 8 for the class size.
Histograms

  Size of each class
  We want them to be the same width so that the taller classes will be
  known to have the most elements. If not then we have to find the area
  of each rectangle to determine relative size.
  To find the size, we divide the ‘range’ by the number of classes.
                              95 − 56 + 1   40
                     size =               =    =8
                                   5        5
  So we will use 8 for the class size.
  If this came out to be a decimal, we can use the decimal value for the
  class size or we can round here too. Remember, this is an
  approximation tool, so as long as we are using values close to those
  we calculate, we will have an accurate representation to analyze.
Histograms

  Endpoints of each class
  We start the smallest class with a left endpoint of 56, since that was
  our minimum. Then, to find the next left endpoint, add 8 to 56.
  Continue in this manner until we have 5 classes.

                        Grade Range      Frequency
                            56-
                            64-
                            72-
                            80-
                            88-
Histograms
  Then, we subtract 1 from each left endpoint to find the right endpoint
  of the previous class.
                        Grade Range     Frequency
                           56-63
                           64-71
                           72-79
                           80-87
                           88-95
Histograms
  Then, we subtract 1 from each left endpoint to find the right endpoint
  of the previous class.
                        Grade Range     Frequency
                           56-63
                           64-71
                           72-79
                           80-87
                           88-95
  Finally, we count how many elements go in each class.
                        Grade Range     Frequency
                           56-63            1
                           64-71            2
                           72-79            3
                           80-87            8
                           88-95            5
Histograms
                               Grades of a High School Student
                          8

              Frequency   6

                          4

                          2   56-63

                                      64-71

                                                72-79

                                                        80-87

                                                                88-95
                                              Grades

  We see the same range and shape. Here, we’d have no choice but to
  give the class only for the center as we would lose the ability to see
  individual values.
Histograms
  Example
  The EPA lists most sports cars in its “two-seater” category. The table
  below gives the city mileage in miles per gallon. Make and analyze a
  histogram for the the city mileage.
        Model                 Mileage     Model           Mileage
        Acura NSX               17        Insight           57
        Audi Quattro            20        S2000             20
        Audi Roadster           22        Lamborghini        9
        BMW M Coupe             17        Mazda             22
        BMW Z3 Coupe            19        SL500             16
        BMW Z3 Roadster         20        SL600             13
        BMW Z8                  13        SLK230            23
        Corvette                18        SLK 320           20
        Prowler                 18        911               15
        Ferrari 360             11        Boxster           19
        Thunderbird             17        MR2               25
Histograms

                                          √
  There are 22 cars, so we would use 4 < 22 < 5 classes, so here I
  will choose 5. The size of each class would be
                        57 − 9 + 1   49
                                   =    = 9.8
                            5        5
  So we will use 10.
Histograms

                                          √
  There are 22 cars, so we would use 4 < 22 < 5 classes, so here I
  will choose 5. The size of each class would be
                        57 − 9 + 1   49
                                   =    = 9.8
                            5        5
  So we will use 10.

                         Mileage   Frequency
                          9 -18        11
                          19-28        10
                          29-38        0
                          39-48        0
                          49-58        1
Histograms
                                     MPG for Sports Cars
                         12

             Frequency   9

                         6

                         3

                                     19-28

                                              29-38

                                                      39-48

                                                              49-58
                              9-18

                                             MPG

      Center:
Histograms
                                     MPG for Sports Cars
                         12

             Frequency   9

                         6

                         3

                                     19-28

                                              29-38

                                                      39-48

                                                              49-58
                              9-18

                                             MPG

      Center: Boundary between the first two classes
      Range:
Histograms
                                     MPG for Sports Cars
                         12

             Frequency   9

                         6

                         3

                                     19-28

                                              29-38

                                                      39-48

                                                              49-58
                              9-18

                                             MPG

      Center: Boundary between the first two classes
      Range: 58 − 9 = 49
      Shape
Histograms
                                     MPG for Sports Cars
                         12

             Frequency   9

                         6

                         3

                                     19-28

                                              29-38

                                                      39-48

                                                              49-58
                              9-18

                                             MPG

      Center: Boundary between the first two classes
      Range: 58 − 9 = 49
      Shape Unimodal, skewed right
Bar Graphs

   These look similar to histograms but there are a few differences.
        Generally used for categorical variables
Bar Graphs

   These look similar to histograms but there are a few differences.
        Generally used for categorical variables
        Bars can be vertical or horizontal
Bar Graphs

   These look similar to histograms but there are a few differences.
        Generally used for categorical variables
        Bars can be vertical or horizontal
        Cannot analyze distribution like histogram because the order of
        the classes is not necessarily in numerical order
Bar Graphs

   These look similar to histograms but there are a few differences.
        Generally used for categorical variables
        Bars can be vertical or horizontal
        Cannot analyze distribution like histogram because the order of
        the classes is not necessarily in numerical order
        Can be used for comparisons
Bar Graphs

   Example
   The growth of the US population age 65 and over is given in the table.
   Create a bar graph to represent this data.
                         1900   4.1   1970    9.8
                         1910   4.3   1980    11.3
                         1920   4.7   1990    12.5
                         1930   5.5   2000    12.4
                         1940   6.9   2010    13.2
                         1950   8.1   2020    16.5
                         1960   9.2   2030    20.0
Percent

              5
                    10
                         15
                              20
                                                          Bar Graphs

       1900
       1910
       1920
       1930
       1940
       1950

Year
       1960
       1970
       1980
                               Age of Seniors by Decade

       1990
       2000
       2010
       2020
       2030
Bar Graphs for Comparisons

   Example
   Create a bar graph for the given causes of death and analyze the
   results. Values given are the number per 100,000 people.
              Cause of Death     1970   1980    1990    2000
              Cardiovascular     640    509     387     318
              Cancer             199    208     216     201
              Accidents           62     46      36      34
Bar Graphs for Comparisons

                                           Causes of Death
                       600
    Number of Deaths

                                                                       Legend
     (per 100,000)

                       450
                                                                    Cardiovascular
                       300                                          Cancer
                                                                    Accidents
                       150
                             1970

                                    1980

                                                 1990

                                                             2000
                                                Year
Bar Graphs for Comparisons

                                           Causes of Death
                       600
    Number of Deaths

                                                                       Legend
     (per 100,000)

                       450
                                                                    Cardiovascular
                       300                                          Cancer
                                                                    Accidents
                       150
                             1970

                                    1980

                                                 1990

                                                             2000
                                                Year

           Cancer and accidents are roughly the same in each decade
Bar Graphs for Comparisons

                                           Causes of Death
                       600
    Number of Deaths

                                                                       Legend
     (per 100,000)

                       450
                                                                    Cardiovascular
                       300                                          Cancer
                                                                    Accidents
                       150
                             1970

                                    1980

                                                 1990

                                                             2000
                                                Year

           Cancer and accidents are roughly the same in each decade
           Cardiovascular disease decreases each decade and is
           approaching level of cancer deaths
Pie Charts
   Pie charts are used to compare different categories in relation to each
   other. It is only good for one type of variable; we cannot do
   comparisons between different types of observations with one pie
   chart.
Pie Charts
   Pie charts are used to compare different categories in relation to each
   other. It is only good for one type of variable; we cannot do
   comparisons between different types of observations with one pie
   chart.
   Example
   You sit on an overpass and record the color of the first 100 cars you
   see. The results are as follows:
                              color   frequency
                               red        15
                              blue        21
                              green       18
                              white       22
                              black       19
                              other       5
   Construct a pie chart to illustrate the relationship between the colors
   of these cars.
Pie Charts

   We have to make sure that the size of each slice is correct in relation
   to the other slices. To do this, we make sure the central angle is the
   correct size. Since we have 100 observations, the number of
   observations is the percent of the circle we need for that wedge. For
   example, for the red cars, we saw 15 of them, so we would need a
   central angle of .15(360◦ ) = 54◦ .
Pie Charts

                                Blue

                                21%               Red
         Green
                                         15%
                    18%

                                             5%
                                                     Other
                          22%          19%

                 White                       Black
Pie Charts

   Example
   The following is a breakdown of the solid waste that made up
   America’s garbage in 2000. Values given represent millions of tons.
                             Material           Weight
                               Food              25.9
                               Glass             12.8
                               Metal             18.0
                               Paper             86.7
                              Plastics           24.7
                              Rubber             15.8
                               Wood              12.7
                          Yard Trimmings         27.7
                               Other              7.5
   Create a pie chart to represent this data.
Pie Charts

   We can’t make a pie chart with this data; at least not yet. What do we
   need?
                Material         Weight    Relative Frequency
                  Food            25.9           11.2 %
                  Glass           12.8            5.5%
                  Metal           18.0            7.8%
                  Paper           86.7           37.4%
                 Plastics         24.7           10.7%
                 Rubber           15.8            6.8%
                  Wood            12.7            5.5%
             Yard Trimmings       27.7           11.9%
                  Other            7.5            3.2%
                                 231.9
Pie Charts

   Now we can find the central angles and create our pie chart.
        Material        Weight    Relative Frequency    Central Angle
          Food           25.9           11.2%              40.3◦
          Glass          12.8            5.5%              19.8◦
          Metal          18.0            7.8%              28.1◦
          Paper          86.7           37.4%              134.6◦
         Plastics        24.7           10.7%              38.5◦
         Rubber          15.8            6.8%              24.5◦
          Wood           12.7            5.5%              19.8◦
     Yard Trimmings      27.7           11.9%              42.8◦
          Other           7.5            3.2%              11.5◦
Pie Charts

                                             Metal
                                                      Glass
                                        7%
         Paper                                6%
                                                              Food
                            37%                 11%

                                                     3%
                                                              Other
                                                12%
                            11%                           Trimmings
                                   7%      6%

                 Plastics                       Wood
                                  Rubber
You can also read