Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan

Page created by Sylvia Sanchez
 
CONTINUE READING
Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan
Let Your Data Speak For You
The Power of Data Visualization
RIPL 2021 / Joe Ryan
Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan
Welcome! Here’s our agenda
     Introductions

     De nitions of and motivations for visualization

     Some best practices

     Evaluating visualizations

     A brief history of visualization

     How do I know what visualization style to use?

     Resources
fi
Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan
Introductions
Please introduce yourself to
your colleagues via Zoom chat.
• Name / workplace
• What motivated you to join
  this session today?
Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan
A selective bio
 (former) Digital Projects Librarian, NC
 State University Libraries

   NC Architects

 (former) Digital Humanities Research
 Associate, ITS Research Computing
 Center, UNC Chapel Hill

 (former) Head, Center for Statistics and
 Visualization, University of Denver

   Votester
Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan
De ning data visualization
fi
De ning information visualization
fi
“The        use of computer-supported, interactive,

                     visual representation of abstract data

                                        to amplify cognition.”

Card, S. K., Mackinlay, J., and Schneiderman, B. (1999) Readings in information visualization: Using vision to think. San Francisco:
Morgan Kaufmann.
Why visualize information?
Motivation 1: Artistic expression

 Data-powered

 Often no scale or data
 source

 Can use same tools as
 other styles
Motivation 2: Analysis / Sense-making

 Patterns?

 Outliers?

 Relationships?

 Possible outcome:
 hypothesis formation
Motivation 3: Communication

 The ideal: a self-describing
 visualization

 Provide easy-to-
 understand visual cues to
 guide viewer to essential
 message
What every communication needs

     Well-de ned audience

     Well-de ned message

     Well-de ned understanding of
     your audience’s context in
     receiving your message
fi
fi
fi
Build a mission statement

“My visualization will help {my audience} understand {a speci c message}.”

                                                     fi
Your mission statement tells you:

 Visualization style choice

   How well does my audience know the subject area?

   How visually literate is my audience?

 Context: what kind and how much does your audience need?

 Label / Title: how much jargon does your audience know?
Visualization best practices
Colors in visualization

 Very powerful. What does the “stoplight palette” tell you? Red/Yellow/Green?

 Colors can carry many other meanings and are quick to draw the eye. Is
 your use of color needed? Is the color helping you highlight something
 critical?

 Be aware of colorblindness

 Keep your background a neutral color to avoid confusion
Visualization wisdom

     X/Y charts: have zero on the y axis if possible.

     Trends: use line charts to show behavior over time.

     Speci c values: tables, bar charts, or lists of values.

     Test your visualization with your audience. Revise if you nd confusion. You
     may have to take things away more often than add things.
fi
                                                        fi
Only show data relevant to your message,
      in good and su cient context.
         ffi
Evaluating visualizations
Stephen Few’s Viz Evaluation Criteria
1. Informative Criteria

         1. Usefulness: Does it meet your audience’s needs? Do they care?

         2. Completeness: Is the data on display complete? Or sliced to distort?

         3. Perceptibility: Is the right viz style in use for the dataset?

         4. Truthfulness: Is the data source credible and accurate?

         5. Intuitiveness: Is the visualization’s message easy to grasp?

2. Emotive Criteria

         1. Aesthetics: Is the visualization pleasing to gaze upon?

         2. Engagement: Does the visualization draw one in to learn more or distract from the message?

https://www.perceptualedge.com/articles/visual_business_intelligence/data_visualization_effectiveness_pro le.pdf
                                                                                          fi
Joe’s sample visualization evaluation
Usefulness: somewhat useful to know what transportation types
are most fuel-ef cient

Completeness: an unexplained, unjusti ed selection of
transportation types. What about scooters?

Perceptibility: Very dif cult to compare between values

Truthfulness: No data sourcing, no mention of speed of each type

Intuitiveness: Not sorted by distance, lines not parallel, lines
follow y-, not x-axis

Aesthetics: Why all lines converging at a point? Icons not from
universal set

Engagement: Most consumers do not buy oil by the barrel. What
about a gallon? Or 25 gallons?

What would I do to improve this visualization?
  fi
          fi
                            fi
Live evaluation: volunteer?
A brief history of visualization
Pre-historical visualization
 Cave paintings, stone tablets,
 handwriting systems

 Navigation, city planning, record
 keeping

 Image: Babylonian world map, circa
 600 BC
Turin Map, 1150 BCE
 Includes color-coded geological
 information and mine site data.
Tabula Peutinger, 366-335 BCE

 Road map of the Roman Empire,
 showing roads and cities.

 Map coverage: Britain to India
Ptolemy’s Map, 2nd Century

 Latitude and longitude! First
 appearance of an absolute scale
 for mapping.
Celestial Bodies Chart, 950 CE

 Relative position and
 trajectory over time
Su Song’s Celestial Atlas, 1092

 Locations of various celestial
 bodies

 Uses techniques unknown in
 Europe until 16th Century

 Part of a 1,000 year tradition of
 star mapping in China
Ramon Llull, Diagrams of Relationships
Between Knowledge, 13th-14th Century
Diagram of philosophical
concepts common to all
lives

Attempt to gain insights
and uncover new truths
about life

Rotating disks can generate
new combinations

First hierarchy visualization?
Abraham Ortelius, Theatrum Orbis
Terrarum, 1570

 First single volume
 atlas of the world

 53 maps with
 descriptions and
 supplements
Edmond Halley, Contour Maps, 18th
Century

 Contour lines to connect and
 delineate differences in
 atmospheric conditions

 Contour lines live on in modern
 weather maps
Joseph Priestley, A Specimen of a
     Chart of Biography, 1765

      Lifetimes of
      in uential
      people
      categorized
      and
      displayed on
      a timeline.
fl
Charles de Fourcroy, Tableau
Poléometrique, 1782

 Demographic comparison of
 major European cities

 An inspiration for the
 development of the treemap
William Playfair, 1786
John Snow, London Cholera Map,
1854
 “X” indicated water pump

 Black dots indicate cholera
 fatalities

 Led to decommissioning of Broad
 Street pump, the source of the
 infection
Charles Minard, 1869   Color, line width, line direction,
                       longitude as x-axis.
Luigi Perozzo, 3D Swedish Census,
1869

 Years horizontal, # of people
 vertical, age groups for depth

 Widely used in science and
 medicine today
Francis Galton, Weather Chart, 1875

 First newspaper weather map
 showing previous day’s weather

 Wind direction and air pressure
 similarities
Howard Fisher, Mapping Software,
1960s

 First general-purpose map
 software

 Harvard Laboratory for Computer
 Graphics and Spatial Analysis
Today’s landscape

 Many mapping tools

 General purpose desktop visualization and
 analytic tools

 Programming tools

 Web sandboxes
Choosing e ective chart types
          ff
Choosing effective chart types
Identify the goal of your visualization in terms of its data

  Compare values?

  Show the composition of a phenomenon?

  Show a distribution of values?

  Show trends?

  Show relationships between entities?

  Highlight the location of data elements?
It’s really about the data type
 The data types in your dataset can point you in encouraging directions.

 Common data types

   Time series

   Geospatial

   Topical

   Tree / Network / Graph
Time series data

 “A sequence of data points, typically consisting of successive
 measurements made over a time interval.” — Wikipedia
Time series data
 Trends: tendency to decrease, increase, or stay the same

 Variability: change from one time point to following points

 Rate of change

 Co-variation: two or more variables changes in relation to each other

 Cycles: repeating variability

 Exceptions: values that do not follow any emergent pattern
Time series common viz styles

 Line charts

 Bar charts
Time series line chart
Time series bar chart
Time series: what not to do
Geospatial data

 Data that includes physical
 location of some type as one or
 more of its attributes
Geospatial data
     Speci c location data

       Latitude / longitude

       County / City / State / Nation / Content / Hemisphere / &c.

     General location data

       Distance and direction from some point

       # of stops on a light rail system

       Building name / campus name / room number
fi
Geospatial visualization styles

     Maps!

       Speci cally, thematic maps, which show some arbitrary data overlaid on a geographic map.

              Even more speci cally:

                 Proportional/graduated symbol maps

                 Choropleth maps

                 Cartograms

                 Heat maps

                 Flow maps
fi
         fi
Proportional symbol maps

 Data can be encoded by shape,
 size, and/or color.

 Often require aggregation of data.

 Can quickly overwhelm—use with
 care.
Choropleth maps
          Uses shades of a single color
          to show relative concentration
          in a de ned area.

          Can be “classed” or
          “unclassed” - basically hard
          de ned categories or a
          continuous color scheme

          3-7 classes recommended,
          more than that and you risk
          legibility problems
fi
     fi
Cartograms

     Distortions of cartographic areas
     in response to a measured
     variable.

     Two avors, contiguous and non-
     contiguous.
fl
Contiguous Cartogram

 Maintains contact of cartographic
 regions, resulting in a distorted
 map view.

 Low precision, but very eye-
 catching.
Heat maps

Overlay using semi-concentric,
graduated colored shapes to
display intensity and location of a
phenomenon.
Flow maps

     Map / ow chart hybrid used to
     show direction and amount of
     movement in a network.

     Minard’s French wine exports in
     1864.
fl
Topical data

 Data that can answer questions
 like “What?”. Collections of
 topics, sometimes with time
 elements.

 Frequently the output of text
 mining, and qualitative research
 studies.
Word clouds
Tables
 Real data

 Absolute comparisons

 Can show all or some values

 Style cells to show majority vs
 long tail values

 Parseable beyond 10 items
Bar charts

 A simple, clean approach,
 scalable or group able as desired.
“But what if the topics relate to each other
               in some way?”

               –Some or all of you
Tree data

 Topical data that can be arranged
 into a hierarchy.
Treemaps

Area used to show relative
measure.

Can be used for hierarchical and
non-hierarchical data.

“Share of whole” display.
Treemap / bar chart hybrid

 Area used to show relative
 measure.

 Can be used for
 hierarchical and non-
 hierarchical data.
Radial Tree Graph

 For datasets with a single parent
 and a reasonable number of
 children.

 Be mindful of legibility and
 parseability in large datasets.
“But I don’t have a hierarchy! And things are related! And I don’t know
                              what to do.”

                            –Some or all of you
Graph data
 Data that represents complex
 interrelationships between entities.

 Pairs of vertices that are
 connected by edges.

 Object type and link type may
 vary.

 Example: the social graph game!
Force-directed graph

 Displays relationships between
 entities, arranged for maximum
 legibility.

 Lines (“edges”) can be styled to
 characterize the relationship
 between two people (“nodes”).
Radial graph or
chord diagram
 First used to
 visualize
 relationships
 between genes.

 Easily
 overwhelms
 most people.
For more reading

 Whitepaper from Tableau: https://www.tableau.com/learn/whitepapers/
 which-chart-or-graph-is-right-for-you

 Hubspot’s take: https://blog.hubspot.com/marketing/types-of-graphs-for-
 data-visualization

 Don’t be afraid of non-academic sources.
Putting visualization into action for you
Let’s talk.

 What’s the current state of visualization work at your library?

 What roadblocks exist?

 What data sources do you have?

 How about your data quality?

 What tools can you use for visualization?
Tools
     Data cleaning

         Excel / Google Sheets

         OpenRe ne

     Visualization

         Google Data Studio

                Example: Population / Internet / Mobile Use

         Tableau

                Example: Who wrote the Beatles’ top hits?

         Qlik

                Example: OSCARS
fi
References & Resources
 Free index of Stephen Few articles by topic here.

 Show Me the Numbers: Designing Tables and Graphs to Enlighten (Stephen
 Few)

 Storytelling with Data: A Data Visualization Guide for Business Professionals
 (Cole Nussbaumer Kna ic)

 Effective Data Storytelling: How to Drive Change with Data, Narrative and
 Visuals (Brent Dykes)
                fl
Q&A
Thanks for your time today!

https://www.linkedin.com/in/joseph-ryan-0017882/
You can also read