Let Your Data Speak For You - The Power of Data Visualization RIPL 2021 / Joe Ryan
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Welcome! Here’s our agenda Introductions De nitions of and motivations for visualization Some best practices Evaluating visualizations A brief history of visualization How do I know what visualization style to use? Resources fi
Introductions Please introduce yourself to your colleagues via Zoom chat. • Name / workplace • What motivated you to join this session today?
A selective bio (former) Digital Projects Librarian, NC State University Libraries NC Architects (former) Digital Humanities Research Associate, ITS Research Computing Center, UNC Chapel Hill (former) Head, Center for Statistics and Visualization, University of Denver Votester
De ning information visualization fi
“The use of computer-supported, interactive, visual representation of abstract data to amplify cognition.” Card, S. K., Mackinlay, J., and Schneiderman, B. (1999) Readings in information visualization: Using vision to think. San Francisco: Morgan Kaufmann.
Why visualize information?
Motivation 1: Artistic expression Data-powered Often no scale or data source Can use same tools as other styles
Motivation 2: Analysis / Sense-making Patterns? Outliers? Relationships? Possible outcome: hypothesis formation
Motivation 3: Communication The ideal: a self-describing visualization Provide easy-to- understand visual cues to guide viewer to essential message
What every communication needs Well-de ned audience Well-de ned message Well-de ned understanding of your audience’s context in receiving your message fi fi fi
Build a mission statement “My visualization will help {my audience} understand {a speci c message}.” fi
Your mission statement tells you: Visualization style choice How well does my audience know the subject area? How visually literate is my audience? Context: what kind and how much does your audience need? Label / Title: how much jargon does your audience know?
Visualization best practices
Colors in visualization Very powerful. What does the “stoplight palette” tell you? Red/Yellow/Green? Colors can carry many other meanings and are quick to draw the eye. Is your use of color needed? Is the color helping you highlight something critical? Be aware of colorblindness Keep your background a neutral color to avoid confusion
Visualization wisdom X/Y charts: have zero on the y axis if possible. Trends: use line charts to show behavior over time. Speci c values: tables, bar charts, or lists of values. Test your visualization with your audience. Revise if you nd confusion. You may have to take things away more often than add things. fi fi
Only show data relevant to your message, in good and su cient context. ffi
Evaluating visualizations
Stephen Few’s Viz Evaluation Criteria 1. Informative Criteria 1. Usefulness: Does it meet your audience’s needs? Do they care? 2. Completeness: Is the data on display complete? Or sliced to distort? 3. Perceptibility: Is the right viz style in use for the dataset? 4. Truthfulness: Is the data source credible and accurate? 5. Intuitiveness: Is the visualization’s message easy to grasp? 2. Emotive Criteria 1. Aesthetics: Is the visualization pleasing to gaze upon? 2. Engagement: Does the visualization draw one in to learn more or distract from the message? https://www.perceptualedge.com/articles/visual_business_intelligence/data_visualization_effectiveness_pro le.pdf fi
Joe’s sample visualization evaluation Usefulness: somewhat useful to know what transportation types are most fuel-ef cient Completeness: an unexplained, unjusti ed selection of transportation types. What about scooters? Perceptibility: Very dif cult to compare between values Truthfulness: No data sourcing, no mention of speed of each type Intuitiveness: Not sorted by distance, lines not parallel, lines follow y-, not x-axis Aesthetics: Why all lines converging at a point? Icons not from universal set Engagement: Most consumers do not buy oil by the barrel. What about a gallon? Or 25 gallons? What would I do to improve this visualization? fi fi fi
Live evaluation: volunteer?
A brief history of visualization
Pre-historical visualization Cave paintings, stone tablets, handwriting systems Navigation, city planning, record keeping Image: Babylonian world map, circa 600 BC
Turin Map, 1150 BCE Includes color-coded geological information and mine site data.
Tabula Peutinger, 366-335 BCE Road map of the Roman Empire, showing roads and cities. Map coverage: Britain to India
Ptolemy’s Map, 2nd Century Latitude and longitude! First appearance of an absolute scale for mapping.
Celestial Bodies Chart, 950 CE Relative position and trajectory over time
Su Song’s Celestial Atlas, 1092 Locations of various celestial bodies Uses techniques unknown in Europe until 16th Century Part of a 1,000 year tradition of star mapping in China
Ramon Llull, Diagrams of Relationships Between Knowledge, 13th-14th Century Diagram of philosophical concepts common to all lives Attempt to gain insights and uncover new truths about life Rotating disks can generate new combinations First hierarchy visualization?
Abraham Ortelius, Theatrum Orbis Terrarum, 1570 First single volume atlas of the world 53 maps with descriptions and supplements
Edmond Halley, Contour Maps, 18th Century Contour lines to connect and delineate differences in atmospheric conditions Contour lines live on in modern weather maps
Joseph Priestley, A Specimen of a Chart of Biography, 1765 Lifetimes of in uential people categorized and displayed on a timeline. fl
Charles de Fourcroy, Tableau Poléometrique, 1782 Demographic comparison of major European cities An inspiration for the development of the treemap
William Playfair, 1786
John Snow, London Cholera Map, 1854 “X” indicated water pump Black dots indicate cholera fatalities Led to decommissioning of Broad Street pump, the source of the infection
Charles Minard, 1869 Color, line width, line direction, longitude as x-axis.
Luigi Perozzo, 3D Swedish Census, 1869 Years horizontal, # of people vertical, age groups for depth Widely used in science and medicine today
Francis Galton, Weather Chart, 1875 First newspaper weather map showing previous day’s weather Wind direction and air pressure similarities
Howard Fisher, Mapping Software, 1960s First general-purpose map software Harvard Laboratory for Computer Graphics and Spatial Analysis
Today’s landscape Many mapping tools General purpose desktop visualization and analytic tools Programming tools Web sandboxes
Choosing e ective chart types ff
Choosing effective chart types Identify the goal of your visualization in terms of its data Compare values? Show the composition of a phenomenon? Show a distribution of values? Show trends? Show relationships between entities? Highlight the location of data elements?
It’s really about the data type The data types in your dataset can point you in encouraging directions. Common data types Time series Geospatial Topical Tree / Network / Graph
Time series data “A sequence of data points, typically consisting of successive measurements made over a time interval.” — Wikipedia
Time series data Trends: tendency to decrease, increase, or stay the same Variability: change from one time point to following points Rate of change Co-variation: two or more variables changes in relation to each other Cycles: repeating variability Exceptions: values that do not follow any emergent pattern
Time series common viz styles Line charts Bar charts
Time series line chart
Time series bar chart
Time series: what not to do
Geospatial data Data that includes physical location of some type as one or more of its attributes
Geospatial data Speci c location data Latitude / longitude County / City / State / Nation / Content / Hemisphere / &c. General location data Distance and direction from some point # of stops on a light rail system Building name / campus name / room number fi
Geospatial visualization styles Maps! Speci cally, thematic maps, which show some arbitrary data overlaid on a geographic map. Even more speci cally: Proportional/graduated symbol maps Choropleth maps Cartograms Heat maps Flow maps fi fi
Proportional symbol maps Data can be encoded by shape, size, and/or color. Often require aggregation of data. Can quickly overwhelm—use with care.
Choropleth maps Uses shades of a single color to show relative concentration in a de ned area. Can be “classed” or “unclassed” - basically hard de ned categories or a continuous color scheme 3-7 classes recommended, more than that and you risk legibility problems fi fi
Cartograms Distortions of cartographic areas in response to a measured variable. Two avors, contiguous and non- contiguous. fl
Contiguous Cartogram Maintains contact of cartographic regions, resulting in a distorted map view. Low precision, but very eye- catching.
Heat maps Overlay using semi-concentric, graduated colored shapes to display intensity and location of a phenomenon.
Flow maps Map / ow chart hybrid used to show direction and amount of movement in a network. Minard’s French wine exports in 1864. fl
Topical data Data that can answer questions like “What?”. Collections of topics, sometimes with time elements. Frequently the output of text mining, and qualitative research studies.
Word clouds
Tables Real data Absolute comparisons Can show all or some values Style cells to show majority vs long tail values Parseable beyond 10 items
Bar charts A simple, clean approach, scalable or group able as desired.
“But what if the topics relate to each other in some way?” –Some or all of you
Tree data Topical data that can be arranged into a hierarchy.
Treemaps Area used to show relative measure. Can be used for hierarchical and non-hierarchical data. “Share of whole” display.
Treemap / bar chart hybrid Area used to show relative measure. Can be used for hierarchical and non- hierarchical data.
Radial Tree Graph For datasets with a single parent and a reasonable number of children. Be mindful of legibility and parseability in large datasets.
“But I don’t have a hierarchy! And things are related! And I don’t know what to do.” –Some or all of you
Graph data Data that represents complex interrelationships between entities. Pairs of vertices that are connected by edges. Object type and link type may vary. Example: the social graph game!
Force-directed graph Displays relationships between entities, arranged for maximum legibility. Lines (“edges”) can be styled to characterize the relationship between two people (“nodes”).
Radial graph or chord diagram First used to visualize relationships between genes. Easily overwhelms most people.
For more reading Whitepaper from Tableau: https://www.tableau.com/learn/whitepapers/ which-chart-or-graph-is-right-for-you Hubspot’s take: https://blog.hubspot.com/marketing/types-of-graphs-for- data-visualization Don’t be afraid of non-academic sources.
Putting visualization into action for you
Let’s talk. What’s the current state of visualization work at your library? What roadblocks exist? What data sources do you have? How about your data quality? What tools can you use for visualization?
Tools Data cleaning Excel / Google Sheets OpenRe ne Visualization Google Data Studio Example: Population / Internet / Mobile Use Tableau Example: Who wrote the Beatles’ top hits? Qlik Example: OSCARS fi
References & Resources Free index of Stephen Few articles by topic here. Show Me the Numbers: Designing Tables and Graphs to Enlighten (Stephen Few) Storytelling with Data: A Data Visualization Guide for Business Professionals (Cole Nussbaumer Kna ic) Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals (Brent Dykes) fl
Q&A Thanks for your time today! https://www.linkedin.com/in/joseph-ryan-0017882/
You can also read