Jonathan Schwabish is Here to Help You Bring Your Data to Life with Advanced Data Visualizations
Ever wonder what chart horizons lie beyond the basic bar, pie, or line graph? Ever heard of bee swarms, waffles, or dumbbells charts? These are not terms commonly associated with the data visualization space, but if you really want to take your graph and chart-making skills to the next level, it will benefit you greatly to become more familiar with them.
Jonathan Schwabish returns to the show to give us a tour of advanced chart options that can work well for business data storytelling and a few exotic graphs that are simply a fascinating and novel way to view data. From choropleth maps to slope charts, tune in to hear it all!
He is an economist at the Urban Institute in Washington, DC. In addition to his research on programs that support low-income communities, he is a writer, teacher, and creator of policy-relevant data visualizations. He is considered a leading voice for clarity and accessibility and how researchers communicate their analyses.
Jon is also the prolific author of Better Presentations, Elevate the Debate, and Data Visualizations in Excel, and today's episode samples charts from his most excellent read, Better Data Visualizations.
Jon will open your eyes to a whole new world of possibilities in the advanced data visualization space!
In This Episode, You’ll Learn…
- The pros and cons of Excel, Tableau, Datawrapper, Flourish, Power BI, and R for data visualizations.
- Why a bee swarm chart is one of Jon’s favorites.
- The advantages of a waffle chart over a pie chart.
- When using a dumbbell dot plot is recommended.
- Examples of use cases for Voronoi diagrams and choropleth maps.
- The importance of asking for feedback on a graph or chart you have created.
- How to overcome your dislike of scatterplots and why the slope chart is Jon’s go-to.
People, Blogs, and Resources Mentioned
- Bee swarm chart
- Waffle chart
- Voronoi diagram
- Dumbbell dot plot
- Choropleth map
- Slope chart
- Tableau
- Datawrapper
- Flourish
- Power BI
- R
- Better Presentations
- Elevate the Debate
- Better Data Visualizations
- Data Visualization in Excel
- My free 30-second online assessment to find out and overcome the #1 silent killer of your data presentation success
How to Connect with Jon Schwabish:
- Jon’s LinkedIn and Twitter profiles
- PolicyViz
- Urban Institute
Where Lea is Speaking Next:
I'd love to meet you, in-person or online! Here are the data storytelling, analytics, digital marketing conferences and events I'll be speaking at:
Thanks for Listening!
Thanks so much for joining me. Have some feedback you’d like to share, or a question? Leave a note in the comments below, and we’ll get back to you!
Now, I'm going to ask two favors from you:
- If you're digging the show, don’t forget to hit the Follow buttons on iTunes or Spotify to never miss an episode.
- If you liked what you heard, I would love if you could leave me a rating or review in Apple Podcasts. Ratings & reviews are extremely appreciated and very important in the rankings algorithm. The more ratings, the better chance of fellow practitioners getting to hear this helpful information!
And finally, always remember: viz responsibly, my friends.
Namaste,
[0:00:01.4] LP: Hello, hello, Lea Pica here. Today’s guest is a data visualization pro who is here to take us through some super-advanced charts and when to use them. Stay tuned to find out who is taking us to school on the Present Beyond Measure Show, episode 84.
[0:00:16.9] ANNOUNCER: Welcome to the Present Beyond Measure Show, a podcast at the intersection of analytics, data visualization, and presentation awesomeness. You’ll learn the best tips, tools, and techniques for creating analytics, visualizations, and presentations that inspire data-driven decisions and move you forward. If you’re ready to get your insights understood and acted upon, you’re in the right place. And now your host, Lea Pica.
[0:00:44.6] LP: Hello, hello, and welcome to the 84th episode of the Present Beyond Measure Show. The only podcast at the intersection of presentation, data visualization, storytelling, and analytics. This is the place to be if you're ready to make maximum impact and create credibility through thoughtfully presented insights and ideas.
Today’s interview is loaded with tips on advanced charts you may have never heard of and graphs that you’ll love to use from a true data visualization expert and esteemed author. So be sure to stay tuned in.
Now, as usual, I’m excited for today’s guest because his contributions to the field of effective data visualization, communication, and presentations skills are unrivaled and he’s super funny to boot. Let’s dive in.
[INTERVIEW]
[0:01:45.3] LP: Hello and welcome. Today’s guest is an economist at the Urban Institute in Washington DC. In addition to his research on programs that support low-income communities, he’s a writer, teacher, and creator of policy-relevant data visualizations and he’s considered a leading voice for clarity and accessibility and how researchers communicate their analyses.
He's also the author of amazing books including Better Presentations, Better Data Visualizations, Elevate the Debate, and coming soon, Data Visualizations in Excel. I have the Better Data Visualizations right here, one of my favorites, and his contribution to the space is nearly unmatched so I’d love for you to help me welcome today’s guest, Jon Schwabish. Hello.
[0:02:36.0] JS: Hi Lea, thanks, good to see you again, it’s been a while.
[0:02:39.0] LP: I know, it took a while to get this going.
[0:02:40.5] JS: It took a while.
[0:02:42.3] LP: We had a few detours.
[0:02:43.1] JS: Yeah-yeah-yeah.
[0:02:44.1] LP: But it’s great to have you back, as long as you keep publishing new books. I’m just going to keep having you return.
[0:02:50.3] JS: Yeah, I don’t know, this is the fourth one, it could be the final, we’ll see. Someone will drag me somewhere and I go kicking and screaming and then I’ll say, “Okay, yeah, it’ll be fun to write.”
[0:02:59.9] LP: Right. Next is your data memoir, right?
[0:03:05.3] JS: Step one, “When I was four, I made a bar chart.”
[0:03:07.7] LP: Right, exactly. Well, your experience is vast in the space just helping researchers but also spilling over as a major player in the data visualization and presentation space. I’ve been using your information as a digital marketer and data analyst for many, many years.
So I want to jump right in and talk really tech stuff because that’s what the people want. So what I thought I would ask is the first question that I always get in my workshops: what tools do you use for your visualizations? Talk about the tools.
[0:03:45.5] JS: Yeah, tools, always the big question. It’s like, “Oh, you’ve shown me all these graphs and all these good techniques but how do I actually make the thing?” Yeah, I hear this question a lot too. So I mean, there are a lot of tools out there and my view on tools is that they are just tools. You know, you hear a lot of people say, “Oh, you should never use this tool or that tool or that tool because they’re garbage or they don’t do this, they don’t do that.”
My view is, they are just tools. So for example, I spent a lot of my time in Microsoft. I use Excel for probably three-quarters of my visualization, and when I hear people say, “Excel doesn’t make good graphs.” I’m like, “Well, first off, I am making the graph, Excel isn’t doing anything,” you know, “Until the terminators take over, I’m still in charge,” right? But would I use Excel to create an interactive feature on the New York Times website? No, that’s not what it’s good for, right?
But my main toolkit is Excel, particularly, obviously, for static stuff. A lot of the work I do at the Urban Institute, we work in the Microsoft suite for a lot of things for our writing and for our presentation. So Excel makes sense for a lot of that work. I also use Tableau and I’ve been using Tableau a lot more over the last year or so. I’m just trying to get better at it and for no particular reason other than I just kind of like the challenge.
There are some weird things in Tableau. For folks who ever look at my Twitter feed every once in a while, I’ll have some question like, “Why is Tableau doing this weird thing?” And what’s great about Tableau, maybe even more so than a lot of these other tools, is that the community around Tableau is so helpful. It’s extraordinary. You can ask the question on Twitter and you’ll get an answer immediately. You do that for Excel, it’s not the same.
[0:05:19.1] LP: No one wants to raise their hand and be like, “I love Excel.”
[0:05:22.0] JS: I love Excel, right? “Hey, look up, over here.”
[0:05:24.5] LP: Yet we all use it.
[0:05:26.0] JS: Again, I see what Tableau is good for and what it may be not good for, it’s just a tool. A lot of people ask me, “Do I use Power BI?” I personally don’t use Power BI because I spend most of my time on a Mac and Power BI not on a Mac yet. So once Microsoft and Apple can get over whatever and it comes out, I’ll probably use Power BI. I used it early on when I was able to run parallels more on my computer and be able to have both operating systems.
I do like it because it taps into the Microsoft systems so much more. I mean, it’s just natively into Microsoft so, if you’re an Excel person and you want to make dashboards, Power BI is great. I will say that I just don’t think that the Power BI dashboards, as a lot of what I’ve seen and people do amazing stuff with it, but sort of the general stuff, I don’t think it looks as glossy and is as polished as Tableau. But I don’t know if that matters.
I always think about like, if you think of all the dashboards that are created in any tool, what portion of them are these bespoke, custom, awesome looking things but really don’t help you do what a dashboard should. The goal is to explore the data and the rest of those that are sort of your more standard dashboards. How many of them are not public and people are just using them in their day-to-day work and sitting around the office?
And for those, who cares if it’s shiny and glossy and looks polished? The point is, for you and me Lea, to dive into data so like, who cares how, right? So anyway, back to your core question. So I use Excel, Tableau, I use a lot of Datawrapper and we’ve been using more of Datawrapper at Urban now. Datawrapper is, for those who don’t know, a browser-based data visualization tool. It has a very robust opportunity to use it that’s free. So you don’t have to pay for it, you can do a lot with just a free version. Basically, you can use every visualization they have a library you can use for free. The only thing you can’t do is build in your custom default templates in the tool but fine, okay.
It’s a browser-based tool so it has its own downsides. So you wouldn’t want to put social security numbers into Datawrapper, right? And it’s a limited library and at least for me, I can’t really hack into the HTML code but I do like it. I do like that tool. I also use Flourish, which is a similar kind of tool and it’s more focused on animations. I do like Flourish for some sort of more different types of graphs. So I’ll use Flourish when I want to make a bee swarm chart or some other charts that are not necessarily in that sort of standard graph type.
Okay, so I’ve mentioned Excel, Tableau, Datawrapper, Flourish, and the last one I use is R. So I use the R programming language primarily, when I’m making maps and when I’m doing small multiples. So small multiples of course, we have these smaller multiple charts. We’re not packing everything into one graph and in R it’s just so easy to make small multiples. It’s one extra line of code, you do facet wrap, and bam, your line chart that had 10 lines in it is now 10 separate graphs.
[0:08:28.8] LP: Interesting, wow.
[0:08:30.6] JS: So I will say personally that I am a mediocre R programmer but I also can’t – I mean, I have not learned, I’ll be positive here, I have not learned how to do the statistical part of R, so I don’t run regressions or clean data. I use tools that I’ve been using for a long time in my career, Stata and SAS to do the data cleaning and analyzing and then I’ll package it up and bring it over to R to do the data vis.
So again, I use a lot of different tools and the book that you just held up, the Better Data Visualizations book, I use all five of those tools, plus a bunch more. Like you know, I mean –
[0:09:06.4] LP: Yeah, I saw one in there, I was like, “There’s no way.”
[0:09:12.4] JS: Yeah, and I would say most of the charts in that book can be created in Excel. The ones where you end up in trouble are maps, excel is just not great at maps, and a lot of stuff that’s sort of curvy or swirly because Excel is just not very good at curves.
[0:09:31.4] LP: It’s very angular.
[0:09:31.8] JS: Yeah, exactly right. It’s very angular. So you can work in lines and you need lines of sort of a non-trivial length and you can kind of do almost anything in Excel but if you think about the border of your country or your state that’s jagged or irregular, Excel is not going to be good at that because a curve is infinitesimally small lines stacked together and Excel is just not really good at that. So –
[0:09:55.5] LP: That’s interesting.
[0:09:56.4] JS: Yeah, but for most of the other stuff, it’s great. So I just use a combination of tools and it really does vary on what I need to do and where it’s going to live and does it need to be responsive or does an image work? You know, just a straight-up image? And do I need people to explore the data so then I’m in an interactive world? Am I just telling a story, just making a point? Yeah, so I think there are a lot of tools out there that people should explore and should use because there is no tool that rules them all, right? There’s no Lord of the Rings.
[0:10:30.5] LP: Right.
[0:10:30.9] JS: Now, other people would disagree. Lots of people get into their little camps of, “My tool’s the best,” and I’m just not there. So come back to me in a couple of years and maybe I’ll be into the Tableau world so deep that I’ll be like, “You should use Tableau,” but I don’t think I’ll be there.
[0:10:45.7] LP: You know, what I really appreciate about your answer is it depends, which for me, is the answer. Ideally, for any of the questions that I always get like, “What’s your favorite tool? What’s your favorite chart? What’s your favorite platform? What’s the right thing?” and I’m like, “No right, no wrong.”
[0:11:01.2] JS: Yeah, no right, no wrong.
[0:11:02.0] LP: You have to look at the specific situation, the environment it’s being consumed in, like you said, “Is it interactive?” That’s going to change. I know what you mean where with Tableau, I’ve been experimenting for quite a long time, trying to do an analysis of something called the Bechdel test, which is a test criteria to measure whether something Hollywood puts out, like a movie or TV show, meets the criteria for being women empowered. Not even women-centric but equality.
[0:11:33.3] JS: Yeah, right, equality, yeah.
[0:11:35.7] LP: And I found a really interesting use for Tableau which one day I’m going to release it, I’m excited, where I did a timeline of the past fail rating of different studios over time, starting from the thirties when the data begins and it was able to show me a very curvy, like you said, line that expanded in width or shortened depending on their percentage success rates. So you could see wider areas of the line are good years, thinner areas not so good, depending on how many movies were released that year. So you just have to look at the various strengths of each.
[0:12:17.9] JS: The way I think of these different tools and some of them have a philosophy behind them, right? So Excel, for example, it works in lines, bars, and circles and they need to be a non-trivial size, right? And then it works in a 2D X-Y space. That’s sort of the underlying philosophy and you could sort of describe a philosophy in lots of different ways.
Like, R for example, works in a layering philosophy. Let’s say you wanted to make a world map where the countries are colored by different, you know, they’re colored from a light blue to a dark blue, and then for whatever reason, you want to add bubbles on top of it, circles of different sizes. That’s literally just one extra line of code because it’s just taking the map, layering on the colors, and then layering on the bubbles. So the GG plot package in R is really layering.
And then Tableau, there are two ways at least that I think about it. One is sort of what it calls measures and dimensions. So it’s things you group and then things you count and then the other piece I think is the biggest barrier for people just learning Tableau is that it generally, not always but generally likes its data to be in a long or a tall format as supposed to wide and if you come into data or data visualization, you’d probably start with excel or Google sheets and we tend to work in kind of a wide format.
So let’s say you have countries and we’re going to have two variables for countries. We’re going to have GDP and we’ll have life expectancy. So in a wide data set, you would have in your first column, all of your countries. Let’s say, there are 230 some odd countries down that first column. In the second column, you’d have GDP, some number, and then in the third column, you’d have life expectancy, some number. And that, for most of us, is just instinctual or it’s just, you know, “Okay, I can put those into two different graphs and good to go.”
In Tableau, generally speaking, it prefers to take that data and not have it as three columns in the way that I just described it but longer, so that you would have three columns but they’d be different. The first column would be the countries and they would repeat. So you would have, instead of 200 rows, you’d have 400 rows. The second column would be the name of the data field, so you’d have 200 rows with the name GDP and 200 rows with the word life expectancy, and then the third column would be the data value. So all the way down.
And that way, when you bring in Tableau, you have the thing that groups it which is the two categories in your countries, and then the data value, and once you get that more data-based structure as opposed to a spreadsheet structure, Tableau starts to fall into place a little bit more. So each of these tools has their own kind of underlying philosophy and I think once you sort of get and understand the philosophy and then like the challenge of course is like, make it instinctual as you’re working with the tool, then you can really start to build some things and become more fluent in pushing the boundaries of what you can do with it.
Like you said Lea, having a line that curves and then changing the width of it in Tableau, it’s not really that difficult, once you understand how it works and the different markers and that sort of thing. So yeah, it’s just interesting. I mean, again, they each have their weirdnesses, right? But I don’t think any one tool is inherently better than another tool, although many listening will say, “No, my tool is best.”
[0:15:35.1] LP: Pitchfork.
[0:15:35.1] JS: Yeah, right, exactly, yeah.
[0:15:36.5] LP: No, I like being the Switzerland of –
[0:15:40.1] JS: Yeah, and I’m fine with it. I’m fine with it. And then, your listeners work in different places and work with different types of people and for me, my full-time job is in the nonprofit sector so my avatar of the people that I’m generally working with is a data person in a nonprofit organization that has six people or 12 people and that data person, generally in my experience, has been thrust into that position because they’re the person that has some scale or some affinity or some skill there but they’re sort of thrust into this position.
And that nonprofit of six people, they might not have money to buy Tableau at USD1,500 a license, right? Or they don’t have big data, so they don’t really need, you know, or they’re collecting data on a quarterly basis and that’s it and it’s pretty limited so, if that’s your case – I mean, they all have excel, they all have Microsoft so maybe you don’t need these other things. And maybe they’re always printing out their briefing books and so they don’t need stuff online.
As you said, it depends and it doesn’t just depend on your skillset and your affinity for using a tool. It depends on your audience and what they can do and part of what I try to do is just empower groups that I work with to say, “I can show you how to make this dashboard but I want to help you be able to make this dashboard next time.” So maybe it’s not the best business strategy.
[0:17:06.3] LP: Right.
[0:17:07.1] JS: But like, “How can you do it with your data next time?” Or at least update this dashboard, so we’re going to do a little bit of training at the very least to show you how these tools work.
[0:17:15.9] LP: Absolutely. Yeah, no, that makes sense. So, one of the things I loved about Better Data Visualizations, your book, was how many charts I had not heard of and I thought we would dedicate, you know, we spend a lot of time talking on the show about the basic ones that are best for more explanatory executive decision making but I want to give the people what they want and talk a little bit about some of the more obscure items that you have in here and I want to see how many we can cover. So the first one you mentioned, what the heck is a bee swarm?
[0:17:53.2] JS: I knew you were going to come up with that one first. It’s one of my favorites. Yeah, so what’s a bee swarm chart? So it is essentially a bunch of dots. They tend to, when you plot them out, it’s all the dots in your data, when you plot them out, generally because lots of distributions have some sort of mass in the middle, they tend to look like a swarm of bees, which is where the name comes from.
But you can think of it like, for anyone who has never seen one before, you could think of it like, instead of showing a bar chart where you’re showing, say, the mean or the average or the median of some value, you actually show all of your data, and that way, you can see, “Oh yeah, the min of this variable is 50,” but you can see the distribution isn’t from 49 to 51. You've got a distribution that goes from one to 90 or something like that. And like every graph type, they work in some instances and not in others.
So for example, if I had a hundred dots, a bee swarm chart might make a lot of sense. If I had a hundred thousand dots, you just going to see like –
[0:18:53.1] LP: Mess, a magic eye.
[0:18:54.6] JS: Right, exactly. So it’s not really going to work. So I really like the bee swarm because you get to show more in your data. You actually get to show the variation more clearly than say, a bar chart or something.
[0:19:06.9] LP: A bar.
[0:19:07.5] JS: And, I think it helps engage people a little bit more. I don’t have any real research to back that up but I just feel like if I see a bee swarm of the 50 states in the US, I live in Virginia so I can go look for Virginia and I think people will engage more with graphs when they can kind of see themselves in the data and you don’t necessarily get that in a bar chart. It’s like, “Okay, there’s the bar, I don’t know where I am but okay.” So I’m a big fan of the bee swarm chart. You know, it’s one of my new favorites.
[0:19:34.6] LP: That’s so cool. I’ll try to have links to all of these but you know, I’m looking at some of these and I’m thinking about the questions. I’m always thinking, if I’m going to use a chart, I want to know the question I’m trying to answer and that determines the chart, not the other way around. A bar is kind of like this totality. It’s a view of totality, it’s just, “This state is this value.”
But in looking at some of these swarms, I think you can ask deeper questions of, “Okay, I see where the general value is, however, there’s a lot of deviation here versus this state, maybe that’s a good place to look to see why it’s so much wider.”
[0:20:13.8] JS: Right and if you think about, because you mentioned earlier the business case, imagine if you have multiple retail outlets. So you have 50 retail outlets in your company and average revenue in some quarters was USD50,000. Well, that might mask the fact that some of them had USD100,000 and some of them had USD10,000 and the means miss that.
I mean, that’s sort of a core tenant of data visualization, like look at your data because that helps you see the variation. And you know, means and medians and even variances sometimes mask those interesting or outliers for different findings but the chart type itself, the bee swarm really helps you see that and especially see those outliers.
[0:20:55.3] LP: And I think it could even be such an interesting storytelling scenario if you're presenting live and you’re prepared to walk through what a bee swarm is supposed to say.
[0:21:03.6] JS: Absolutely.
[0:21:05.2] LP: I love to start charts or stories with more aggregate views, kind of start them off with something they expect, something simple but then reveal that, “This is what happens when we look at the distribution of this,” and, “Look at what we’re seeing here in these places.” Well, that’s so neat. Now, we love yummy, edible charts like pies and donuts. Tell me about the waffle.
[0:21:29.7] JS: Oh, the waffle, yeah.
[0:21:32.1] LP: Breakfast food chart.
[0:21:33.8] JS: Breakfast food chart, yeah. So I will say that I define the waffle chart maybe a little bit differently than others.
[0:21:41.0] LP: Okay.
[0:21:41.8] JS: So think about what’s generally called a unit chart. So a unit chart might just have squares or circles and they’re just kind of arranged in some grid or stack or something like that. For me, I define a waffle chart as kind of a subset of that, which is, it’s literally a 10 by 10 grid.
[0:21:57.5] LP: Oh, okay.
[0:21:57.6] JS: So that each unit, each object, each shape is a percentage point. So I kind of define it fairly specifically and –
[0:22:05.2] LP: Only 10 by 10.
[0:22:06.4] JS: Only 10 by 10. So now, does it really matter what we call this thing? I don’t know.
[0:22:11.2] LP: It’s Jon’s way or the highway.
[0:22:12.1] JS: Right, right, exactly, yeah. So if you want to call it a unit chart, you can go ahead, be happy. I like the waffle chart as one of the alternatives to a pie chart. I think it is, in a lot of ways, more engaging just because the pie chart’s been around forever and we’ve seen it a million times, which is good and bad, right? I like that you can do a little bit more annotation and labeling on it. You have a little bit more flexibility in how you're going to do that and you can sort of show groups within groups.
So let’s say you have three groups and they’re 20%, 50%, and 30%, you can even show in that 20% group, you know, maybe there are 10 squares you want to highlight. So add a little border around those 10. So they’re literally just 10 by 10 grids. I use them as one of my sort of standard alternatives to the pie chart, and not that pie charts are bad, like there’s that whole side.
[0:23:02.9] LP: What?
[0:23:04.5] JS: What? I have no idea. You know, there’s a whole side, like, we could, you know, we will need a much longer show to talk about that. Not that pie charts are bad but in some cases, pie charts just kind of look old just because people have used them for so long, and sometimes, like to your point from earlier on the “it depends” part, sometimes engagement is just a goal. You just need to get people over to your website, over to your –
[0:23:26.5] LP: Appeal is a thing.
[0:23:28.0] JS: Appeal is a thing. Right, exactly.
[0:23:29.5] LP: It’s a part of aesthetics.
[0:23:29.8] JS: Right, exactly, and maybe it is just to show, “Look how much of the total this part takes up,” and that’s important to show, right?
[0:23:38.9] LP: Right.
[0:23:39.4] JS: And so you’re not asking people to say, “Is this like 78% or 79%?” You just want to see, “Oh, it’s a lot.” Maybe just engaging people is the goal. I just like that chart as an alternative and it’s easy to make, right? Because it’s just squares. And it’s named after a waffle. I mean, anything where you can get these good breakfast foods.
[0:23:58.8] LP: I know, exactly.
[0:24:00.2] JS: Pie is a breakfast food, for those who are wondering. Pie is definitely a breakfast food.
[0:24:03.4] LP: Pie is definitely a breakfast food. Now, I understand. Even with the pie, you know, being a circle where we’re not the best tuned to determine area with circles. We’re not good at comparing them and I think this is a great alternative to understand what the composition is of a specific piece in comparison to the rest and probably even be a little better at understanding it.
All right, I love this one, I’m so glad you mentioned it. I wish more people would learn to use it, it’s the dumbbell dot plot.
[0:24:37.4] JS: Oh, yeah. So how can we explain this to folks?
[0:24:42.4] LP: It does need a bit of training.
[0:24:44.5] JS: It does. So people call it different things, dumbbells, dot plot, there are other words for it. I mean, you could think of it like two dots on a line just sitting next to each other with generally a line or an arrow that connects them. The way I approach this is the way many – no, I am going to say most, I think this is going to be true, I think the way most people would plot the following type of data is going to be using a paired bar chart.
So imagine you have two observations for multiple groups, so that might be countries, states, gender, whatever, right? So you’ve got value A and value B.
[0:25:24.2] LP: When you say paired, do you mean clustered bar?
[0:25:27.0] JS: Yeah, I mean bars next to each other.
[0:25:28.7] LP: Right, okay.
[0:25:29.4] JS: So paired or clustered, yeah.
[0:25:30.3] LP: I was going to ask you about that, so this is great.
[0:25:33.4] JS: Yeah, so these bars are sitting next to each other. So you’ve got the first value for the United States, the second value for the United States, space, first value for Canada, second value for Canada, and so on and so on. My instinct is that that’s the way most people would plot those data and the challenge there with that chart type is A, there’s a lot of ink on the page. So we got a lot of stuff that’s hard to add annotations, it’s just kind of heavy, sometimes kind of heavy but more importantly, we’re asking people to make a lot of comparisons simultaneously, right?
You gotta compare the level of the bars by their length or height, the difference between the two, and then those same comparisons across the group. So you gotta do within and across comparisons and it’s kind of hard to do.
So dumbbell dot or dumbbell chart or dot plot often makes that task a lot easier because you’ve placed these dots on the same row for each, in this case, country and you can see both the relative differences and then you can see the gap between them. It’s a good approach and I would just say for folks who are really interested in the tool’s piece, that chart is also pretty easy to make because you just have to recognize that it’s just a scatter plot, right?
[0:26:42.5] LP: Ah, okay.
[0:26:43.2] JS: Because the X dimension is your value and then the Y dimension is just some row holder, just a placeholder that you are putting in on a single row if you sort of have the dots next to each other if that’s your image of it. So again, back to our earlier conversation about philosophy, what can your tools do? How do they, man, I don’t want to use the word think because I’ll get into ChatGPT.
[0:27:04.5] LP: Oh boy.
[0:27:06.1] JS: Yeah, I know, right? What’s the mechanism that the tool works in? If you think about that sort of 2D XY space, you just need data in two dimensions and it doesn’t matter what the dimensions are, right? It doesn’t matter if it is a real number and an imager or a real value and something you’ve made is, as long as you have two dimensions, the tool is going to plot it and so that’s all you really need.
[0:27:27.3] LP: That’s great. Yeah, the way I think of it is a list of categories is if it were a bar but I have two values for each category and I loved it for doing things like pre-post analysis of pages or yes-no survey data or demographics, male-female, things like that.
[0:27:46.6] JS: Yeah, which is a really good point because it just demonstrates that it can be used for so many different types of data. 2020 and 2023, male and female, yes and no. I mean, you can kind of use it for any data, which is a great aspect to it. I will say that my personal approach is if I use a dot plot for change over time, I generally add an arrow. So the line that connects the dots, I’ll do as an arrow.
[0:28:08.9] LP: Oh, okay. That’s a good tip.
[0:28:10.6] JS: If it’s like yes-no, I’ll just do a line because it is not a direction, right?
[0:28:15.9] LP: Oh, it’s a great idea.
[0:28:16.9] JS: Yeah.
[0:28:17.2] LP: I like that, it’s clever. All right, so we can check that one out, a dumbbell dot plot. All right, this one has a fun name and I actually came across this the first time during the last election. It’s a Voronoi if I am pronouncing that correctly.
[0:28:32.3] JS: Oh yeah, a Voronoi chart. Yeah, these are tricky. All right, so how are we going to explain this to folks who have never seen one before? Well, let me describe it maybe the way I describe it in the book.
[0:28:42.7] LP: Okay.
[0:28:43.4] JS: This might be a good way to do it. So imagine a city and imagine that there are 20 fire stations in that city around different areas of the city. What you can do with that city is you can look at each of those firehouses and you can split the city up into many areas. So you just kind of cut up your city so that each one of those mini areas you just created has one firehouse in it.
If you create the Voronoi diagram correctly, the fire station and each one of those little areas is closer to all the areas in their section than any other firehouse.
[0:29:21.4] LP: Okay.
[0:29:22.3] JS: So what is often used for urban planning and in the ecology literature is that you could say, “Okay if there is a fire at 4th and Main, fire station A should go there, not fire station B, because they are closer.” That’s sort of like the classic example, the way I’ve seen them used now is more like an alternative to a pie chart because if you think about it, you take that city, just split it into multiple groups and you can sort of get this part to hold.
It’s just a different way. I mean, I don’t know if we are really good at discerning the quantities from that but that is how I see them used more often now. If you go deeper into it, like the one that I saw that was really cool and I remember that is not my field but on fire prevention and forests, like firefighting in forests and how they would like place different teams in the forest to fight the fire.
[0:30:23.5] LP: Oh, interesting.
[0:30:24.5] JS: Because they would like, “Okay if we position this team here, they’re going to go to this edge of the fire and this team over here goes to this edge of the fire.” If you just draw the geography, draw the map or the diagram correctly, you can figure out who should go where. So yeah, it’s a really interesting kind of chart type.
[0:30:39.1] LP: That helps me understand a bit better. The context I saw it during the 2020 election, maybe it was 5-38, I’m not sure, it was almost like a decision tree that showed what would happen as each state was called.
[0:30:54.4] JS: Yes.
[0:30:55.7] LP: And all of the outcomes of the election on every scenario, every combination of every state going a particular way and then they would kind of fill in the state that would be just called, and then the remaining states were filtering a tap.
[0:31:11.8] JS: Were filtering, yeah.
[0:31:12.7] LP: I thought it was so fascinating. I was there for hours looking for certain outcomes.
[0:31:18.2] JS: Yeah, yeah.
[0:31:18.9] LP: And rooting for certain states.
[0:31:21.4] JS: Yeah, it’s a good chart. It’s harder to make obviously because you need these irregular polygons, which is harder to make. So I’ll give you one more that I think is really cool and hopefully, some of your listeners know of Jon Snow, not the Game of Thrones Jon Snow but the Jon Snow cholera map. So real quick, Jon Snow mid-1850s made this map of this area of London and tracked deaths from a cholera epidemic and it is a fairly famous example of early epidemiology.
Now, people call it the Snow Map but there is a second version that he created shortly after his first version. Well, let me backtrack for one second. So the issue that he uncovered was that there were different water pumps around this area of the city and he figured out that the water was contaminated and that was his big contribution to epidemiology. But what’s really interesting is that his later map actually drew this area around this particular region of the original map surrounding this one water pump that had broken and was infected with sewage.
So you can take what people call that map and divide it into a Voronoi diagram because each water pump becomes one of those positions, like one of those fire stations that we talked about earlier. So you could see people were closer to that one pump than other pumps as you draw this out. So when people call the Snow Map, it is actually the Snow Voronoi diagram, which is just kind of a cool dataviz history background kind of thing, yeah.
[0:32:57.3] LP: Easter egg?
[0:32:58.1] JS: Easter egg, yeah. Yeah, it’s an Easter egg, yeah.
[0:32:59.9] LP: That’s great. So obviously, it may not be the most practical for most cases but it’s fun I think for people to just see data visualized in different ways and tap the different ways we are able to understand the information in different forms. So we’ll get a little more practical, we’ll talk about a kind of chart that’s used a lot in everyday business and maybe has a more accurate alternative, which is the geographic map, I think cartograph.
[0:33:29.4] JS: Well, so the standard map would be like a choropleth map.
[0:33:32.9] LP: Oh, that’s what I meant, choropleth.
[0:33:34.1] JS: I am looking at a bunch of them on my screen right now.
[0:33:36.1] LP: All these words.
[0:33:36.9] JS: Yeah, I know, right? So yeah, if you see a map where the states or whatever are filled in with a color, that’s called the choropleth map. The challenge with the choropleth map is that the size of the geography doesn’t always correspond to the importance of the data value. So think about the map of the world, Russia is a huge country, like tremendously huge but pick any data set, the value for Russia may not be particularly important and so –
[0:34:10.4] LP: Right.
[0:34:11.1] JS: Median income is a great one, right? Luxembourg has the highest median income in the world but it’s a super, super, super tiny country and so you can’t even see it and so I spend most of a chapter in the book talking about all these other different approaches where the idea is, let’s resize the geographies so that they’re closer to the data value which has this effect of distorting the data so that Luxembourg no longer looks like Luxembourg because we made it bigger and Russia no longer looks like a Russia because we’ve made it smaller. So tradeoffs, tensions, it depends, back to the kind of theme of our discussion, right?
[0:34:50.1] LP: Right. I mean, you could use a bar worldwide to just show that but you’re also going to lose the sense of spatial location, right? Are there questions or conclusions you can come to because of the spatial location? So one of the things I saw that I really liked were the hexagon or tile versions. So tell us about those.
[0:35:13.2] JS: So one way that you can make a map that does this tradeoff, you know, the tradeoff is that people can find themselves on a map but the geography may not match. One of the tradeoffs is to create what’s called a tile grid map or a hexagon or hex-grid map, where basically you take all of the units, all the geographies in your geography and make them the same shape, be them squares or be them hexagons.
I’m no cartographer. I’m sure cartographers would debate, and I’m sure they can talk forever about whether hexagons are a better fit than squares. When you do that, you have to start making some decisions about where you’re going to place the squares because now it’s arbitrary. So what you often see, and the US is a good example, most of the tile-grid maps I see, Florida is like off to the side and just below Georgia but we know that they are stacked on top of each other in real life.
But if you want to get that sort of curve to look like the US, because Florida kind of branches out there to the east, most people kind of push it over to the side there. So it is arbitrary how you structure it because it is an arbitrary shape but one of the advantages of that is you can start adding more data to your map. So if you want to add little line charts to each state, you can do that because now Texas and Rhode Island are the same size or the same shape, so you can sort of add more data, you can play around a little bit.
[0:36:37.4] LP: And you can use color or color intensity as the measure because now, the area, the size is not working with the color to create additional meaning.
[0:36:50.2] JS: I think it’s just a useful way to think about geography in a different way but I think it’s also worth just noting that people love maps. They love to see maps, they love to read maps but that doesn’t mean that every dataset that is geography should be a map. I think that’s just important to realize that just because it’s geographic data, why are you using a map?
And Lea, to your point earlier, are you telling a geographic story? “Oh, look at this thing that’s happening in the southern part of the country?” That’s an interesting geographic story. But if you are just saying, “I want you to compare these different values across this geography or this country or this world,” maybe a map isn’t the right way because if you look at a map where you’re putting countries into five bins, well now, in this one bin the US and the UK are the same color but maybe their values are very different and you can’t see that in the map.
So there are all these tradeoffs and it’s not that one is right and one is wrong, it is just a tradeoff, and back to “it depends,” which I hate giving that answer because people want an answer but it really does depend.
[0:37:59.8] LP: Right, test and see.
[0:38:01.6] JS: Yeah, test and see. Try. I think that part of seeing is getting the feedback from folks. So you’re testing, you’re seeing, ask people what they thought, like, “Is this clear?” I think a lot of us make the thing, we put it out, we move on to the next task in our jobs and we don’t have time or patience or money to go in and ask people like, “Did that work for you?” but I think that is how you get better.
“Give me some feedback. Did I mess this up? Is this good, is it clear?” I think that is just a big part of the data visualizers tool kit. Data visualizer, I don’t know if that’s a thing like we got to come up with a word.
[0:38:41.0] LP: We just made it a thing.
[0:38:41.5] JS: We’ve gotta come up with a name for what we do, so yeah.
[0:38:44.9] LP: I like vizier, that’s what I like.
[0:38:47.0] JS: Vizier, yeah. Well then, we have to get a fancy hat.
[0:38:49.3] LP: Oh, yeah, you’re right. It’s too much. All right, so this one’s well-known but at least eight people told me how much they hate this chart type this week. So why do people hate scatter plots? What’s going on there and can they be redeemed?
[0:39:08.4] JS: Yeah, I don’t know. I think because they’re not instinctively intuitive but it is important to know that they’re not intuitive not because well we, as human beings, are unable to read them but because we just haven’t learned how to read them.
[0:39:20.9] LP: Right.
[0:39:21.8] JS: Human beings aren’t born knowing how to read a bar chart. We have to learn how to read a bar chart.
[0:39:25.8] LP: Right, that’s true.
[0:39:26.6] JS: I think the way to think about this is if you’re creating scatter plots and you’ve run into this problem where your managers or colleagues or audience like doesn’t get it, instead of having one label for each access, so let’s do GDP again, say you have your country, your bubbles are countries, and on the horizontal access is GDP. Instead of just placing GDP and hopefully everybody listening knows that that’s gross domestic product but I felt like I had to add that parenthetical.
[0:39:54.0] LP: I thought you said ChatGPT.
[0:39:55.2] JS: Right, right.
[0:39:56.0] LP: Totally.
[0:39:56.7] JS: Right. So you have GDP on that X-axis label, don’t be afraid to add maybe two other labels that are maybe to the right you say, “GDP is going up,” or, “GDP is higher,” and then maybe a little arrow, and then on the other side, “GDP is lower,” or, “GDP is declining” with an arrow that way. So you add this annotation layer or these labels that help explain to people how to read the graph instead of saying, “Maybe you’ve seen this graph before, maybe you haven’t, here it is, good luck, go figure it out,” you really provide more annotation, more instruction right within the graph.
[0:40:32.7] LP: I agree. I think we don’t train our audience enough in charts and annotate in a way to guide them through even step-by-step to do that. All right, so maybe it’s really not the actual charts. It is about the execution.
[0:40:46.4] JS: I think it is about us. I think this holds for a lot of things, you know, writing and presentations and all sorts of things. We forget that we have been working with our data or with our content for weeks or months or whatever and we forget that people haven’t seen it before and we just expect them to get it right away and more often than not, we need to hold their hand a little bit, right?
So writing is especially that way. I mean, I find a lot of people who write, especially in the academic world, they write as if I know what they’re talking about.
[0:41:15.5] LP: No clue, no clue.
[0:41:17.9] JS: Yeah.
[0:41:18.2] LP: All right, so our last chart question before the next segment.
[0:41:20.8] JS: Okay.
[0:41:21.1] LP: Is there a non-standard chart in your book that you would just love to see people learn how to use more, that you think is useful?
[0:41:28.7] JS: Ooh, that’s a good one. I mean, we’ve talked about a few of them, like I think the dot plot is one. The other one that I like that I use a lot that is related to the dot plot is a slope chart.
[0:41:39.1] LP: Oh, how did I miss that one?
[0:41:41.4] JS: Yeah, that’s kind of my go-to now. It kind of tries to address the same challenge as the dot plot and at its core is just a line chart but with two points rather than multiple ones. That’s really all that is at the end of the day but it is using the same kind of scenario as a dot plot where you have data for Canada for two years, data for the US for two years and you want to be able to make multiple comparisons quickly and simultaneously.
I think the slope chart is a really nice way to do that and it also can often be more compact than a paired bar chart or clustered bar chart. So I think the slope chart would be the other one that I – I do see a lot of people using it but I think that’s the one that I would like to see in particular to replace that paired bar chart.
[0:42:27.6] LP: I’m so glad you mentioned it because I have a blog post about it on my blog, shameless plug.
[0:42:31.7] JS: Okay, shameless plug. Look at that.
[0:42:33.0] LP: And I agree, again –
[0:42:33.8] JS: And we didn’t even plan that.
[0:42:34.6] LP: I know.
[0:42:35.2] JS: That was totally organic, yeah.
[0:42:36.9] LP: Just complete neuro inter-reception there.
[0:42:38.6] JS: Mind-meld, yeah, right.
[0:42:39.9] LP: I love it also again for like pre-post but what I think it really lends itself well for is like building a story like a slide build over a series and you can put a certain number of the lines there that maybe people expect but then really highlight and distinguish and align.
[0:42:57.6] JS: Yeah, one step at a time, yeah, absolutely.
[0:42:59.2] LP: Right.
[0:42:59.8] JS: Yeah.
[0:43:00.1] LP: Cool stuff.
[BREAK]
[0:43:07.6] LP: All right, so we have entered our final wild-card question. Think very hard here and imagine this very plausible, realistic scenario: you are umpiring your son’s little league playoffs when suddenly, you trip and fall into a vortex that pulls you back to the moment you’re about to deliver your first presentation. Do you remember what you are presenting about and what advice would you give yourself?
[0:43:33.9] JS: Ooh, that is a good one. Okay, so a few caveats. I wouldn’t umpire my son’s game because that would be a conflict of interest.
[0:43:41.1] LP: Okay, makes sense.
[0:43:42.1] JS: So that’s the big, big caveat. First presentation, professional presentation, so not like when I was a kid?
[0:43:49.9] LP: No.
[0:43:50.7] JS: One of my first presentations was when I was doing grad school, my job market talk, which is you write your dissertation and you have to sort of plan this one talk that you are going to shop around. That’s your topic. And I gave it to the department, I did my graduate work at Syracuse University, and I did the talk and these were the days, Lea, not all of your listeners are going to be able to relate to this, but these were the days, remember these days where you had the plastic sheets and you’d put them on the projector.
[0:44:20.8] LP: The transparency in the overhead projector?
[0:44:22.4] JS: Yeah.
[0:44:22.7] LP: No, I don’t know what that is.
[0:44:25.5] JS: Oh my goodness. So for those who don’t know that this was a thing, back in the day, you didn’t have PowerPoint, you didn’t have projectors, you had to print your slides on these plastic sheets in a special printer where it would always smell like burning plastic and then you put these sheets on this projector that was like this big lamp and it would project up and then there would be a mirror and it will project onto the thing.
So anyway, that was the day and I had a laser pointer, which I rarely use anymore but I kind of lost it. I remember losing it on my body somewhere and I couldn’t find it and it turns out, I had maybe dropped it on my back pocket but the whole time I was trying to find it, I was stammering through the whole thing. I was like, “Um, so yeah, um…” and then afterward, my adviser was like, “Yeah, you need to cut back on the ah’s and um’s.”
I was like, “I just couldn’t find my laser pointer, that’s all it was, I was really prepared,” so I don’t know what the lesson learned there is, yeah.
[0:45:23.4] LP: Well, this is really relevant to today with our laser pointers.
[0:45:26.4] JS: So that was pretty bad, yeah.
[0:45:29.8] LP: Well, I just appreciate the trip down memory lane on that one. That method for sure is highly designed with neuroscience in mind, no question, And that fan, that droning fan.
[0:45:43.1] JS: Oh, the fan that would just burn and it would just get so hot and –
[0:45:45.8] LP: Just start to doze off and –
[0:45:47.6] JS: Oh, I remember I had a professor earlier in graduate school. His slides were blank transparencies and he would just write instead of a blackboard, he would just write and it’s like, “Wow, this is a thing.”
[0:46:00.6] LP: So trailblazing, wow.
[0:46:01.8] JS: Right? At the time, who knew? Yeah.
[0:46:04.9] LP: Well, that’s a good one. I’ll take away, just have your pointer, have your clicker, and then don’t let it slip back there.
[0:46:11.1] JS: Don’t lose it. Don’t lose it, tape your pockets closed if you need to. Don’t lose your clicker. It’s a good one.
[0:46:18.2] LP: I need a second. Oh, well, I cannot believe how fast this went by. That means we had a great time once again. So unfortunately our time has run out but please let the listeners know where they can keep up with you.
[0:46:33.0] JS: Yeah, absolutely. So my preferred social media is Twitter, so you can find me @jschwabish. I’m also on Instagram but it’s mostly just fun signs that I see outside and take pictures of them but mostly Twitter.
[0:46:44.6] LP: Oh, I’m following that.
[0:46:45.6] JS: Yeah.
[0:46:45.7] LP: It’s like my favorite thing.
[0:46:47.0] JS: Right? And then my website, of course, policyviz.com, and those are the main places they can get a hold of me and if they want to see some of my research, they can go to the Urban Institute, which is at urban.org, and they can find all the great stuff that we’re doing every day.
[0:47:01.6] LP: Very cool. Well again, we talked about Better Data Visualizations, which you absolutely must have in your bookshelf and all of the charts, books, links, and everything we’ve talked about will be on the show notes page for this episode. Jon, it’s always a pleasure. You didn’t disappoint and just keep writing books, you’ll keep coming back.
[0:47:22.5] JS: Okay.
[0:47:23.0] LP: We’ll keep jamming.
[0:47:23.8] JS: Thanks Lea, I appreciate it. Now, I have this whole new task on my list like, “Write new books.”
[0:47:28.7] LP: It’s the only reason, right? Well, thanks again.
[0:47:32.1] JS: Thank you.
[0:47:32.9] LP: I look forward to when our paths cross again.
[END OF INTERVIEW]
[0:47:43.0] LP: All right, I hope you enjoyed that interview. Man, I could have kept asking about super-secret chart types forever and you know, while I recommend keeping it basic for senior executive and decision-maker audiences and presentations, hopefully, these new exciting advanced chart types will whet your appetite for exploring novel ways of seeing numbers in interesting forms.
So to catch all of the links to the resources mentioned in this episode, visit the show notes page at leapica.com/084, and if you’d like to connect, don’t be shy and reach out to me on LinkedIn or Twitter and be sure to send a connection invite with a note mentioning the show. I love to meet my listeners and I respond to every message.
I’ll leave you with a little bit of presentation inspiration by our guest, Jon Schwabish. I located it as his final thoughts for his awesome book, Effective Data Visualizations. I just really found it to be such a wonderful way of wrapping up this amazing work. He says, “I first became interested in data visualization after seeing much of my and my colleagues’ work go unnoticed and unused. I did not come to the field with a degree in design or computer science or data science and because I did it, I believe you can too. In fact, anyone can effectively communicate their data by thinking critically about their own work and the needs of their audience, readers, and users.”
I really can’t improve on that. It really says it all. You don’t have to be a rocket scientist to deliver data effectively. You just have to know where to go to learn the tools, the skills, and the mindsets and luckily, you are in the right place. That’s all for today. Stay well and namaste.
[END]