Recent Posts
Featured Posts

Data Visualization

datavoz4.jpg
Most of us are used to consuming information in the form of tables, graphs and charts. In recent years, there is an increasing demand from customers, stakeholders and management for information in a visual form. Being “data visualization literate” is fast becoming a skill as indispensable and basic as being able to use a word processor or Excel.

So what are some of the rules of thumb for creating a good dashboard or a set of charts?

Before going into that let’s cover a few facts:

dataviz2.jpg

Fact 1: Data Types. Data can be of different types – numbers, names, geospatial, etc. The most widely accepted categorization of data types is: nominal (i.e. categories, item names, etc), ordinal (anything that can be ordered or compared, for example credit ratings AA, AAA, etc) and quantitative (intervals, lengths, numerical values, ratios, etc). (Stevens, 1946)

Fact 2: Ways of Visualising Data. There are many ways you can represent data – position of points with respect to x and y axis, colors, sizes, shapes, etc. The most comprehensive list, has been created by Bertin in 1967 and is now taken as the de facto best summary of possible ways to visualize data: position, size, value, texture, color, orientation, shape.

dataviz3.jpg

Fact 3: Percepual Accuracy. How accurately our eyes perceive information is partially dependent on what we use to describe the information. Positions of points with respect to x and y axis is the most accurate way to visualize information as it will be perceived by our eyes in the most precise way – we can easily tell if two dots are even the slightest apart from each other or off straight line. Lengths are also quite accurate – we can usually look at two lines and make statements such as “these two lines have the same length” or “this line is 2/3 of that line” with confidence. Angles and slopes are perceived less accurately, areas and volumes can be very deceiving. It has been shown that when comparing areas or volumes of two shapes people tend to underestimate the difference logarithmically, i.e. say things like this circle is 5 times bigger than that circle when in reality the first circle is 5*5=25 times bigger. We are good at comparing linear data, less good at comparing areas and even worse when comparing volumes. The least perceptually accurate metrics are colors, textures and density – these can be very subjective, especially if you take into account color blindness and other nuances.

Fact 5: Expressiveness and Effectiveness. In addition to perceptual accuracy, two other factors are important when evaluating which of the methods to use to represent data:

  • Expressiveness (show ALL data and ONLY the data) and

  • Effectiveness (use the most “viewer friendly” method of displaying the data).

Whatever is the most expressive and efficient way to represent a particular data set will depend on what category the data belongs to – nominal, ordinal or quantitative. For example, when displaying an upward trend in numbers, a line graph (or a bar graph) is sufficient, there is no need for fancy box plots that show medians and quantiles (we are not asking for this extra information), just as there is no need for a pie chart that may convey the same info but in an unnecessarily confusing way. On the other hand, when showing a percentage split of internet users by state, a pie chart or a geo map will be great, whereas a line graph (implying some sort of a continuous trend) is not as expressive and not as efficient.

How To Build a Good Chart / Dashboard.

So now, let’s summarize the above facts. We have three categories of data (Fact 1), seven ways of representing data (Fact 2), with some of these ways being perceptually more accurate or less accurate (Fact 3) and with all of these ways offering varying degrees of expressiveness and effectiveness depending on which data category you apply them to (Fact 4).

Based on this knowledge, the best way to build an impactful data visualization would be as follows:

  1. List all your data fields (i.e. time, age, $, distance, state, etc),

  2. Rank them in a descending order – from the most important field that you want to be able to display in your visualization down to the least important one,

  3. Starting from the most important field – pick data visualization/representation that is the most perceptually accurate as per Fact 2 (so that the users can straightaway get an accurate picture of this important piece of data) and the most expressive and efficient,

  4. Go to the second most important field – pick (from the remaining data vizualisation / representation methods) the next remaining best method in terms of perceptual accuracy, expressiveness and effectiveness,

  5. Continue to work your way down.

This way you end up with a chart that has a lot of information squeezed into it – lines, colors, shapes, textures – all representing various information. But at the same time this chart is not “overcrowded” because your most important piece of information is represented through the most accurate, expressive and effective method and thus jumps at readers straightaway, your information of medium importance is still conveyed relatively accurately, expressively and effectively and your least important information is still there, it might be not very accurately expressed and it may take a few seconds for the user to zoom in on it, but that’s not too crucial.

dataviz1.jpg

Follow Us
No tags yet.
Search By Tags
Archive
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square