When data visualization is misleading?

Believe it or not, the major of our decisions are made based on graphs, ratios, tables and any statistical visualization. But what if those resources are wrong? There are many examples of bad decision based on misleading charts, from the number of abortion and cancer screening, to student’s graduation and environmental change. It’s sometime hard to figure out if data is well explained by charts, but it’s even harder when it comes to big data.Untitled 1.jpg

The very first important and the basic rule for any accurate visualization is the right scaling and labeling for axis. More than that, you should check if data plotted correctly especially when more than one scale are showing on Y-axis. When it comes to the time, which is shown on X-axis, you should consider the proper period of time to be included in graph. Are couples of months a good indicator for showing a company’s stock price? Take a look at the picture above, why are these graphs are misleading? You can check Stephanie Glen video for more examples of misleading graphs of these types.
Capture2.JPGChoosing the best trend line for data is necessary. This picture well depicted the misunderstanding that comes from choosing inappropriate trend line; the linear trend line is showing the decrease in quantity on Y-axis, while polynomial trend line is showing the increase. Function approximation would be a solution for finding the best trend for data with high fluctuation. Photo Credit

Untitled4.png

Ratios and dimensionless numbers are always giving a better insight in data visualization. This fact is well explained by David McCandless in his Ted talk. The picture a shows top 5 countries with the highest total number of soldiers. While picture b shows the top 5 countries with the highest ratio of total number of soldiers over number of people (number of soldiers/ the population of country) who are living within the country. Which country has the biggest army? China or North Korea?

As I mentioned above, it even gets harder to make sure about the accuracy of visualization when it comes to big data. The source of information in big data visualization is critical. In other words, visualization of big data needs to cover all 4 V’s of big data in order to make sure we considered whole data. Unlike data sampling method to use the subset of data within a population, the beauty of big data comes from the whole data source. What are other misleading factors in big data visualization?

Advertisements

Author: Amin Sabzehzar

MBA student Mechanical Engineer University of Nevada, Reno

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s