Nowadays, our lives involve more and more data either directly or indirectly. The following daily activities would sound very odd a decade ago but very common these days; wake up sipping a cup of coffee while scrolling down social media feed, confirm a dinner with friends via one of instant communication platforms then booking a ride to the office through a mobile app. These activities could happen before you even walk out your door in the morning and there might be millions more people having these same steps in every hour. This is just an example from one industry, technology. In fact, there are many more companies in other industries that record and analyze data of their customers as well, e.g. retail, manufacturing, hospitality, transportation and the list goes on. And if you are not someone in any company, you can also use visualization for personal use such as creating a summary of your expenditure to identify your spending trend or finding out types of food you have eaten over years (of course, given that you have records in the first place). Now, imagine yourself looking at the data, would you question ‘How do I gain more understanding of the data on hand?’. If going straight to raw data is your answer, it is likely that you will only waste your energy and time as it would be very difficult to get some sense by reading raw data line by line. This is why data visualization could be a much more efficient alternative. This term refers to a way of turning data into various visuals which offers a more understandable format that can be used to communicate across diverse audiences. In this article, we will introduce visualizing choices for you to choose.
Pause. Think. Select.
To design your visuals, it is important to know the audience and objectives of your presentation. It is worth taking your time figuring out the true answer for this. Do you want to use it as a conversation starter which should lead to a critical discussion among your colleagues or do you want to present your justification for your conclusion and hope for an approval from the board of directors or do you want to show your personal food consumption to the doctor so he can better decide which course of remedy to prescribe? Different objectives and audiences require different storyboards, visuals and terms to achieve expected results. Your manager might not be interested in details and so would be pleased by a high-level summary. In contrast, operational staff might prefer drilling down into details of your presentation and like to spend more time discussing your findings. Separate sessions and different visualization for these two stakeholders may be required in order to maximize effectiveness of your visuals. Or if that is not possible, then deliberately selecting appropriate visuals will increase success of reaching your goals.
Your visual, your choice.
Visualizing data is like finding pieces of a jigsaw puzzle and connecting them into a picture. Placing one piece in a wrong position and you will have a hard time figuring out what that picture looks like. If that happens with your visualization, then your visuals will not be able to do its job as a tool for decision making, let alone the execution of a strategic move and your effort will be gone with the wind. To avoid that, you may want to consider the following choices as well as their examples to help you decide if it suits your situation. Note that our examples involve facebook page data in August 2020 of three food delivery companies in Thailand; GrabTH, LINE Man and Gojek.
1. Simple Text - “Just one or two figures to share? Give this a go!”
A simple text can minimize processing steps for your audience as it shows right away what you want to emphasize. A figure or percentage can be big and bold with a brief description nearby. This method is typically used in the dashboard design to make users easily see whether there is a problem in his business, e.g. negative growth. Then, the user can drill down further to find the root causes of such problems.
Example: Average engagement percentage of three facebook pages (see figure 1). This will let your audience sense an overall growth of three companies, namely GrabTH, LINE Man and Gojek. For this example, it is likely to work if your audience already has industry knowledge or might have a benchmark in their mind. Otherwise, the number will trigger questions and leave them wonder how they should interpret the number.
Figure 1. A simple text showing an average daily growth
2. Table and heatmap - “Multiple measurements or different audiences involved? Try this!”
A tabular form of presentation can be used when you want to communicate to audiences whose interests are diverse. You can input multiple measurements or categories into the tables and your audience can navigate to rows and columns relevant to them. While the titles of your rows and columns are categorical data, your cell can be either categorical or numerical. However, you should bear in mind that using a table during a live presentation can steal the attention of your audience as they have to focus on the content while reading through a table. Eventually, they will lose focus on the speaker.
Heatmap is similar to a table with the introduction of visual cues to imply the magnitude of the numbers which can aid your audience to quickly identify where they have to focus.
Example: Numbers of posts on GrabTH, LINE Man, Gojek during each time of a day in August 2020, as of 24 September 2020 (Figure 2 and figure 3 are table and heatmap, respectively). You may notice that, despite having the same content, your focus is likely to be led by colors of the heatmap and you will know right away that Monday morning has the highest number of posts. This can be mapped with the engagement percentage of corresponding time and content of the post to analyze effectiveness of the timing.
Figure 2. A table showing the number of posts on GrabTH, LINE Man and Gojek in August 2020
Figure 3. A heatmap showing the number of posts on GrabTH,
LINE Man and Gojek in August 2020
3.1 Scatter plot - “Want to show the relationship of two numerical variables? Check this out!”
Suppose you have information about two quantitative sets of data and you want to illustrate how they are related regardless of time, i.e. how one variable affects another, a scatter plot could help portray that. It comprises two axes and each data point or coordinate is represented by a dot. Dispersion of dots implies how data is correlated.
Example: number of likes and comments on each post of GrabTH, LINE Man, Gojek in August 2020. From the graph, you may assume that there is no relation between the number of likes and comments. Nonetheless, you might consider using statistical methods to test whether your assumption is true or not.
Figure 4. A scatter plot showing number of likes and comments on each post on GrabTH, LINE Man, Gojek facebook page in August 2020
Figure 5. Zoom in to the scatter plot in Fig. 4
3.2 Lines 3.2.1 Line graph - “A growth over time to show? Line graph is here for you.”
It can be used to show continuous data, especially date or time. The horizontal axis is intuitive for human to interpret as continuously moving forward. The vertical axis shows a measurement of your choice. Slope of the line reflects higher growth and vice versa. It shows the overall trend and the movement over the period. You may use this to compare performance of one attribute to its past or to compare performance of different attributes over the same period.
Example: percentage increase of page members of each company since 1st August 2020. Looking at this graph can help the audience identify where further investigation is required, e.g. at a less steeper period.
Figure 6. A line graph showing percentage increase of facebook page members during August 2020 of GrabTH, LINE Man and Gojek
3.3 Bars 3.3.1 Vertical bar and horizontal bar - “Looking for something to show many categorical data and its frequency? Why not try these bars!”
When you are asked what types of graphs you usually see, these bars are surely one of your answers. Being common makes your audience feel familiar and reduces their processing time. With these bars, you can compare different categories of one, two or more series. Nevertheless, please be careful when you have a lot of categories or series to compare. Having too many bars can overwhelm your audience and your key message will be lost.
Example: number of total comments on each facebook page in August 2020. facebook page with most and least can be quickly identified by the height or length of the bars. Yet, if there are bars with almost the same height, you may need to add a label to assist your audience in marking the numbers.
Figure 7. A vertical bar chart showing number of total comments in August 2020 of GrabTH, LINE Man and Gojek
Figure 8. A horizontal bar chart showing number of total comments in August 2020 of GrabTH, LINE Man and Gojek
3.3.2 Stacked vertical bar and stacked horizontal bar - “Can I show categories with their components? Of course!”
These two are the rotation of each other and they are similar to bars mentioned earlier. The difference is that components of the categories can be presented within each bar. Stacked bars can be used with either absolute values or percentage. One thing to keep in mind when adopting the stack is the difficulty in comparing the components across categories, especially if each category contributes about the same proportion. In this case, you may reconsider if the information is really worth presenting.
Example: comparison of different reactions on facebook page of three companies in August 2020. It can be seen that ‘Like’ is the most common reaction across three facebook pages. You might notice the contribution of ‘Like’ overall reactions. Having significantly high contribution can reduce visibility of other categories.
Figure 9. A 100% stacked vertical bar chart showing percentage of reactions on facebook page in August 2020 of GrabTH, LINE Man and Gojek
Figure 10. A 100% stacked horizontal bar chart showing percentage of reactions on facebook page in August 2020 of GrabTH, LINE Man and Gojek
3.3.3 Waterfall - “Illustrate changes happening between beginning and ending value”
This chart can be used to show the incremental changes of something which is caused by various factors. Each bar in the waterfall chart will show the magnitude and direction of the change based on the previous bar. Possible cases where you might find this chart being used are such as change in the company’s profit (affected by revenue and costs) or change in department’s headcount.
Example: Income statement of Grab Taxi (Thailand). Height of the bars shows the impact of each element while colors indicate direction. The last grey bar is the final amount. For income statement, the final amount can be net profit or loss. Note that in this example, we did not include income tax expense due to the absence of information. Hence, the final amount below is loss before income tax expense.
Figure 11. A waterfall chart showing 2019 each element of the income statement of Grab Taxi (Thailand)
4. Area 4.1 Square area - “Let this be your choice when you compare data with distinguishable size or frequency.”
As its name suggests, this chart uses area to represent magnitude of the data. However, if each category has about the same magnitude, your audience will have a hard time figuring out the impact of each square which will make your visualization less captivating.
Example: percentage of reactions on GrabTH, LINE Man and Gojek facebook page in August 2020. By the size of the square, ‘Love’ is obviously the most clicked reaction of all three facebook pages while ‘Sad’ is the smallest one and, therefore, the least clicked reaction.
Figure 12. A square area showing percentage of reactions on GrabTH, LINE Man and Gojek facebook page in August 2020
More options to be used with cautions 1. Pie chart Segment of a pie chart is a main element that shows magnitudes of each category. This feature of a pie chart can be regarded as a double-edged sword because, while it is intuitive to interpret, it is difficult to explicitly compare all segments based on angles and areas. This issue is prominent especially when two or more categories are about the same size. Unlike bars, you can sort them in ascending or descending order which reduces the burden on your audience in sorting the data in their head. You may substitute a pie chart with a bar chart as it serves the same purpose with more intuitive. 2. Donut chart Similar problem of a pie chart pops up in a donut chart too. This time, instead of angles and areas, viewers have to use arc length as a guide to guess how big the proportion of each category is. This is not as straightforward as other clues, e.g. color, height or length. Consider replacing a donut chart with a bar chart and see if your original key message is still preserved. 3. 3D char Featuring 3D in your graph might add a futuristic look to your visuals. But once you incorporate 3D into your graph, you have to trade something off. Additional components of 3D, such as shadow and gridlines, will increase processing steps to viewers. All together, 3D will turn down the voice of key messages from the graph and eventually the graph will fail to convey important information to viewers. When it comes to data visualization, simplicity is the best. 4. Secondary y-axis
It is understandable that sometimes all-in-one graphs are what you aim for. However, if you plan to achieve that by introducing the secondary y-axis, you might want to reconsider. Two y-axes could affect the interpretation of your graph since you are showing two different units or different axis intervals on a single graph. Plus, viewers are likely to assume that these two graphs are somehow related and might even try to look for their correlation. Alternatively, you may present different units on a completely different graph to ensure that both of their key points are illustrated without distortion.
Example: In figure 13, the number of Haha and Wow are 17% and 13%, respectively. The way pie chart and donut chart portrays these are quite hard to see the difference. In figure 14, combining both the number of comments and likes in the same chart requires a secondary axis as the magnitude of two data sets are significantly different but it distorts the message as the audience might think that the number of comments in August is higher than the number of likes, which is not true.
Figure 13. A pie chart in 3D and a donut chart showing percentage of reactions on GrabTH, LINE Man and Gojek facebook page in August 2020
Figure 14. A bar chart and line chart with secondary axis showing number of comments and likes on GrabFood facebook page during April - August 2020
Conclusion We cannot deny that our lives are getting surrounded by more and more data. Those who can grasp and utilize existing data will take their company or their personal lives to the next level, as the saying goes, data is power. A crucial method that enables us to understand and analyze data is data visualization. Before choosing which visuals to go with, you should understand your audience and the information you want to present first. Then you can come up with clear objectives and start designing your presentation accordingly.
Knaflic, C. N. (2015). Storytelling with data: A data visualization guide for business professionals. Hoboken, NJ: John Wiley & Sons.
(2020). กรมพัฒนาธุรกิจการค้า : Department of Business Development. Retrieved September 28, 2020, from https://www.dbd.go.th/
GmbH, U. (2020). Analyze and improve fan pages - Fanpage Karma. Retrieved September 28, 2020, from https://www.fanpagekarma.com/