I start new Tableau Training Project with available Dataset using Kaggle and data.world. Basic purpose of the training is "Practice my knowledge obtained through study materials and dashboard analysis experiences".
Let me explain why I decided to start this project quickly. I have learned some knowledge of data-viz, I have read some articles, blogs and books about it, but I realized that, I don't have many experiences of creating dashboards including experiences of defining the purpose of the dashboard, who will be a user, who will make a decision with my dashboard, what to visualize and what decisions could be made by my dashboard.
It's time to challenge myself.
The dashboard is already done. You can see it in my Tableau Public.
1. What data we will use for this training
2. Data Visualization Framework
Before we start it, let's see what data we will use. This time we use Trending Youtube Video Statistics. For the detail please visit to the website and I recommend you to download the data to see by yourself.
In a short, the data is about trending videos. In the Youtube, at the left top, we have "Trending" with a fire mark.
And then, let's set the purpose of the data-viz. Let's say that we are in a data analytics team in a company. One day a marketing team there send you a message asking you to help them make their Youtube ads strategy more efficient.
To make the situation simple, and because I'm not so familiar with its Ads system, let's say that, you have to decide what kind of videos you put your Ads on. In other words, you have to find a good video to put your ads.
And there would be another aspect. The marketing team is also looking for an influencer who will introduce your product to the listeners.
So as a big picture, let's say that what we have to do is basically
1. to find a good video category to put ads on, where there are many people viewing the category and it is relevant to our product.
2. to find a good influencer to ask to introduce our product.
Before we start working with Tableau, I would like to start with making a framework of Data Visualization/ Dashboard Creation.
I am a big fun of "CRISP-DM – the CRoss Industry Standard Process for Data Mining model". In case you are not familiar with it, let's review it here.
(CRISP-DM: Copyright on KDnuggets)
As you see the first step is Business understanding and then or at the same time we do Data Understanding.
I believe that we have to define some critical components. I came up with the framework below so let me introduce it.
I set four components that we should define before: Task Definition, Stakeholder/User Definition, Potential Decisions and KPI's Definition. Let me explain what they are and why I believe they are important.
First of all, I believe that, as a data visualization process, we should not visualize data just because we want to see the data. It should be in the data exploration/data understanding process. I believe we need to think about the purpose of the visualization we will deal with. What is the problem, what is we want to know, what is we want to do? Let's call them "Task". In order to make the dashboard more meaningful and relevant to people who will use or see it, I believe we should consider the Task first.
Then, we need to consider who is the stakeholder. In this case the stakeholder can be a marketing director. Then the question is, what data they would want to see? This would be the important question for a better story telling. Let's think about a presentation. You may have heard that a good presenter always consider his/her audiences to make the story more relevant. The essence of this process is the same. Let's make the story with our data more relevant.
The third component is defining a big picture of what to visualize. In order to achieve the Task, we also consider what approach we can do with the data we have. Before defining this component we should finish data understanding.
The last step is to define what numbers we are going to show in order to satisfy the 1. Task and 3. What to Identify. In other word this is about thinking how we define the answers for the third process. For example, in order to identify what category we should look at, we can use the total views and the number of videos, which would define what the best category is.
So this is the framework I made for a better data visualization. As you may see, it is following the CRISP-DM and I hope I explained correctly how much important the business understanding is.
I feel that if we get lost about what to visualize, we may not start from defining the Task: what it is for.
From the next post we start creating our dashboard.
I hope you enjoyed my framework. See you then.