(9/27 Edited: some data prep should be done in Tableau. Detail is at the end of this post.)
From today, I'd like to write Tableau Dashboard Training Series. This is the first series. The pictures above are my goals and a reference for this series is here:
As the first part, I will talk about data preparation in this post. From the next post, data visualization will start with step-by-step way.
This data prep part is for some of my friends who want to study R. I'd like to skip details so if you have any questions, please let me know with g-mail(email@example.com), Facebook or whatever.
By the way, thank you for visiting to my website. The data is being gathered into Google Analytics.
My future work is to predict some metric in Google Analytics with R, and visualizing the data with Tableau. I appreciate your help for this study.
The data used for this training is the well-known "global superstore data". In this training, I'd like to set the goals following:
1. To create an executive dashboards helping us understand overview.
2. The dashboards shows a detail of customers based on sales and profits.
3. It helps us answer some questions about region, product, category and so on. It should help to understand if there is something wrong. The dashboard should be informative.
To create the dashboard satisfying the goals, we have to do some data preparation. The processes were here:
1. Extract US Market
2. Calculate shipping days and status to see if it is okay on shipping.
3. Calculate sales and profit per each customer
4. Calculate Profit Margin
I extract US market for the sake of simplicity, however, in terms of customer segmentation, I believe it is better to segment customers based on geography. And 2. to 4. are calculation for business understanding.
Here is my code and visualization starts from next post.
Again, if you have any questions on this code, please let me know.
(9/27 Additional note is from here)
During I make the dashboard in Tableau, I realized that some data prep (precisely, some calculation) should be done in Tableau.
For example, if I want to calculate Profit Margin by State, the formula should be like this.
R code for this calculation can be like this.
However, if I want to calculate this by city, I have to create another column, and so on, which means that it generates many columns that takes time. This is not good.
Instead of calculation in R, let's use Tableau calculation field.
In conclusion, data prep is critical for entire visualization process, but for some calculation, it can be better to do it in Tableau, because it does calculations automatically for us.