In this post, I will talk about time-series decomposition. According to this website and the author's book, "Time Series Decomposition is to decompose a time series into trend, seasonal, cyclical and irregular components" and "The trend component stands for long term trend, the seasonal component is seasonal variation, the cyclical component is repeated but non-periodic fluctuations, and the residuals are irregular component".
(After we learn time-series analysis, we will conduct these R codes in Tableau. These posts are preparation for it, so we will go back to Tableau later.)
In the last post I talked about data prep, but let me put my code again. I will use Sales, Discount and Profit from this code.
Let's import the data.
For example, let's use Sales variable of the data. To decompose the data, we have to define the frequency. For example, in this post, we transform the daily data into weekly data, and it is obvious that the sales has an annual trend, so the frequency is 52 (1 year has 52 weeks). Frequency means how many data points can be included in a certain period (in this case, 1 year).
According to Stack Overflow, if we use xts object, we need to take an additional step at the 7th line.
Then, let's decompose the data.
It shows that
1. Sales has a strong seasonal trend
2. In a long term, total amount of sales is increasing.
3. In fall to winter, the residual tends to increase. It may mean that there can be another effect in that term.
Then, let's do modeling and prediction. I will use ARIMA model, but I skip the detail (since I haven't studied yet. reference is here.)
The blue line shows prediction. However, it is obvious that this prediction is not so useful as I expected. Simply because Sales is affected by many factors, for example, advertisement or discount rate.
By the way, the pictures above (left) is Monthly Sales data with prediction. Based on the similarity, I assume that the dashboard uses single variable time series analysis.
In the future, let's talk about Multi Variable Time Series Analysis. I am studying about it but it will take time to talk about. This is my homework.
Additional notes. Analysis of Profit.
Analysis of Discount (Mean of Discount).
It is interesting that
1. Profit is decreasing from early 2014. Sales is increasing and Discount rate is decreasing. Why?
2. Discount Rate is not a time-series data or ARIMA model fails to make a prediction.
3. Discount has a strong seasonal trends. Maybe there is a big sales event like amazon?
The picture above has seasonal trends of Sales and Discount. It seems that Sales and Discount has a negative correlation. How to analyse the correlation in time-series data is also my homework.
We analysed the global superstore data in a different way. I hope you enjoyed this analysis. I enjoyed it.