Initial Preparation of the Data for Time Series Analysis

By: Mayukh– Jigsaw Academy Faculty

Further to the article I wrote sometime back Introduction to Time Series Analysis, where I discussed the different components of Time Series, I will in this post explain the importance of knowing and understanding the relationship among the components, before we can proceed with dealing with them separately.

In classical Time series analysis it is assumed that there is a multiplicative relationship among the factors.

If ‘Y’ denotes the value of the variable at time ‘t’ and ‘T’, ‘C’, ‘S’ and ‘I’ represent trend, cyclical fluctuations, seasonal variations and irregular variations respectively, then:

Y = T * C * S * I

 

On the other hand it is also assumed that

Y = T +C + S +I

Which is the additive relationship among the components.

This model form shouldn’t mislead anyone to think that all four components are always present in a Time Series. There may well be a Time Series with less than four components.

It is also important to know that each factor may not be independent to each other.

Editing the data: 

Before using the data for analysis to isolate and measure the components separately, we must ensure for the comparability up to the requisite level.  So, we need to edit the data and for doing that we need to take care of the following factors:

  • Change of population :

     In a market study, if a reputed company likes to survey the sales position of a commodity through a comparison with last fifteen or twenty years data, the correct analysis will not be revealed as with that long period the population has drastically changed due to new birth, death and migratory population. So, users of that commodity should be calculated per thousand in each year, in a particular locality. In that case total population can be estimated by taking the average growth into consideration.  The census report may be depended upon for this purpose.

 

  • Change of price :

    For any sales-oriented data Time Series analysis can be a tad erratic, if we consider simply sales price only. In general, prices are changing very frequently.  To avoid this effect, we may consider the unit of commodities but not their face values. So, the total price should be converted to the units sold and in case of more than one commodity the price index needs to be considered.

 

  • Calendar variation :

    For monthly data, as each moth is not of equal duration, it is better to convert the data into days. In that case each data should be divided by the number of days in that month for which the data is given. Again, the daily data can be converted to weekly data, simply multiplying the daily data by seven

  • Change of other factors :

    Apart from the factors discussed above, there may be other factors due to which meaningful analysis may not be possible. For example, during the span of Time Series, taxes in different sectors may change. So, the analysis on revenue collected by the Government must be adjusted accordingly.

Once we are done with these edits, the data becomes ready for component wise analysis and the subsequent forecasting based on those analysis.

Interested in learning about other Analytics and Big Data tools and techniques? Click on our course links and explore more.

Jigsaw’s Data Science with SAS Course – click here.
Jigsaw’s Data Science with R Course – click here.
Jigsaw’s Big Data Course – click here.

 

 

Related Articles

loader
Please wait while your application is being created.
Request Callback