8 Important Time Series Datasets For Machine Learning

Introduction

Time series data sets are applied for machine learning. These problem areas have to be predicted with a numerical or categorical value, but data lines are arranged by time. A challenge is to find high-quality standard data sets for practice when you begin in machine learning time series datasets. In this article, we will be discussing 8 basic time series data sets that will be used to start and use machine learning to practise time series forecasting. Let us find out dataset for time series analysis:-

  • 4 univariate time series datasets.
  • 3 multivariate time series datasets.
  • Web pages for searching and downloading additional datasets.

List of datasets

As discussed above, below are the lists of time series data along with the time series data examples:

  1. Univariate Time Series Datasets
  2. Shampoo Sales Dataset
  3. Monthly Sunspot Dataset
  4. Daily Female Births Dataset
  5. Multivariate Time Series Datasets
  6. EEG Eye State Dataset 
  7. Occupancy detection dataset
  8. Ozone Level Detection Dataset

1. Univariate Time Series Datasets

Univariate data sets are called time series data sets that only have one component. These datasets are an excellent starting point for:

  • They are easy to understand and simple.
  • You can quickly plot them into excel or your preferred plotting app.
  • Compared to predicted outcomes, you can accurately trace the forecasts.
  • A series of conventional and modern approaches can be easily tested.

There are several sources of time series data and time series analysis example, such as the “Time Series Data Library” developed at Monash University, Australia by Prof. Rob Hyndman.

Below are the four uniform time series data sets you can be downloaded from a variety of fields, for time series example, Sales, Meteorology, Physics and Demography.

2. Shampoo Sales Dataset

This data collection shows the monthly amount of shampoo purchases over a span of 3 years.

The units are a sales figure and 36 observations are available. Makridakis, Wheelwright and Hyndman are credited to the original dataset of the year 1998.

  • Dataset of minimum daily temperatures

This data set out the minimum normal temperature in the city of Melbourne in Australia for 10 years from 1981-1990.

The units are Celsius and 3650 observations are present. The database is credited with the Australian Meteorological Bureau.

3. Monthly Sunspot Dataset

This dataset represents a monthly sunspot count for just over 230 years (1749-1983).

There are 2,820 observations and the units are a count. Andrews & Herzberg is the root of the dataset from the year 1985.

4. Daily Female Births Dataset

The dataset explains California’s number of women’s births in 1959.

There are 365 observations and the units are counted. Newton is credited with the origins of the dataset in 1988.

5. Multivariate Time Series Datasets

Generally, multivariate databases are the sweet point for machine learning approaches. These are of three types and the UCI Machine Learning Repository is a major source of multivariate time series results. For example, when we are writing, we can import 63 datasets of time series and work with them free of charge.

A list of three suggested time series multivariate data sets from meteorology, medical practise and surveillance areas is discussed with an example below.

6. EEG Eye State Dataset 

This dataset explains EEG data for a person and the openness or closure of their eyes. The purpose of the issue is to determine whether the eyes are closed or open using EEG data alone. The dilemma is whether the eyes are open or closed for EEG data alone. the problem is the problem. This is a predictive modelling classification problem, with a total of 14,980 observations and 15 input variables. The class value of ‘1’ shows the closed eye and ‘0’ suggests openness. Data is time-ordered, and observations over a duration of 117 seconds have been reported. Data are time-ordered and measurements over a duration of 117 seconds have been reported.

7. Occupancy detection dataset

This dataset explains a room’s dimensions and aims to predict whether a room is inhabited.

Over the course of some weeks, there are 20,560 one-minute measurements. It is a probability challenge for classification. Seven characteristics of the space include different light and temperature features.

Luis Candanedo from UMONS is credited with the source of the results. The data is presented in three files which indicate which splits can be used to train and evaluate a model.

8. Ozone Level Detection Dataset

This dataset summarises 6 years of measurements on ground ozone level and aims to forecast whether or not it is an ‘ozone day.’

The dataset has 2,536 comments and 73 attributes. This is a prediction challenge for classification which is shown in the last attribute as “1” in a day of ozone and “0” in an ordinary day.

Data was supplied in two models, a maximum of 8 hours and a peak of 1 hour. The one-hour high set for the moment will be proposed.

Conclusion

In this article, we have discovered a series of standard time series prediction data sets that you can use to launch and train prediction using machine learning methods.

There are no right or wrong ways of learning AI and ML technologies – the more, the better! These valuable resources can be the starting point for your journey on how to learn Artificial Intelligence and Machine Learning. Do pursuing AI and ML interest you? If you want to step into the world of emerging tech, you can accelerate your career with this Machine Learning And AI Courses by Jigsaw Academy.

ALSO READ

Related Articles

loader
Please wait while your application is being created.
Request Callback