6 Amazing Sources of Practice Data Sets

Analytics theories and methods? Check.

Grasp of concepts? Check.

Practice data sets? Go ahead and check this off your list too!

The best way to put your newly acquired skills to the test is to experiment with publicly available data sets. A great way to vet your analytics skills, this could be your first foray into the messy world of ‘real’ data.

Don’t know where to find publicly available datasets? Well, fret not because we have done the work for you. Below is a list of excellent data sources that you can happily cut your teeth on!

1. Kaggle:

Kaggle is the home for everything data science-related. Forum discussions centre on Kaggle competitions, data science troubleshooting, fun data sets, discussions of various machine learning, big data and data science topics and more. It also has an excellent jobs board!

Firms often rely on Kaggle frequent analytics competitions to find talent for their businesses. It is one of the few business-oriented platforms that you can take an advantage of!

2. United States Census Bureau:

The US Census Bureau collects census information once every 10 years. Once you head to the link, click on the ‘Topics’ tab at the top and you will see sub-tabs containing titled ‘population’, ‘economy’, ‘demographics’, ‘income’ and more. Each of these tabs will lead you to an abundance of data to play with.

What do the demographics data tell you about the relationship between poverty and race in the world’s only remaining superpower? What can you learn about immigration patterns over the past 50 years in different regions of the country? Are there any noticeable differences in the per capita income from the previous census? You can set yourself the task of figuring out answers to these and many other intriguing questions.

3. India Census:

India’s 2011 Census Report, which contains population data disaggregated by age, gender, income, and housing. Sink your teeth into data that only the world’s largest democracy—and second most populous country—can afford you!

How has the sex ratio of the country changed? How does each state compare on the literacy front? Have there been any changes in rural fertility rates over the years? Now you can find the answers to such questions on your own!

4. Airline Transit Info:

Data Expo
Bureau of Transportation Statistics

Ever been stranded at the airport wondering when your flight is actually going to take off? How often have your flights been delayed? You can get detailed reports on this and other parameters at the above links. Now you have something to do the next time you are stuck in an airport lobby with nothing fun to do!

5. World Bank:

The World Bank works to eradicate extreme poverty and promote income growth, and in this endeavour, they have an amazing repository of data across a variety of indicators that are a joy to work with. You can access data by country, and topics such as the economy, education, healthcare, trade, development etc. Boy talk about choice!

6. UC Irvine Machine Learning Repository:

UC Irvine is a prestigious American university whose Centre for Machine Learning and Intelligent Systems is a repository of multivariate, univariate, text, spatial, and domain-theory data sets that are freely available to the general public.

These are just some of the many interesting data banks out there that are just waiting to be explored. Need more? You can find what you’re looking for at the following links. Go crazy!

  1. Quora: https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
  2. KDnuggets: https://www.kdnuggets.com/2011/02/free-public-datasets.html
  3. Georgia Tech FODAVA: https://fodava.gatech.edu/visual-data-analytics-data-sets
  4. YouTube Data: https://netsg.cs.sfu.ca/youtubedata/
  5. Mode Analytics: https://blog.modeanalytics.com/five-public-dataset/
  6. Stack exchange: https://stats.stackexchange.com/questions/7/locating-freely-available-data-samples
  7. Abbott Analytics: https://www.abbottanalytics.com/data-mining-resources-sets.php
  8. Bigml blog: https://blog.bigml.com/2013/02/28/data-data-data-thousands-of-public-data-sources/
  9. Reddit: https://r-dir.com/reference/datasets.html

Want to learn more about analytics and data science? Head to our courses page to learn more about how you can get the best analytics training available!

Related Articles

Please wait while your application is being created.
Request Callback