Analytics, Business Analytics, Predictive modelling, Advanced Analytics, Big Data Analytics, Data Mining, Knowledge Discovery, Artificial Intelligence, Machine learning, Business Intelligence, OLAP, Reporting, Data warehousing, Statistics
There are many terms that get thrown around in the field of analytics. This article is an attempt to list the subtle differences or similarities between the common terms.
Analytics – Analytics can simply be defined as the process of breaking a problem into simpler parts and using inferences based on data to drive decisions. Analytics is not a tool or a technology; rather it is a way of thinking and acting.
Analytics has widespread applications in spheres as diverse as science, astronomy, genetics, financial services, telecom, retail, marketing, sports, gaming and health care.
Business analytics – This term refers to the application of analytics specifically in the sphere of business. It includes subsets like –
Industries which rely extensively on analytics include –
Predictive Analytics – Predictive analytics is one of the most popular analytics terms. Predictive analytics is used to make predictions on the likelihood of occurrence of an event or determine some future patterns based on data. Remember it does not tell whether an event will happen. It only assigns probabilities to the future events or patterns.
The term emphasizes the predictive nature of analytics (as opposed to, say the retrospective nature of tools like OLAP). This is one of those terms that is designed by sales people and marketers to add glamour to any business. “Predictive analytics” sounds fancier than just plain “analytics”. In practise, predictive analytics is rarely used in isolation from descriptive analytics.
Descriptive analytics – Descriptive analytics refers to a set of techniques used to describe or explore or profile any kind of data. Any kind of reporting usually involves descriptive analytics. Data exploration and data preparation are essential ingredients for predictive modelling and these rely heavily on descriptive analytics.
Inquisitive analytics – Whereas descriptive analytics is used for data presentation and exploration, inquisitive analytics answers terms why, what, how and what if. Ex: Why have the sales in the Q4 dropped could be a question based on which inquisitive analysis can be performed on the data
Advanced analytics – Like “Predictive analytics”, “Advanced analytics” too is a marketing driven terminology. “Advanced” adds a little more punch, a little more glamour to “Analytics” and is preferred by marketers.
Big data analytics – When analytics is performed on large data sets with huge volume, variety and velocity of data it can be termed as big data analytics. The annual amount of data we have is expected to grow from 8 zettabytes (trillion gigabytes) in 2015 to 35 zettabytes in 2020.
Growing data sizes would inevitably require advanced technology like Hadoop and Map Reduce to store and map large chunks of data. Also, large variety of data (structured, unstructured) is flowing in at a very rapid pace. This would not only require advance technology but also advanced analytical platforms. So to summarize, large amounts of data together with the technology and the analytics platforms to get insights out of such a data can be called as the Big data analytics.
Data Mining – Data mining is the term that is most interchangeably used with “Analytics”. Data Mining is an older term that was more popular in the nineties and the early 2000s. However, data mining began to be confused with OLAP and that led to a drive to use more descriptive terms like “Predictive analytics”.
According to Google trends, “Analytics” overtook “Data mining” in popularity at some point in 2005 and is about 5 times more popular now. Incidentally, Coimbatore is one of the only cities in the world where “Data mining” is still more popular than “Analytics”.
Data Science – Data science and data analytics are mostly used interchangeably. However, sometimes a data scientist is expected to possess higher mathematical and statistical sophistication than a data analyst. A Data scientist is expected to be well versed in linear algebra, calculus, machine learning and should be able to navigate the nitty-gritty details of mathematics and statistics with much ease.
Artificial Intelligence –During the early stages of computing, there were a lot of comparisons between computing and human learning process and this is reflected in the terminology.
The term “Artificial intelligence” was popular in the very early stages of computing and analytics (in the 70s and 80s) but is now almost obsolete.
Machine learning – involves using statistical methods to create algorithms. It replaces explicit programming which can become cumbersome due to the large amounts of data, inflexible to adapt to the solution requirements and also sometimes illegible.
It is mostly concerned with the algorithms which can be a black box to interpret but good models can give highly accurate results compared to conventional statistical methods. Also, visualization, domain knowledge etc. are not inclusive when we speak about machine learning. Neural networks, support vector machines etc. are the terms which are generally associated with the machine learning algorithms
Algorithm – Usually refers to a mathematical formula which is output from the tools. The formula summarizes the model
Ex: Amazon recommendation algorithm gives a formula that can recommend the next best buy
Machine Learning – Similar to “Artificial intelligence” this term too has lost its popularity in the recent past to terms like “Analytics” and its derivatives.
OLAP – Online analytical processing refers to descriptive analytic techniques of slicing and dicing the data to understand it better and discover patterns and insights. The term is derived from another term “OLTP” – online transaction processing which comes from the data warehousing world.
Reporting – The term “Reporting” is perhaps the most unglamorous of all terms in the world of analytics. Yet it is also one of the most widely used practices within the field. All businesses use reporting to aid decision making. While it is not “Advanced analytics” or even “Predictive analytics”, effective reporting requires a lot of skill and a good understanding of the data as well as the domain.
Data warehousing – Ok, this may actually be considered more unglamorous than even “Reporting”. Data warehousing is the process of managing a database and involves extraction, transformation and loading (ETL) of data. Data warehousing precedes analytics. The data managed in a data warehouse is usually taken out and used for business analytics.
Statistics – Statistics is the study of the collection, organization, and interpretation of data. Data mining does not replace traditional statistical techniques. Rather, it is an extension of statistical
methods that is in part the result of a major change in the statistics community. The development of
most statistical techniques was, until recently, based on elegant theory and analytical methods that
worked quite well on the modest amounts of data being analyzed. The increased power of computers and their lower cost, coupled with the need to analyze enormous data sets with millions of rows, have allowed the development of new techniques based on a brute-force exploration of possible solutions.
Analytics platform – Software that provides for the computation required to carry out the statistical methods, descriptive and inquisitive queries, machine learning, visualization and Big data (which is software plus hardware).
Ex: SAS, R, Tableau, Hadoop etc.
Clickstream analytics/ Web analytics – Analysis on user imprints created on the web
Ex: Number of clicks, probability to buy based on search times of a particular word etc.
Text analytics – Usually refers to analysing unstructured (not tabulated) data in the form of continuous text.
Ex: Facebook data analysis, twitter analysis etc.
Location analytics – With advanced GPS and location data available location analytics has become quite popular
Ex: Offers based on customer location, insurance risk calculations based on proximity to hazards
Sports analytics – Analysis of sports data using analytical tool and methods. Performance as well as revenue data can be subjected to analytical procedures to achieve better results
Edited:July 2015 by Bharadwaj Aldur