The 5V of Big Data: A Basic Guide

img
Ajay Ohri
Share

Introduction

Data became the digital gold after a school of thought emerged that every piece of data is important to the business in the final scheme of things. This led to the study of data within the business ecosystem, and a school of thought emerged that proposed to capture all data running through a business. Along came Big Data architecture that proposed a system that captures, stores, analyses these massive amounts of data. In this article, we will learn about 5v of Big data.

Big data is what it reads, dealing with massive amounts of data at a great speed passing through the business and in various ways it presents itself. The whole chunk of data that makes a business and the tools to capture, store and analyze this data can be called Big Data. Big Data exhibit a few characteristics that cover challenges like capturing, analysis, storage, visualization of data coming in at various speeds and in a variety of formats.

5V of Big Data

Of the many characteristics of big data, the proverbial 5v of big data characterizes big data’s nature the best. In this section, we shall look at each of those V’s in greater detail.

  1. Volume
  2. Velocity
  3. Variety
  4. Veracity
  5. Value

1. Volume

The most characteristic property of Big Data, Volume, highlights the amount of data that passes through the business day in and day out and how each of the data items needs to be captured to make a holistic sense of the business to derive value out of it. Data here will refer to anything that can be captured, structured, unstructured, semi-structured, arriving in batches or real-time. Without the huge volume, it would not be fair to call the data ecosystem Big Data, even if it captures every business aspect. The whole premise of value in Big Data is based on this first V, Volume.

To give you a small example of how big the data should be to classify it as Big data, sample this. The year 2016 saw global mobile traffic of 6.2 exabytes per month. It is estimated that by the year 2020, this will easily top 40000 exabytes.

2. Velocity

Velocity in big data refers to the crucial characteristic of capturing data coming in at any speed. In today’s world, data is almost stateless. It is coming and leaving the business ecosystem at great speeds. Big Data systems are equipped to capture this data at the rate at which it is coming in. If the speed does not match up to the speed at which data is coming in, there will be frequent backlogs, ultimately choking the system. Big data systems are designed to handle a massive and continuous flow of data—methods like sampling help in dealing with velocity issues in a big data system.

As an example of the velocity that a big data system has to endure, more than 3.5 billion searches per day are made through the Google search engine. With an ever-increasing number of active accounts on Facebook, the number of likes, updates, shares, and comments coming into Facebook increases by 22% every year.

3. Variety

It is characteristic of Big Data to capture anything and everything of value in the business ecosystem. This includes data with no immediate value to derive but can be processed further with advanced tools to gain insights into building intelligence into the system. Apart from the structured data that a business is used to, unstructured data buckets like images, videos, sounds, flat files, email bodies, log files, and more.

These contain data that can be mined out with advanced tools. Big Data system is designed to capture the unstructured and semi-structured data passing through the business in a timely and efficient manner. This also means that apart from storing a variety of data or heterogeneous data, the Big Data system should hook onto these various data types of data sources efficiently without compromising on speed.

4. Veracity

With the volume, variety, and velocity that Big Data allows for, the models built on the data will not be of true value without this characteristic. Veracity is the trustworthiness of the source’s data, the quality of the data derived after processing. The system should allow mitigation against data biases, abnormalities or inconsistencies, volatility, duplication, among other factors.

5. Value

The most important V as far as the business goes is Value. If the Big Data system cannot derive value out of the whole exercise in a reasonable amount of time, it isn’t a worthwhile exercise to get involved in for the business. Big Data theoretically should give you value. How big or small that value turns out is for the analytical team and the research team to think over, design, build and deliver.

Value is one of the first properties discussed in business, and a certain degree of value will be projected at the outset of a Big Data project. Big Data helps to build the infrastructure that Machine Learning and Artificial intelligence can be based on. Businesses that start today into Big Data can tomorrow easily transition to Machine Learning and Artificial Intelligence to augment their decision-making processes.

Conclusion

The characteristics 5v of Big Data initially included the 3 core V’s of Velocity, Volume, and Variety. The other three v’s of Veracity and Value were added later with the evolution of and prevalence of Big data across industries. All these 5v of big data are critical to the understanding of the Big Data architecture.

Big data analysts are at the vanguard of the journey towards an ever more data-centric world. Being powerful intellectual resources, companies are going the extra mile to hire and retain them. You too can come on board, and take this journey with our Big Data Specialization course.

ALSO READ

Related Articles

loader
Please wait while your application is being created.
Request Callback