In today’s scenario, where data is still sexy and data skills are still in high demand, the true role of a Data Scientist can be ambiguous and is often confused with that of a Data Analyst or Data Engineer. Those interested in a career in Data Science don’t fully comprehend what the role entails and are often confused as to what skills are needed to excel as a Data Scientist. They are also not sure if they have the intrinsic skills needed to make a career as a Data Scientist in the first place. This article aims to address these and many more questions related to a career in Data Science. Let’s get started!
To begin with let’s understand the term ‘Data Scientist’. If you do a search for the definition of Data Scientist you will find several interesting definitions, but by far the most honest one is this:
“A Data Scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”
If we broaden this definition we can say that a Data Scientist is someone who:
According to a report by International Data Corporation (IDC), the Big Data and Business Analytics market will grow to $203 billion by 2020. The banking industry is expected to be a big driver of this increase in spending, while IT and business services will lead most of the tech investing. Other industries like telecommunications, insurance, transportation, and utilities will also increase their own spending thus spurring growth.
IDC also estimates that the worldwide IOT market will grow to $1.7 trillion in 2020. Devices, connectivity, and IT services will likely make up two-thirds of the IoT market in 2020, with devices (modules/sensors) alone representing more than 30 percent of the total.
Coming to the supply-demand talent gap in the industry, IDC predicts a need for 181,000 people with deep analytical skills by 2018 in the US alone, and a requirement for five times that number of positions with data management and interpretation capabilities.
Data Scientist is not only the sexiest job of the 21st century, but also a high paying one, both in India and worldwide. In fact, it is one of the top paying jobs out there right now!
A Data Scientist proficient in R can bag a job paying Rs. 10.40 LPA (lacs per annum) on an average, a pay of Rs 10.12 LPA for Python, and Rs. 9.54 LPA for SAS. The best pay is for someone who can work with all three tools, taking home a cool Rs. 12.91 LPA.
The pay is higher than average (compared to other domains) across all experience ranges for a data scientist, starting off at Rs 6.4 LPA and exceeding Rs. 30 LPA at the top end of the spectrum.
All in all some very encouraging news for data scientists!
Before we go on to discuss the technical and business skills one needs to develop as a Data Scientist, lets first look at some basic innate traits that majority of successful Data Scientists possess. If you are looking at a career in Data Science and find that you can say yes to many of the ones listed below, then you are good to go. If that’s not the case, don’t despair. You can still embark on the Data Science journey, but you will just have to put in that extra mile and commit to developing the skills you can:
Though a data scientist can technically come from any stream of education, there is a
clear preference for those with a degree in science, statistics, and mathematics. If you are looking at a long-lived career in technology, a bachelor’s degree in something computing related is worth it. There is a definite advantage for these graduates as a good part of Data Science is about numbers and programming skills, and a solid foundation in computer science, math, modelling, and statistics, will make the journey easier. However, let me also point out that there are many people from other varied disciplines who have gone on to become successful Data Scientists.
Let’s now come to the industry relevant skills and tools, that a Data Scientist needs to develop, to succeed in the analytics industry today.
1. An Integrated Analytical Skill Set– Statistical skills, algorithms, machine learning, and mathematics
It is essential for a Data Scientist to have expertise in diverse analytical tools, as many analytics companies today use a combination of popular data analytics tools and technologies. At the very core, a data scientist needs to be able to understand numbers and use analytics tools to piece together data to discover potential patterns and correlations through statistics. The mandate is clear- If you can’t use the tools, you can’t analyze the data. Therefore, it is vital that a data scientist knows correlation, multivariate regression, and other statistical aspects of modeling, to be able to use those tools effectively.
Though SAS is still one of the more popular data languages, in today’s environment data scientists find themselves increasingly working on projects using multiple tools. So today, recruiters look for people with expertise in a combination of tools, like SAS, R, and Hadoop. As we all know R is one of the great success stories for open source software. It is free and can do pretty much everything SAS can do. As for Hadoop, it is an open-source programming framework that allows data to be spread over large clusters of commodity servers and processed in parallel. Used in parallel (R and Hadoop), organizations can easily and more economically derive useful insights to get improved advantages from their data.
2. Programming expertise:
A data scientist needs to have some level of programming expertise. Even if you don’t have a computer science degree, you need to be comfortable designing and programming in a variety of languages including Java, Python, C++ or C#. You need to be able to determine the right software packages or modules to run, modify them or even design and develop new computational techniques to solve business problems (e.g., machine learning, natural language processing, graph/social network analysis, neural nets, and simulation modelling).
3. Visualization skills:
One of the core functions of a Data Analyst is to visually anatomize exploratory data, and then communicate their findings and insights using interesting and innovative visualization tools. As a Data Scientist, your main objective is to bring insights to the management, to enable them to make better business decisions. What use are excellent data mining and modeling tools, if the results of an analysis are poorly visualized? It is thus imperative, that data scientists are apt at the art of visual storytelling and can creatively and persuasively communicate, the stories their data tell.
Though these are the core skills a data scientist must work at developing, it is also useful to:
Let’s now get down to the practicalities. You have done your due diligence and you are now ready to formally set off down the Data Science path. Here is what you need to do:
How to Prepare for an Analytics Interview?
Now that you are all skilled up and ready to take on the world of analytics, here are some interview tips from Priti Sawant, a staffing expert.
These books give you the tools you need to get started with data from basic statistics to machine learning and new ways to think about visualization. And if you’re already experienced with data, the Starter Kit will push you further. The package includes (13) titles on R, data analysis, Python, machine learning, and visualization. One could also look at purchasing a singular book depending on the need:
One of the most influential personalities in this domain, Eric Siegel talks of the power and perils of prediction in this entertaining book by including case studies from across the globe. Meant for the common man, the book explains predictive modeling and its basics in layman terms.
Perfect for new data scientists, Predictive Analytics offers tangible and easy-to-understand insights into the complex world of data analysis. Read this book to find out how institutions are increasingly predicting human behavior – whether you’re going to click, buy, lie, or die, as the title suggests. Predictive Analytics also shares the “why” and the “how” of behavior prediction – highlighting the many ways in which predictive analysis is able to improve healthcare, fight crime and boost sales – all through the careful analysis of big data.
Political forecaster Nate Silver won a lot of accolades for his accurate prediction of the results of every single state in the 2012 US election. In this book, he reveals how one can develop better foresight in this uncertain world. From the stock market to the poker table, from earthquakes to the economy, he takes us on an enthralling insider’s tour of the high-stakes world of forecasting, showing how we can use information in a smarter way amid a noise of data – and make better predictions in our own lives. Without accurate methods, the sheer abundance of data can make predictions go bad, especially when confronted with the limits of human cognition. Read ‘The Signal and the Noise’ to find out how forecasters are able to overcome biases and unpredictability to uncover accurate, meaningful predictions in a vast sea of noisy data.