Ernst & Young (EY) Data Analyst Interview Questions to Prepare in 2022-23


One of the professions with the highest demand worldwide is skilled Data Analyst. Because demand is so high and the availability of people who can do this job well is so limited, Data Analysts command high wages and excellent benefits, even at the entry level. 

Jobs for data analysts are available in a wide range of businesses and sectors. Any business that uses data to analyze it needs data analysts. Using data to make investment decisions, target customers, evaluate risks, or allocate cash is one of the top roles in data analysis. 

The EY interview process includes a number of stages designed to evaluate the applicant’s professional talents and find the ideal candidate for the data analyst position. Data analysts sift through masses of data to find patterns, predict outcomes, and glean knowledge that will aid their employers in making more informed business decisions. 

EY Data Analyst Interview Questions 

Getting ready for an EY data analyst interview? Ten to twelve different question themes are covered during the EY Data Analyst interview. When getting ready for the interview: 

  • Understand the qualifications required for EY Data Analyst positions. 
  • Learn about the EY data analyst interview process. 
  • Practice interview questions for EY Data Analyst. 

Let’s see the interview questions and the EY Data Analyst interview summary. 

1. When creating a predictive method to forecast client churn, how will you manage the quality assurance process? 

Data analysts need the business owners’ input and a collaborative setting to operationalize analytics. There should be an effective, efficient, and repeatable procedure for developing and deploying predictive models. The model will only be a one-and-done model if the business owner’s feedback is not considered. 

The ideal response to this question is that you would first divide the data into three separate sets: Training, Testing, and Validation. You would then present the validation set results to the business owner after eliminating biases from the first two sets. The feedback from the client or the business owner will indicate how accurately your model forecasts customer churn and produces the intended results. 

2. How are primary and foreign keys different from one another? 

Some of the main distinctions between a primary key and a foreign key are listed below: 

  • A table’s primary key identifies a record, and the foreign key is essentially the set of columns in which the table’s primary key matches. 
  • The primary key makes sure that each column of data is unique. A foreign key is a collection of columns that connects data from two tables. 
  • A table only needs one primary key, whereas multiple foreign keys can exist. 
  • Unique and not-null restrictions are both included in the primary key. The values of a foreign key can be duplicated. 

3. What is the distinction between data mining and data profiling? 

Data profiling focuses on examining certain data qualities, which provides useful information on data parameters such as data type, frequency, and duration, as well as their discrete values and value ranges. It gathers data, analyzes it, and performs quality checks on it to evaluate the source data’s structure and quality. 

As its name implies, the data profiling process examines the data from the provided source before assisting in its analysis. Data mining, on the other hand, prepares the statistics and interpretations of the data. It examines the data in further detail. One can explain how data mining identifies patterns in data by recognizing the association across datasets in response to these EY technical interview questions. Data profiling, on the other hand, examines the data to determine the precise content and data that are included in the data set. 

On the other hand, data mining seeks to find uncommon records, examine data clusters, and discover sequences. Data mining searches through the prebuilt database for preexisting patterns and correlations in order to maximize its usefulness. Results from data mining are produced using sophisticated algorithms and computer-driven approaches. 

4. Describe the K-mean algorithm. 

By using the K-mean partitioning technique, objects can be partitioned into K groups. The clusters in this approach are spherical, the data points are oriented around each cluster, and the cluster variances are similar to one another. It computes the centroids under the assumption that it is already aware of the clusters. Identifying the different types of groupings supports the business’s presumptions. It is helpful for a variety of reasons, including its ability to handle big data sets and ease of adaptability to new examples. 

5. What qualities/skills do you believe a data analyst should have to succeed in this position? 

For a data analyst to succeed, problem-solving and analytical thinking are two essential abilities. To make the information gathered available understandable, one needs to be adept at data formatting. Technical competence should not be overlooked as being very important. You might also discuss additional qualifications that the interviewer seeks in a perfect applicant for the role based on the job description. 

6. A fresh data analytics project has been given to you. What are the first steps you’ll take and the procedures you’ll use? 

The interviewer wants to know how you tackle a certain data problem and what thought process you use to stay organized, which is why they are asking you this question. You can begin your response to this question by stating that you will begin by identifying and outlining the problem’s purpose so that there is clear guidance for what needs to be done. The following step would be data exploration, which is crucial when dealing with a new dataset. I would also become familiar with the full dataset during this phase. The next stage would be to get the data ready for modeling, which would entail addressing missing values, locating outliers, and validating the data. After confirming the data, I will begin data modeling in an effort to uncover any insightful nuggets. The model would then be put into practice, and the output results would be monitored as the last phase. 

This is a general explanation of how data analysis works. However, depending on the specifics of your data problem tools available, the answer to your question may differ slightly. 

7. Describe “Clustering.” List the characteristics of clustering algorithms. 

Data is categorized into clusters and groups using the clustering approach. Unlabeled items are categorized and grouped into classes by a clustering technique. The following characteristics describe these cluster groups: 

  • Vertical or horizontal 
  • Soft and hard 
  • Iterative 
  • Disjunctive 

Clustering is the classification of related categories of things into a single group. Data sets of a similar sort are grouped together using clustering. These data sets are related to one another by one or more characteristics. 

8. When is time series analysis used? 

A technique called time series analysis is used to examine a set of data points that have been gathered over time. When performing a time series analysis, analysts must capture data points at regular intervals rather than occasionally or arbitrarily. The goal of time series analysis is to find patterns in the way that variables change over time. Time series analysis needs a lot of data points in order for the conclusions to be valid and consistent. Large data sets better represent the data and make it simpler to find and remove noise from the data. Additionally, they make sure that any patterns or trends found are not simply outliers but rather are what cause seasonal variation. 

A time series analysis is performed to find the underlying factors affecting the trends in the data. The observed data points can also be utilized to do predictive analysis to determine the likelihood of upcoming events. 

9. Explain Outlier. 

This question is necessary for every data analyst interview questions and answers manual. Data analysts frequently use the term “outlier” to describe a number in a sample that appears to be significantly different from the norm. The outlier values are significantly different from the data sets. These might be smaller or bigger, but they would be offset from the primary data points. These outlier values may exist for a variety of causes, including measurement errors and others. 

10. What is object-oriented modeling? 

A method known as object-oriented modeling (OOM) applies OOPs concepts at various phases of software development. In this method, we use models to think of solutions in terms of actual problems. This strategy’s primary goal is to close the semantic barrier between the system and the outside world. Before we begin developing the entity, it is frequently tested. Additionally, it is useful for coordinating with clients. This method can be used to simplify projects in order to boost their performance. 

11. How will you classify unstructured data to find important customer trends? 

You can respond to this question by saying that you would first speak with the business stakeholder to determine the purpose of classifying this data. The model would then be modified in accordance with the new data samples, and its accuracy would be assessed. This is done iteratively. Mentioning that you would map the data, develop an algorithm, mine the data, and visualize it. To ensure that you create an enrichment model that may result in actionable results, you would accomplish this in numerous phases while taking stakeholder comments into account. 

12. What is a transparent database management system? 

A transparent DBMS is one that distributes data while concealing its physical organization from users. 

13. What is a waterfall chart in Excel, and when would you use one? 

A waterfall chart in Excel is a particular kind of column chart used to show how a value’s beginning position might vary based on a sequence of modifications to arrive at a final value. Only the first and last columns in a standard waterfall diagram show the total data. The intermediate columns merely display the positive and negative alterations from one period to the next and have the impression of floating. Since the floating columns resemble a bridge connecting the endpoints, waterfall charts are sometimes known as “bridge charts.” 

14. What is the purpose of DML? 

A group of languages used expressly to alter databases is known as data manipulation language (DML). DML is available for processing and altering databases using standard operators like SELECT, INSERT, DELETE, UPDATE, and so on. 

15. How can a Data Analyst highlight cells containing negative values in an Excel file? 

The last query in our interview questions and answers for data analysts. A data analyst can utilize conditional formatting to draw attention to an Excel sheet’s cells with negative values. The steps for conditional formatting are as follows: 

  • Select the cells with negative values first. 
  • Now select the Conditional Formatting option from the Home tab. 
  • Then select the Less Than option under Highlight Cell Rules by going there. 
  • The last step requires you to provide “0” as the value in the dialog box. 


Although interviews might be nerve-wracking, if you have done your research and have practiced with mock interviews, you should be alright. You’ll ace the interview if you respond to the questions with assurance. If you’re interested in acquiring more knowledge and certification in this field, UNext Jigsaw’s Data Science certification courses. 

Related Articles

Please wait while your application is being created.
Request Callback