If you are a Data Scientist, you’re well aware of the numerous SQL statements, excel formulas, functions, and algorithms in your profession. While the ones you use often are undoubtedly mastered, sometimes you need to leap into a project that demands different applications or new tools of your programming language of preference.
This is a specially drafted list of Data Science cheat sheets. These Data Science cheat sheet resources will make your work easier and help you become a better Data Scientist. Read this to uncover the best references for Python, SQL, Machine Learning, seaborn and more.
Machine Learning is changing our society, and Data Scientists are propelling that transformation. Machine Learning is used in our automated systems, Facebook algorithms, and Search engine results. However, there is a significant amount of programming that goes into constructing the Machine Learning models that customers deal with daily. It all starts with massive datasets and a lot of creative code.
The instant Machine Learning algorithms cheat sheet will be invaluable for Data Scientists who specialize in Machine Learning and analysts who are preparing to enter this booming domain.
Supervised learning algorithms aim to predict trends acquired in previous information on unknown data by mapping inputs to outputs. Supervised learning models can be either regression models, which strive to determine a continuous variable, or which attempt to predict a binary or multiclass variable
Here we have two types of supervised learning models
Linear models
The outputs of linear models are a linear arrangement of characteristics. In this part, we will discuss the most used linear models in machine learning:
Algorithm  Description  Applications 
Linear Regression  An approach for modeling a linear connection between inputs and a numeric output variable. 

Logistic Regression  An algorithm that represents a linear connection between inputs and a category output 1 or 0. 

Ridge Regression  It is a member of the regression family that penalizes characteristics with poorly predicted outcomes by decreasing their coefficients closer to zero. It is relevant for classification and regression. 

Lasso Regression  It is a member of the regression family that penalizes characteristics with poorly predicted outcomes by reducing their coefficients to zero. It is relevant for classification and regression. 

Treebased models
To forecast from decision trees, treebased models employ a set of “ifthen” rules. In this part, we will go through some of the most often used linear models in machine learning.
Algorithm  Description  Applications 
Decision Tree  To create predictions, Decision Tree models apply decision rules to features. It is relevant for classification and regression. 

Random Forests  A form of ensemble learning that integrates the output of several decision trees. 

Gradient Boosting Regression  Gradient Boosting Regression uses boosting to create predictive models from a group of poor predictive learners. 

XGBoost  The Gradient Boosting algorithm is an effective and adaptable boosting method. It is relevant for both classification and regression problems. 

LightGBM Regressor  A gradient boosting framework that is intended to be more effective than existing approaches. 

Unsupervised learning is concerned with identifying broad patterns in data. This form of segmentation is generalizable and used for a wide range of objects. Clustering methods learn how to group like data points together, and association algorithms group distinct data points depending on predefined criteria.
Algorithm  Description  Applications 
KMeans  The most used approach—it dervies K clusters based on euclidean distances 

Hierarchical Clustering  A bottomup methodology in which each data point is considered as its cluster, and the nearest two clusters are continually merged together. 

Gaussian Mixture Models  A probabilistic approach for representing evenly distributed clusters in a dataset. 

Algorithm  Description  Applications 
Apriori Algorithm  A rulebased technique that determines the most frequent itemset in a given dataset using prior information of frequent itemset attributes. 

SQL
Data Scientists use SQL worldwide to arrange data into tables and deal with different datasets. SQL is often used to extract the necessary data for a specific study, followed by Python and its many specialized modules to handle the challenging project.
As a Data Scientist, you will utilize the following SQL commands and functions:
Basic SQL cheat Sheet
Keyword  Description 
SELECT  state which columns to query. 
FROM  Declares which table/view to choose from 
WHERE  gives a condition 
=  compare a value to a given input 
LIKE  used with the where clause to get a specific pattern in a column 
GROUP BY  Sets similar data into groups 
HAVING  Specifies only rows where aggregate values match the specified conditions should be returned. 
INNER JOIN  Gives all rows where the record of one table is similar to the records of another table. 
LEFT JOIN  Gives all rows from the left with similar rows on the right. 
RIGHT JOIN  Gives all rows from the right table with similar rows on the left. 
FULL OUTER JOIN  Gives rows similar either in the left or right table 
Function  Description 
COUNT  Give the no. of rows in a table. 
SUM  Add the values 
AVG  Gives the avg for of values 
MIN  Gives the smallest value of the group 
MAX  Gives the largest value of the group 
SQL  Description 
SELECT student FROM class  Select data in column student from a table named class 
SELECT * FROM class  Select rows and columns from a table class 
SELECT student FROM class
WHERE student = ‘Alex’ 
Select data in column student from a table class where student = ‘Alex’ 
SELECT student FROM class
ORDER BY student ASC (DESC) 
Select data in column student from a table class and order by student. (in asc by default or desc order) 
SELECT student FROM class
ORDER BY student LIMIT n OFFSET offset 
Select data in column student from a table class and skip offset of rows and gives the next n rows 
SELECT student, aggregate(subject)
FROM class GROUP BY student 
Select data in column student from a table class and group rows with aggregate function 
SELECT student, aggregate(subject)
FROM class GROUP BY HAVING clause 
Select data in column student from a table class and group rows with aggregate function and filter groups using the HAVING condition. 
SQL  Description 
INSERT INTO class(columnfirst)
VALUES(list_value) 
Insert a row into a table class 
INSERT INTO class(columnlist)
VALUES (list_value), (list_value), … 
Insert rows into a table class 
INSERT INTO class(columnlist)
SELECT columnlist FROM subject 
Insert rows from subject into a table class 
UPDATE Class SET student = newvalue  Update a new value in table class in the column student for all rows 
UPDATE Class SET student = newvalue, father_name = new_value
WHERE condition 
Update values in column student and father_name in table class that meet the condition 
DELETE FROM class  Delete rows from a table class 
DELETE FROM class WHERE condition  Delete all rows from table class that meet a certain condition 
Data Science is a highly difficult discipline that necessitates some pretty good mathematics. Depending on your field of study, you may be required to use calculus, linear algebra, and statistics regularly. To progress in the discipline, Data Scientists must comprehensively know the ideas and how they apply in various contexts.
They are tools for Data Science students and experts to find a certain equation or doublecheck their work swiftly.
Even for competent Data Scientists, many of these equations might get hazy if not used daily. This is your quickreference basic linear algebra data Science cheat sheet, containing basic terminology that Data Scientists might need.
TERM  NOTATION 
vector  denoted by small letter v with arrow above 
scalar  any real number, e.g.
2, 1,⅓ or π 
matrix  A, represented by capital letter and equals a m × n matrix 
m × n  m rows times n columns 
basis vectors  represented by letters i, j and k with a ^ hat over 
mapping  T:Rm →Rn, Changing from m to n 
determinant  scalar, the area or volume of vectors 
cross product  length perpendicular to the plane of two vectors in three dimensions 
dot product  scalar, when one vector meets another vector 
Data Science Resources
If you’re just starting your career in Data Science or are still studying to become a Data Scientist, you need to brush up on essential terminology and Excel functions. This cheat sheet will give important shortcuts and commands and pasteable formulae that will save you time.
Function  Shortcut 
Add Current Date  ctrl+; 
Add Current Time  shift+ctrl+; 
Edit Cell Comment  shit+F2 
Show Active Cell  ctrl+backspace 
Add Column  alt+lC 
Add Row  alt+lR 
Fill Down  ctrl+D 
Fill Right  ctrl+R 
Save Workbook  shift+alt+F2 
Add Chart  Alt+F1 
Move to Last  ctrl+END 
Formulas require a cell reference. Defining the cell reference will affect how the formula is implied and copied from one to another.
Relative Cell Reference  =A2+B2 
Absolute Cell Reference  +$A$1 
Function  Syntax  Description 
DATE  DATE(year, month, day)  returns a date given the parameters of year, month, date. 
DATEDIF  DATEDIF(startdate,enddate,unit)  calculates the time between two given dates. 
DAY  DAY(serial no.)  returns the actual day of a date (integer between to 31) 
EDATE  EDATE(startdate, months)  adds a period of months onto a start date. 
EOMONTH  EOMONTH(start_date, months)  same as the EDATE, returns the last period in the month. 
NOW  NOW()  returns the serial no. showing the date at the real time 
TODAY  TODAY()  returns the serial no. showing the date 
YEAR  YEAR()  returns the serial no. showing the date into a year. 
In this article, the recommended cheat sheets are a narroweddown list of the best. They will keep you covered in the projects and help you brush up on your skills.
It’s critical to stay up with innovations in this fastchanging digital industry, no matter where you are on your Data Science journey. Every aspect of your profession is prone to change and progress with time. Data analysis programming languages, tools, and procedures are upgrading and becoming more robust. It is one of the best things that makes this profession so appealing.
Learning is a neverending process. So, continue learning and advance professionally. Enroll in the latest online programs and webinars on big data, deep learning, Machine Learning, or Artificial intelligence if you want to dive further into a specific field of Data Science.