A common problem with machine learning models used to train data is that they sometimes fail to predict test data while being trained to recognize noise patterns (irrelevant data added in) and need regularization in machine learning before they function well. ย The model suffers from overfitting. Since the model learns from the training dataโs specific noise and patterns, it fails to generalize the unseen data and factor it in. Hence it needs regularization.
In this article let us look at:
Regularization in machine learningย terms is to make things acceptable or regular. The process involves the shrinking of data coefficients to tend to zero values. In other words, the process of regularization of the regularization methods in machine learning will discourage overfitting the model, which then learns to be more flexible in a complex environment.
The basic concept of regularization in machine learning is to provide bigger loss values to complex models and award losses to complex models with the introduction of a complexity term.
Take the simple regularization in the logistic regression relationship mentioned below.
Yโ W_0 W_1 X_1 W_2 X_(2 ) โฏ W_P X_P
Here, the value to be predicted or learned relation of regularized logistic regression is Y and its deciding features are given by the values of X_1to P etc., and the weights of the features are represented by W_1to P and the bias by W_0.
To fit this regression model, one also requires the regularization parameter in machine learning based on the weights, bias and loss functions to be able to predict the Y value. Linear regression uses RSS or residual sum of squares as its loss function regularization parameter, which can be denoted byย
[RSS= the sigmoid function (โ) of _(j=1)^m (Y_i-W_0-โ_(i=1)^n W_i X_ji )^2] which is also called objective of linear regression without regularization.
The algorithm uses the loss function for learning from the training dataset by adjusting the coefficients or weights. When given noisy datasets, it has overfitting issues, and the estimated coefficients produced do not generalize the unseen data, thus needing regularization. This process of regularization in deep learning causes the larger coefficients to be penalized and pushes the estimates learned to zero value.
The 2 regularization techniques in machine learning are the Lasso and Ridge Regression techniques for regularization in machine learning, which are different based on the manner of penalizing the coefficients in the L1 and L2 regularization in machine learning.
The L1 regularization in the machine learning technique modifies the RSS value. It adds a shrinkage quantity or penalty, which is the sum of the coefficientโs absolute values in the sigmoid equation below where estimated coefficients use the modified loss-function.
โ_(j=1)^m (Y_i-W_0-โ_(i=1)^n W_i X_ji )^2 ฮฑโ_(i=1)^n |W_i |=RSS ฮฑโ_(i=1)^n |W_i |ย
Lasso Regression or lasso regularization hence uses for normalization of the absolute values of coefficients and hence differs from ridge regression since its loss function is based on the weights or absolute coefficients. The algorithm for optimization will now inflict a penalty on high coefficients in what is called the L1 norm. The value of alpha-ฮฑ is similar to the ridge regression regularization tuning parameter and is a tradeoff parameter to balance out the RS coefficientโs magnitude.
If ฮฑ=0, we have a simple linear regression. If ฮฑ=โ, the coefficient of lasso regression is zero. If values are such that 0<ฮฑ<โ, the coefficient has a value between 1 and 0. It appears very similar to ridge regression, but letโs have a look at both techniques with a different perspective. In ridge regression, the coefficients or sum of squares of weights is equal or less than s, meaning the equations 2 parameters can be expressed as W_1^2 W_2^2โคs.
Hence, coefficients in ridge regression have the least loss function for all within the circle points of the equationโs solution. In lasso regression, the coefficients are the sum of modulus of weights being equal to or less than s. Then, we get the equation|W_1 | |W_2 |โคs
The coefficients of ridge regression coefficients have the smallest values of loss function for all diamond points lying within the diamond of the equationโs solution.
This regularization in machine learning technique uses L2 regularization or modifies the shrinkage quantity of the RSS through a penalty added in which is given by the square of the coefficientโs magnitude and represented as modified loss function as belowย
โ_(j=1)^m (Y_i-W_0-โ_(i=1)^n W_i X_ji )^2 ฮฑโ_(i=1)^n W_i^2=RSS ฮฑโ_(i=1)^n W_i^2
Here, alpha-ฮฑ is the parameter for shrinkage quantity called the tuning parameter, and this value decides the penalty on the model. ie the emphasis of the tuning parameter is used.
If ฮฑ=0, the penalty is zero, and we have simple linear regression. If ฮฑ=โ, the ridge regression coefficient value is zero since the modified loss function’s core loss function is ignored. The square of coefficients becomes minimized, causing the value of the parameter to tend to zero. If 0<ฮฑ<โ, the value of the coefficient lies between 1 and 0. Hence selecting values of the ฮฑ coefficient is critical. This ridge regression regularization technique is called the L2 norm.
To improve machine performance, overfitting is prevented using L1 and L2 regularization in machine learning regression modeling techniques imparted to the machine learning model in the 2 discussed above types of regularization in machine learning.
There are no right or wrong ways of learning AI and ML technologies โ the more, the better! These valuable resources can be the starting point for your journey on how to learn Artificial Intelligence and Machine Learning. Do pursuing AI and ML interest you? If you want to step into the world of emerging tech, you can accelerate your career with thisย Machine Learning And AI Coursesย by Jigsaw Academy.
Fill in the details to know more
From The Eyes Of Emerging Technologies: IPL Through The Ages
April 29, 2023
Personalized Teaching with AI: Revolutionizing Traditional Teaching Methods
April 28, 2023
Metaverse: The Virtual Universe and its impact on the World of Finance
April 13, 2023
Artificial Intelligence โ Learning To Manage The Mind Created By The Human Mind!
March 22, 2023
Wake Up to the Importance of Sleep: Celebrating World Sleep Day!
March 18, 2023
Operations Management and AI: How Do They Work?
March 15, 2023
How Does BYOP(Bring Your Own Project) Help In Building Your Portfolio?
What Are the Ethics in Artificial Intelligence (AI)?
November 25, 2022
What is Epoch in Machine Learning?| UNext
November 24, 2022
The Impact Of Artificial Intelligence (AI) in Cloud Computing
November 18, 2022
Role of Artificial Intelligence and Machine Learning in Supply Chain Managementย
November 11, 2022
Best Python Libraries for Machine Learning in 2022
November 7, 2022
Add your details:
By proceeding, you agree to our privacy policy and also agree to receive information from UNext through WhatsApp & other means of communication.
Upgrade your inbox with our curated newletters once every month. We appreciate your support and will make sure to keep your subscription worthwhile