Recommendation System – Created by Machine Learning
Machine learning has a subclass known as recommendation engines that often rank or rate people or products. A recommender system, broadly defined, is a system that anticipates the ratings a user would give to a certain item.
The recommendation system in machine learning handles the plethora of data by filtering the most crucial information based on the information provided by a user and other criteria that take into account the user’s choice and interest. It determines whether a user and an item are compatible and then assumes that they are similar in order to make recommendations.
These kinds of systems have helped both the users and the services offered. These kinds of systems have also improved the quality and decision-making process.
What Are Recommender Systems?
Many businesses have trouble figuring out what constitutes a good recommendation, which is a challenge in and of itself. The performance of the recommender you constructed can be assessed using this definition of “good” recommendations. A recommendation’s quality can be determined using a variety of techniques that gauge its coverage and accuracy. In contrast to coverage, which is the percentage of search space objects the system can make recommendations for, accuracy is the proportion of correct recommendations out of all potential recommendations.
The dataset and strategy used to generate the recommendation have the only bearing on how it is evaluated. There are many conceptual connections between the Recommendation system and the classification and regression modeling issue. In an ideal scenario, you would want to monitor metrics surrounding the user and observe how actual users respond to recommendations in order to improve your recommendation; unfortunately, this is quite challenging to do. RMSD, MAE, and k-fold cross validation are examples of common statistical accuracy measures that are used to assess a recommender’s accuracy.
According to recent research, 91% of consumers are more inclined to choose brands that offer exclusive customer experiences.
Collaborative Filtering System
Methods that are primarily based on previous interactions between users and the target items are known as collaborative filtering techniques for recommendation system machine learning. Therefore, all historical user interaction data with the target items will be the input to a collaborative filtering system. Usually, this information is kept in a matrix, where the rows represent the users and the columns represent the objects.
Memory-based and model-based approaches are the two main categories of collaborative filtering methods.
Memory-based recommender system is often known as a community collaborative filtering system. Ratings for user-item pairings are essentially anticipated based on their historical data. Collaborative filtering based on users and collaborative filtering based on items are two other divisions that can be made. User-based basically means that recommendations from like-minded users will be powerful and consistent. According to the similarity between items determined by comparing user ratings for those products, item-based collaborative filtering makes recommendations for items.
Model-based collaborative filtering approaches are ML-based predictive models. The model’s inputs are parameterized features associated with the dataset in an effort to address an optimization-related issue. Decision trees, rule-based methods, latent factor models, and other tools are examples of model-based approaches.
Advantages of Collaborative Filtering System
The key benefits of utilizing collaborative filtering models are their ease of use and high level of coverage. It is also advantageous because it does not necessitate knowledge of the item content and captures nuanced traits (particularly true for latent factor models).
Disadvantages of Collaborative Filtering System
The major disadvantage of this approach is that there hasn’t been any user-item interaction with it, which makes it unfriendly for making recommendations for new goods. This is referred known as the cold start problem. On extremely sparse datasets, memory-based techniques are known to perform badly.
Example of Collaborative filtering algorithms:
Consider a matrix of users to favored lunch item where all users are Americans who love cheeseburgers, for example.
import pandas as pd
# Load up the data with pandas
r_cols = [‘user_id’, ‘food_item’, ‘rating’]
train_data_df = pd.read_csv(‘train_data.csv’, sep=’\t’, names=r_cols)
test_data_df = pd.read_csv(‘test_data.csv’, sep=’\t’, names=r_cols)
# Convert the pandas dataframes to graph lab SFrames
train_data = graphlab.SFrame(train_data_df)
test_data = graphlab.SFrame(test_data_df)
# Train the model
collab_filter_model = graphlab.item_similarity_recommender.create(train_data,
# Make recommendations
which_user_ids = [1, 2, 3, 4]
how_many_recommendations = 5
item_recomendation = collab_filter_model.recommend(users=which_user_ids,
Content-based Recommendation System
The content-based filtering (CBF) algorithm makes suggestions based on specific item attributes by identifying similarities. These systems build data profiles based on description data, which may include user or item characteristics. The built profiles are then utilized to suggest products comparable to those the user has previously enjoyed, purchased, watched, or listened to.
Content-based system alleviates the existing challenges with newly released products because the system already has a plethora of information about each product’s features due to the given keywords.
On the biggest systems, the tagging process entails a tremendous workload. Second, there is still a problem with cold starts because there is very little historical information available for new clients. Additionally, algorithms may behave in a conservative, risk-averse manner, recommending categories of goods and material that a specific consumer has already purchased while avoiding novel, possibly intriguing things.
Example of the content-based system:
Product Feed for Amazon
import pandas as pd
import numpy as np
from numpy import dot
from numpy.linalg import norm
This function will normalize the input data to be between 0 and 1
data (List) : The list of values you want to normalize
The input data normalized between 0 and 1
min_val = min(data)
if min_val < 0:
data = [x abs(min_val) for x in data]
max_val = max(data)
return [x/max_val for x in data]
def ohe(df, enc_col):
”’ This function will one hot encode the specified column and add it back
onto the input dataframe
df (DataFrame) : The dataframe you wish for the results to be appended to
enc_col (String) : The column you want to
The OHE columns added onto the input dataframe
”’ ohe_df = pd.get_dummies(df[enc_col])
ohe_df.reset_index(drop = True, inplace = True)
return pd.concat([df, ohe_df], axis = 1)
def __init__(self, df):
self.df = df
def cosine_sim(self, v1,v2):
This function will calculate the cosine similarity between two vectors
def recommend(self, book_id, n_rec):
df (dataframe): The dataframe
song_id (string): Representing the song name
n_rec (int): amount of rec user wants
# calculate similarity of input book_id vector w.r.t all other vectors
inputVec = self.df.loc[book_id].values
self.df[‘sim’]= self.df.apply(lambda x: self.cosine_sim(inputVec, x.values), axis=1)
# returns top n user specified books
if __name__ == ‘__main__’:
PATH = ‘../data/data.csv’
# import data
df = pd.read_csv(PATH)
# normalize the num_pages, ratings, price columns
df[‘num_pages_norm’] = normalize(df[‘num_pages’].values)
df[‘book_rating_norm’] = normalize(df[‘book_rating’].values)
df[‘book_price_norm’] = normalize(df[‘book_price’].values)
# OHE on publish_year and genre
df = ohe(df = df, enc_col = ‘publish_year’)
df = ohe(df = df, enc_col = ‘book_genre’)
df = ohe(df = df, enc_col = ‘text_lang’)
# drop redundant columns
cols = [‘publish_year’, ‘book_genre’, ‘num_pages’, ‘book_rating’, ‘book_price’, ‘text_lang’]
df.drop(columns = cols, inplace = True)
df.set_index(‘book_id’, inplace = True)
# ran on a sample as an example
t = df.copy()
cbr = CBRecommend(df = t)
print(cbr.recommend(book_id = t.index, n_rec = 5))
Personalized Video Ranker
The PVR algorithm was developed since it was so important for OTT (Over-the-Top) services to understand consumer preferences. The PVR algorithms select the top matches for each user individually from the whole range of datasets.
A Netflix site, for example, typically displays 40 rows, with each row containing about 70 items. These values are always changed depending on the device you use to view the material.
Candidate Generation Network
The candidate generation network uses deep neural networks to examine each user’s history, including likes, comments, and the most popular digital material, among other things. As a result, it accurately forecasts future user preferences. When combined with a ranking network, the candidate generation network extracts richer information for each piece of material to score the recommendations.
One of the top personalized digital media material providers, YouTube uses a candidate generation network to keep its users interested.
Knowledge-based Recommender systems
Some of the earliest known recommender systems are knowledge-based engines, which are supported by a wide variety, velocity, and variation of datasets. They decode its intents, context, and entities to capture digitally stored knowledge in a company’s backend to match certain user requests. This type of machine knowledge-based recommendation system extracts domain information from a business that is guided by “if-this-then-that” criteria.
Financial services, digital cameras, and travel locations are examples of item domains pertinent for knowledge-based recommender systems.
Applying a reinforcement machine learning approach by offering clients optimal recommendations and tracking their reactions is an intriguing trend in recommendation system machine learning. As a result, the balance between content discovery and use can be maintained. The best feature of utilizing reinforcement learning to develop recommender systems is that the recommendation algorithm may not only propose to users the information they might find most helpful but also open new vistas by offering some random recommendations. To become an expert in Data Science, Data Science online course designed by Industry Experts from UNext Jigsaw is a must for you!