Recommender System (1) — Overview & Matrix Factorization technique
This article introduces an overview of the Recommender system, and in the following articles, I will introduce each paper and write down what I learned about the recommender system with ML. From March, I’m going to be in charge of development related to personalization and recommendation as an intern in the NAVER maps team. While studying in advance, I intend to serialize the articles in this series.
We are already experiencing the recommendation system without knowing it through various services in our daily life. On YouTube or Netflix’s home tab, in various online shopping malls, or in online dating apps like Tinder, many platforms first show the items I’m interested in. A recommendation system is important to enable users to quickly select what they want within a limited amount of time in a situation full of numerous contents and information. Tech companies such as youtube, amazon, and netflix are working on these recommendation systems because good personalized recommendations provide another level of user experience.
If you use these recommendation systems (especially in the case of YouTube, it continuously recommends videos that I might be interested in, so I will fall into the videos without knowing it). How do system see users preferences? It is likely that they will use machine learning or deep learning, which are hot recently, but in fact, there have been various recommendation techniques even before ML was widely used.
The recommendation system begins by analyzing the targets for which you want to get results. You can analyze the items you want to provide, or you can analyze users who receive recommendations.
- items : products, movies, events, articles
- users : service users, users, readers, customers
Collaborative Filtering vs Content-based Filtering
The recommendation system is basically divided into two strategies. Collaborative filetering and content-based filtering. In addition to this, recently, there are various recommendation methods such as multi-criteria recommender system, hybrid recommender system, and session-based recommender system, but these two filterings are the most basic approach.
Content-based Filtering (CBF)
The content-based filtering approach creates profile for each user or product to characterize its nature. It makes recommendations based on items the user liked in the past.
user profile : the type of item this user likes
item profile : set of discrete attributes and features
If the user prefers a specific item, it recommends an item similar to that item. Since there is no comparison between users and only uses the characteristics of the item, it is said that most early recommendation systems with few users use content-based recommenders.
However, it is difficult to create a profile because external information that is not available or cannot be easily collected is often required. Thus, the alternative is the Collaborative Filtering method.
Collaborative Filtering (CF)
The collaborative filtering analyzes relationships between users and interdependencies among products to identify new user-item associations.
The CF method does not need to write an explicit profile and relies solely on past user behavior.
The two primary areas of CF are the neighborhood methods and latent factor models.
Neighborhood Methods (NM)
Focus on calculating the relationship between items or users
Latent Factor Models (LF)
The classification of the Recommender System is as shown in the diagram below.
Matrix Factorization Techniques (MF)
The most successful way to implement the Latent factor model is Matrix Factorization. MF basically captures the characteristics of users and items through factor vectors inferred from rating patterns. At this time, if there is a strong relationship between the user and the item, it is recommended. MF is said to be superior to the typical nearest neighbor technology in terms of recommendation systems and it has the advantages of scalability, high accuracy and flexibility.
In order for the recommendation system to make a recommendation, there must be input data by default. There are two main types of input (feedback).
Explicit feedback : feedback intentionally provided by users.
- like/dislike buttons
- star ratings (ex)netflix
Implicit feedback : feedback data. interactions.
- user viewed an item or item’s details
- user added the items to the watchlist or cart
- user purchased an item
- user have read an article up to the end
Usually, if only explicit feedback is used, there is a concern that a sparce matrix may be formed. Because it is unlikely that everyone will compulsively rate items, and only part of them. In that case, MF is a strength in that it can incorporate additional information using implicit feedback. When explicit feedback is scarce, implicit feedback is used to indirectly reflect the user’s opinion, and a densely filled matrix can be created using this.
The Basic Matrix Factorization Model
(Briefly introduce the paper)
Matrix factorization models map both users and items to a joint latent factor space of dimensionality f, such that user-item interactions are modeled as inner products in that space.
approximates user u’s rating of item i can be approximates by
The system learns the model by fitting the previously observed ratings. However, the goal is to generalize those previous ratings in a way that predicts future, unknown ratings.
Adding Input Sources
Cold start problem is the problem which it is difficult to make a recommendation because enough data for recommendation is not secured. For newly signed up users, accurate recommendations are not possible because there is no information on what this user likes or dislikes. In the case of a newly joined user, there may not be information about what this user likes or dislikes, and in the case of a service based on rating data, this problem occurs even if the user does not leave a rating.
That’s why behavior information is needed.
This is what I personally find most interesting while reading this paper.
The model described above was a static (time independent) model. But let’s think about our daily life. As we grow up, our tastes and tastes in movies change over time. Or, the perception and popularity of products constantly change according to the trends of the world. Thus, the system should account for the temporal effects reflecting the dynamic, time-drifting nature of user-item interactions.
Y. Koren, R. Bell and C. Volinsky, “Matrix Factorization Techniques for Recommender Systems”, 2009