Evaluation
Evaluating recommender systems is an essential step in determining their efficacy and impact. These systems are designed to provide personalized recommendations to users, with the goal of improving user engagement and satisfaction. Without proper evaluation, it is impossible to know if the recommender system is achieving its intended objectives. We have performed evaluation for both Matrix Factorization and Autoencoder based Collaborative Filtering models. Let us first check the evaluation for the Autoencoder based Collaborative Filtering model followed by Matrix Factorization.
Evaluation of recommender systems involves assessing their performance based on specific metrics such as accuracy, coverage, diversity, and novelty. These metrics help to gauge the system's ability to predict user preferences and make relevant recommendations. Additionally, evaluation helps to identify any biases or limitations in the system, providing insight into areas for improvement. By evaluating recommender systems, businesses can make informed decisions about how to optimize their recommender systems to provide better recommendations and ultimately improve user experience.
Auto Encoder and Collaborative Filtering
After we generate the predicted ratings matrix using the user similarity matrix as the weights on the original ratings matrix we calculate the root mean squared error (RMSE) metric on a held-out test set. We implemented the following api
Here, we only take the non-zero ratings for evaluation. The function uses the trained model to generate predicted ratings for the test set. Specifically, it extracts the predicted ratings for the test set by taking a slice of the self.user_ratings array from the end of the training set to the end of the matrix along the rows (users) and only including the columns (items) where mask is True. This gives us masked_predicted_ratings. Finally, the function calculates the RMSE between masked_predicted_ratings and masked_actual_ratings, which are the predicted and actual ratings, respectively, for the test set. Here’s the graph of RMSE obtained after training the model for different number of epochs:
We see that RMSE settles at around 1.6, this means that on average, the predicted rating by the model is off by 1.6 units from the actual rating. This is better than the RMSE value of 2.2 which we got from Matrix Factorization.
Matrix Factorization
We have used the root mean squared error (RMSE) metric to test out the accuracy of our MF model. Using this model we were able to obtain an RMSE value of 2.2 after training our model for 200 epochs. Here’s the graph for MF performance: