< Back to Coding Projects

alex's book recommender app

If you can't see the app, please give the server some time to load it up.


25th January 2021

Background

As a data scientist, I spend a lot of time wranggling data, building machine learning models and extracting key insights from my results. However, I've never made a user-friendly application out of my machine learning models before. I wanted to make something fun, easy and some-what useful. So I decided to learn how to build Flask applications that would allow me to repurpose my recommender systems skills for a specific user experience.

Requirements for this project:


Technical Details

I've written about recommender systems in another post, so you can learn more about them here. Also, the code for this project can be found here.

I've chosen to build an item-based CF (reasons explained below) that uses data from Goodreads: containing 10,000 books and 6 million ratings. So the book titles that can be entered are limited to the 10,000 books from the Goodreads data. Also, not all of the books will have an image associated with it (another limitation of the data). If my data included the metadata of the books (e.g. genre, number of pages, etc.), I could have tried building a content-based recommender.

The similarity matrix generated is around 800MB in size, so I've stored it in an SQL database which will allow the backend to retrieve the necessary data and produce results immediately. It takes about 15-30 secs to run everything (depending on your internet speed).

The recommendations will be heavily skewed towards novels and stories due to the nature of the dataset from Goodreads. Of course, if I had built this out of the entire Goodreads database, there would be a much larger selection of books to choose from and to recommend. Nevertheless, I hope you'll get some good recommendations.


Thought Process

As you know, recommender systems have a wide range of applications (e.g. recommending videos, songs, shopping items, etc.) for different platforms. This application is a little different to the typical recommender in that it gives recommendations immediately on the fly based on user inputs. Usually a company will store all the ratings of its users in a datacase and contintually save new ratings from users, which can be used to update their recommender systems later. They are often be updated overnight, thus no need for realtime recommendations or any disruption in user experience.

However, in my case, my application has to provide realtime recommendations as it would be unreasonable to ask users to sign up, enter hundreds of ratings and wait a day or so more their recommendations to appear. So I chose to build an item-based CF as it would allow me to store a fixed similarity matrix unlike the user-based CFs which requires a new similarity matrix to be generated each time.


Closing Comments

Recommender systems are tricky to build, as they are heavily dependent on the kind of data you have. For example, it doesn't matter how good your algorithm is, if you don't have enough data, or if the ratings of your data are heavily skewed to one book - the results will inevitable be bad. It's also very difficult to measure the accuracy of your models - there are metrics like hit-rate, diversity, churn etc. that can be helpful. But whether you've been recommended a good book or not is subjective. Building a good recommender seems to be more of an art than a science sometimes!

This was a super interesting project for me, I learnt a lot of new things and am now much more familiar with recommender systems. Generally, it was a good opportunity for me to combine my data science and web development skills - allowing me to gain a better understanding of how machine learning models can be transformed into user-friendly applications. I look forward to building some more apps in the future!

< Back to Coding Projects