Graph-Powered Machine Learning has already introduced us to content-based recommendations and collaborative filtering. These are the two most used approaches to providing recommendations. However, they both need information about the users to do so. What if you do not have user information? That’s where session-based recommendations come in.
What will you learn from this blog?
- What are session-based recommendations, and when to use them.
- What is important to keep in mind when building session-based recommendations.
- How to optimise k-NNs to provide session-based recommendations faster.
When should you use session-based recommendations?
Session-based recommendations can provide recommendations even without information about the users. But when is that? These days it’s hard to imagine that a site doesn’t know who its users are. Websites collect information about your location, preferred language, and more. This information is collected to tailor the experience for you. But it’s not enough to identify users and create user profiles. Identification techniques like cookies can’t create such user profiles either. They are not reliable enough, and they also come with many privacy concerns.
At the end of the day, you don’t have usable information about the users if they don’t log in. And this is often the case for (travel, holidays, accommodation etc.) booking sites. The users find a site, browse around, view several items, and log in just before they are ready to book (if they found something interesting).
Session-based recommendations use anonymous user interactions to provide recommendations. First, the interactions are grouped into sessions – interactions within a specific timeframe. Recommendations are then provided based on user activity in the active session. You can use different approaches to provide recommendations, including nearest neighbour-based (k-NN) approaches. (We have also used these in collaborative filtering.) These approaches are simple and provide high-quality recommendations.
To provide session-based recommendations, you need to:
- Group user interactions to sessions.
- Convert the session and item information into a graph.
- Compute similarities and store them in a k-NN.
- Provide recommendations using the current session and item information.
Modelling session data in a graph
When modelling the session data, you need to make sure that:
- You model the sequence of the user interactions. This increases the recommendation quality by focusing on the latest interactions. Time decay ensures recommendation quality by assigning lower values to older interactions. Relevance windows contain only the x latest interactions. Thus they both ensure recommendation quality and save space.
- You can link item IDs and item metadata easily. Users interact with items. Being able to identify these items allows you to provide content-based recommendations.
Here is an example graph model of session data.
Each session is connected to a user. In our graph, sessions have a start and an end time/date. You can also collect more information about sessions, such as location or device used.
The sessions contain events – user interactions such as clicks, views, and searches. The NEXT relationship connecting the events models the sequential order. As mentioned above, a relevance window is used to model only the x latest events. Thus tackling speed and space concerns, as well as ensuring recommendation quality.
Each event – user interaction – links to different items. Item metadata and features are modelled as nodes. You can use this information to provide content-based recommendations.
Before the user can book his/her restaurant/vacation/flight, they must log in. After doing so, the anonymous user and all their interactions are linked to a real user. This means seeing a user’s interactions before they log in is still possible.
The session ends with a booking event or after a certain threshold. The threshold can be a particular time of inactivity or a certain number of clicks. Using such a threshold has several benefits:
- It ensures the relevance of the data stored.
- It increases the quality of the recommendations provided.
- It reduces the amount of data stored in your database.
- It allows you to track which sessions ended with a booking/sale and which with the user leaving.
Providing session-based recommendations with nearest neighbour approaches
You can take many approaches to provide session-based recommendations. Item- or session-based k-NNs are simple and provide quality recommendations.
Item-based k-NNs
Item-based k-NNs consider only the last element (item) in a session. They find the items most similar to it and recommend those to the user. The similarity is computed based on the co-occurrence of items in other sessions.
These k-NNs work the same as the ones in collaborative filtering. You first represent the items as vectors and then apply a function to compute similarity. The top k similar items are the resulting recommendations.
Computing similarity for each item pair is going to take too much time. Especially if you’re aiming for near-real-time recommendations. To speed up the process, item similarities can be precomputed and stored in the training data. These should be updated regularly – every x hours or clicks.
Another way of speeding up the process is to consider only a subset of the data. This requires finding item pairs that are likely to be similar. Algorithms such as locality-sensitive hashing or nearest-neighbour search can help you do this. Once you find the pairs that are likely to be similar, you compute similarities only for these item pairs.
Session-based k-NNs
Session-based k-NNs are more accurate. They look at the entire session (or last x interactions), and compare it to other sessions. The items from similar sessions are then recommended to the users.
With this approach, you first need to compute the top k similar sessions. Then, you compute a score for each unseen item in the current session. The top scored items are user’s recommendations.
You can probably tell that this would also take a lot of time. But don’t worry, you can optimise – and speed up the process.
Again, it’s possible to precompute and update the similarities between sessions. Furthermore, we can filter out all the sessions that don’t have the items from the current session. While this process helps, it still takes time and requires a lot of memory.
An alternative optimisation takes advantage of the graph way of storing data. This process finds the possible neighbours and uses a sample of these to compute a k-NN. The sample can be random or composed of the most recent sessions. The k-NN is computed in real-time, but the graph is used to sample and filter the items to compute scores. Algorithms compute scores for the items in the k-NN and provide recommendations. This optimisation is faster and more efficient.
Conclusion
Session-based recommendations use (anonymous) user interactions to provide recommendations. Thus they can provide recommendations even without information about the users. This is often the case for booking sites, where users don’t log in until the end of their session.
Graphs allow you to model the session data, and compute a k-nearest neighbour network. You can use item- or session-based k-NN to provide session-based recommendations. Session-based k-NN is more accurate. Optimisations are needed to speed up the process of computing the k-NN networks. The best optimisation includes finding possible neighbours and computing scores only for these.
Once again, graphs provide an excellent solution for providing recommendations. The main advantages of the graph model for session-based recommendations are:
- The events chain can be easily represented in a graph model.
- Time decay and relevance windows allow you to focus on the more recent events.
- Additional information such as item metadata can be added to your graph.
- Graphs provide the proper indexing structure for speeding the recommendation process.
Alessandro Negro, the author of Graph-Powered Machine Learning is already working on the next book. This book, Knowledge Graphs Applied, is a collaboration of his and our talented scientists and engineers Vlasta Kus, Giuseppe Futia and Fabio Montagna. Intrigued? You can now get the MEAP version! Visit our detailed collaborative filtering glossary page for more information.