
If you’ve ever bought something on Amazon or other large online retailers, you’ll have noticed the ‘similar products’ that the site recommends to you after you’ve made your purchase. Sometimes they’re not the best suggestion, but in my experience most of the time they hit the mark.
This is an area of machine learning called recommender systems.
How do recommender systems work? In the case of online retailers, the standard approach is to fill out huge matrices and work out the relationships between different products. You can then see which products normally go together in the same basket, and make recommendations accordingly. This is called collaborative filtering and it works mainly because most products have been purchased thousands or millions of times, allowing us to spot the patterns.

Fast Data Science - London
A table of products and how often they are bought together. This is the basis of collaborative filtering, which is one recommender systems algorithm.
Now imagine you run a dating website. Let’s simplify and say your site only caters for male-female pairings. How do you recommend a female to a male user who’s just registered?
This is when things get tricky. There are many users, new users are registering all the time, and most users have made few contact requests.
In this case we can work with what we do have:
Recommender systems on text
One approach which I like to use is a deep learning approach called vector embeddings, which goes like this:

A dating recommender system might assign all users to a location in vector space, and match users who appear close together.
Of course the tricky bit is how to go from a profile text and image, to a vector. This is something that Convolutional Neural Networks (CNNs) are very good at.
Vector embeddings can be useful for making recommendations in other industries too:

A house sale website recommender system might need to make use of text and image data
There are off the shelf recommender systems that you can use for online retail or movie recommendations. But for text or image based recommendations really you need a custom solution, and this is extremely complex to build.
I have set up Fast Data Science Ltd to provide consulting services in this area after 10 years’ experience working with machine learning on natural language data. If you have lots of text or image data and you’d like to build a custom recommender system I’d love to hear from you. Please contact me here or write a comment below.
Dive into the world of Natural Language Processing! Explore cutting-edge NLP roles that match your skills and passions.
Explore NLP JobsGuest post by Alex Nikic In the past few years, Generative AI technology has advanced rapidly, and businesses are increasingly adopting it for a variety of tasks. While GenAI excels at tasks such as document summarisation, question answering, and content generation, it lacks the ability to provide reliable forecasts for future events. GenAI models are not designed for forecasting, and along with the tendancy to hallucinate information, the output of these models should not be trusted when planning key business decisions. For more details, a previous article on our blog explores in-depth the trade-offs of GenAI vs Traditional Machine Learning approaches.

After this ruling, will tech companies move all model training to data centres that they consider “copyright safe”? Will we see a new equivalent of a “tax haven” for training AI models on copyrighted content? An “AI haven”? This article is not legal advice.

This new video explains natural language processing: what it is, how it works, and what can it do for your organisation. Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that focuses on giving computers the ability to understand human language, combining disciplines like linguistics, computer science, and engineering.
What we can do for you