How well can you predict an individual customer's spending habits?

· Thomas Wood

Why do we need to predict customer spend?

You may have read my previous post about customer churn prediction. Another similar problem that’s just as important as predicting lost customers, is predicting customers’ daily expenditure.

Let me give you an example: you work for a large retailer which has a loyalty card scheme. You’d like to predict for a given customer how much they are likely to spend over the next week.

In this case normally there would be clear patterns

  • customers buy more on Mondays than on Saturdays (weekly cycle)
  • there might be a monthly cycle and a yearly cycle
  • Christmas, Easter and bank holidays might drive an explosion in demand

However there are a few problems when you get down to customer level:

  • some customers may have visited your shop only once
  • some have visited hundreds of times
  • a customer might not enter the shop for a few months but then come back (dormant customer)

Looking at all customers’ spend together

What this means is that, if you look at all customers’ expenditures (or averaged over a region), you will probably see some recognisable weekly, monthly and seasonal patterns:

Graph of average customer spend across all customers of a business

Individual customer’s spend

However for a single customer it’s hard to make out any recognisable pattern among all the noise. The weekly and yearly trends were only apparent when we averaged over all customers.

Fast Data Science - London

Need a business solution?

NLP, ML and data science leader since 2016 - get in touch for an NLP consulting session.
Graph of the individual customer's spend by day. This is much more noisy than the between-customers graph.

So how can you go about predicting the future expenditure of a given customer the next time they enter the shop?

This problem is quite interesting as there are at least two very different approaches to solving it, from two different traditional disciplines:

  • Predictive modelling (from the field of machine learning) - focusing on an individual customer
  • Time series analysis (from the field of statistics) - focusing on groups of customers

This means that depending on whether you hire somebody with a machine learning background, or somebody with a statistics background, you may get two contradictory answers.

In this post I’ll talk only about the predictive modelling approach.

If you are interested in predicting the first graph, which is averages for groups of customers, you might want to look into my next post on time series analysis.

Predictive model: individual customer spend

The simplest way would be to use a predictive modelling machine learning approach. For example you could use Linear Regression. If you are unfamiliar with how to do this I recommend Andrew Ng’s Coursera course.

You would provide as input to your Regression model:

  • Last purchase value (if available)
  • Second last purchase value (if available)
  • Third last purchase value (if available)

The output you want it to predict is:

  • The next purchase value

This will predict the next purchase with some accuracy. After all the biggest factor to predict what someone will buy, is what they bought in the past.

However I’m sure you can easily think of some cases where this will break down. For example

  • A customer with no past purchases
  • Over Christmas if purchases tend to be bigger

You can improve the performance of the Predictive Model approach by making it a little more sophisticated:

  • Add more input features to the Regression model such as “day of week”, “day of year”, “isChristmasSeason” etc.
  • Switch to a Polynomial Regression Model, or Random Forest Regression. This will allow your model to become more powerful if the relationships between your inputs and outputs are not entirely linear, although it comes with a risk of your predictions going a crazy (like predicting huge numbers) if you are not careful!
  • Make different models 

Customer spend in practice

If you have a prediction problem in retail, or would like to some help with another problem in data science, please contact me.

In practice it can be tricky to get the data on each customer that I described above. You need to extract certain statistics on customers’ past purchases from your database to train your model, and also query your database in real time to run a prediction. If you would like more ideas on how to do this please check out my video on the topic or this Python tutorial on Github, or read more about AI in business on our blog.

Your NLP Career Awaits!

Ready to take the next step in your NLP journey? Connect with top employers seeking talent in natural language processing. Discover your dream job!

Find Your Dream Job

Fast Data Science at Ireland's Expert Witness Conference on 20 May 2026
Legal aiGenerative ai

Fast Data Science at Ireland's Expert Witness Conference on 20 May 2026

Fast Data Science will appear at Ireland’s Expert Witness Conference on 20 May 2026 in Dublin On 20 May 2026, La Touche Training is running the Expert Witness Conference 2026, at the Radisson Blu Hotel, Golden Lane, Dublin 8, Ireland. This is a full-day event combining practical workshops and interactive sessions, aimed at expert witnesses and legal professionals who want to enhance their expertise. The agenda covers critical topics like recent developments in case law, guidance on report writing, and techniques for handling cross-examination.

Using Natural Language Processing (NLP) to predict the future
Ai for businessNatural language processing

Using Natural Language Processing (NLP) to predict the future

Guest post by Alex Nikic In the past few years, Generative AI technology has advanced rapidly, and businesses are increasingly adopting it for a variety of tasks. While GenAI excels at tasks such as document summarisation, question answering, and content generation, it lacks the ability to provide reliable forecasts for future events. GenAI models are not designed for forecasting, and along with the tendancy to hallucinate information, the output of these models should not be trusted when planning key business decisions. For more details, a previous article on our blog explores in-depth the trade-offs of GenAI vs Traditional Machine Learning approaches.

Getty Images v Stability AI trial concludes - will 'AI training havens' spring up in jurisdictions with lenient copyright laws?
Generative ai

Getty Images v Stability AI trial concludes - will 'AI training havens' spring up in jurisdictions with lenient copyright laws?

After this ruling, will tech companies move all model training to data centres that they consider “copyright safe”? Will we see a new equivalent of a “tax haven” for training AI models on copyrighted content? An “AI haven”? This article is not legal advice.

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us