Data science

Predicting customer churn

· Thomas Wood
Predicting customer churn

How can an AI model predict customer churn? Who will stay with your business and who will switch to a competitor? It’s easy to make a basic customer churn model with Python.

What is customer churn?

One question faced by lots of companies in competitive markets, is… why are our customers leaving us? What drives them to switch to a competitor? This is called ‘customer churn’, and we can model it with machine learning.

Imagine you run a utility company. You know this about each of your customers:

  • When they signed the first contract
  • How much power they use on weekdays, weekends, etc
  • Size of household
  • Zip code / Postcode

For millions of customers you also know whether they stayed with your company, or switched to a different provider.

Utility companies often use customer churn models, as customers frequently switch electricity and gas providers.

Utility companies often use customer churn models, as customers frequently switch electricity and gas providers.

Fast Data Science - London

Need a business solution?

NLP, ML and data science leader since 2016 - get in touch for an NLP consulting session.

Why model customer churn?

Ideally you’d like to identify the people who are likely to switch their supply, before they do so! Then you can offer them promotions or loyalty rewards to convince them to stay.

How customer churn prediction works

How can you go about modelling customer churn at your organisation?

If you have a data scientist or statistician at your company, they can probably run an analysis and produce a detailed report, telling you that high consumption customers in X or Y demographic are highly likely to switch supply.

It’s nice to have this report and it probably has some pretty graphs. But what I want to know is, for each of the 2 million customers in my database, what is the probability that the customer will churn?

If you build a machine learning model you can get this information. For example, customer 34534231 is 79% likely to switch to a competitor in the next month.

Customer churn model in Python

Surprisingly building a customer churn model like this is very simple. I like to use Scikit-learn for predicting customer churn - it is a nice easy-to-use machine learning library in Python. It’s possible to knock up a program in a day which will connect to your database, and give you this probability, for any customer. Churn analysis with Python is truly one of the most efficient ways to go about customer churn prediction.

One problem you’ll encounter is that customer data is very non-homogeneous. For example, the postcode or zip code is a kind of categorical variable, while power consumption is a continuous number. For this kind of problem, I found the most suitable algorithms are Support Vector Machines, Random Forest, and Gradient Boosted models, all of which are in Scikit-learn. I also have a trick of augmenting location data with demographic data for that location (such as average credit score or income level per postcode), which improves the accuracy of the prediction.

If you are interested in the details of how to build a customer churn model in Python, you can follow our article on customer spend prediction, which is an analogous problem. The process for customer churn prediction is the same as for customer spend, except that you are building a logistic regression (classification) model (churn is TRUE or FALSE), rather than a regression model (customer spend is a scalar value). We also have a video about customer spend prediction and a Python tutorial on customer spend prediction on Github.

Same goes for employee churn analysis which aims to understand the reasons and factors that influence employee attrition and retention, and to identify the employees who are likely to leave the company in the near future. There are different methods and techniques that can be used for employee churn analysis, such as descriptive statistics, exploratory data analysis, data visualization, hypothesis testing, and machine learning. One of the most popular and effective methods is to use machine learning algorithms to build predictive models that can classify employees into churners or non-churners based on their features and characteristics.

If customer churn is an issue for your business and you’d like to anticipate it before it happens, I’d love to hear from you! Get in touch via the contact form to find out more.

Is natural language processing the future of Business Intelligence?
Data scienceNatural language processing

Is natural language processing the future of Business Intelligence?

Guest post by Essa Jabang, who works as a data and engineering consultant in our team at Fast Data Science and also runs his own company Taybull.

Business uses of natural language processing - the 2023 capabilities of NLP
Data scienceNatural language processing

Business uses of natural language processing - the 2023 capabilities of NLP

What is NLP in business environments? Natural language processing (NLP) is a branch of AI (Artificial Intelligence), empowering computers to not just understand but also process and generate language in the same way that humans do.

Natural language processing for fake news detection? Claas Relotius and plagiarism, ChatGPT, and generative models
Data scienceNatural language processing

Natural language processing for fake news detection? Claas Relotius and plagiarism, ChatGPT, and generative models

Can we detect what is fake news or plagiarised in 59 articles for Der Spiegel by Claas Relotius? We used natural language processing to uncover the clues that pointed to a rogue journalist’s history of submitting fake news

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us