Descriptive and predictive analytics

What is predictive analytics?

Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.
Geoffrey Moore, author and consultant

Predictive analytics is the art of analysing past data and trends in order to make predictions for the future. It allows you to anticipate future customer behaviour and adjust your business strategy accordingly.

At Fast Data Science we have used predictive analytics for a number of industries, focussing primarily on healthcare and pharmaceuticals.

For example, you may have a retail business and need to predict how much all customers will spend next month, or how much a given customer will spend over a given year. All of these problems involve using statistics to look into the future. Of course, we can never get the prediction exactly right but machine learning allows us to choose the prediction with the lowest deviation from the true value, or highest likelihood of being correct.

At Fast Data Science we were investigating staff attrition in the National Health Service, and we built a predictive model which allowed to predict the probability of any given employee leaving the organisation over the next year.

Descriptive analytics: an example in healthcare

Descriptive analytics involves using data science to analyse events and infer processes and causality. For example, at Fast Data Science we were able to use AI explainability techniques to identify the features which increase the likelihood of employee attrition in the National Health Service in our predictive model. We found that certain factors drove staff attrition in the National Health Service (such as length of service, age, employment and assessment history), and we produced a report for the NHS management describing our observations.

Prescriptive analytics: NHS example continued

Prescriptive analytics is the science of using the insights from a data science analysis to drive business decisions. For example, in addition to describing the causes of staff attrition in the NHS, our report for management also made some recommendations for actions that can be taken at organisation level, which should improve staff retention. Prescriptive analytics in particular requires a good understanding of business processes and borders on management consulting.

Predictive analytics in healthcare

Much of traditional healthcare and medicine has relied on predictive analytics for some time. Physicians use their expertise to anticipate the progression of a disease, or the response to a treatment. The recent surge of Medtech and health tech has meant that software tools have increased the predictive power at physicians’ disposal.

In healthcare, predictive analytics uses historical data to make predictions about the future. This allows us to personalise care for each individual. The medical record, history, demographics and other data on a patient can be used to predict in a statistically robust way the likely health status of that person in the future.

Natural language processing and deep learning are especially important for predictive analytics in healthcare, because a large amount of health data is unstructured and not organised into the neat tabular format required by traditional machine learning algorithms.

Predictive analytics technologies

Techniques and technologies that we have worked with include:

  • Regression and classification models (linear/polynomial/log regression, autoregressive models, etc)
  • Time series models (ARMA, ARIMA, GARCH, exponential moving average, etc.)
  • Spark MLLib
  • Weka
  • R
  • Python
  • DataBricks

If you have a prediction task in your business we would be very interested to hear from you.

Predictive analytics in pharmaceuticals

There are a large number of impactful use cases of predictive analytics in the pharma industry.

Investigators running clinical trials often benefit from a predictive analytics tool which can learn from past trials to predict events in a future trial. A particular challenge is determining the number of participants in a trial. If this number is too low, then the trial may not be able to detect a significant effect and it will be a complete waste of money. However, trial costs scale with number of participants, and various protocols such as the Declaration of Helsinki recommend using the smallest viable number of participants on ethical grounds. For this reason, a predictive model which can predict the dropout rate of participants will provide researchers with the best possible information at the trial design phase.

Predictive analytics can assist pharmaceutical companies to anticipate where demand for drugs will arise in different geographical locations.

Natural language processing is a powerful tool in the pharmaceutical industry, simply because of the huge amount of unstructured text data, from patient records, to trial protocols, and scientific literature. NLP predictive analytics models can be used to better anticipate patient outcomes during a trial, and Fast Data Science has developed a predictive analytics NLP model to predict clinical trial complexity from clinical trial protocols for the German pharmaceutical company Boehringer Ingelheim.

Predictive analytics projects by Fast Data Science

We have built predictive models to predict website traffic, purchases, online signups, customer spend on a customer and country-wide level, NHS doctor attrition and trainee grade results, and more. Primarily our projects have been in the healthcare and pharmaceuticals area, but we work across industries.