Machine Learning Consultant

What is machine learning? Ask a machine learning consultant

Machine learning is the study of computer algorithms that improve automatically through experience. The term “machine learning” covers different AI systems which can “learn” from past data and process new data and make predictions, classifications, and decisions. Machine learning is a sub-field of AI.

Machine learning algorithms build a model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision. It is hard to write a robust set of rules to process incoming emails (e.g. if the email contains the word “accounts”, forward to Accounts), but machine learning would let you collect a set of emails in all the categories that you’re interested in, and train a model to learn the patterns.

Machine learning and AI Venn diagram

A simple example of when you might want to use machine learning is the following: you have been tasked to predict the loading time of trucks for a grocery business. The last trucks had 10, 15 and 12 crates and took 8 minutes, 16 minutes, and 13 minutes to load respectively.

Number of crates on truckLoading time in minutes
108
1516
1213

The machine learning solution to this problem would be to build a line of best fit.

Scatter graph illustrating the process of fitting a linear regression model on vehicle loading times. It has axes: number of crates and time/minutes. To calculate the time for a hypothetical new truck with 14 crates, we can read off the line of best fit

Now when the next truck comes along and it has 14 crates, you can predict a likely loading time of 14.92 minutes.

A machine learning consultant working in a real business will be dealing with much more complex relationships between numbers (for example, you may have different vehicle types, times of day, and crates of frozen goods vs fresh goods vs FMCG), and you would usually split your data into a training set and an unseen test set for validating your assumptions. Fast Data Science has built and deployed this kind of model in consulting engagements for retailers.

Nearly all recent breakthroughs in machine learning involve neural networks. These are machine learning models with many hidden layers, like in the illustration below. This field is called deep learning.

As machine learning consultants, we often train neural networks. This is a 3-layer neural network with 4 input nodes, 6 hidden nodes, and 2 output nodes

Generative AI is a specific application of deep learning where we are not just using machine learning models to predict things; they are also used to create things. This could include artistic creations like text and images which normally could only be done by humans.

Machine Learning Consultants at Fast Data Science

Fast Data Science is a London, UK-based machine learning consultancy firm offering bespoke machine learning consultant services to organisations across industries. Our main area of focus is natural language processing (NLP).

The manager, Thomas Wood, studied a Masters in 2008 at Cambridge University in Computer Speech, Text and Internet Technology and since then he has been working exclusively as a machine learning consultant mostly in NLP.

In 2018, Thomas Wood founded Fast Data Science to deliver machine learning consultant services, focusing on NLP. We have built NLP pipelines from scratch, and worked on natural language dialogue systems, document classifiers and text based recommender systems. For these tasks we have used both traditional machine learning techniques as well as the state of the art such as neural networks and generative AI. We normally use Python but we are open to using any of your preferred technologies or cloud providers.

What are some tasks that you can do with machine learning?

Machine learning can be used for tasks like

  • Predicting revenue, customer churn, customer lifetime value, or other numeric values by training a model on a large customer database of past transactions, such as a CRM system.
  • Natural language dialogue systems: creating systems which understand human text or speech
  • Text analysis: pulling out structured data from documents
  • Topic analysis and clustering. For example, if you want to gain some qualitative insights into large amounts of survey responses which you have collected, or factory error reports.
  • Recommender systems: recommend a product to a user based on what similar users have bought in the past.
  • Document classification: documents can be triaged to categories based on the data that has already been seen.
  • Document anonymisation

Fast Data Science’s machine learning consultant work is focused around natural language processing. Check our case studies for more information.

Fast Data Science - London

Need a machine learning consultant?

Contact Fast Data Science to talk to an expert machine learning consultant.

How can machine learning handle unstructured data?

Today many companies, in particular in certain industries such as healthcare, pharmaceuticals, legal, and insurance, have large amounts of unstructured data. This is typically data in text format, which may even be unscanned documents, PDFs, HTML, or any other file type.

Unstructured data can be difficult to deal with but can contain a goldmine of information.

One way to process unstructured text documents with machine learning would involve first converting it from PDF or Word into a manageable plain text format (e.g. using Apache Tika), then passing it through an NLP pipeline which involves splitting it into tokens (words), and eventually converting it to numbers, possibly using a word vector embedding model. Once the document has been turned into a numeric format, it can be used to train a document classifier, or to pull information out. Generative AI can also help to answer advanced nuanced questions about a document.

When documents are large, we recommend to use traditional machine learning models such as Naive Bayes to identify the relevant part of a PDF, before doing any parts of the process with generative AI.

Recent advances in generative AI models allow us to request a structured output from a model such as the OpenAI API. In essence, if your PDF contains details like company earnings for different years before and after tax, you can request the generative AI to return a table of that information in a very structured fixed format. Without generative AI, it can be very hard to pull out the correct number in context from a document especially if that document contains multiple similar numbers in tables, and you don’t have control over the production of the document (e.g. although company financial reports follow a broadly similar pattern, each company uses their own distinctive template).

Before attempting to solve a problem with generative AI, I would recommend to attempt to solve it with the simpler traditional machine learning models first, such as linear regression or Naive Bayes, and quantify the performance. You can then know for certain if the generative AI improved your model’s accuracy and by how much.

For image files, computer vision models like convolutional neural networks (CNNs) are excellent for classifying images. If you have a small number of image classes and you want to classify incoming images as e.g. car vs motorbike, it’s quite straightforward to train an image classifier to handle the task. For more complex tasks, generative AI APIs may be useful, if speed and privacy constraints allow.

Fast Data Science’s machine learning consulting offering focuses on extracting value from organisations’ unstructured datasets using ML and NLP.

A machine learning consultant’s portfolio

Below you can see some of the case studies that the machine learning consultants at Fast Data Science have completed:

Applications of machine learning in healthcare

Natural Language Processing applications in healthcare Natural Language Processing applications in healthcare

Machine learning is being increasingly adopted across the healthcare sector. This technology is sometimes called healthtech or MedTech.

Machine learning consultants can use AI to compare and detect changes in clinical reports, extract clinical concepts such as MeSH terms from electronic medical records, and develop human-to-machine natural language dialogue systems to improve the healthcare experience.

We have consulted on a number of machine learning projects in healthcare, including:

Machine learning technologies used

The machine learning consultants at Fast Data Science use mainly NLP models, including:

  • Bag of words, tf*idf, cosine similarity
  • NLP pipelines, lemmatisation, parsers, chunkers
  • Deep neural networks
  • Clustering: Latent Dirichlet Allocation
    • This is useful for extracting topics from a set of unstructured documents, for example legal documents, survey responses, factory error reports, etc.
  • Search engines and search term recommenders
  • Google Natural Language, AWS, Microsoft Azure

Topic detection is an NLP technique that allows you to discover common themes in a set of unstructured documents. Topic detection is a machine learning technique that allows you to discover common themes in a set of unstructured documents.

Machine Learning consultants and Python and R

In our consulting work we are open to using any programming languages or frameworks, and we will happily adapt to your tech stack. We have worked with

  • Python NLTK
  • OpenAI API
  • Scikit-Learn
  • TensorFlow
  • Keras
  • R

We have even developed several publicly available open source NLP libraries.

Machine learning consultant’s past projects

We have consulted for major household names, and our machine learning projects include:

  • a spoken dialogue system to control a smart home
  • an unsupervised text analysis program to analyse text descriptions of manufacturing defects (Boehringer Ingelheim)
  • a model to classify jobseekers’ CVs into industries and salary bands (CV-Library).
  • analysis of survey responses (White Ribbon Alliance)

Fast Data Science is a leading machine learning consultant firm and can transform businesses through advanced AI and data-driven solutions.

Our machine learning consulting services deliver end-to-end support, from strategy to deployment, leveraging machine learning to unlock insights from complex datasets. We begin with workshops to pinpoint high-ROI projects, followed by data exploration and custom model development. Our machine learning consultants, skilled in NLP and predictive analytics, have built tools like the Clinical Trial Risk Tool for pharmaceutical clients, optimizing risk assessment.

As your machine learning consultants, we ensure seamless integration using cloud platforms of your choice, like Azure or AWS, addressing challenges like scalability and data privacy.

Founded in 2016 by Thomas Wood, with a Cambridge Masters in NLP, our team of machine learning consultants provides jargon-free, actionable results, enhancing decision-making across industries such as healthcare and legal. Our machine learning consultant services include training and visualisations for easy adoption, driving measurable impact.

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us