Your Guide to Natural Language Processing (NLP)

wordcloud min
A word cloud generated from this article.

With the rise of artificial intelligence, automation is becoming a part of everyday life. Natural Language Processing (NLP) has proven to be a key part of this breakthrough. Natural Language Processing bridges the gap between computers, AI, and computational linguistics.

A simple sentence spoken by humans consists of different tones, words, meanings, and values. Expert AI systems are able to leverage these hidden structures and meanings to understand human behaviour. However, we often need a very close and detailed assessment to conclude what meaning might and might not be correct. When we have a large amount of text data, it can become impossible to read quickly.

Raw text data in English or other languages is an example of unstructured data. This kind of data does not fit into a relational database and is hard to interpret with computer programs. Natural Language Processing is a sub-field of AI which deals with how computers interpret, comprehend, and manipulate human language.

Different approaches to Natural Language Processing

NLP is much more than speech and text analysis. Depending on what needs to be done, there can be different approaches. There are three primary approaches:

Statistical Approach: The statistical approach to natural language processing depends on finding patterns in large volumes of text. By recognising these trends, the system can develop its own understanding of human language. Some of the more cutting edge examples of statistical NLP include deep learning and neural networks.

Symbolic Approach: The symbolic approach towards NLP is more about human-developed rules. A programmer writes a set of grammar rules to define how the system should behave.

Connectionist Approach: The third approach is a combination of statistical and symbolic approaches. We start with a symbolic approach and strengthen it with statistical rules.

Now that you know the various approaches used in NLP, let’s see how the language is interpreted in a way that machine can understand it better.

How is Language Interpreted?

Language interpretation can be divided into multiple levels. Each level allows the machine to extract information at a higher level of complexity.

Morphological Level: Within a word, a structure called the morpheme is the smallest unit of meaning. The word ‘unlockable’ is made of three morphemes: un+lock+able. Similarly, ‘happily’ is made of two: happy+ly. Morphological analysis involves identifying the morphemes within a word in order to get at the meaning.

Morphemes 2
A Chinese word, an English word, and a Turkish word. Some languages, such as Mandarin, have one or two morphemes per word, and others, such as Turkish, can have many morphemes per word. English is somewhere in the middle. The example shown of ‘unlockable’ can be analysed as either un+lockable or unlock+able, which illustrates the inherent ambiguity of many of the analyses we run in NLP.

Lexical Level: The next level of analysis involves looking at words at a whole.

Syntactic Level: Taking a step forward, syntactic analysis focuses on the structure of a sentence: how words interact with each other.

parse tree 1
A parse tree is one way we can represent the syntax of a sentence.

Semantic Level: Semantic analysis deals with how we can convert the sentence structure into an interpretation of meaning.

Discourse Level: Here we are dealing with the connection between two sentences. This is incredibly tricky: if somebody says ‘he’, how do we know who they are referring to from a past sentence, if multiple persons were mentioned?

Now that we have a clear understanding of Natural Language Processing, here are some common examples of NLP:

Applications of NLP

Social Media Monitoring

pexels pixabay 267350 min
Out-of-control social media can be damaging for a brand.

One of the best examples of Natural Language Processing is social media monitoring. Negative publicity is not good for a brand and a good way to know what your customers think is by keeping an eye on social media.

Platforms like Buffer and Hootsuite use NLP technology to track comments and posts about a brand. NLP helps alert companies when a negative tweet or mention goes live so that they can address a customer service problem before it becomes a disaster.

Sentiment analysis

pexels moose photos 1587014 min

While standard social media monitoring deals with written texts, with sentiment analysis techniques we can take a deeper look at the emotions of the user.

The user’s choice of words gives a hint as to how the user was feeling when they wrote the post. For example, if they use words such as happy, good, and praise, then it indicates a positive feeling. However, sentiment analysis is far from straightforward: it can be thrown by sarcasm, double entendres, and complex sentence structure, so a good sentiment analysis algorithm should take sentence structure into account.

Companies often use sentiment analysis to observe customers’ reactions towards their brands whenever something new is implemented.

Text analysis

Simple texts can hold deep meanings and can point towards multiple subcategories. Mentions of locations, dates, locations, people, and companies can provide valuable data. The most powerful models are often very industry-specific and developed by companies with large amounts of data in their domain. A machine learning model can be trained to predict the salary of a job from its description, or the risk level of a house or marine vessel from a safety inspection report.

One cool application is forensic stylometry, which is the science of determining the author of a document based on the writing style. I’ve trained a simple forensic stylometry model to identify which of three famous authors is likely to have written a text. You can try it out here.

Sherlock Holmes Portrait Paget
Forensic stylometry is an NLP technique that allows us to play detective to identify the author of a ghostwritten novel, anonymous letter, or ransom note. Image of Sherlock Holmes is in the public domain.

Healthcare and NLP

Natural Language Processing has also been helping with disease diagnosis, care delivery, and bringing down the overall cost of healthcare. NLP can help doctors to analyse electronic health records, and even begin to predict disease progression based on the large amount of text data detailing an individual’s medical history.

Amazon Comprehend Medical is using NLP to gather data on disease conditions, medication, and outcomes from clinical trials. Such ventures can help in the early detection of disease. Right now, it is being used for several health conditions including cardiovascular disease, schizophrenia, and even anxiety.

Cognitive Assistant

IBM has recently developed a cognitive assistant that acts like a personal search engine. It knows detailed information about a person and then when prompted provides the information to the user. This is positive step toward helping people with memory problems.

We have seen how helpful Natural Language Processing can be. How does it work in detail? Here we give a brief look into the algorithms.

Some basic approaches to NLP

Bag of Words

Bag of Words is the simplest model in NLP. When we run a bag of words analysis, we disregard the word order, grammar, and semantics. We simply count all the words in a document and feed these numbers into a machine learning algorithm.

For example, if we are building a model to classify news articles into either sport or finance, we might calculate the bag of words score for two articles as follows:

WordArticle AArticle B
Example bag of words score for two articles which we want to assign to sport or finance.

Looking at the above example, it should be easy to identify which article belongs to which category.

The main drawback of the bag of words method is that we are throwing away a lot of useful information which is contained in the word order. For this reason, bag of words is not widely used in production systems in practice.


Tokenisation is often the first stage of an NLP model. A document is split up into pieces to make them easier to handle. Often, each word is a token, but this is not always the case, and tokenisation has to know not to separate phone numbers, email addresses, and the like.

For example, below are the tokens for the example sentence “When will you leave for England?”


This tokenisation example seems simple because the sentence could be split on spaces. However, not all languages use the same rules to divide words. For many East Asian languages such as Chinese, tokenisation is very difficult because no spaces are used between words, and it’s hard to find where one word ends and the next word starts. German can also be difficult to tokenise because of compound words which can be written separately or together depending on their function in a sentence.

Tokenization is also not effective for some words like “New York”. Both New and York can have different meanings so using a token can be confusing. For this reason, tokenisation is often followed by a stage called chunking where we re-join multi-word expressions that were split by a tokeniser.

Tokenisation can be unsuitable for dealing with text domains that contain parentheses, hyphens, and other punctuation marks. Removing these details jumbles the terms. To solve these problems, the next methods shown below are used in combination with tokenisation.

Stop Word Removal

After tokenisation, it’s common to discard stop words, which are pronouns, prepositions, and common articles such as ‘to’ and ‘the’. This is because they often contain no useful information for our purposes and can safely be removed. However, stop word lists should be chosen carefully, as a list that works for one purpose or industry may not be correct for another.

To make sure that no important information is excluded in the process, typically a human operator creates the list of stop words.


Stemming is the process of removing the affixes. This includes both prefixes and suffixes form the words.

Suffixes appear at the end of the word. Examples of suffixes are “-able”, “-acy”, “-en”, “-ful”. Words like “wonderfully” are converted to “wonderful”.

Prefixes appear in front of a word. Some of the common examples of prefixes are “hyper-“, “anti-“, “dis-“, “tri-“, “re-“, and “uni-“.

To perform stemming, a common list of affixes is created, and they are removed programmatically from the words in the input. Stemming should be used with caution as it may change the meaning of the actual word. However, stemmers are easy to use and can be edited very quickly. A common stemmer used in English and other languages is the Porter Stemmer.


Lemmatisation has a similar goal to stemming: the different forms of a word are converted to a single base form. The difference is that lemmatisation relies on a dictionary list. So “ate”, “eating”, and “eaten” are all mapped to “eat” based on the dictionary, while a stemming algorithm would not be able to handle this example.

Lemmatisation algorithms ideally need to know the context of a word in a sentence, as the correct base form could depend if the word was used as a noun or verb, for example. Furthermore, word sense disambiguation may be necessary in order to distinguish identical words with different base forms.

Neural networks

pexels markus spiske 1089438 small min

For many decades, researchers tried to process natural language text by writing ever more complicated series of rules. The problem with the rule-based approach is that the grammar of English and of other languages is idiosyncratic and does not conform to any fixed set of rules.

For example, for one recent project in the pharma space, I tried to write a series of rules to extract the number of participants from a clinical trial protocol. I found examples like

We recruited 450 participants

The number of participants was N=231

The initial intention was to recruit 45 subjects, however due to dropouts the final number was 38

You can see how difficult it would be to write a set of instructions for a computer on where to find the correct number.

So with this kind of problem, it’s often more sensible to let the computer do the heavy lifting. If you have several thousand documents, and you know the true number of participants in each of these documents (perhaps the information is available in an external database), then a neural network can learn to find the patterns itself, and recognise the number of subjects in a new unseen document.

I believe that this is the way forward, and some of the traditional NLP techniques will be used less and less, as computing power becomes more widely available and the science advances.

Common neural networks used for NLP include LSTM, Convolutional Neural Networks, and Transformers.


The field of natural language processing has been moving forward in the last few decades and has opened some meaningful ways to an advanced and better world. While there are still challenges in decoding different languages and dialects used around the world, the technology continues to improve at a rapid pace.

NLP has already found applications in finding healthcare solutions and helping companies meet their customers’ expectations. We can expect to see natural language processing affecting our lives in many more unexpected ways in the future.

Machine Learning Consulting – What it is and how businesses can benefit

pexels christina morillo 1181531 small min

With all this buzz constantly hovering around big data, AI, and especially machine learning (ML), small businesses and enterprises alike are not only becoming more aware of what it is but also getting increasingly curious about the applications, and particularly, the benefits of machine learning consultancy.

Many among us have most likely heard of ML in some form or the other – but don’t quite know what it actually is, the business problems it can solve or the tremendous value it can add to a business.

In short, ML is a data analysis process which leverages specific ML algorithms to learn iteratively from existing data, which in turn, helps computers discover hidden insights without actually having to rely on specific programming for it. But that’s oversimplifying things somewhat.

What is Machine Learning?

Essentially, ML refers to the study of specific algorithms and models which computers use to perform certain tasks – without having to explicitly rely on a set of programming instructions or code. It’s actually a discipline of artificial intelligence (AI), which you’ll understand why later on in the article.

ML can predict the desired system output through this experience of processing data, and it does all this without having any previous knowledge of the system behavioral model. The algorithms are very unique in the sense that they simulate learning capabilities similar to our own learning patterns. This helps the system automatically improve over time and yield highly accurate output based on new system inputs.

Any input or information received externally is processed by the system internally to create ‘knowledge’, which is used to improve its performance and efficiency over time to deliver more accurate output, based on new inputs.

ML and machine learning consultancy have actually been around for quite a while. In fact, it’s right under our noses – every time we use Google, that is.

Let’s take the search engine giant’s query search mechanism, for instance. Every time a user keys in a search query, it not only has a purpose behind it but a rather diverse context as well; the text the user enters doesn’t exactly shed light on precisely what kind of information is required. This is why Google is required to “understand” and identify the web pages which are the closest to what the user is searching for.

Google results for machine learning consulting
Google uses machine learning to identify the most relevant results for a user, given their personal search history.

This “knowledge” that Google displays for the user can come from a variety of sources or even factors, all of which contribute toward yielding the correct and relevant results. So let’s call them parameters for now. The parameters include first and foremost, the search query text of course, the user’s web browsing history, the subject matter and URL structure of pages that may be relevant, the frequency of similar queries, the browsing habits of other users who have requested similar content through very similar search words – and a complete string of parameters and factors known only to Google.

Now, at any given moment, Google is required to handle search requests in the millions, with users expecting the search engine to deliver highly accurate results. So how does Google do it?

Given the sheer scale at which Google must perform around the clock, it’s quite impossible to have these search queries processed manually. Therefore, Google relies on machine learning and automation, as well as natural language processing to understand each user’s requirements and search query demands – and then proceeds to rank the most relevant results.

Google is just one such example.

Amazon, Microsoft Azure and Google have launched their cloud machine learning platform, and since then both machine learning consulting and AI consulting have become critical for businesses in nearly every vertical. Surprisingly though, we’ve already witnessed ML without knowing it. The Google example above is just one of the ways we unknowingly experience it every day.

Screenshot of Microsoft Azure Machine Learning Studio, a cloud platform that allows you to train machine learning models without writing code
Screenshot of Microsoft Azure ML, a cloud platform that allows you to train machine learning models without writing code

Email spam detection, for example, or face-tagging done on Facebook are two more examples – Gmail recognizes the chosen words or the pattern which must be used to filter out spam, while Facebook automatically tags any uploaded images using face recognition techniques.

And this brings us to the business benefits of machine learning consulting and AI consulting, which are huge, to say the least.

How do businesses benefit from Machine Learning Consulting?

To some people, terms like AI consulting, machine learning consulting or natural language processing probably sound like they were pulled right out of a futuristic movie.

Arthur Samuel
Arthur Samuel, an early machine learning researcher at IBM.

However, the technology’s prevalence actually dates back to the 1950s. It was the American Arthur Samuel, an IBM researcher, who developed the very first machine learning program which could play Checkers, a computer game wildly popular at the time.

By the 1990s, machine learning was officially recognized as a unique branch of AI and has since produced impressive technology-powered use cases in nearly every sector. In today’s modern era, machine learning business adoption and use cases are primarily fueled by making improvements within computer processing technologies.

Proactive businesses are already applying computation-heavy ML algorithms to large data sets with significantly lower processing time. As a result, their cost of data storage has reduced, allowing them to access fairly large chunks of data, within which hidden patterns of profitable business knowledge can often be discovered using machine learning technologies.

Some ML algorithms are accessible through open sources. For example, cloud computing allows businesses of all scales to use ML for delivering much improved services to end users, without necessarily having to first invest heavily into the required infrastructure resources.

Machine learning is already reaching maturity, and there are many unique ways businesses can capitalise on and benefit from this technology:

Eliminate manual tasks

AI can be seen as the next wave of the industrial revolution, replacing repetitive tasks and allowing humans to be repurposed to what they do best.
AI can be seen as the next wave of the industrial revolution, replacing repetitive tasks and allowing humans to be repurposed to what they do best.

For the majority of the 20th century, industrial automation made use of machines to reduce manual tasks which were both repetitive and predictable.

However, industrial automation largely remained ineffective in terms of replacing manual operations – which required many considerations toward a number of variable parameters, internal system changes and external factors – all of which were highly unpredictable in nature.

The introduction of ML technologies helped fill this gap through predictive models which were applied to data points changing in real time, delivering improved decision-making support and executing task automation accordingly.

Over the last few decades, machine learning applications have evolved far beyond just industrial automation. In fact, they support anything from software-based business services to B2B consumers and end users within the business.

Real-time decision making

On a daily basis, businesses must rely on highly accurate information in order to make key decisions at any given time. We’re living in a highly connected and digitalised world today, which means that extracting the desired information from Big Data would be nearly impossible without bringing some kind of AI consulting or machine learning consulting into the mix.

Machine learning enables businesses to transform huge data sets into knowledgeable and actionable intelligence – which they can use in a number of ways – e.g. to improve the user experience, gain better insights into core issues or specific steps they could take to beat their competitors.

Therefore, this invaluable information can be integrated with daily business processes as well as operational activities to respond readily to business circumstances and changing market demands. Organisations that are already taking advantage of ML are usually the first ones to set a benchmark for their competitors to follow and continue to maintain that competitive edge in real-time.

Reduced operational overheads

Companies can save money on call centres, or divert customers from call centres via AI, or even use AI to improve operations of existing call centres.
Companies can save money on call centres, or divert customers from call centres via AI, or even use AI to improve operations of existing call centres.

Let’s shift our attention to a critical component of any successful business: quality customer support and service. Businesses with a large consumer base often struggle a lot to keep up with their consumers’ demands, and tend to fail at delivering the customer support those consumers have come to expect.

In many cases, they end up hiring large customer support teams which must be properly trained, not to mention the connectivity infrastructure costs for communicating with those customers in an efficient and timely manner.

With ML, however, businesses can leverage chatbots and automated response systems, which would allow them to quickly identify any number of issues, and automatically guide customers to the right solution without any manual input from a customer service rep – thus, saving costs and delivering a highly responsive and to-the-point customer service experience at the end of the day. Nothing irritates a valued customer more than having to wait in long queues or getting a solution that isn’t applicable to them.

Enhanced business models and services

While large and well-established enterprises thrive and dominate by owning a certain chunk of the market share, many businesses must gain a competitive edge by remaining profitable in other domains. Such is the case with SMEs, who gain market dominance by introducing innovative products and services, or newer, more effective business models, for example.

Airbnb is among a handful of companies who have leveraged ML technologies to better realise their unique business model. Machine learning has enabled them to pretty much guarantee highly accurate search results, along with a customer service experience that everyone raves about.

The same can actually apply to companies of all scales and sectors, considering the vast use cases there are when it comes to machine learning consulting. We’ll be discussing some of those use cases later down the article.

Improved security and network performance

Unfortunately, when network intrusions, cybersecurity threats or other similar anomalies occur, businesses rarely have a ‘reaction window’ beforehand. It all happens in real-time and businesses must proactively contain security threats before they escalate into a full-scale attack which can compromise sensitive data or core services.

Machine learning algorithms have the ability to monitor network performance for any security threats and anomalies in real time – and in a way that proactive measures can be automatically taken to mitigate those threats.

ML algorithms also have the ability to self-train themselves, much like the human mind, and this allows businesses to automatically scale and improve their cybersecurity over time – adapting to changes on an ongoing basis and replacing manual threat research and analysis with security insights that are specific to your business’s network.

ML has enabled many new-generation cybersecurity providers to come up with newer technologies in order to help their clients swiftly and effectively eliminate threats before they manifest into a full-blown cyber-attacks.

Simplified product marketing and more accurate sales forecasts

ML can help enterprises in a number of ways to promote their products in a more cost-efficient manner and make far more accurate sales forecasts. There are major sales and marketing advantages to be had including:

  • Since ML can virtually consume an unlimited amount of Big Data, this can be used to persistently review and modify sales and marketing strategies based on specific consumer behavioral patterns. Once a specific model has been ‘trained’ or ‘learned’, the ML algorithm will then be able to identify the correct variables. As a result, you’ll have access to more focused data feeds without the need to analyse lengthy and complicated integrations.
  • The rate at which machine learning is capable of consuming data and identifying relevant data, means that your sales and marketing team can take the right actions at the right time. For instance, specific machine learning algorithms will automatically optimise the best subsequent offer to forward to your customers. Consequently, your customers will receive the offer at just the right time, as opposed to you having to invest the time to plan and make that offer visible to those customers at a specific time.
  • And, through ML you can analyse any data related to past customer behaviours or outcomes, and interpret them in a meaningful, profitable way. Based on this new and varying data, you will be able to make far better projections around future customer behaviours.

Product recommendations

Product recommendations as well as upselling and cross-selling are naturally a critical component of any sales and marketing strategy. Your ML algorithm can be designed to analyse product purchase history of a consumer and based on that data, identify specific products from your existing inventory that they may be interested in.

The algorithm will identify hidden purchase patterns within that product purchase history and then group similar products into clusters. This process is referred to as unsupervised learning, which is a particular kind of ML algorithm. A model like this will enable you to make vastly improved product recommendations to your customers, thereby motivating them more to purchase a specific product. Therefore, unsupervised learning can help create an excellent product recommendation model.

More accurate medical diagnosis and predictions

Machine learning can accelerate and improve medical diagnoses. It does not replace clinicians but serves as an extra tool for them to use.
Machine learning can accelerate and improve medical diagnoses. It does not replace clinicians but serves as an extra tool for them to use.

Machine learning consulting in medical diagnosis has helped numerous healthcare organisations in improving patient health and cut down overall healthcare costs with the help of superior diagnostic tools and more effective treatment plans.

ML has helped in easy identification of high-risk patients, making almost perfect diagnosis and recommending the best possible medical treatments, along with re-admission predictions. These are mostly based on the available datasets of patient records as well as the symptoms they exhibited. With near-perfect diagnoses and improved medicinal recommendations, there can be faster patient recovery without the need for any extraneous medications. Therefore, ML can be leveraged by the healthcare sector to improve patient health at a faster rate while keeping costs down.

Machine learning consulting & AI Consulting – Use cases and applications

Pharmaceutical industry

There’s massive potential for AI consulting to transform the pharmaceutical sector and introduce better cost savings along each stage of the business. Much like AI in the healthcare industry, uptake of machine learning, AI and natural language processing in the pharmaceutical sector has just begun, with many pharma-companies, however, already seeing major returns on their upfront investment.

In fact, here’s a quick read on the way natural language processing, AI consulting and machine learning consulting has changed the face of business in the pharmaceutical sector.

Market personalisation

The more effort you put into understanding your customer, the better you can serve their needs, and of course, the more revenue you will generate. That’s what market personalisation is essentially all about.

Perhaps you’ve already had this experience where you visited a store online, looked at a product and initially decided not to buy it – but then saw digital ads for that exact product while browsing around some other website a few days later. This kind of market personalisation is just a generic example of how machine learning can be leveraged to better showcase your products and sell more units.

Companies can even personalise which emails their customers receive, which coupons or direct mailings they see, the offers they get, or the products that are displayed on their screen as “recommended” as they browse the web – all of these are designed to lead in your audience more reliably towards making a purchase.

Data and personal security

Malware is a major problem even in this day and age. In just 2014 alone, one company reported that it detected 325,000 malware files consistently on every single day. However, another company specialising in AI consulting and machine learning consulting said that every piece of new malware typically has nearly the same identical code as its predecessors – in reality, just between 2-10% files change from one iteration to the next.

The learning model they came up with had no problem with the 2-10% variations and could predict which files were, in fact, malware with almost pinpoint accuracy. In other scenarios, machine ML algorithms can be used to analyse patterns in how data may be accessed in the cloud, and immediately report anomalies which could lead to security breaches.

If you recently attended a big public event or flew to another destination, then you probably waited in long screening lines. However, machine learning is now proving that it can be a great deal of help in terms of eliminating false alarms and spotting things that manual screeners may miss in screenings at concerts, airports, stadiums, etc. This can significantly improve the screening process and ensure more safety at major public events.

Online search

As we discussed at the beginning of the article, probably the most famous use case example of machine learning is Google. Each time you key-in a search, a machine learning algorithm observes how you respond to the results. So, for example, if you click the topmost result and stay on that same web page, it assumes that you got the information you needed and that the search was pretty much a success.

But, on the other hand, if you click through to page no. 2 of SERPs (search engine result pages), or type a new search term altogether without clicking any of the links on page no.1 or otherwise, then the algorithm assumes the search was not a success. As a result, it learns from its mistakes and delivers a much more streamlined search result in future.


Machine Learning and Natural Language Processing can be used to read and analyse legal contracts.
Natural Language Processing can be used to read and analyse legal contracts.

Natural language processing is now being used in all kinds of exciting applications across multiple disciplines. ML algorithms coupled with NLP can take the place of customer service agents and route customers more efficiently towards the answers they seek – with chatbots being a prime example.

In fact, it’s even being used to obscure confusing legal terms in contracts and change it into plain language to help attorneys quickly sift through large volumes of case-related data and other legal information in order to prepare themselves for an upcoming case.

What does the future of Machine Learning Consulting hold?

Owing to current adoption trends, machine learning consulting will grow by addressing a number of key issues, including:

Improved ML infrastructure and processes

As machine learning continues to mature and evolve as a programming paradigm, better processes, improved GPUs and AI chips, as well as more automation will make ML a lot easier and faster to use.

More talent for ML

Most ML consultancies are analysing their workforce to identify those who can work with data science. A background in math, statistics or programming usually suffices for those looking to work as data scientists and ML specialists, after a relatively quick brush-up course.

More creativity with data

Advances in natural language processing means that finding the right data in a haystack is now more straightforward – with areas within AI research such as data synthesis offering more readily available technical solutions.

Be it natural language processing and text analysis, cloud machine learning consulting or conversion rate optimisation with AI consulting, Fast Data Science is ready to serve as your ML/AI partner for now and beyond.