NLP Researcher

NLP Researcher Explains:

Natural language processing is a subfield of linguistics, sitting on a crossroads with computer science, artificial intelligence and engineering. NLP has been an active field of research since the 1950s. In 1950, Alan Turing published an article titled “Computing Machinery and Intelligence” which proposed what is now called the Turing test as a criterion of intelligence/, a task that involves the automated interpretation and generation of natural language (e.g. “We are searching in the database”). At that time, NLP was not yet seen as its own separate field of research within or separate from artificial intelligence.

NLP consultant

Natural Language Processing Research

Our main area of research focus is natural language processing (NLP) research. The manager, Thomas Wood, studied a Masters in 2008 at Cambridge University in Computer Speech, Text and Internet Technology, and conducted his NLP research project on pleonastic pronouns. Since then he has been working exclusively in machine learning and mostly in NLP. In 2018 he founded Fast Data Science to deliver data science consultancy and research, focusing on NLP. We have built NLP pipelines from scratch, and worked on natural language dialogue systems, document classifiers and text based recommender systems. For these tasks we have used both traditional machine learning techniques as well as the state of the art such as neural networks. We normally use Python for our NLP research.

Areas of research within NLP

Examples of natural language processing research areas include:

  • Natural language understanding
  • Natural language dialogue systems
  • Text analysis
  • Topic analysis – clustering
  • Document classification
  • Document-based recommender systems
  • Unstructured data analysis
  • Document anonymisation

Fast Data Science - London

Need a business solution?

NLP, ML and data science leader since 2016 - get in touch for an NLP consulting session.

NLP and unstructured data

Today many companies, in particular in certain industries such as healthcare, pharmaceuticals, legal, and insurance, have large amounts of unstructured data. This is typically data in text format, which may even be unscanned documents, PDFs, HTML, or any other file type.

Unstructured data is very difficult to deal with but can contain a goldmine of information. Fast Data Science specialises in extracting value from organisations’ unstructured datasets. If you have a large document set in your organisation, consider hiring a company of NLP researchers such as Fast Data Science.

Natural Language Processing applications in healthcare

Natural Language Processing applications in healthcare Natural Language Processing applications in healthcare

AI and natural language processing are being increasingly adopted across the healthcare sector.

Healthtech and MedTech are hot areas of NLP research. NLP researchers are using NLP to compare and detect changes in clinical reports, extract clinical concepts such as MeSH terms from electronic medical records, and develop human-to-machine natural language dialogue systems to improve the healthcare experience. These NLP research breakthroughs are beginning to impact the sector.

We have worked on a number of NLP research projects in healthcare, including:

Natural Language Processing research at Fast Data Science

We do a lot of natural language processing with Python. We have used many NLP models and architectures in our research, including:

  • Bag of words, tf*idf, cosine similarity
  • NLP pipelines, lemmatisation, parsers, chunkers
  • Deep neural networks
  • Clustering: Latent Dirichlet Allocation
    • This is useful for extracting topics from a set of unstructured documents, for example legal documents, survey responses, factory error reports, etc.
  • Search engines and search term recommenders
  • Google Natural Language, AWS, Microsoft Azure
Topic detection is an NLP technique that allows you to discover common themes in a set of unstructured documents.

Natural Language Processing in Python and R

We work with the following programming languages and frameworks:

  • TensorFlow
  • Keras
  • Python NLTK
  • R

Examples of past Natural Language Processing projects

NLP projects we have worked on for major household names include

  • a spoken dialogue system to control a smart home
  • an unsupervised text analysis program to analyse text descriptions of manufacturing defects (Boehringer Ingelheim)
  • a model to classify jobseekers’ CVs into industries and salary bands (CV-Library).
  • analysis of survey responses (White Ribbon Alliance)

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us