Fast Data Science updates Drug Named Entity Recognition to 2.0.0

· Thomas Wood
Fast Data Science updates Drug Named Entity Recognition to 2.0.0

Fast Data Science updates Drug Named Entity Recognition Python library

We’re excited to announce a major update to our popular Drug Named Entity Recognition (NER) Python library! This new version (v2.0.0) brings several improvements to make finding drug information in text (named entity recognition) even easier and more accurate.

Here’s what’s new in Drug Named Entity Recognition v2.0.0:

  • Fuzzy matching: Say goodbye to typos! Now, the library can find drugs even if they are misspelled in your text. This is perfect for handling user input or text with potential errors.
  • Improved performance: The library now operates more efficiently.
  • Customisable drug list: You can now add your own drug synonyms or entirely new drugs to the library’s recognition capabilities. This allows you to tailor the library to your specific needs and domain.
  • Bug fixes: A number of bugs have been squashed to ensure a smoother user experience.
  • Molecular structures: The library can return the atomic structure of a drug if the data is available.
  • Lightweight and easy to use: The library remains a user-friendly tool that integrates seamlessly with other NLP libraries.

Get started today!

You can find the project on PyPI and on Github. It’s fully open source with MIT License.

You can install the Python library by typing in the command line:

pip install drug-named-entity-recognition

You can also try the library in your browser on Fast Data Science.

Drug Named Entity Recognition is also available as a Google Sheets plugin

Natural language processing

Want to learn more?

Liked what you’ve just read? Get in touch for an NLP consulting session.
Google Sheets logo

We have a no-code solution where you can use the library directly from Google Sheets!

You can install the plugin in Google Sheets here.

Drug name recogniser

Worked code examples

Molecular structures

from drug_named_entity_recognition.drugs_finder import find_drugs
drugs = find_drugs("i bought some paracetamol".split(" "), is_include_structure=True)

this will return the atomic structure of the drug if that data is available.

>>> print (drugs[0][0]["structure_mol"])
316
  Mrv0541 02231214352D          

 11 11  0  0  0  0            999 V2000
    2.3645   -2.1409    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.7934    1.1591    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645    1.1591    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645    0.3341    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790   -0.0784    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6500   -0.0784    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790   -0.9034    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6500   -0.9034    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645   -1.3159    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790    1.5716    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790    2.3966    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  9  1  0  0  0  0
  2 10  2  0  0  0  0
  3  4  1  0  0  0  0
  3 10  1  0  0  0  0
  4  5  2  0  0  0  0
  4  6  1  0  0  0  0
  5  7  1  0  0  0  0
  6  8  2  0  0  0  0
  7  9  2  0  0  0  0
  8  9  1  0  0  0  0
 10 11  1  0  0  0  0
M  END
DB00316

Fuzzy matching/spelling tolerance

You can get drugs even with spelling mistakes:

drugs = find_drugs("i bought some Monjaro".split(" "), is_include_structure=True, is_fuzzy_match=True)

Add and remove drugs (customise the drugs list)

Now you can modify the drug recogniser’s behaviour if there is a particular drug which it isn’t finding:

To reset the drugs dictionary

from drug_named_entity_recognition.drugs_finder import reset_drugs_data
reset_drugs_data()

To add a synonym

from drug_named_entity_recognition.drugs_finder import add_custom_drug_synonym
add_custom_drug_synonym("potato", "sertraline")

To add a new drug

from drug_named_entity_recognition.drugs_finder import add_custom_new_drug
add_custom_new_drug("potato", {"name": "solanum tuberosum"})

To remove an existing drug

from drug_named_entity_recognition.drugs_finder import remove_drug_synonym
remove_drug_synonym("sertraline")

You may also be interested in these domain specific named entity recognition solutions

Your NLP Career Awaits!

Ready to take the next step in your NLP journey? Connect with top employers seeking talent in natural language processing. Discover your dream job!

Find Your Dream Job

Semantic leakage
Generative ai

Semantic leakage

A person has recently returned from a camping trip and has a fever. Should a doctor diagnose flu or Lyme disease? Would this be any different if they had not mentioned their camping trip? Here’s how LLMs differ from human experts.

Predicting Customer Churn using Machine Learning and AI
Data science consultingAi for business

Predicting Customer Churn using Machine Learning and AI

How can you predict customer churn using machine learning and AI? In an earlier blog post, I introduced the concept of customer churn. Here, I’d like to dive into customer churn prediction in more detail and show how we can easily and simply use AI to predict customer churn.

JICL publication: A generative AI-based legal advice tool for small businesses in distress
Ai in research

JICL publication: A generative AI-based legal advice tool for small businesses in distress

A generative AI-based legal advice tool for small businesses in distress We are pleased to announce the publication of our paper A generative AI-based legal advice tool for small businesses in distress, in collaboration with an interdisciplinary team based in the UK and Hungary.

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us