Fast Data Science updates Drug Named Entity Recognition to 2.0.0

· Thomas Wood
Fast Data Science updates Drug Named Entity Recognition to 2.0.0

Fast Data Science updates Drug Named Entity Recognition Python library

We’re excited to announce a major update to our popular Drug Named Entity Recognition (NER) Python library! This new version (v2.0.0) brings several improvements to make finding drug information in text (named entity recognition) even easier and more accurate.

Here’s what’s new in Drug Named Entity Recognition v2.0.0:

  • Fuzzy matching: Say goodbye to typos! Now, the library can find drugs even if they are misspelled in your text. This is perfect for handling user input or text with potential errors.
  • Improved performance: The library now operates more efficiently.
  • Customisable drug list: You can now add your own drug synonyms or entirely new drugs to the library’s recognition capabilities. This allows you to tailor the library to your specific needs and domain.
  • Bug fixes: A number of bugs have been squashed to ensure a smoother user experience.
  • Molecular structures: The library can return the atomic structure of a drug if the data is available.
  • Lightweight and easy to use: The library remains a user-friendly tool that integrates seamlessly with other NLP libraries.

Get started today!

You can find the project on PyPI and on Github. It’s fully open source with MIT License.

You can install the Python library by typing in the command line:

pip install drug-named-entity-recognition

You can also try the library in your browser on Fast Data Science.

Drug Named Entity Recognition is also available as a Google Sheets plugin

Natural language processing

Want to learn more?

Liked what you’ve just read? Get in touch for an NLP consulting session.
Google Sheets logo

We have a no-code solution where you can use the library directly from Google Sheets!

You can install the plugin in Google Sheets here.

Drug name recogniser

Worked code examples

Molecular structures

from drug_named_entity_recognition.drugs_finder import find_drugs
drugs = find_drugs("i bought some paracetamol".split(" "), is_include_structure=True)

this will return the atomic structure of the drug if that data is available.

>>> print (drugs[0][0]["structure_mol"])
316
  Mrv0541 02231214352D          

 11 11  0  0  0  0            999 V2000
    2.3645   -2.1409    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.7934    1.1591    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645    1.1591    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645    0.3341    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790   -0.0784    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6500   -0.0784    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790   -0.9034    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6500   -0.9034    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645   -1.3159    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790    1.5716    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790    2.3966    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  9  1  0  0  0  0
  2 10  2  0  0  0  0
  3  4  1  0  0  0  0
  3 10  1  0  0  0  0
  4  5  2  0  0  0  0
  4  6  1  0  0  0  0
  5  7  1  0  0  0  0
  6  8  2  0  0  0  0
  7  9  2  0  0  0  0
  8  9  1  0  0  0  0
 10 11  1  0  0  0  0
M  END
DB00316

Fuzzy matching/spelling tolerance

You can get drugs even with spelling mistakes:

drugs = find_drugs("i bought some Monjaro".split(" "), is_include_structure=True, is_fuzzy_match=True)

Add and remove drugs (customise the drugs list)

Now you can modify the drug recogniser’s behaviour if there is a particular drug which it isn’t finding:

To reset the drugs dictionary

from drug_named_entity_recognition.drugs_finder import reset_drugs_data
reset_drugs_data()

To add a synonym

from drug_named_entity_recognition.drugs_finder import add_custom_drug_synonym
add_custom_drug_synonym("potato", "sertraline")

To add a new drug

from drug_named_entity_recognition.drugs_finder import add_custom_new_drug
add_custom_new_drug("potato", {"name": "solanum tuberosum"})

To remove an existing drug

from drug_named_entity_recognition.drugs_finder import remove_drug_synonym
remove_drug_synonym("sertraline")

You may also be interested in these domain specific named entity recognition solutions

Unlock Your Future in NLP!

Dive into the world of Natural Language Processing! Explore cutting-edge NLP roles that match your skills and passions.

Explore NLP Jobs

Launching Harmony Meta
Ai in research

Launching Harmony Meta

We are excited to introduce the new Harmony Meta platform, which we have developed over the past year. Harmony Meta connects many of the existing study catalogues and registers.

The Ethics of AI in Healthcare: Opportunities and Risks
Generative ai

The Ethics of AI in Healthcare: Opportunities and Risks

Guest post by Jay Dugad Artificial intelligence has become one of the most talked-about forces shaping modern healthcare. Machines detecting disease, systems predicting patient deterioration, and algorithms recommending personalised treatments all once sounded like science fiction but now sit inside hospitals, research labs, and GP practices across the world.

How can you use large language models and stay HIPAA or GDPR compliant?
Generative ai

How can you use large language models and stay HIPAA or GDPR compliant?

If you are developing an application that needs to interpret free-text medical notes, you might be interested in getting the best possible performance by using OpenAI, Gemini, Claude, or another large language model. But to do that, you would need to send sensitive data, such as personal healthcare data, into the third party LLM. Is this allowed?

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us