Fast Data Science updates Drug Named Entity Recognition to 2.0.0

· Thomas Wood
Fast Data Science updates Drug Named Entity Recognition to 2.0.0

Elevate Your Team with NLP Specialists

Unleash the potential of your NLP projects with the right talent. Post your job with us and attract candidates who are as passionate about natural language processing.

Hire NLP Experts

Fast Data Science updates Drug Named Entity Recognition Python library

We’re excited to announce a major update to our popular Drug Named Entity Recognition (NER) Python library! This new version (v2.0.0) brings several improvements to make finding drug information in text (named entity recognition) even easier and more accurate.

Here’s what’s new in Drug Named Entity Recognition v2.0.0:

  • Fuzzy matching: Say goodbye to typos! Now, the library can find drugs even if they are misspelled in your text. This is perfect for handling user input or text with potential errors.
  • Improved performance: The library now operates more efficiently.
  • Customisable drug list: You can now add your own drug synonyms or entirely new drugs to the library’s recognition capabilities. This allows you to tailor the library to your specific needs and domain.
  • Bug fixes: A number of bugs have been squashed to ensure a smoother user experience.
  • Molecular structures: The library can return the atomic structure of a drug if the data is available.
  • Lightweight and easy to use: The library remains a user-friendly tool that integrates seamlessly with other NLP libraries.

Get started today!

You can find the project on PyPI and on Github. It’s fully open source with MIT License.

You can install the Python library by typing in the command line:

pip install drug-named-entity-recognition

You can also try the library in your browser on Fast Data Science.

Drug Named Entity Recognition is also available as a Google Sheets plugin

Natural language processing

Want to learn more?

Liked what you’ve just read? Get in touch for an NLP consulting session.
Google Sheets logo

We have a no-code solution where you can use the library directly from Google Sheets!

You can install the plugin in Google Sheets here.

Drug name recogniser

Worked code examples

Molecular structures

from drug_named_entity_recognition.drugs_finder import find_drugs
drugs = find_drugs("i bought some paracetamol".split(" "), is_include_structure=True)

this will return the atomic structure of the drug if that data is available.

>>> print (drugs[0][0]["structure_mol"])
316
  Mrv0541 02231214352D          

 11 11  0  0  0  0            999 V2000
    2.3645   -2.1409    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.7934    1.1591    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645    1.1591    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645    0.3341    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790   -0.0784    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6500   -0.0784    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790   -0.9034    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6500   -0.9034    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3645   -1.3159    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790    1.5716    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0790    2.3966    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  9  1  0  0  0  0
  2 10  2  0  0  0  0
  3  4  1  0  0  0  0
  3 10  1  0  0  0  0
  4  5  2  0  0  0  0
  4  6  1  0  0  0  0
  5  7  1  0  0  0  0
  6  8  2  0  0  0  0
  7  9  2  0  0  0  0
  8  9  1  0  0  0  0
 10 11  1  0  0  0  0
M  END
DB00316

Fuzzy matching/spelling tolerance

You can get drugs even with spelling mistakes:

drugs = find_drugs("i bought some Monjaro".split(" "), is_include_structure=True, is_fuzzy_match=True)

Add and remove drugs (customise the drugs list)

Now you can modify the drug recogniser’s behaviour if there is a particular drug which it isn’t finding:

To reset the drugs dictionary

from drug_named_entity_recognition.drugs_finder import reset_drugs_data
reset_drugs_data()

To add a synonym

from drug_named_entity_recognition.drugs_finder import add_custom_drug_synonym
add_custom_drug_synonym("potato", "sertraline")

To add a new drug

from drug_named_entity_recognition.drugs_finder import add_custom_new_drug
add_custom_new_drug("potato", {"name": "solanum tuberosum"})

To remove an existing drug

from drug_named_entity_recognition.drugs_finder import remove_drug_synonym
remove_drug_synonym("sertraline")

You may also be interested in these domain specific named entity recognition solutions

Find Top NLP Talent!

Looking for experts in Natural Language Processing? Post your job openings with us and find your ideal candidate today!

Post a Job

Fast Data Science to present Harmony at AI|DL
Ai in research

Fast Data Science to present Harmony at AI|DL

Tech Talk at the AI|DL AI Meetup (London) Artificial Intelligence and Deep Learning for Enterprise Video of Thomas Wood presenting Harmony at the AICamp meetup

Fast Data Science presents Harmony at MethodsCon: Futures in Manchester
Ai in research

Fast Data Science presents Harmony at MethodsCon: Futures in Manchester

MethodsCon in Manchester We will be at MethodsCon: Futures in Manchester, run by the National Centre for Research Methods on 11 and 12 September 2024 to present Harmony, the NLP and AI tool we have been developing for researchers in social science, funded by Wellcome and the Economic and Social Research Council.

Using AI for investment advice
Ai in finance

Using AI for investment advice

Investment advice with AI Introduction Artificial intelligence (AI) is not just a buzzword; it’s a transformative force reshaping various industries, with finance and investment standing at the forefront of this revolution.

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us