We’re excited to announce a major update to our popular Drug Named Entity Recognition (NER) Python library! This new version (v2.0.0) brings several improvements to make finding drug information in text (named entity recognition) even easier and more accurate.
You can find the project on PyPI and on Github. It’s fully open source with MIT License.
You can install the Python library by typing in the command line:
pip install drug-named-entity-recognition
You can also try the library in your browser on Fast Data Science.
Natural language processing
We have a no-code solution where you can use the library directly from Google Sheets!
You can install the plugin in Google Sheets here.
from drug_named_entity_recognition.drugs_finder import find_drugs
drugs = find_drugs("i bought some paracetamol".split(" "), is_include_structure=True)
this will return the atomic structure of the drug if that data is available.
>>> print (drugs[0][0]["structure_mol"])
316
Mrv0541 02231214352D
11 11 0 0 0 0 999 V2000
2.3645 -2.1409 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.7934 1.1591 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.3645 1.1591 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
2.3645 0.3341 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.0790 -0.0784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.6500 -0.0784 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.0790 -0.9034 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.6500 -0.9034 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.3645 -1.3159 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.0790 1.5716 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.0790 2.3966 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1 9 1 0 0 0 0
2 10 2 0 0 0 0
3 4 1 0 0 0 0
3 10 1 0 0 0 0
4 5 2 0 0 0 0
4 6 1 0 0 0 0
5 7 1 0 0 0 0
6 8 2 0 0 0 0
7 9 2 0 0 0 0
8 9 1 0 0 0 0
10 11 1 0 0 0 0
M END
DB00316
You can get drugs even with spelling mistakes:
drugs = find_drugs("i bought some Monjaro".split(" "), is_include_structure=True, is_fuzzy_match=True)
Now you can modify the drug recogniser’s behaviour if there is a particular drug which it isn’t finding:
To reset the drugs dictionary
from drug_named_entity_recognition.drugs_finder import reset_drugs_data
reset_drugs_data()
To add a synonym
from drug_named_entity_recognition.drugs_finder import add_custom_drug_synonym
add_custom_drug_synonym("potato", "sertraline")
To add a new drug
from drug_named_entity_recognition.drugs_finder import add_custom_new_drug
add_custom_new_drug("potato", {"name": "solanum tuberosum"})
To remove an existing drug
from drug_named_entity_recognition.drugs_finder import remove_drug_synonym
remove_drug_synonym("sertraline")
Dive into the world of Natural Language Processing! Explore cutting-edge NLP roles that match your skills and passions.
Explore NLP JobsWhat is generative AI consulting? We have been taking on data science engagements for a number of years. Our main focus has always been textual data, so we have an arsenal of traditional natural language processing techniques to tackle any problem a client could throw at us.
Listen to the new episode of the Clinical Trial Files podcast, where Karin Avila, Taymeyah Al-Toubah and Thomas Wood of Fast Data Science chat about AI and NLP in pharma, the Clinical Trial Risk Tool, what impact AI can make in clinical trials. This episode commemorates Alan Turing’s 113rd birthday on 23 June 2025.
Fast Data Science at will be presenting at the 4th Annual Conference on the Intersection of Corporate Law and Technology at Nottingham Trent University Join Thomas Wood of Fast Data Science, Marton Ribary and Eugenio Vaccari for their presentation “A Generative AI-Based Legal Advice Tool for Small Businesses in Distress” at the 4th Annual Conference on the Intersection of Corporate Law and Technology at Nottingham Trent University
What we can do for you