We have open-sourced a Python library called Medical Named Entity Recognition for finding medical conditions and diseases in a string and returning MeSH codes. For example, “dementia”. This NLP task is called named entity recognition (finding medical conditions in text) and named entity linking (mapping the diseases to IDs).
This is intended for data mining, text mining and other applications of AI in pharma.
Medical Named Entity Recognition also only finds the English names of these conditions. Names in the other languages are not supported.
You can install the Python library by typing in the command line:
pip install medical-named-entity-recognition
The source code is on Github and the project is on Pypi.
If your NER problem is common across industries and likely to have been seen before, there may be an off-the-shelf NER tool for your purposes, such as our Drug Named Entity Recognition Python library or the Country Named Entity Recognition Python library.
Dictionary-based named entity recognition is not always the solution, as sometimes the total set of entities is an open set and can’t be listed (e.g. personal names), so sometimes a bespoke trained NER model is the answer. For tasks like finding email addresses or phone numbers, regular expressions (simple rules) are often sufficient for the job.
What we can do for you