Planning data science projects is tricky, and NLP projects can be particularly problematic. Based on our past experience, we have shared an interactive tool which you can use for estimating task durations and dependencies for an NLP project.
It generates a graphical Gantt chart for your project, based on the inputs you give it.
Input the parameters of your Natural Language Processing project
Project and organisation level
What is the goal of the project?
Is the client a large organisation with a complex process of procurements, purchase orders, approvals, etc?
Does the project need to be signed off by a separate executive level in the organisation, or in another organisation?
Who will use the model?
Data
Is the text data multilingual?
Does the text data need to be extracted from PDFs or similar?
Do we need to manually annotate data?
Is the text data sensitive?
Must the data remain on the client's servers?
Is there a risk of AI bias, or is AI bias an issue?
Task
Do we need to classify data into more than 10 classes?
Do we need to extract multiple values from text, such as finding percentages, dosages, addresses, names?
Does a gold standard of model performance exist? For example, do human annotators achieve 85% accuracy?
Deliverables
Must a front end program be developed?
Must the model be deployed and integrated into the existing technology stack?
Does the model need to be retrained regularly?
Do we need to make an explainable AI model?
View your NLP project’s Gantt chart
month 1
month 2
month 3
month 4
month 5
month 6
month 7
month 8
month 9
month 10
month 11
month 12
month 13
month 14
month 15
month 16
month 17
month 18
month 19
month 20
NDAs
ethics and privacy management
request access to data and systems
kick off meeting
explore data
label data
define metrics for success
develop baseline model
develop a series of models in a leaderboard
select best model
develop front end
deploy model
QA
user testing
handover
Getting your Natural Language Processing project off the ground
Now you’ve made a draft Gantt chart for your NLP project, you can start getting everything together to launch the project. Have a look at this list of things you need to consider when starting a data science project. We also have an overview of the key stages of a data science project.
Data science roadmap planner - you can use this to identify ‘low-hanging fruit’, that is, data science initiatives which are low in technical effort and high in business impact.
NLP project risk tool - identify factors contributing to the risk of a project not completing.