Above: video of Thomas Wood presenting Harmony at the Pydata on 27 March 2024
Update: you can download the slides from the presentation here
Link to the meet up: Meetup.com.
I will present our work on Harmony, harmonydata.ac.uk, which is a free online AI research tool that uses generative AI and LLMs to help psychologists analyse datasets. It uses Python, Pandas and HuggingFace Sentence Transformers to find similarities between questionnaires.
Psychologists and social scientists often have to match items in different questionnaires, such as “I often feel anxious” and “Feeling nervous, anxious or afraid”.
This is called harmonisation.
Harmonisation is a time consuming and subjective process. Going through long PDFs of questionnaires and putting the questions into Excel is no fun.
We’ve been working on an open source Python library and free web tool called Harmony which uses natural language processing and generative AI models to help researchers harmonise questionnaire items, even in different languages.
Fast Data Science is a leading data science consultancy firm providing bespoke machine learning solutions for businesses of all sizes across the globe. With a focus on innovation and collaboration, Fast Data Science empowers businesses to leverage the transformative power of data.
Looking for experts in Natural Language Processing? Post your job openings with us and find your ideal candidate today!
Post a Job
We are excited to introduce the new Harmony Meta platform, which we have developed over the past year. Harmony Meta connects many of the existing study catalogues and registers.
Guest post by Jay Dugad Artificial intelligence has become one of the most talked-about forces shaping modern healthcare. Machines detecting disease, systems predicting patient deterioration, and algorithms recommending personalised treatments all once sounded like science fiction but now sit inside hospitals, research labs, and GP practices across the world.

If you are developing an application that needs to interpret free-text medical notes, you might be interested in getting the best possible performance by using OpenAI, Gemini, Claude, or another large language model. But to do that, you would need to send sensitive data, such as personal healthcare data, into the third party LLM. Is this allowed?
What we can do for you