
We are excited to introduce the new Harmony Meta platform, which we have developed over the past year. Harmony Meta connects many of the existing study catalogues and registers.
If you want to search for longitudinal studies that measured a particular variable such as “psychosis in adolescence” and you don’t know the exact wording that was used by the researchers, you can now use the AI search on Harmony Meta.
Try it here: https://harmonydata.ac.uk/search
Other examples of things you can search for
All of these will turn up studies that measured these values or approximate synonyms. If the researchers asked a question such as “difficulty reading”, your search for “dyslexia” will find it.
We have 5.5 million variables indexed. As far as I can tell, this is every large longitudinal study run in the UK ever, including well known studies such as the Millennium Cohort Study, the 1970 British Cohort Study, and Born in Bradford.
Harmony Meta works with a vector index so all 5.5 million variables are converted to vectors using a large language model.
Investigators on this project were Bettina Moltrecht and Eoin McElroy. Rachel Holland Gomes worked on the UI, John Rogers on the front end, Thomas Wood (Fast Data Science) on the back end, with Jay Dugad working on community management.
Harmony Meta was funded by ESRC, the Economic and Social Research Council. We would like to acknowledge partners Population Research UK (PRUK), UCL Centre for Longitudinal Studies, DATAMIND UK, The Alan Turing Institute, National Centre for Research Methods, UK Research and Innovation, Wellcome Trust, National Centre for Social Research, and Social Finance.
Ready to take the next step in your NLP journey? Connect with top employers seeking talent in natural language processing. Discover your dream job!
Find Your Dream JobGuest post by Jay Dugad Artificial intelligence has become one of the most talked-about forces shaping modern healthcare. Machines detecting disease, systems predicting patient deterioration, and algorithms recommending personalised treatments all once sounded like science fiction but now sit inside hospitals, research labs, and GP practices across the world.

If you are developing an application that needs to interpret free-text medical notes, you might be interested in getting the best possible performance by using OpenAI, Gemini, Claude, or another large language model. But to do that, you would need to send sensitive data, such as personal healthcare data, into the third party LLM. Is this allowed?

How can you use generative AI to find topics in a free text survey and identify the commonest mentioned topics? Imagine that you work for a market research company, and you’ve just run an online survey. You’ve received 10,000 free text responses from users in different languages. You want to quickly make a pie chart or bar chart showing common customer complaints, broken down by old customers, new customers, different locations, different spending patterns, and demographics.
What we can do for you