A national Land Registry hired us to use NLP to interpret land title deeds, which are written in unstructured legal language.
A problem encountered by land registers worldwide is both the digitisation of older land titles and converting the free text field that describes a land title into structured data. One national Land Registry has structured map data with coordinates of polygons but the land title is a text document. It contains information about the primary ownership of a plot of land, any secondary ownership, and rights of the owners and rights of other parties, such as mineral rights.
This takes the form of several sentences of highly unstructured text, giving us a footprint of the history of the title over the centuries. For example, a house in a city may have been built on a farm which was parcelled up for development a century ago, and the land title will contain references to the much larger plot of the original farm and the smaller plot of the existing house.
Fast Data Science - London
The land registry also had a large geospatial dataset storing the coordinates of titles and parts of titles. We worked on an NLP project to identify the section of a land title in plain text which refers to its primary ownership, allowing parts of the title text to be linked to corresponding polygons in a map file. We used a variety of machine learning techniques, including deep learning models on Microsoft Azure, to match the text to the primary ownership polygon for new unseen land titles.
Together with the land registry team we produced a demo web front end where a user could enter a land title and view the results of the analysis with a confidence score. This was enough for the land registry to proceed to further work analysing their dataset of land titles with natural language processing.
Following the proof of concept, the land registry proceeded to further development sprints towards extracting more enriched data from their land titles database. This should deliver a better user experience for conveyancing solicitors who want to match a land title to structured map data, allowing fast search, information retrieval, and statistical analysis.
If you have a large number of unstructured text documents and would like Fast Data Science to assist you in topic discovery, cluster analysis, or another NLP analysis, please get in touch with us.
What we can do for you