The Office of Rail and Road (ORR) is the British national rail regulator, responsible for health and safety on mainline rail, the London Underground, light rail, and trams.
The ORR has a large amount of structured data feeds in a standardised format of Location, Date, and Value, recording train performance, weather, maintenance costs, trespasses, and other incidents. There is a further large data lake of unstructured text data. The ORR put out a call for software and AI specialists to help them analyse the incident data logged throughout the rail network, with a user friendly interface such as a drag-and-drop tool.
The engineers and analysts at ORR are often tasked with constructing causal analyses such as
Fast Data Science - London
but an obstacle to these kinds of analysis is the difficulty of linking together disparate and differently structured data sets. The ORR had a structured database of variables representing delays, weather data, repair costs, maintenance, accidents and other information. They had an existing Power BI solution which enabled them to explore datasets and join them to some degree. However there was no drag-and-drop solution allowing a non-technical user to experiment with machine learning and AI. This is where Fast Data Science came in.
The ORR set out a need for a graphical user interface which would allow non-technical stakeholders to explore patterns and relationships within the organisation’s data, beyond what would be possible with the standard Power BI set-up.
We developed an in-browser drag-and-drop tool that allows users to explore datasets graphically and link them together, building machine learning models which are able to predict effects such as flood-related delays as a function of flooding and money spent on drainage. We have also enabled users to harness natural language processing (NLP) to find key phrases and topics which are common in given areas of the country or at certain dates.
Our GUI was a first in the ORR as it has allowed high-ranking stakeholders to experiment with machine learning using a simple and easy-to-understand graphical interface, and has enabled the ORR to develop ideas about the future potential of machine learning in rail regulation.
Using our tool, it was now possible for a non-data-scientist in the organisation to drag and drop data sets in the UI to predict train delays as a function of weather and repair outgoings. The UI gave users the option of linear regression or random forest models.
This allowed a user to simulate questions such as