Building a machine learning GUI for the Office of Rail and Road

The Office of Rail and Road (ORR) is the British national rail regulator, responsible for health and safety on mainline rail, the London Underground, light rail, and trams.
The ORR has a large amount of structured data feeds in a standardised format of Location, Date, and Value, recording train performance, weather, maintenance costs, trespasses, and other incidents. There is a further large data lake of unstructured text data. The ORR put out a call for software and AI specialists to help them analyse the incident data logged throughout the rail network, with a user friendly interface such as a drag-and-drop tool.
The engineers and analysts at ORR are often tasked with constructing causal analyses such as
  • a trespasser was able to get on the track
  • because high winds had damaged fencing
  • because maintenance on barriers had been cut in that region,

but an obstacle to these kinds of analysis is the difficulty of linking together disparate and differently structured data sets.

Office of Rail and Road logo

Created in 2004

330 employees

Regulates Network Rail

The ORR had a structured database of variables representing delays, weather data, repair costs, maintenance, accidents and other information. They had an existing Power BI solution which enabled them to explore datasets and join them to some degree. However there was no drag-and-drop solution allowing a non-technical user to experiment with machine learning and AI. This is where Fast Data Science came in. 

The ORR set out a need for a graphical user interface which would allow non-technical stakeholders to explore patterns and relationships within the organisation’s data, beyond what would be possible with the standard Power BI set-up.
We developed an in-browser drag-and-drop tool that allows users to explore datasets graphically and link them together, building machine learning models which are able to predict effects such as flood-related delays as a function of flooding and money spent on drainage. We have also enabled users to harness natural language processing (NLP) to find key phrases and topics which are common in given areas of the country or at certain dates.
Our GUI was a first in the ORR as it has allowed high-ranking stakeholders to experiment with machine learning using a simple and easy-to-understand graphical interface, and has enabled the ORR to develop ideas about the future potential of machine learning in rail regulation.
Using our tool, it was now possible for a non-data-scientist in the organisation to drag and drop data sets in the UI to predict train delays as a function of weather and repair outgoings. The UI gave users the option of linear regression or random forest models.

This allowed a user to simulate questions such as

  • what would the delays have been in 2021 if Covid had not happened? (a counterfactual), or
  • if next year will be a very hot summer due to climate change what delays do we expect to see? (a hypothetical).
rail departure board min