We have developed a bot using natural language processing to demonstrate the power of legal NLP.
Fast Data Science have been working with a team of AI and legal experts at Royal Holloway University’s Department of Law and Criminology and the University of Surrey’s Department of Computer Science to generate a chatbot which can answer questions on corporate insolvency in England and Wales.
Using prompt engineering, generative models, and the text of key UK statute law such as the Insolvency Act 1986, important case law from the National Archives, and information on procedures from HMRC’s website, the system triages incoming queries and sends a smart and informative prompt to a generative model.
You can try the insolvency chatbot at this link.
The information provided on this website does not, and is not intended to, constitute legal advice.
We have used an innovative approach to evaluating the output of the bot, since it is a generative model, which are typically hard to evaluate. We use a human-defined mark scheme and use the LLM to assess the bot’s answers to test questions, and mark it as if it were taking a law exam.
We will take the insolvency bot’s response and pass it to GPT-4 with an accompanying “criterion” question such as Does the lawyer mention that piercing the corporate veil may occur as a result of the director breaching their fiduciary duties towards the company?. If the answer comes back ‘yes’, ‘maybe’, or as a yes with caveats, then points are awarded accordingly.
We have some validation scripts in our Github repo at: https://github.com/fastdatascience/evaluate_insolvency
We tried a number of variants of the bot, including one built around GPT-3.5 Turbo and GPT-4, and tested it head-to-head against the unmodified versions of GPT.
We found that GPT-4 is much slower to respond than GPT-3.5 Turbo, but is considerably more precise in its answers.
Our team on this project has been cross-disciplinary, with members from different universities and industries. You can read their profiles here.
We are preparing a paper on the insolvency bot for submission to JURIX 2023 (the 36th International Conference on Legal Knowledge and Information Systems), to be held in Maastricht University,the Netherlands, on 18-20 December 2023. However, in the meantime, you can cite the project using the following citation:
Wood, T.A., Vaccari, E., Ribary, M., Orban, M., Krause, P., Insolvency Bot [Computer software], Version 1.0, accessed at https://fastdatascience.com/insolvency (2023)
@unpublished{countrynamedentityrecognition, AUTHOR = {Wood, T.A., Vaccari, E., Ribary, M., Orban, M., Krause, P.}, TITLE = {Insolvency Bot (Computer software), Version 1.0}, YEAR = {2023}, Note = {To appear}, url = {https://fastdatascience.com/insolvency/} }
Guest post by Essa Jabang, who works as a data and engineering consultant in our team at Fast Data Science and also runs his own company Taybull.
What is NLP in business environments? Natural language processing (NLP) is a branch of AI (Artificial Intelligence), empowering computers to not just understand but also process and generate language in the same way that humans do.
Can we detect what is fake news or plagiarised in 59 articles for Der Spiegel by Claas Relotius? We used natural language processing to uncover the clues that pointed to a rogue journalist’s history of submitting fake news
What we can do for you