Legal chatbot using natural language processing to answer corporate insolvency questions

June 27, 2023 · Thomas Wood

Insolvency bot, taking into account some statute law and some forms and some case law

We have developed a bot using natural language processing to demonstrate the power of legal AI and legal NLP.

Online demo of the tool

Try the insolvency bot

Ask an insolvency question

Fast Data Science have been working with a team of AI and legal experts at Royal Holloway University’s Department of Law and Criminology and the University of Surrey’s Department of Computer Science to generate a chatbot which can answer questions on corporate insolvency in England and Wales.

Using prompt engineering, generative models, and the text of key UK statute law such as the Insolvency Act 1986, important case law from the National Archives, and information on procedures from HMRC’s website, the system triages incoming queries and sends a smart and informative prompt to a generative model.

You can try the insolvency chatbot at this link.

The bot uses Retrieval Augmented Generation (RAG). RAG is a design pattern where we add an information retrieval component to a large language model (LLM), allowing us to add internal knowledge to the LLM’s capabilities.

The information provided on this website does not, and is not intended to, constitute legal advice.

Validating the Insolvency Bot

We have used an innovative approach to evaluating the output of the bot, since it is a generative model, which are typically hard to evaluate. We use a human-defined mark scheme and use the LLM to assess the bot’s answers to test questions, and mark it as if it were taking a law exam.

We will take the insolvency bot’s response and pass it to GPT-4 with an accompanying “criterion” question such as Does the lawyer mention that piercing the corporate veil may occur as a result of the director breaching their fiduciary duties towards the company?. If the answer comes back ‘yes’, ‘maybe’, or as a yes with caveats, then points are awarded accordingly.

We have some validation scripts in our Github repo at: https://github.com/fastdatascience/evaluate_insolvency

We tried a number of variants of the bot, including one built around GPT-3.5 Turbo and GPT-4, and tested it head-to-head against the unmodified versions of GPT.

We found that GPT-4 is much slower to respond than GPT-3.5 Turbo, but is considerably more precise in its answers.

The team

Our team on this project has been cross-disciplinary, with members from different universities and industries. You can read their profiles here.

Presentation at JURIX 2023

The Insolvency Bot was presented by Marton Ribary at JURIX 2023 (the 36th International Conference on Legal Knowledge and Information Systems), held in Maastricht University, the Netherlands, on 19 December 2023. At this conference, we were able to connect with a number of fascinating projects which also involved use of AI and LLMs to improve access to justice (A2J), such as Toivonen et al’s presentation Beyond Debt: The Intersection of Justice, Financial Wellbeing and AI, and Margaret Hagan’s presentation Good AI Legal Help, Bad AI Legal Help: Establishing quality standards for responses to people’s legal problem stories.

Citing the Insolvency Bot, DOIs, and resources

Our paper was published in the JURIX conference proceedings. You can cite the project using the following citation:

Ribary, M., Krause, P., Orban, M., Vaccari, E., Wood, T.A., Prompt Engineering and Provision of Context in Domain Specific Use of GPT, Frontiers in Artificial Intelligence and Applications 379: Legal Knowledge and Information Systems, 2023. https://doi.org/10.3233/FAIA230979

Paper:

Evaluation scripts:

PDF of presentation from JURIX 2023: [Click here to download the slideshow presented at the JURIX 2023 conference](/downloads/insolvency-llm-jurix-2023.pdf).

@software{Ribary_Prompt_Engineering_and_2023,
author = {Ribary, Marton and Krause, Paul and Orban, Miklos and Vaccari, Eugenio and Wood, Thomas Andrew},
doi = {10.3233/FAIA230979},
month = dec,
title = {{Prompt Engineering and Provision of Context in Domain Specific Use of GPT}},
url = {https://fastdatascience.com/insolvency/},
year = {2023}
}

References

Ribary, M., et al. Insolvency Bot: A GPT-based Legal Advice Tool for Small Businesses in Distress [Long Paper Version]. Zenodo, 12 Sept. 2023, doi:10.5281/zenodo.10029735.

Bilgin, O., Fields, L., Laverghetta, A. Jr., Marji, Z., Nighojkar, A., Steinle, S., & Licato, J. (2023). AMHR Lab 2023 COLIEE Competition Approach. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artiﬁcial Intelligence and Law (pp. 77–86).

Bittner, M. (1990). The IRAC Method of Case Study Analysis: A Legal Model for the Social Studies. Social Studies, 81(5), 227–230.

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language models are few-shot learners. ArXiv. https://doi.org/10.48550/arXiv.2209.14500

Celikyilmaz, A., Clark, E., & Gao, J. (2020). Evaluation of text generation: A survey. ArXiv. https://doi.org/10.48550/arXiv.2006.14799

Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., & Androutsopoulos, I. (2020). LEGAL-BERT: The Muppets straight out of Law School. Findings of the Association for Computational Linguistics: EMNLP 2020, 2898–2904. https://doi.org/10.18653/v1/2020.findings-emnlp.261

Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., Webson, A., Gu, S. S., Dai, Z., Suzgun, M., Chen, X., Chowdhery, A., Castro-Ros, A., Pellat, M., Robinson, K., … Wei, J. (2022). Scaling Instruction-Finetuned Language Models. In arXiv. https://doi.org/10.48550/arXiv.2210.11416

Code of Business Crisis and Insolvency 2022 (Italy). (2022). https://www.normattiva.it/uri-res/N2Ls?urn:nir:stato:decreto.legislativo:2019-01-12;14

Companies (Rescue Process for Small and Micro Companies) Act 2021 (2020). (2021). https://www.irishstatutebook.ie/eli/2021/act/30/section/3/enacted/en/html

Corporate Insolvency and Governance Act 2020 (UK). (2015). https://www.legislation.gov.uk/ukpga/2020/12/contents/enacted

Corporations Amendment (Corporate Insolvency Reforms) Act 2020 (Cth) (Act) (Australia). (2020). https://www.aph.gov.au/Parliamentary_Business/Bills_Legislation/Bills_Search_Results/Result?bId=r6626_

Debbarma, R., Prawar, P., Chakraborty, A., & Bedathur, S. (2023). IITDLI: Legal Case Retrieval Based on Lexical Models. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artiﬁcial Intelligence and Law (pp. 40–47).

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423

European Commission. (2003). Commission Recommendation of 6 May 2003 concerning the definition of micro, small and medium-sized enterprises (Text with EEA relevance) (notified under document number C(2003) 1422 (Techreport 2003/361/EC). http://data.europa.eu/eli/reco/2003/361/oj

European Commission. (2022). Proposal for a Directive of the European Parliament and of the Council harmonising certain aspects of insolvency law (Techreport COM/2022/702). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52022PC0702

Fujita, M., Kiyota, N., & Kano, Y. (2021). Predicate’s Argument Resolver and Entity Abstraction for Legal Question Answering: KIS teams at COLIEE 2021 shared task. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment (COLIEE 2021) (pp. 15–24).

Glaese, A., McAleese, N., Trębacz, M., Aslanides, J., Firoiu, V., Ewalds, T., Rauh, M., Weidinger, L., Chadwick, M., Thacker, P., Campbell-Gillingham, L., Uesato, J., Huang, P.-S., Comanescu, R., Yang, F., See, A., Dathathri, S., Greig, R., Chen, C., … Irving, G. (2022). Improving alignment of dialogue agents via targeted human judgements. ArXiv. https://doi.org/10.48550/arXiv.2209.14375

Hardcastle, D., & Scott, D. (2008). Can we evaluate the quality of generated text? In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the 6th Language Resources and Evaluation Conference (pp. 3151–3158).

Hart, H. L. A. ([1961] 2012). The concept of law (Third edition). Oxford University Press.

Hutchinson, G. B. (2021). The Small Companies Rescue Act – false hope for failing companies? Company Law Practice?, 7.

Katz, D. M., Bommarito, M. J., Gao, S., & Arredondo, P. (2023). GPT-4 passes the Bar Exam. SSRN. https://doi.org/10.2139/ssrn.4389233

Kim, M.-Y., Rabelo, J., Goebel, R., Kano, Y., Satoh, K., & Yoshioka, M. (2023). COLIEE 2022 Summary: Methods for Legal Document Retrieval and Entailment. In Y. Takama, K. Yada, K. Satoh, & S. Arai (Eds.), New Frontiers in Artificial Intelligence. JSAI-isAI 2022 (Issue 13859, pp. 51–67). Springer. https://doi.org/10.1007/978-3-031-29168-5_4

Kojima, T., Shane Gu, S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2023). Large Language Models are Zero-Shot Reasoners. ArXiv. https://doi.org/10.48550/arXiv.2205.11916

Lin, S., Hilton, J., & Evans, O. (2021). TruthfulQA: Measuring how models mimic human falsehoods. ArXiv. https://doi.org/10.48550/arXiv.2109.07958

Liu, Yiheng, Han, T., Ma, S., Zhang, J., Yang, Y., Tian, J., He, H., Li, A., He, M., Liu, Z., Wu, Z., Zhu, D., Li, X., Qiang, N., Shen, D., Liu, T., & Ge, B. (2023). Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models. ArXiv. https://doi.org/10.48550/arXiv.2304.01852

Liu, Yiqun, Li, H., Su, W., Wang, C., Wu, Y., & Ai, Q. (2023). THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artiﬁcial Intelligence and Law (pp. 1–6).

Lynch, K. (2023). Is ChatGPT a threat to the crative industries? University of Derby Magazine, 17. https://www.derby.ac.uk/magazine/issue-17/chat-gpt-threat-creative-industries/

Microsoft. (2023). Azure Functions. Computer software. https://azure.microsoft.com/en-gb/products/functions

Miklos, O. (2023). Mickey-bot. Computer software.

Mischcon de Reya. (2023). Mishcon de Reya’s exploration of AI technologies featured in the media. Mischcon de Reya. https://www.mishcon.com/news/mishcon-de-reyas-exploration-of-ai-technologies-featured-in-the-media

Mokal, R. J., Davis, R., Madaus, S., Mazzoni, A., Mevorach, I., Romaine, B., Sarra, J. P., & Tirado, I. (2018). Micro, small, and medium enterprise insolvency: A modular approach. Oxford University Press.

National Statistics. (2023). Company Insolvency Statistics: April to June 2023 [Techreport]. https://www.gov.uk/government/statistics/company-insolvency-statistics-april-to-june-2023

Nguyen, M. L., Bui, Q. M., Do, D.-T., Le, N.-K., Nguyen, D.-H., Nguyen, K.-H., & Anh, T. P. N. (2023). JNLP@COLIEE 2023: Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artiﬁcial Intelligence and Law (pp. 17–26).

Norton III, W. L., & Bailey, J. B. (2020). The pros and cons of the Small Business Reorganization Act of 2019. Emory Bankruptcy Developments Journal, 36(2), 383–393. https://scholarlycommons.law.emory.edu/ebdj/vol36/iss2/2

Novaes, L. P., Vianna, D., & da Silva, A. (2023). A Topic-Based Approach for the Legal Case Retrieval Task. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artiﬁcial Intelligence and Law (pp. 27–31).

OpenAI. (2022a). Introducing ChatGPT. Blog post. https://openai.com/blog/chatgpt

OpenAI. (2022b). New and improved embedding model. Blog post. https://openai.com/blog/new-and-improved-embedding-model

OpenAI. (2023). GPT-4 Technical Report. In arXiv [Techreport]. https://doi.org/10.48550/arXiv.2303.08774

Oppenheimer, D. (2023). ChatGPT has arrived – and nothing has changed. Times Higher Education. https://www.timeshighereducation.com/campus/chatgpt-has-arrived-and-nothing-has-changed

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. ArXiv. https://doi.org/10.48550/arXiv.2203.02155

Rabelo, J., Goebel, R., Kano, Y., Kim, M.-Y., Satoh, K., & Yoshioka, M. (2022). Overview and Discussion of the Competition on Legal Information Extraction/Entailment (COLIEE) 2021. The Review of Socionetwork Strategies, 16, 111–133. https://doi.org/10.1007/s12626-022-00105-z

Rattray, K. (2022). Will ChatGPT replace lawyers? Clio. https://www.clio.com/blog/chat-gpt-lawyers/

Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using siamese BERT-networks. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 3982–3992). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1410

Schilder, F., Chinnappa, D., Madan, K., Harmouche, J., Vold, A., Bretz, H., & Hudzina, J. (2021). A Pentapus Grapples with Legal Reasoning. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment (COLIEE 2021) (pp. 60–68).

Shidiq, M. (2023). The use of artificial intelligence-based ChatGPT and its challenges for the world of education: from the viewpoint of the development of creative writing skills. Proceedings of the International Conference on Education, Society and Humanity, 1(1), 353–357.

Small Business, Enterprise and Employment Act 2015 (UK). (2015). https://www.legislation.gov.uk/ukpga/2015/26/contents/enacted

Small Business Reorganization Act of 2019 (US). (2019). https://www.congress.gov/bill/116th-congress/house-bill/3311

The World Bank. (2017). Report on the Treatment of MSME Insolvency [Techreport]. https://documents1.worldbank.org/curated/en/973331494264489956/pdf/114823-REVISED-PUBLIC-MSME-Insolvency-report-low-res-final.pdf

Thomson Reuters Institute. (2023). ChatGPT and Generative AI within Law Firms Law firms see potential, eye practical use cases and more knowledge around risks [Techreport]. https://www.thomsonreuters.com/en-us/posts/wp-content/uploads/sites/20/2023/04/2023-Chat-GPT-Generative-AI-in-Law-Firms.pdf

UNCITRAL. (2021). Legislative Recommendations on Insolvency of Micro- and Small Enterprises [Techreport]. https://uncitral.un.org/en/ilmse

Vaccari, E. (2022). A Modular Approach to Restructuring and Insolvency Law: Executory Contracts and Onerous Property in England and Italy. Norton Journal of Bankruptcy Law and Practice, 5.

Vaccari, E., Ehmke, D., & Burigo, F. (2023). MSMEs in Distress: Regulatory Costs and Efficiency Considerations in the Implementation of Preventive Restructuring Mechanisms: An Anglo-German-Italian Perspective. Journal of International and Comparative Law.

Vaccari, E., & Ghio, E. (2022). English corporate insolvency law: A primer. Edward Elgar.

Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual: (Python Documentation Manual Part 2). CreateSpace.

Wakeling, D. (2023). A&O announces exclusive launch partnership with Harvey. Allen & Overy. https://www.allenovery.com/en-gb/global/news-and-insights/news/ao-announces-exclusive-launch-partnership-with-harvey

Walters, A. (2020). The Small Business Reorganization Act: America’s new tool for SME restructuring for the COVID and post-COVID era. The Company Lawyer, 10, 324–325.

Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-consistency improves Chain of Thought reasoning in Language Models. ArXiv. https://doi.org/10.48550/arXiv.2203.11171

Warzel, C. (20230208). Talking to AI might be the most important job skill of this century. https://www.theatlantic.com/technology/archive/2023/02/openai-text-models-google-search-engine-bard-chatbot-chatgpt-prompt-writing/672991/

Wood, T. (2023a). Evaluate insolvency. GitHub. https://github.com/fastdatascience/evaluate_insolvency

Wood, T. (2023b). Evaluation script for insolvency bot. Zenodo. https://doi.org/10.5281/zenodo.8292105

Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., & Schiele, B. (2016). Latent embeddings for zero-shot classification. https://doi.org/10.48550/arXiv.1603.08895

Ye, X., & Durrett, G. (2022). The unreliability of explanations in Few-shot Prompting for textual reasoning. ArXiv. https://doi.org/10.48550/arXiv.2205.03401

Yu, F., Quartey, L., & Schilder, F. (2022). Legal prompting: Teaching a Language Model to think like a lawyer. ArXiv. https://doi.org/10.48550/arXiv.2212.01326

Zelikman, E., Wu, Y., Mu, J., & Goodman, N. D. (2022). STaR: Bootstrapping Reasoning with Reasoning. ArXiv. https://doi.org/10.48550/arXiv.2203.14465

Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. (2019). BERTScore: Evaluating text generation with BERT. ArXiv. https://doi.org/10.48550/arXiv.1904.09675

Unlock Your Future in NLP!

Dive into the world of Natural Language Processing! Explore cutting-edge NLP roles that match your skills and passions.

Explore NLP Jobs

Previous
NLP on under-resourced languages Next
Upcoming Dash in Action webinar on 7 June!

Should lawyers stop using generative AI to prepare their legal arguments?

Generative aiLegal ai

Jul 09, 2025

Should lawyers stop using generative AI to prepare their legal arguments?

Senior lawyers should stop using generative AI to prepare their legal arguments! Or should they? A High Court judge in the UK has told senior lawyers off for their use of ChatGPT, because it invents citations to cases and laws that don’t exist!

Fast Data Science at Hamlyn Symposium on Medical Robotics on 27 June 2025

Ai in healthcareEvents

Jun 27, 2025

Fast Data Science at Hamlyn Symposium on Medical Robotics on 27 June 2025

Fast Data Science appeared at the Hamlyn Symposium event on “Healing Through Collaboration: Open-Source Software in Surgical, Biomedical and AI Technologies” Thomas Wood of Fast Data Science appeared in a panel at the Hamlyn Symposium workshop titled “Healing Through Collaboration: Open-Source Software in Surgical, Biomedical and AI Technologies”. This was at the Hamlyn Symposium on Medical Robotics on 27th June 2025 at the Royal Geographical Society in London.

Fast Data Science at The 4th Annual Conference on the Intersection of Corporate Law and Technology on 23 June 2025

Legal aiEvents

Jun 23, 2025

Fast Data Science at The 4th Annual Conference on the Intersection of Corporate Law and Technology on 23 June 2025

We presented the Insolvency Bot at the 4th Annual Conference on the Intersection of Corporate Law and Technology at Nottingham Trent University Dr Eugenio Vaccari of Royal Holloway University and Thomas Wood of Fast Data Science presented “A Generative AI-Based Legal Advice Tool for Small Businesses in Distress” at the 4th Annual Conference on the Intersection of Corporate Law and Technology at Nottingham Trent University

What we can do for you

Transform Unstructured Data into Actionable Insights