Ready to take the next step in your NLP journey? Connect with top employers seeking talent in natural language processing. Discover your dream job!
Find Your Dream JobWe have developed a bot using natural language processing to demonstrate the power of legal AI and legal NLP.
Fast Data Science have been working with a team of AI and legal experts at Royal Holloway University’s Department of Law and Criminology and the University of Surrey’s Department of Computer Science to generate a chatbot which can answer questions on corporate insolvency in England and Wales.
Using prompt engineering, generative models, and the text of key UK statute law such as the Insolvency Act 1986, important case law from the National Archives, and information on procedures from HMRC’s website, the system triages incoming queries and sends a smart and informative prompt to a generative model.
You can try the insolvency chatbot at this link.
The bot uses Retrieval Augmented Generation (RAG). RAG is a design pattern where we add an information retrieval component to a large language model (LLM), allowing us to add internal knowledge to the LLM’s capabilities.
The information provided on this website does not, and is not intended to, constitute legal advice.
We have used an innovative approach to evaluating the output of the bot, since it is a generative model, which are typically hard to evaluate. We use a human-defined mark scheme and use the LLM to assess the bot’s answers to test questions, and mark it as if it were taking a law exam.
We will take the insolvency bot’s response and pass it to GPT-4 with an accompanying “criterion” question such as Does the lawyer mention that piercing the corporate veil may occur as a result of the director breaching their fiduciary duties towards the company?. If the answer comes back ‘yes’, ‘maybe’, or as a yes with caveats, then points are awarded accordingly.
We have some validation scripts in our Github repo at: https://github.com/fastdatascience/evaluate_insolvency
We tried a number of variants of the bot, including one built around GPT-3.5 Turbo and GPT-4, and tested it head-to-head against the unmodified versions of GPT.
We found that GPT-4 is much slower to respond than GPT-3.5 Turbo, but is considerably more precise in its answers.
Our team on this project has been cross-disciplinary, with members from different universities and industries. You can read their profiles here.
The Insolvency Bot was presented by Marton Ribary at JURIX 2023 (the 36th International Conference on Legal Knowledge and Information Systems), held in Maastricht University, the Netherlands, on 19 December 2023. At this conference, we were able to connect with a number of fascinating projects which also involved use of AI and LLMs to improve access to justice (A2J), such as Toivonen et al’s presentation Beyond Debt: The Intersection of Justice, Financial Wellbeing and AI, and Margaret Hagan’s presentation Good AI Legal Help, Bad AI Legal Help: Establishing quality standards for responses to people’s legal problem stories.
Our paper was published in the JURIX conference proceedings. You can cite the project using the following citation:
@software{Ribary_Prompt_Engineering_and_2023, author = {Ribary, Marton and Krause, Paul and Orban, Miklos and Vaccari, Eugenio and Wood, Thomas Andrew}, doi = {10.3233/FAIA230979}, month = dec, title = {{Prompt Engineering and Provision of Context in Domain Specific Use of GPT}}, url = {https://fastdatascience.com/insolvency/}, year = {2023} }
Ribary, M., et al. Insolvency Bot: A GPT-based Legal Advice Tool for Small Businesses in Distress [Long Paper Version]. Zenodo, 12 Sept. 2023, doi:10.5281/zenodo.10029735.
Bilgin, O., Fields, L., Laverghetta, A. Jr., Marji, Z., Nighojkar, A., Steinle, S., & Licato, J. (2023). AMHR Lab 2023 COLIEE Competition Approach. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artificial Intelligence and Law (pp. 77–86).
Bittner, M. (1990). The IRAC Method of Case Study Analysis: A Legal Model for the Social Studies. Social Studies, 81(5), 227–230.
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language models are few-shot learners. ArXiv. https://doi.org/10.48550/arXiv.2209.14500
Celikyilmaz, A., Clark, E., & Gao, J. (2020). Evaluation of text generation: A survey. ArXiv. https://doi.org/10.48550/arXiv.2006.14799
Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., & Androutsopoulos, I. (2020). LEGAL-BERT: The Muppets straight out of Law School. Findings of the Association for Computational Linguistics: EMNLP 2020, 2898–2904. https://doi.org/10.18653/v1/2020.findings-emnlp.261
Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., Webson, A., Gu, S. S., Dai, Z., Suzgun, M., Chen, X., Chowdhery, A., Castro-Ros, A., Pellat, M., Robinson, K., … Wei, J. (2022). Scaling Instruction-Finetuned Language Models. In arXiv. https://doi.org/10.48550/arXiv.2210.11416
Code of Business Crisis and Insolvency 2022 (Italy). (2022). https://www.normattiva.it/uri-res/N2Ls?urn:nir:stato:decreto.legislativo:2019-01-12;14
Companies (Rescue Process for Small and Micro Companies) Act 2021 (2020). (2021). https://www.irishstatutebook.ie/eli/2021/act/30/section/3/enacted/en/html
Corporate Insolvency and Governance Act 2020 (UK). (2015). https://www.legislation.gov.uk/ukpga/2020/12/contents/enacted
Corporations Amendment (Corporate Insolvency Reforms) Act 2020 (Cth) (Act) (Australia). (2020). http://classic.austlii.edu.au/au/legis/cth/num_reg/cairr2020202001654694/
Debbarma, R., Prawar, P., Chakraborty, A., & Bedathur, S. (2023). IITDLI: Legal Case Retrieval Based on Lexical Models. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artificial Intelligence and Law (pp. 40–47).
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423
European Commission. (2003). Commission Recommendation of 6 May 2003 concerning the definition of micro, small and medium-sized enterprises (Text with EEA relevance) (notified under document number C(2003) 1422 (Techreport 2003/361/EC). http://data.europa.eu/eli/reco/2003/361/oj
European Commission. (2022). Proposal for a Directive of the European Parliament and of the Council harmonising certain aspects of insolvency law (Techreport COM/2022/702). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52022PC0702
Fujita, M., Kiyota, N., & Kano, Y. (2021). Predicate’s Argument Resolver and Entity Abstraction for Legal Question Answering: KIS teams at COLIEE 2021 shared task. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment (COLIEE 2021) (pp. 15–24).
Glaese, A., McAleese, N., Trębacz, M., Aslanides, J., Firoiu, V., Ewalds, T., Rauh, M., Weidinger, L., Chadwick, M., Thacker, P., Campbell-Gillingham, L., Uesato, J., Huang, P.-S., Comanescu, R., Yang, F., See, A., Dathathri, S., Greig, R., Chen, C., … Irving, G. (2022). Improving alignment of dialogue agents via targeted human judgements. ArXiv. https://doi.org/10.48550/arXiv.2209.14375
Hardcastle, D., & Scott, D. (2008). Can we evaluate the quality of generated text? In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the 6th Language Resources and Evaluation Conference (pp. 3151–3158).
Hart, H. L. A. ([1961] 2012). The concept of law (Third edition). Oxford University Press.
Hutchinson, G. B. (2021). The Small Companies Rescue Act – false hope for failing companies? Company Law Practice?, 7.
Katz, D. M., Bommarito, M. J., Gao, S., & Arredondo, P. (2023). GPT-4 passes the Bar Exam. SSRN. https://doi.org/10.2139/ssrn.4389233
Kim, M.-Y., Rabelo, J., Goebel, R., Kano, Y., Satoh, K., & Yoshioka, M. (2023). COLIEE 2022 Summary: Methods for Legal Document Retrieval and Entailment. In Y. Takama, K. Yada, K. Satoh, & S. Arai (Eds.), New Frontiers in Artificial Intelligence. JSAI-isAI 2022 (Issue 13859, pp. 51–67). Springer. https://doi.org/10.1007/978-3-031-29168-5_4
Kojima, T., Shane Gu, S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2023). Large Language Models are Zero-Shot Reasoners. ArXiv. https://doi.org/10.48550/arXiv.2205.11916
Lin, S., Hilton, J., & Evans, O. (2021). TruthfulQA: Measuring how models mimic human falsehoods. ArXiv. https://doi.org/10.48550/arXiv.2109.07958
Liu, Yiheng, Han, T., Ma, S., Zhang, J., Yang, Y., Tian, J., He, H., Li, A., He, M., Liu, Z., Wu, Z., Zhu, D., Li, X., Qiang, N., Shen, D., Liu, T., & Ge, B. (2023). Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models. ArXiv. https://doi.org/10.48550/arXiv.2304.01852
Liu, Yiqun, Li, H., Su, W., Wang, C., Wu, Y., & Ai, Q. (2023). THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artificial Intelligence and Law (pp. 1–6).
Lynch, K. (2023). Is ChatGPT a threat to the crative industries? University of Derby Magazine, 17. https://www.derby.ac.uk/magazine/issue-17/chat-gpt-threat-creative-industries/
Microsoft. (2023). Azure Functions. Computer software. https://azure.microsoft.com/en-gb/products/functions
Miklos, O. (2023). Mickey-bot. Computer software.
Mischcon de Reya. (2023). Mishcon de Reya’s exploration of AI technologies featured in the media. Mischcon de Reya. https://www.mishcon.com/news/mishcon-de-reyas-exploration-of-ai-technologies-featured-in-the-media
Mokal, R. J., Davis, R., Madaus, S., Mazzoni, A., Mevorach, I., Romaine, B., Sarra, J. P., & Tirado, I. (2018). Micro, small, and medium enterprise insolvency: A modular approach. Oxford University Press.
National Statistics. (2023). Company Insolvency Statistics: April to June 2023 [Techreport]. https://www.gov.uk/government/statistics/company-insolvency-statistics-april-to-june-2023
Nguyen, M. L., Bui, Q. M., Do, D.-T., Le, N.-K., Nguyen, D.-H., Nguyen, K.-H., & Anh, T. P. N. (2023). JNLP@COLIEE 2023: Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artificial Intelligence and Law (pp. 17–26).
Norton III, W. L., & Bailey, J. B. (2020). The pros and cons of the Small Business Reorganization Act of 2019. Emory Bankruptcy Developments Journal, 36(2), 383–393. https://scholarlycommons.law.emory.edu/ebdj/vol36/iss2/2
Novaes, L. P., Vianna, D., & da Silva, A. (2023). A Topic-Based Approach for the Legal Case Retrieval Task. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Tenth International Competition on Legal Information Extraction/Entailment (COLIEE 2023) in association with the 19th International Conference on Artificial Intelligence and Law (pp. 27–31).
OpenAI. (2022a). Introducing ChatGPT. Blog post. https://openai.com/blog/chatgpt
OpenAI. (2022b). New and improved embedding model. Blog post. https://openai.com/blog/new-and-improved-embedding-model
OpenAI. (2023). GPT-4 Technical Report. In arXiv [Techreport]. https://doi.org/10.48550/arXiv.2303.08774
Oppenheimer, D. (2023). ChatGPT has arrived – and nothing has changed. Times Higher Education. https://www.timeshighereducation.com/campus/chatgpt-has-arrived-and-nothing-has-changed
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. ArXiv. https://doi.org/10.48550/arXiv.2203.02155
Rabelo, J., Goebel, R., Kano, Y., Kim, M.-Y., Satoh, K., & Yoshioka, M. (2022). Overview and Discussion of the Competition on Legal Information Extraction/Entailment (COLIEE) 2021. The Review of Socionetwork Strategies, 16, 111–133. https://doi.org/10.1007/s12626-022-00105-z
Rattray, K. (2022). Will ChatGPT replace lawyers? Clio. https://www.clio.com/blog/chat-gpt-lawyers/
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using siamese BERT-networks. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 3982–3992). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1410
Schilder, F., Chinnappa, D., Madan, K., Harmouche, J., Vold, A., Bretz, H., & Hudzina, J. (2021). A Pentapus Grapples with Legal Reasoning. In J. Rabelo, R. Goebel, Y. Kano, M.-Y. Kim, K. Satoh, & M. Yoshioka (Eds.), Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment (COLIEE 2021) (pp. 60–68).
Shidiq, M. (2023). The use of artificial intelligence-based ChatGPT and its challenges for the world of education: from the viewpoint of the development of creative writing skills. Proceedings of the International Conference on Education, Society and Humanity, 1(1), 353–357.
Small Business, Enterprise and Employment Act 2015 (UK). (2015). https://www.legislation.gov.uk/ukpga/2015/26/contents/enacted
Small Business Reorganization Act of 2019 (US). (2019). https://www.congress.gov/bill/116th-congress/house-bill/3311
The World Bank. (2017). Report on the Treatment of MSME Insolvency [Techreport]. https://documents1.worldbank.org/curated/en/973331494264489956/pdf/114823-REVISED-PUBLIC-MSME-Insolvency-report-low-res-final.pdf
Thomson Reuters Institute. (2023). ChatGPT and Generative AI within Law Firms Law firms see potential, eye practical use cases and more knowledge around risks [Techreport]. https://www.thomsonreuters.com/en-us/posts/wp-content/uploads/sites/20/2023/04/2023-Chat-GPT-Generative-AI-in-Law-Firms.pdf
UNCITRAL. (2021). Legislative Recommendations on Insolvency of Micro- and Small Enterprises [Techreport]. https://uncitral.un.org/en/ilmse
Vaccari, E. (2022). A Modular Approach to Restructuring and Insolvency Law: Executory Contracts and Onerous Property in England and Italy. Norton Journal of Bankruptcy Law and Practice, 5.
Vaccari, E., Ehmke, D., & Burigo, F. (2023). MSMEs in Distress: Regulatory Costs and Efficiency Considerations in the Implementation of Preventive Restructuring Mechanisms: An Anglo-German-Italian Perspective. Journal of International and Comparative Law.
Vaccari, E., & Ghio, E. (2022). English corporate insolvency law: A primer. Edward Elgar.
Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual: (Python Documentation Manual Part 2). CreateSpace.
Wakeling, D. (2023). A&O announces exclusive launch partnership with Harvey. Allen & Overy. https://www.allenovery.com/en-gb/global/news-and-insights/news/ao-announces-exclusive-launch-partnership-with-harvey
Walters, A. (2020). The Small Business Reorganization Act: America’s new tool for SME restructuring for the COVID and post-COVID era. The Company Lawyer, 10, 324–325.
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-consistency improves Chain of Thought reasoning in Language Models. ArXiv. https://doi.org/10.48550/arXiv.2203.11171
Warzel, C. (20230208). Talking to AI might be the most important job skill of this century. https://www.theatlantic.com/technology/archive/2023/02/openai-text-models-google-search-engine-bard-chatbot-chatgpt-prompt-writing/672991/
Wood, T. (2023a). Evaluate insolvency. GitHub. https://github.com/fastdatascience/evaluate_insolvency
Wood, T. (2023b). Evaluation script for insolvency bot. Zenodo. https://doi.org/10.5281/zenodo.8292105
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., & Schiele, B. (2016). Latent embeddings for zero-shot classification. https://doi.org/10.48550/arXiv.1603.08895
Ye, X., & Durrett, G. (2022). The unreliability of explanations in Few-shot Prompting for textual reasoning. ArXiv. https://doi.org/10.48550/arXiv.2205.03401
Yu, F., Quartey, L., & Schilder, F. (2022). Legal prompting: Teaching a Language Model to think like a lawyer. ArXiv. https://doi.org/10.48550/arXiv.2212.01326
Zelikman, E., Wu, Y., Mu, J., & Goodman, N. D. (2022). STaR: Bootstrapping Reasoning with Reasoning. ArXiv. https://doi.org/10.48550/arXiv.2203.14465
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. (2019). BERTScore: Evaluating text generation with BERT. ArXiv. https://doi.org/10.48550/arXiv.1904.09675
Unleash the potential of your NLP projects with the right talent. Post your job with us and attract candidates who are as passionate about natural language processing.
Hire NLP ExpertsAbove: video of the AICamp meetup in London on 10 December 2024. Harmony starts at 40:00 - the first talk is by Connor Leahy of Conjecture
What we can do for you