
Can AI handle legal questions yet? We have compared the capabilities of the older and newer large language models (LLMs) on English and Welsh insolvency law questions, as a continuation of the Insolvency Bot project.
Thomas Wood tried asking several LLMs a series of questions about insolvency law set by insolvency expert Eugenio Vaccari, designed to be about undergraduate level. We tested older LLMs such as GPT-3.5 as well as newer entrants such as DeepSeek.
We tried using the LLMs “off the shelf” with no modification (our control), and then as a comparator, we also tried including relevant English and Welsh case law, statutes, and forms from HMRC in the prompt. For an example, instead of asking an LLM
I have X debts and Y happened. Should I close my company?
we can ask the LLM,
The Insolvency Act 1986 Section 123 states that [paragraph]. The Companies Act 2006 Section 456 states that [paragraph]. This Supreme Court ruling is relevant: [ruling]. I have X debts and Y happened. Should I close my company?
In other words, we do the legwork of looking up the relevant information and stuffing it into the prompt, and the LLM just has to do what it’s good at, namely formulating sentences. This technique of adding extra text to a prompt is called retrieval augmented generation, or RAG. What’s cool about RAG, is that the user doesn’t need to see it.
We found that over the last few years, the more advanced LLMs have actually had a bigger improvement due to the extra information that we included in the prompt. LLMs are constantly improving in their unmodified form, but you can see clearly in this time series plot that RAG has become more effective over time.
The models released in 2025, such as DeepSeek and the current iteration Gemini, now outperform the earlier GPT-3.5 by a large margin. This was not surprising. But what is unexpected is that the RAG-augmented models have even more of an edge over their non-RAG counterparts, than they did one or two years ago.
So we built a RAG system long before Google Gemini or DeepSeek came out, and it has performed far better on those models than on any model we had access to back in 2023 when we developed the system. Any ideas why this could be? Contact us and let us know your thoughts!
The pace of improvements is also astounding. Could we be facing a new Moore’s law in AI?
You can read our original paper (which predates the release of DeepSeek) here:
And you can try the Insolvency Bot here: https://fastdatascience.com/insolvency
Dive into the world of Natural Language Processing! Explore cutting-edge NLP roles that match your skills and passions.
Explore NLP Jobs
Thomas Wood presents the Clinical Trial Risk Tool before the November meeting of the Clinical AI Interest Group at Alan Turing Institute The Clinical AI Interest group is a community of health professionals from a broad range of backgrounds with an interest in Clinical AI, organised by the Alan Turing Institute.

Fast Data Science will appear at Ireland’s Expert Witness Conference on 20 May 2026 in Dublin On 20 May 2026, La Touche Training is running the Expert Witness Conference 2026, at the Radisson Blu Hotel, Golden Lane, Dublin 8, Ireland. This is a full-day event combining practical workshops and interactive sessions, aimed at expert witnesses and legal professionals who want to enhance their expertise. The agenda covers critical topics like recent developments in case law, guidance on report writing, and techniques for handling cross-examination.
Guest post by Alex Nikic In the past few years, Generative AI technology has advanced rapidly, and businesses are increasingly adopting it for a variety of tasks. While GenAI excels at tasks such as document summarisation, question answering, and content generation, it lacks the ability to provide reliable forecasts for future events. GenAI models are not designed for forecasting, and along with the tendancy to hallucinate information, the output of these models should not be trusted when planning key business decisions. For more details, a previous article on our blog explores in-depth the trade-offs of GenAI vs Traditional Machine Learning approaches.
What we can do for you