This was the second time Linklaters had run its LinksAI benchmark tests, with the original exercise taking place in October 2023.
In the first run, OpenAI’s GPT 2, 3 and 4 were tested alongside Google’s Bard.
The exam has now been expanded to include o1, from OpenAI, and Google’s Gemini 2.0, which was also released at the end of 2024.
It did not involve DeepSeek’s R1 – the apparently low cost Chinese model which astonished the world last month – or any other non-US AI tool.
The test involved posing the type of questions which would require advice from a “competent mid-level lawyer” with two years’ experience.
The newer models showed a “significant improvement” on their predecessors, Linklaters said, but still performed below the level of a qualified lawyer.
Even the most advanced tools made mistakes, left out important information and invented citations – albeit less than earlier models.
The tools are “starting to perform at a level where they could assist in legal research” Linklaters said, giving the examples of providing first drafts or checking answers.
However, it said there were “dangers” in using them if lawyers “don’t already have a good idea of the answer”.
It added that despite the “incredible” progress made in recent years there remained questions about whether that would be replicated in future, or if there were “inherent limitations” in what AI tools could do.
In any case, it said, client relations would always be a key part of what lawyers did, so even future advances in AI tools would not necessarily bring to an end what it called the “fleshy bits in the delivery of legal services”.