Linklaters tests AI with legal exams to assess reliability

Linklaters tests AI with legal exams to assess reliability

Magic circle firm Linklaters is testing artificial intelligence by giving it law exams to determine whether it can reliably perform legal tasks.

The firm has created the LinksAI English law benchmark, designed to “test the ability of large language models to answer legal questions.” The exams assess AI systems on overall correctness, use of citations, and clarity, with human adjudicators monitoring for errors and fabricated references.

The questions, described as challenging, are set at a level that a lawyer with at least two years’ post-qualification experience would be expected to answer in a specialist field.

Linklaters has been conducting these tests for two years and reports “significant improvements” in AI performance. In 2023, four models – GPT-2, GPT-3, GPT-4 and Bard – were tested, with Bard achieving the highest score of 4.4 out of 10. However, evaluators noted that all models “were often wrong” and produced “fictional” citations.

By 2025, OpenAI o1 led the field with a score of 6.4 out of 10, while Gemini 2.0 followed closely at 6.0. The firm noted that “material increases in the scores for substance and the accuracy of citations” had driven the improvements.

Despite this progress, Linklaters cautioned that AI should not be used for English law legal advice “without expert human supervision”, as such systems are “still not always right and lack nuance”.

However, the firm acknowledged that AI could soon be useful for tasks such as drafting initial documents or cross-checking legal texts, particularly in “well-known areas of law”, if expert oversight is available.

Share icon
Share this article: