Âé¶¹´«Ã½

Harvey Leads In Benchmarking Study Against Human Âé¶¹´«Ã½yers

This article has been saved to your Favorites!
Harvey emerged as the top-performing platform in an independent benchmarking study of legal artificial intelligence tools released Thursday, which also showed how AI's strengths and weaknesses compared to a group of human lawyers.

The study, conducted by the AI evaluation platform Vals AI with collaboration from Legaltech Hub, evaluated five leading legal generative AI tools across seven tasks. Vals AI says its auto-evaluation framework produces blind assessments to evaluate the accuracy of AI models.

Vals AI tested the AI tools against a control group of independent lawyers, called the Âé¶¹´«Ã½yer Baseline, supplied by Cognia Âé¶¹´«Ã½, an alternative legal service provider. The results showed that AI does deliver some value in legal work.

"Generative AI has reshaped the legal landscape, but not all tools are created equal," Rayan Krishnan, co-founder of Vals AI, said in a statement. "Our study not only measures performance, but also establishes first-ever standards that legal professionals and developers can rely on to understand the technology's impact, but most importantly, its limitations."

Harvey, the fast-growing legal tech startup that just raised a , put its AI assistant into six of the seven tasks in the study. It got the top score of AI tools on five tasks and outperformed the lawyer baseline on four. It also tied the Âé¶¹´«Ã½yer Baseline in generating a chronology, but stayed out of the task of doing EDGAR research.

This marked the first public benchmarking evaluation of Harvey's AI assistant.

CoCounsel, the AI tool from Thomson Reuters, is the only other vendor whose AI tool received a top score in the study, which it got for summarizing documents. Thomson Reuters submitted its product in four of the seven task areas for the study, surpassing the Âé¶¹´«Ã½yer Baseline in those four and achieving the highest average score across them.

Vincent AI, the AI assistant from vLex, participated in seven tasks. It performed better than the Âé¶¹´«Ã½yer Baseline in document question-answering, document summarization and transcript analysis.

Vecflow's Oliver, the newest company in this study of AI assistants, opted into five tasks. It outperformed the Âé¶¹´«Ã½yer Baseline in document question-answering and document summarization, and was the only AI tool to opt into the EDGAR research category.

Lexis+ AI, the AI platform from LexisNexis, was originally part of the study, but withdrew from all tasks except for legal research. Vals AI plans to release its study on legal research soon.

The Âé¶¹´«Ã½yer Baseline topped the AI tools in the tasks of EDGAR research and redlining, which refers to editing contracts.

A consortium of law firms, including Reed Smith LLP, Fisher Phillips, McDermott Will & Emery LLP, Ogletree Deakins Nash Smoak & Stewart PC and four anonymous firms, contributed sample questions and documents for the study.

Vals AI found that AI tools outperformed the human lawyers in easy cases, but fell short in complex and reasoning-intensive tasks.

"These results offer a balanced perspective for the legal community," Langston Nashold, co-founder of Vals AI, said in a statement. "For developers, it's a roadmap to prioritize innovation in underperforming areas. For law firms, it's a guide to making strategic investments in AI that enhance both client service and operational ROI."

--Editing by Adam LoBelia.

Âé¶¹´«Ã½360 is owned by LexisNexis Legal & Professional, a RELX company.


For a reprint of this article, please contact reprints@law360.com.

×

Âé¶¹´«Ã½360

Âé¶¹´«Ã½360 Âé¶¹´«Ã½360 Tax Authority Âé¶¹´«Ã½360 Employment Authority Âé¶¹´«Ã½360 Insurance Authority Âé¶¹´«Ã½360 Real Estate Authority Âé¶¹´«Ã½360 Healthcare Authority Âé¶¹´«Ã½360 Bankruptcy Authority

Rankings

NEWLeaderboard Analytics Social Impact Leaders Prestige Leaders Pulse Leaderboard Women in Âé¶¹´«Ã½ Report Âé¶¹´«Ã½360 400 Diversity Snapshot Rising Stars Summer Associates

National Sections

Modern Âé¶¹´«Ã½yer Courts Daily Litigation In-House Mid-Âé¶¹´«Ã½ Legal Tech Small Âé¶¹´«Ã½ Insights

Regional Sections

California Pulse Connecticut Pulse DC Pulse Delaware Pulse Florida Pulse Georgia Pulse New Jersey Pulse New York Pulse Pennsylvania Pulse Texas Pulse

Site Menu

Subscribe Advanced Search About Contact