Patent AI Insights is the expert resource for AI-powered patent prosecution, maintained by Roger Hahn, USPTO Registered Patent Attorney (Reg. No. 46,376) and founder of ABIGAIL. Topics include Office Action response strategies, prior art analysis, examiner intelligence, claim amendment techniques, and comparisons of AI patent tools.
The Patent AI Transparency Crisis: $100M in Funding, Zero Published Accuracy Data
The patent AI industry has raised over $100 million in venture funding. Not a single company has published reproducible accuracy benchmarks. Here is the vendor-by-vendor evidence.
$100M+ in Funding. Zero Published Benchmarks.
In the last three years, patent AI startups have collectively raised over $100 million in venture capital. Solve Intelligence closed a $55M Series B. Patlytics raised $21M. IPRally secured $35M. PatSnap reached IPO-level scale. These are not small bets -- investors are pricing in the assumption that AI will transform patent prosecution.
But there is a problem that no one in the industry wants to talk about: not one of these companies has published reproducible accuracy benchmarks for their core patent prosecution capabilities.
Not a single hallucination rate. Not a single error rate on claim amendments. Not a single verifiable accuracy metric on Office Action analysis. The entire industry is asking patent attorneys to trust AI with their clients' intellectual property -- and offering nothing but marketing copy as evidence that it works.
Patent attorneys have a duty of candor to the USPTO. They face malpractice liability for every statement in every filing. Yet the tools they are being sold have never been independently evaluated for accuracy. The vendors know their accuracy numbers. They have chosen not to share them.
Vendor-by-Vendor Analysis
We reviewed every major patent AI vendor's public materials -- websites, blog posts, press releases, academic publications, and customer case studies. Here is what we found.
| Vendor | Funding | Founded | Published Benchmarks | Accuracy Claims | What's Verified |
|---|---|---|---|---|---|
| Solve Intelligence | $55M (Series B, Dec 2025) | 2022 | None | "50% more productive" | Nothing. Productivity claim has no published methodology, sample size, or accuracy data. |
| Patlytics | $21M | 2023 | None | "18x customer growth" | Nothing. Customer growth is a sales metric, not an accuracy metric. |
| IPRally | $35M | 2018 | None | Blog post references search metrics | Nothing. Blog discussed methodology but published zero actual data. |
| PatSnap | IPO-level | 2007 | Single metric | "81% X Hit Rate" | One metric for prior art search only. No data on prosecution analysis, claim amendments, or hallucination rates. |
| DeepIP | Undisclosed | 2020 | None | Self-published feature comparisons | Nothing. Feature lists are not accuracy data. |
| Lexis+ AI | $650M acquisition (RELX) | N/A | None (self-published) | General AI assistant for legal research | Stanford study found 17% hallucination rate on legal citations. |
| Westlaw AI | Thomson Reuters | N/A | None (self-published) | AI-assisted legal research | Stanford study found 33% hallucination rate on legal citations. |
Data compiled from public sources as of March 2026. If any vendor believes this is inaccurate, we invite them to publish their benchmarks.
The Stanford Hallucination Study
In 2025, researchers at Stanford (Magesh et al.) published what is arguably the most important empirical study on AI accuracy in legal practice. They systematically tested major legal AI tools for hallucinated citations -- cases, statutes, and references that the AI presented as real but that do not exist.
One in three legal citations generated by Westlaw AI was fabricated. These are not paraphrasing errors or minor inaccuracies -- they are citations to cases and authorities that do not exist.
Nearly one in five citations was fabricated. While better than Westlaw, this is still catastrophic for a tool marketed to practicing attorneys who face sanctions for citing nonexistent authorities.
GPT-4, Claude, and other general-purpose LLMs hallucinated legal citations at rates between 58% and 82%. This establishes the baseline that legal-specific tools are trying to improve on.
The Stanford study matters for patent AI because it demonstrates an uncomfortable truth: even the largest, best-funded legal AI tools hallucinate at alarming rates. And those are the tools that have actually been tested. The patent AI vendors listed above have never been independently tested at all.
If Westlaw and Lexis -- with billions of dollars in resources -- produce hallucination rates of 17-33%, what should we expect from patent AI startups that have never submitted to independent evaluation?
What "50% More Productive" Actually Means
Solve Intelligence's flagship claim is that their tool makes patent attorneys "50% more productive." This claim has been repeated in press releases, investor materials, and marketing content. Let us examine what it actually tells you.
Questions this claim does not answer
- What is the accuracy of the AI's Office Action analysis?
- How often does it hallucinate prior art citations?
- What percentage of suggested claim amendments introduce new matter?
- How was "productivity" measured? Time to completion? Output volume? Quality?
- What was the sample size? How were participants selected?
- Was the study conducted by an independent party or by the vendor?
A tool that produces Office Action responses 50% faster is worthless if 20% of those responses contain hallucinated citations. It is worse than worthless -- it is a malpractice liability accelerator. Speed without accuracy is not productivity. It is risk multiplication.
The same logic applies to every vendor on the list. Customer growth, feature comparisons, and productivity claims are not substitutes for accuracy data. The only metric that matters for a patent attorney is: how often does this tool produce incorrect output that could harm my client?
The Buyer's Checklist: 4 Questions for Every Patent AI Vendor
Before you sign a contract with any patent AI vendor, ask these four questions. If they cannot answer all four, you do not have enough information to evaluate the tool's safety for your practice.
Do you publish reproducible benchmarks?
Not marketing metrics. Not customer testimonials. Published accuracy data on a disclosed dataset using a disclosed methodology that an independent party could reproduce. If the answer is no, ask why.
What is your hallucination rate on MPEP and case law citations?
Any tool that generates legal citations must disclose how often those citations are fabricated. The Stanford study showed rates of 17-33% for major legal AI tools. If a vendor does not know their hallucination rate, they have not measured it. If they have measured it and will not share it, draw your own conclusions.
Have you disclosed your failure modes?
Every AI system has failure modes -- categories of input where it performs poorly. A vendor that claims their tool works equally well on all technology areas, all rejection types, and all prosecution scenarios is either lying or has not tested thoroughly enough to discover the failure modes.
Has an independent party evaluated your tool?
Self-reported accuracy is not accuracy. Vendor-selected case studies are not benchmarks. Ask whether any third party -- academic, journalistic, or institutional -- has independently evaluated the tool. If not, you are relying entirely on the vendor's self-assessment.
The Revealed Preference Argument
Economists use the term "revealed preference" to describe the idea that people's actual choices reveal their true beliefs, regardless of what they say. The same principle applies to patent AI vendors and accuracy data.
Consider the following logic:
- If a vendor's tool performed well on independent benchmarks, publishing those results would be a massive competitive advantage.
- Publishing benchmarks is relatively cheap. The datasets exist. The evaluation frameworks exist. A single engineer could run a benchmark suite in a week.
- Every major patent AI vendor has chosen not to publish benchmarks.
- Therefore, the most likely explanation is that they expect the results to be unfavorable.
This is not speculation. This is basic incentive analysis. A vendor sitting on great accuracy numbers would publish them immediately -- it would be the single most effective marketing asset they could produce. The silence is the data.
When a vendor tells you their tool is "accurate" or "reliable" but refuses to publish numbers, they are asking you to trust their marketing department over your own due diligence. For a profession built on evidence and precision, this should be unacceptable.
PatentBench: Open Benchmarks for Patent AI
We built PatentBench because we were tired of the silence. PatentBench is the first open, reproducible benchmark suite for patent prosecution AI. It is designed to do what no vendor has been willing to do: measure accuracy on real patent prosecution tasks with transparent methodology.
Every evaluation metric, dataset, and scoring rubric is public. Anyone can reproduce our results.
The benchmark dataset is derived from real USPTO Office Actions and prosecution histories. No synthetic data.
Benchmarks cover Office Action analysis, rejection classification, claim amendment quality, and citation accuracy.
Any patent AI tool can be evaluated against the same benchmarks. We challenge every competitor to submit their tool for evaluation.
An Open Challenge to Every Patent AI Vendor
We have published our benchmarks, our methodology, and our results. We challenge Solve Intelligence, Patlytics, IPRally, PatSnap, and every other patent AI vendor to do the same. Run PatentBench against your tool and publish the results. If your tool is as good as you claim, you have nothing to lose.
Frequently Asked Questions
Stay Updated on Patent AI Accountability
Get notified when we publish new vendor evaluations, benchmark results, and transparency reports.
See What Transparent Patent AI Looks Like
Abigail publishes its accuracy data, discloses its failure modes, and subjects itself to independent evaluation via PatentBench. Try it free and see the difference.
Related Guides
Discussion
Sign up for instant commenting + $25 free credit
Create an ABIGAIL account to post comments instantly (no moderation wait) and get $25 in credit to try our AI patent prosecution tools.
First comments are held for moderation. Subsequent comments post instantly.
Discussion
Sign up for instant commenting + $25 free credit
Create an ABIGAIL account to post comments instantly (no moderation wait) and get $25 in credit to try our AI patent prosecution tools.
First comments are held for moderation. Subsequent comments post instantly.