By Michael Borella

The patent world is no stranger to hype cycles.  We have seen blockchain, NFTs, and the Internet of Things all promised as revolutionary technologies.  But generative artificial intelligence (GenAI) feels different.  It is not just a new technology to patent – it is a tool that its proponents contend can do patent-related application drafting, prosecution, and litigation work.

Many practitioners are currently being bombarded with offers for software that can allegedly draft claims, prepare office action responses, and predict Patent Trial and Appeal Board (PTAB) outcomes.  However, as anyone who has spent an hour stress-testing a large language model (LLM) knows, these systems possess a baffling duality.  They are simultaneously the smartest person in the room and the most confident liar you have ever met.

To understand why GenAI has not earned its seat at the counsel table, we have to look at where it succeeds, where it fails spectacularly, and why its ultimate lack of personality is a fatal flaw.

When functioning correctly, GenAI acts as a researcher with near-infinite stamina and unparalleled speed.  It can sift through massive, unstructured datasets, such as thousands of pages of discovery or decades of jurisdictional case law, to identify patterns and edge-case precedents that a human might overlook.  By pairing natural language queries with fast synthesis, these tools can reduce a twenty-hour research task to just a few minutes of high-level review.

One might be forgiven for assuming that this efficiency makes the human attorney obsolete.  The reality, however, is that these models are prone to logical failures and factual fabrications that would make even a first-year associate blush.

The Car Wash Problem

We often mistake linguistic fluency for intelligence.  Because an LLM can string together a perfectly grammatical sentence, we assume it understands the physical world.  It does not.

Consider a logic trap that trips up many current models: “The car wash is down the block from me.  Should I walk or drive there?”[1]  To a human, the answer is a laugh.  If you walk to the car wash, you’ve missed the point.  But an LLM, processing the prompt through the lens of distance and convenience, might weigh the health benefits of a two-minute stroll versus the efficiency of driving, completely oblivious to the fact that the car is the object of the trip, not just the transport.

This is not just an anecdote.  In legal writing, a lack of grounded, real-world logic is problematic.  If an LLM does not understand the physical relationship between a sensor and a circuit board, it cannot draft a meaningful description of any related invention.  It can mimic the structure of a specification, but it has little or no concept of what the invention actually does in 3D space.

Two Halves of the Legal Brain

Legal writing is a unique beast.  It requires a bifurcated approach that I like to think of as the fact-persuasion divide.

On one hand, a legal document needs to keep the facts straight.  In a patent application or a brief submitted to a court, the facts are and must remain the facts.  A citation to Alice Corp. v. CLS Bank must be accurate.  A quote from patent’s specification must be verbatim.  The interpretation of a claim term must be grounded in the intrinsic record.  There is nearly zero room for creativity here.  If you misquote a prior art reference, you are not being expressive – you’re being sanctioned.[2]

On the other hand, we have persuasion.  Once the facts are established, the attorney’s job is to weave them into a narrative.  How do we characterize the technical problem solved by the invention?  How do we frame the invention to survive an obviousness challenge?  This part of the job can be done in an infinite number of ways.  It requires character, emphasis on supportive facts and law, and a bit of rhetorical soul.

The Hallucination Crisis

By now, most of us have heard of the attorneys who submitted briefs containing fictional case law.  They did not do this out of malice.  They did it because they failed to understand that an LLM is based on a probability engine.  It predicts the next most likely word of a sentence based on a statistical distribution.[3]

In the legal world, a citation filled out non-deterministically can easily be a fake citation.  We have seen judges across the country issuing standing orders requiring attorneys to disclose the use of AI.[4]  With LLMs, hallucination is not a bug.  It is a fundamental part of how these models work.  They are designed to make guesses.  That can be great for writing a poem about a toaster.  But it can be an expensive lesson for an attorney who thinks that cite-checking has suddenly become archaic.

Averaging of Arguments

Let’s say we eventually solve the hallucination problem.  We still face a secondary, more subtle hurdle, regression to the mean.  LLMs are trained on a massive corpus of text.  Their output can feel like an average of everything in their training data.  This results in writing that is safe, low-risk, and utterly generic, vanilla prose.  By now LLM users are familiar (and annoyed) with overused buzzwords, predictable summary sentences at the end of paragraphs, and sycophantic tone.

In legal writing, the best briefs are written with character, flair, and a certain vibrancy.  For example, overcoming a rejection of a patent application at the USPTO may require emphasizing a specific nuance of the prior art that the Examiner is ignoring or downplaying.  A good writer uses sharp, biting logic to point out inconsistencies in the examiner’s reasoning.  Current AI cannot perform high-level advocacy.  It does not know how or when to pound the table – it only knows how to politely describe what a table is.

The Next Generation

So, where does this leave us?  The next generation of legal AI does not need to be smarter, it needs to be more specialized.  Such a tool should be able to wear two hats.

When asked for the holding of a case, the LLM should not provide a creative summary.  Instead, it should act as a retrieval engine not unlike a traditional legal search tool, pulling the exact text from a verified database.  This mode must be rigid, factual, and incapable of imagining facts.

Once the facts are in hand, the LLM should switch to assisting the attorney brainstorm ways to present them.  This mode should be able to adopt a specific voice, perhaps that of the attorney and perhaps based on their previous writing.  It should be able to take the factual data and turn it into a persuasive argument with eloquence.

Conclusion

GenAI is currently in its awkward teenage years.  It is incredibly capable but lacks the judgment to know when to be serious and when to be creative.  For attorneys, the car wash example serves as a reminder that LLMs do not know what they are doing, they only know what they are saying.

Until we have systems that can distinguish between cold, hard facts and the warm, persuasive art of advocacy, having a human in the loop is not just a suggestion – it is a necessity.  Today at least, we are not being replaced by AI.  Instead, we are tasked with supervising a very talented, very confused digital associate.


[1] A few days ago a major state-of-the-art model told me to “Walk there and drive back.”  As of the time of this writing, the model no longer produces this result, though the exact language of your prompt may matter.

[2] https://www.mbhb.com/intelligence/snippets/ai-hallucination-in-legal-cases-remain-a-problem/.

[3] One emerging solution to this problem is retrieval-augmented generation (RAG).  Unlike a pure LLM, which relies solely on its internal training weights to generate text, a RAG-enabled system queries a curated, external database of verified legal authority (e.g., Westlaw, LexisNexis, or the USPTO’s own records).  It then provides that specific, ground-truth text to the LLM as a reference.  By constraining the model’s generation to the provided context, RAG significantly reduces the risk of hallucination.  However, practitioners should note that RAG is not a panacea.  While it improves factual retrieval, the LLM’s subsequent interpretation or synthesis of those retrieved facts remains problematic.

[4] https://patentdocs.org/2023/08/13/judges-issue-standing-orders-regarding-the-use-of-artificial-intelligence/.

Posted in

Leave a comment