Vol. 99, Postscript

Beyond Words: The Risks of Generative Interpretation

Jonathan Scher*
AI, Artificial Intelligence, Large Language Models, Technology
Postscript

Judges are beginning to use large language models like ChatGPT to interpret legal texts. This Note examines whether they should do so. Prior studies testing LLMs as legal interpreters use survey responses as benchmarks for performance. I offer the first study comparing LLM interpretations to real-world judicial decisions. Across eight Ninth Circuit cases, I test whether GPT-4 Turbo (a model of ChatGPT) correctly identifies legal text as ambiguous or unambiguous. I find that ChatGPT’s assessments diverged from the court’s determinations 50% of the time. I then advance a novel argument: judicial reliance on LLMs may constitute improper ex parte communication under current judicial ethics rules.

View Full PDF

Beyond Words: The Risks of Generative Interpretation

Like this article?