Vol. 97, No. 5 Archives – Southern California Law Review

The Failed Promise of Treasuries in Financial Regulation

September 2024September 2024Pradeep K. Yadav* & Yesha Yadav†

U.S. government Treasury bonds (“Treasuries”) anchor financial stability. Public regulation mandates that financial firms maintain deep buffers of Treasuries that can be sold for cash in a crisis. In private lending between financial firms—running into trillions of dollars daily—Treasuries are the preferred form of collateral, designed to make debt fully resistant to default.

But this unquestioned reliance on Treasuries in public and private self-regulation has created a financial system that rests on fragile foundations. The first fundamental problem—thus far unnoticed in existing literature—lies in the system-wide tension that is present when both public and private self-regulation depend on the same scarce Treasuries/cash for survival.

This tension plays out in the common system of intermediation that supports both public and private self-regulation. Crucially, financial regulation places its trust in the competencies of twenty-four large financial firms—primary dealers—that uphold both the buying and selling of Treasuries as well as the supply of Treasuries to lending markets for use as collateral. This system of intermediation, however, is far from perfect. As we show, primary dealers confront incurable information gaps when allocating cash and Treasuries between private lending and public trading markets. Further, facing scarcity, primary dealers must choose whether to devote resources to one space over the other. Finally, as for-profit actors, primary dealers have no reason to continue intermediating if the cost-benefit trade-off turns sour. As it stands, for financial regulation to remain resilient, its mechanisms for intermediating Treasuries must also be lucrative.

The second problem lies in the fragmented system of supervision that governs an interconnected public trading and private lending market for Treasuries. Multiple regulators are in charge, but they lack coordination mechanisms, complementary regulatory approaches, and institutional mandates to facilitate cooperation. It follows that regulators have failed to spot shared risks to Treasuries intermediation and to develop mechanisms to correct them.

This Article sets out a three-part solution to better realize the promise of Treasuries for financial regulation: (1) enhancing transparency across trading and lending markets, (2) developing consolidated oversight, and (3) mandating that primary dealers maintain intermediation during crises. With Treasuries anchoring public regulation and trillions in private contracting, their fragility represents a danger that policymakers can ill afford to ignore.

INTRODUCTION

When COVID-19 shocked the financial system in March 2020, the (then) $17 trillion market for Treasuries became one of its most unexpected casualties.¹ As equity and corporate bond markets reeled, investors rushed to sell Treasuries and raise cash to remain solvent.² Their reaction was exactly as expected. Viewed as failure-proof, Treasuries provide the world with its most dependable safe haven. When other markets run into distress, Treasuries are supposed to buffer the fall by ensuring a constant supply of default-free assets and cash for those that sell them.³ Recognizing this fortress-like quality, public regulation and private industry rely systematically on Treasuries as the shield to protect financial markets against panic, collapse, and uncertainty.⁴

Public financial regulation mandates that Treasuries constitute a sizable part of the rainy day safety buffers of any number of regulated financial firms.⁵ The assumption here is that Treasuries can, by dint of quick sales, release cash in a crisis, allowing a firm to pay off its creditors and, in turn, prevent creditors from also going bust themselves.⁶ Using similar logic, the private market for lending between financial firms—running at trillions of dollars daily—also depends on Treasuries as the preferred form of collateral.⁷ By securing debt using Treasuries, lenders can be sure that they will be repaid, either by the borrower as promised, or by selling the Treasuries collateral.⁸ This unquestioned confidence in Treasuries as collateral means that parties do not even need to conduct due diligence on one another, so long as sufficient Treasuries can secure the debt.⁹ Indeed, it is taken for granted that the price of Treasuries will not fall when that of assets, like corporate bonds or equities, crashes. In other words, investors rush to safety during crises by putting capital into Treasuries and maintaining (or increasing) their price.¹⁰

March 2020, however, upended these assumptions. Rather than Treasuries providing reliable trading (or liquidity)—allowing sellers to cash out without distorting prices—investors found themselves unable to transact on reasonable terms.¹¹ Execution costs increased by 50%–500%, and market depth—or the quantity of offers (quotes) available to trade—plunged to 10%–38% of earlier values.¹² Testifying before the Senate Banking Committee in February 2021, the Chair of the Federal Reserve (“the Fed”), Jerome Powell, remarked that the Treasury market did not have “the capacity to handle” the pressure.¹³ Treasuries’ prices became chaotic and fell out of sync with those in related markets.¹⁴ As detailed by Annette Vissing-Jorgensen, this price instability had nothing to do with changes to the country’s economic fundamentals (for example, inflation).¹⁵ Instead, its cause was the rapid deterioration of trading conditions in the Treasury market with large investors rushing in to sell.¹⁶ As such, with equity markets plunging almost 3,000 points daily, the price of Treasuries also dropped precipitously, instead of increasing or staying stable as should have been the case for the world’s premier safe haven.¹⁷

Worryingly, the disruptions in March 2020 were not a one-off event. Rather, as shown by Matthias Fleckenstein and Francis A. Longstaff, market confidence in the capacity of Treasuries to steadfastly provide a safe haven has diminished significantly in recent years. Fleckenstein and Longstaff observe that Treasuries have traded much more cheaply to their fair value at key moments in modern financial history, with sizable price discounting observed during the 1997 Asian Financial Crisis, the 2000s, and frequently between 2015–2020.¹⁸ Taken together, these repeated performance failures call into question the core assumption made by public and private financial regulation in relying so fundamentally on Treasuries as safe assets: that their default-free nature means that Treasuries are also always perfectly tradable at fair prices.¹⁹ We close this gap in the literature to show that this assumption is simply wrong. Rather, while Treasuries can be regarded as risk-free, the market that trades them is not, diminishing their capacity to act as an anchor for public as well as private industry self-regulation. In this Article, we make two claims to detail: (1) the fragile system of intermediation that underpins Treasuries’ distribution, and (2) the deeply flawed model of market supervision that is ill-matched to contend with the risks created by faulty intermediation.

In our first contribution, we show that there is a fundamental, internal tension within a system in which both public and private financial regulation rely on scarce Treasuries to support economic survival. This tension and interconnection crystallize in a shared system of intermediation that must, at once, manage the buying and selling of Treasuries with the public as well as ensure the constant supply of Treasuries collateral to the private lending market.

For a start, this system of intermediation is remarkably fragile. Opacity, conflict, and complexity are pervasive. Crucially, regulation places trust in the capacity of (currently) twenty-four large banks and investment firms—known as primary dealers—to intermediate Treasuries. Primary dealers are uniquely authorized to purchase Treasuries from the government at auction and then to distribute them widely.²⁰ This role puts primary dealers center stage in the secondary market for buying and selling Treasuries with investors, in which they sell to those that want to buy and buy from those that want to sell. In this way, primary dealers help operationalize the assumption made in public financial regulation that Treasuries can always be liquidated by those needing cash or bought by firms wanting a reliable safe asset—all at fair and stable prices.

Primary dealers also act as critical intermediaries for the approximately five trillion dollars in exposure in the private market for short-term lending²¹—known as the repurchase or repo market—in which Treasuries constitute the preferred form of collateral.²² The repo market allows financial firms with cash to lend it to others that need it.²³ To eliminate default risk, lending is short-term and secured (mostly using Treasuries).²⁴ By ensuring firms can borrow cash whenever they need, the repo market provides a lifeline to financial firms to address everyday funding demands.²⁵ Within the repo market, primary dealers match borrowers with lenders.²⁶ They also act as lenders by using their own cash to serve those looking to borrow.²⁷ Finally, dealers borrow for themselves in the repo market as a way of funding their firm’s everyday operations.²⁸ In intermediating the supply of Treasuries to the repo market, primary dealers help insulate financial firms against default and systemic fallout.

Primary dealers confront steep and pervasive costs when intermediating across both the secondary market for Treasuries as well as the repo market. Information gaps are endemic. This opacity is structurally unavoidable in the repo market. Because Treasuries represent the preferred form of collateral and lending is short-term, due diligence is deemed unnecessary.²⁹ By design, primary dealers lack the tools and incentives to carefully monitor the default risk posed by parties with whom they contract.³⁰ They are also unable to fully gauge, on a market-wide basis, how this risk is building—for example, whether certain counterparties might be growing more indebted, less likely to repay, and whether to continue to lend to them, and on what terms.³¹

To be sure, using Treasuries as collateral should mean that primary dealers and the financial market have nothing to fear from default. But this view glosses over the damaging effect of opacity on intermediation. A system-wide absence of real-time information means that primary dealers are justified in being overly cautious when the prospect of default does arise and in quickly cutting off credit to counterparties across the board on account of not knowing exactly where the problem lies and how widespread it may be. Dealers might demand more Treasuries collateral to match unknown but higher levels of risk—even from borrowers that appear to be safe. In the absence of detailed information, withdrawing intermediation is rational, even advisable, to ensure that primary dealers do not keep lending to any number of defunct firms. After all, there is no rule forcing primary dealers to keep trading.³² From the standpoint of the market and its regulation, however, this kind of preventative action is harmful, chaotic, and liable to amplify distress. Financial firms can end up suddenly unable to meet their daily funding needs, or to roll over past debt, having to quickly find the cash to repay if a dealer calls in a repo loan or makes an existing one more expensive.³³

Opacity also raises doubts about whether Treasuries collateral is even capable of being enforced, that is, traced and sold by a primary dealer to recover the amount owed after default. Because the market lacks real-time reporting and due diligence, a borrower may not actually own the Treasuries collateral it offers up to secure a debt. Rather, collateral can belong to another party that has agreed to let the borrower use it for a time.³⁴ Collateral reuse is commonplace in Treasury-backed repo markets. Complex collateral chains, in which the same Treasury circulates to collateralize multiple loans, has become a feature.³⁵ For example, a Lender takes Treasuries from a Borrower as collateral. The Lender can then use these same Treasuries as collateral to borrow cash for itself. According to Manmohan Singh of the International Monetary Fund, each Treasury security collateralizes around three repo loans.³⁶ Reuse affords gains in efficiency. In good times, prized Treasuries can help release credit for numerous parties. But during crisis and with opacity endemic, doubts are raised about whether the collateral is traceable and capable of being sold.³⁷ Stated bluntly, even though a particular Treasury can be reused multiple times to release credit, it can be sold only once to cover a loss. Those believing they have a right to its proceeds may find that the Treasury no longer exists precisely when they need it the most. Opacity means that dealers and others cannot know in advance how complex their collateral chain will be, and whether their collateral is as protective as regulation readily assumes.³⁸

Primary dealers also confront opacity in the secondary market for buying and selling Treasuries with investors.³⁹ Home to over $600 billion in average daily turnover in both 2020 and 2021, this market lacks real transparency.⁴⁰ Trades are not reported publicly in real time.⁴¹ The secondary market did not have a comprehensive trade reporting regime until 2017, capable of delivering insights on a trade-by-trade level.⁴² The regime that is currently in place mandates reporting to regulators only (rather than wider dissemination). Until February 2023, trading statistics were published weekly and in aggregate, after which regulators permitted once-daily reporting to the public (also in aggregate terms). The reporting regime has also had major gaps historically (for example, it has not required hedge funds to report trades).⁴³ Limited, comprehensive real-time disclosure adds to the monitoring costs faced by primary dealers, forcing them to buy and organize trading data privately. This can add delays and inaccuracies to data processing, making it harder to determine how risks are building in real time (for example, predicting large orders, predatory traders, or price dislocations).

This opacity feeds tension within a system of intermediation that must meet the needs of both public and private financial regulation at the same time. That is, actions taken by primary dealers to protect repo market operations for private firms can come at a cost to maintaining trading in the secondary market for the wider public.

Reliance on Treasuries as collateral in repo funding markets means that the availability of these securities for trading in secondary markets can become restricted. The repo market requires trillions of dollars in Treasuries (and cash) to be set apart daily to support private lending and borrowing.⁴⁴ The free-float of Treasuries—or the amount of Treasuries that are circulating freely at a given point in time—is thus reduced by what must be earmarked to support trillions in daily repo operations.⁴⁵ In a crisis, primary dealers must rapidly shore up Treasuries collateral in repo operations to protect financial stability and ensure that sufficient collateral exists to support trillions in exposure between private firms. In cases when the repo market gets securely ring-fenced, secondary markets can become strained as primary dealers have a smaller supply of assets with which to respond to investors wanting to buy and sell Treasuries in a panic.⁴⁶

Opacity contributes to the challenge primary dealers face in ensuring steady intermediation to both repo and secondary markets. A lack of full and real-time information means that primary dealers face constant difficulties in attempting to predict the needs of the repo market—like how much cash and Treasuries are needed on any given day. These demands are hard to predict in any event. As a market for funding the daily life of financial firms, pressure on the repo market can vary wildly depending on any number of factors like seasonality (for example, making payroll), time of day (for example, reduced demand during lunchtimes), and the nature of the firm’s business (for example, banks requiring large amounts of cash).⁴⁷ In March 2020, for example, weekly collateral needs varied by more than $350 billion for positions held by primary dealers.⁴⁸ While the secondary market for Treasuries tends to be more stable, crises can trigger an unexpected spike. For example, total aggregate weekly trading in the turbulent week of March 6, 2020, was around $5.7 trillion. By late July, however, activity volumes had normalized, and the secondary market saw around $3 trillion in weekly aggregate trading volume.⁴⁹

This tension between protecting repo markets and maintaining resilience in secondary trading creates the danger that intermediaries stop performing when the costs of doing so become too high. Regulation does not require dealers to remain trading.⁵⁰ If the cost-benefit trade-off of intermediation becomes overly expensive, intermediaries withdraw. Or, they choose to protect one market over the other, depending on profitability, important client relationships, and keeping a reputational halo.⁵¹ Stated differently, for public and private financial regulation to currently remain credible, Treasuries intermediation must be lucrative business for primary dealers.

When primary dealers face such a choice, they have powerful incentives to resolve the tension in favor of the repo market. The repo market is much larger than the secondary market. For example, to take a more typical week in 2020, for example, the week of July 29, 2020, the average daily trading in the secondary market by primary dealers was $518 billion and their average daily risk exposure was $271 billion.⁵² By comparison, in the repo market, primary dealers had lent out around $1.58 trillion and borrowed $1.81 trillion of Treasuries.⁵³ Taken together, their repo activity measured about six times their average daily secondary market trading volume of Treasuries, and about twelve times their daily average exposure in the Treasuries secondary market.⁵⁴ With its size and repeat client relationships, dealers can make profitable gains by focusing resources in the repo market ahead of the secondary market.⁵⁵ But this private preference comes with a collective price, in which the secondary market can become disrupted and fails to function as a safe haven for investors at large. Taken as a whole, serious pressures on dealer balance sheets can damage Treasury market function. As shown by Darrell Duffie et al., the quality of U.S. Treasury market operations deteriorates markedly when dealers are forced to delve deep into their balance sheet to intermediate trading.⁵⁶

In our second contribution, we show that regulators are poorly placed to recognize the tension between a system of public and private financial regulation that is so deeply reliant on Treasuries to function.

That regulators have failed to account for the structural interlinkages between repo and secondary markets is not surprising. From the institutional standpoint, secondary trading and repo markets are subject to a patchwork system of fragmented oversight, governed by a rule book that has failed to adapt to changing market design.⁵⁷ The market does not have a lead regulator; oversight of the secondary market is shared by five or more agencies.⁵⁸ The repo market, by contrast, looks largely to the Federal Reserve (“the Fed”) and the Federal Reserve Bank of New York (“NY Fed”) for supervision.⁵⁹ This confusing division of authority breeds gaps and blind spots. Owing to fragmentation and an absence of coordination, regulators lack a coherent picture of the risks that run between repo and secondary markets. Rulemaking is costly given the need to overcome bureaucratic walls and divergences in institutional mandates and approaches between different regulators.⁶⁰

Importantly, each market is regulated in accordance with distinctive methodological approaches. The secondary market for Treasuries trading (broadly speaking) hews to a more capital markets–based approach that focuses on generating smooth trading, price efficiency, and trade reporting to regulators.⁶¹ By contrast, repo markets fall under a more “prudential” framing that protects the systemic soundness of firms and the market. Disclosure and pricing carry far less emphasis than ensuring that firms avoid default and do not sicken one another if one of them collapses.⁶² A prudential model prioritizes collateralization and deep capital buffers, and can come with the (implied) promise of federal protection in case firm failure sets off systemic contagion.⁶³ These differences in regulatory approach complicate rulemaking, monitoring, and coordination challenges already pervasive to the task of overseeing Treasury repo and secondary markets. Regulators cannot fill information gaps because Treasuries collateralization reduces the need to gather and disclose data in real time. Data gathering in the secondary market also remains patchy. Without full information, policymakers cannot know what tools might work best to prevent sudden loss of liquidity and price distortions. Because ex post interventions to stabilize the market are available, regulators may prefer to rely on them rather than to engage in complex, ex ante, administratively costly rulemaking. When the Treasury market failed in March 2020, the Fed stepped in immediately, making around $1.5 trillion in cash and Treasuries available to primary dealers in a bid to revive intermediation.⁶⁴

A final observation on the economic significance of the Treasury market. From the standpoint of political economy, weakness in Treasury market structure is profoundly problematic for the status of U.S. debt as the global risk-free asset that is a lynchpin for financial stability. As Anna Gelpern and Erik Gerding write, the notion of a risk-free asset is one that is legally constructed rather than being intrinsically real.⁶⁵ Default arises as a matter of contractual design.⁶⁶ It is conventionally believed that the United States will pay its debts. However, in theory, it may default.⁶⁷ The most tangible manifestation of Treasuries, their power and prestige, comes from the workings of the market—by investors buying and selling Treasuries, or by using Treasuries as collateral to release economic value. Public oversight and private industry self-regulation reinforce this real-world compact. This collective practice makes failures in Treasury market structure particularly dangerous for the long-term dominance of the United States. With the Treasury’s risk-free status ultimately ephemeral, a disrupted market undermines the most fundamental article of faith about the power of the U.S. economy and its financial system.⁶⁸

In conclusion, to repair the broken promise of Treasuries in financial regulation, this Article proposes a three-part solution for reform. As a starting point, it advocates for systematically greater transparency and reporting, particularly in more opaque repo markets. A richer understanding of how this market works can help regulators and dealers manage their risks, address conflicts, and unravel complexities between the secondary and repo markets. Secondly, the Article seeks to require dealers to maintain intermediation, rather than exit the market at will. Even with information, dealers can still stop intermediating both repo and secondary trading whenever this task becomes too difficult or expensive. As noted earlier, there are no rules keeping key dealers in the market in crisis periods, and so such intermediation can disappear at any moment. To counter this risk, we propose that regulators expressly require key dealers to affirmatively maintain trading and price stability in Treasuries, even in crisis.⁶⁹ We consider this to be necessary in light of the fundamental reliance that financial regulation places on the steadfastness of the intermediation system for Treasuries. Importantly, such a mandate is familiar. For example, trading on the New York Stock Exchange (“NYSE”) was long maintained by dealers that contracted to support trading and price stability during crises.⁷⁰ In addition to being well-worn and familiar, this mandate offers realistic assurance that the Treasury market will always provide liquidity, and, in particular, do so when such liquidity is most needed and when the chances of market failure are greatest. Finally, we support thoroughgoing reform of the regulatory structure for the repo and secondary market to harness the potential for coordination offered by the Financial Stability Oversight Council (“FSOC”).⁷¹ Created in the wake of the 2008 Financial Crisis, we believe that it is well placed to coordinate a more streamlined approach to rulemaking and supervision and to holistically view the repo and secondary market for Treasuries as interconnected.⁷²

This Article proceeds as follows. Part I provides an overview of why Treasuries are risk-free and the reliance placed on their risk-free status in public regulation. It details the centrality of primary dealers to intermediation and market function. Part II analyzes the workings of the multitrillion-dollar repo market and the anchoring role of Treasuries in private contracting. Part III develops a novel account of the interconnected risks of intermediation in both repo and secondary markets to show that it is undermined by opacity, conflict, and complexity. Part IV sets out a solution to remedy fragility in Treasury market design. Part V concludes.

I. TREASURIES AND THE FINANCIAL SYSTEM

Beyond funding the affairs of state, the Treasury market represents a foundational pillar of global financial stability. A Treasury bond is perceived to be a default-free security that is capable of being traded easily at fair prices, offering investors an asset that can serve as a safe, cash-like store of value.⁷³ These attributes ensure that Treasuries occupy a central place in regulation as an asset capable of being used by financial institutions to protect themselves and the market from sudden insolvency and systemic collapse. In public regulation, financial firms must keep Treasuries as part of their firm’s rainy day reserves. By owning Treasuries, firms hold a security that has predictable cash flows and is assumed to be rapidly tradable to generate cash when it is in trouble. Similarly, Treasuries are critical to private self-regulation in anchoring everyday lending between financial firms. They constitute the preferred form of collateral in the five-trillion-dollar repurchase market, allowing firms to borrow and lend to one other safely without undertaking prior due diligence.⁷⁴

This Part explains why Treasuries have acquired this stature as the foremost safe asset and haven for global financial stability.⁷⁵ It highlights that the usefulness of Treasuries is comprised of two main attributes: (1) they are, for all intents and purposes, default-free, meaning that the United States will pay its debts, and (2) they are supposed to be highly tradable (or liquid), capable of being bought and sold in their secondary market with ease, at a fair price, and without trades causing prices to become distorted.⁷⁶ These two attributes, while linked, are distinct from one another. This Part describes the key features of this secondary market and its regulation. Like any other active market, Treasuries trade in an environment that is operationally complex and risky. These risks are amplified by a unique framework of public oversight that is fragmented and lacking in leadership, making rulemaking and supervision subject to coordination costs and delays.⁷⁷

A. Why Treasuries Are Risk-Free

The Treasury market is critical to economic life in the United States and the health of financial markets globally.⁷⁸ Public debt has proven to be a transformative force for the country.⁷⁹ It has allowed Congress to implement major policy initiatives such as public works projects, wars, and efforts to counteract economic misfortunes.⁸⁰ The Treasury market provides policymakers with power to pursue far-reaching goals.⁸¹ Policies do not have to be constrained by present-day taxpayer contributions. Rather, the Treasury can tap into global capital markets to raise money.⁸² Crucially, the economic and political heft of the United States enables investors to have confidence that whatever they lend to the Treasury will be repaid exactly as promised.⁸³ This credibility means that the United States can borrow to fund itself much more cheaply than other countries with weaker economies and political institutions.⁸⁴

Holding a default-free asset can be uniquely advantageous. Investors can be sure that the money they lend to the U.S. government is safe. Importantly, Treasuries provide a counterpoint to a portfolio containing a mix of investments with riskier options like corporate debt or equity. Whereas other assets involve varying cash flows, uncertainties in valuation, periods where they become hard to sell, or lose their value (in case of bankruptcy), Treasuries are not supposed to face any such danger.⁸⁵ Instead, investors believe Treasuries will perform in accordance with their terms, retain value, price, and currency stability. It follows that Treasuries have long been viewed as the safe haven for domestic as well as foreign investors, including other sovereigns looking to invest their reserves.⁸⁶

The key attributes of a Treasury bond—default-free, denominated in the U.S. dollar, designed to be paid out in specific maturities and simple to value—are bolstered by its tradability (for example, its ability to be bought and sold quickly and cheaply without significantly impacting prices).⁸⁷ Official government reports into the Treasury market commonly begin by observing that it constitutes the “deepest and most liquid government securities market in the world.”⁸⁸

This liquidity represents a hallmark without which Treasuries could not attract investors as easily.⁸⁹ Those holding Treasuries would not be able to turn them into cash, while those wishing to add Treasuries to their portfolios would struggle to purchase them. If investors lack liquidity, they will charge the U.S. government more to reflect the cost of keeping a less tradable investment on their books.⁹⁰

B. Making Treasuries Tradable

Through much of its history, regulators have relied on a cohort of international banks and investment banks—designated as primary dealers—to support intermediation in Treasury markets.⁹¹ After the initial issue by the U.S. Treasury (in what is called the “primary” market), the secondary market for day-to-day trading of Treasury securities is divided into two parts: (1) the segment where primary dealers and other dealers transact externally with customers (like foreign governments, or mutual funds) to buy and sell Treasuries, and (2) the interdealer segment where dealers internally transact with one another in order to even out or otherwise manage the inventories they hold of Treasuries. If some dealers want Treasuries but do not have them, while others wish to sell, the interdealer trading market offers a space for Treasury dealers to be able to transact with one another.⁹² Historically, primary dealers have played a critical role in each of these three parts of the market.⁹³

Treasuries at Auction: Primary dealers are expected to use their deep pockets to bid for new Treasury securities when they are issued at government auctions. These bids must be at competitive prices, meaning that primary dealers cannot comply with their obligations by simply making unrealistic and unrealizable bids.⁹⁴ The government places great trust in primary dealers by relying on these firms to act as counterparties at every issue of public debt. In their 2007 study, Michael Fleming et al. show that primary dealers purchased an average of 71% of new issues.⁹⁵

Given regular and heavy demands on their balance sheets, primary dealers are chosen from among those that can demonstrate the financial capacity to purchase, hold, and trade these securities. Though the NY Fed slightly relaxed eligibility conditions in 2016, the firms that perform primary dealer functions come from the ranks of well-regulated financial institutions—mostly banks.⁹⁶ As part of their agreement, primary dealers provide the NY Fed with weekly reports into the activities of the Treasury market.⁹⁷

Secondary Market – Dealer-Client: The secondary market comprises two major segments. In the dealer-client market, investors like foreign governments or mutual funds come to buy or sell Treasuries. This segment is critical for ensuring that Treasuries are widely available and capable of performing their stabilizing, protective function. Trading volume for a recent single day—July 31, 2024—in this dealer-client segment was about $735 billion.⁹⁸

Primary dealers have a major advantage in the dealer-client market. As large and internationally active financial firms, they possess an ample base of clients with which to transact. This network provides the means by which to operationalize the protective function of Treasuries by ensuring they are widely distributable.⁹⁹ Importantly, by dint of their participation in Treasury auctions, primary dealers have sizable inventories of securities that they can transmit.¹⁰⁰ Indeed, when seeking to bid for Treasuries at auction, primary dealers usually collect indications of interest from major clients beforehand, ensuring that that they are able to capture and meet demand more precisely.¹⁰¹

The structure of the dealer-client market is based on an over-the-counter design. Clients contact primary dealers (and other dealers) using telephones or computer screens to ask for bids on what they are willing to buy and sell and at what price.¹⁰² It rewards those most capable of developing repeat relationships with leading investors like foreign governments, insurance, or pension funds. By their proximity to Treasury auctions, the ability to anticipate client appetites, and experience, primary dealers generally represent an efficient and informed disseminator of Treasuries across the world.¹⁰³

Secondary Market – Interdealer: The interdealer market represents the other main segment of the secondary market.¹⁰⁴ It helps Treasuries dealers to even out their reserves by selling what they do not need to other dealers or dipping into this market to buy when they face a shortfall. On July 31, 2024, the day chosen above to illustrate recent dealer-client trading, the volume of trading in the interdealer segment was similar in magnitude to that of dealer-client trading, at around $729 billion.¹⁰⁵

The interdealer market fulfills two key functions. One, it helps reduce the risk that primary dealers and others face gluts or scarcity in their inventory of Treasuries. It provides a mechanism whereby the availability of Treasuries can be modulated between dealers to meet their external client demands. Primary dealers can sometimes face sudden, one-sided demand. With COVID-19 triggering panic in March 2020, investors sought en masse to sell Treasuries in order to get their hands on cash.¹⁰⁶ According to Vissing-Jorgensen, sales by foreign investors, mutual funds, and the household sector (including hedge funds) came to around $287 billion, $266 billion, and $194 billion, respectively, in the first quarter of 2020 alone.¹⁰⁷

A market to regulate supply and demand ensures that the dealer-client market is not disrupted because dealers are unable to obtain Treasuries or cash. For example, confronting heavy demand to buy from a sovereign, dealers can go into the interdealer market to supplement thinning reserves. By doing so, they fulfill client demand. The ability to sell excess inventory to other dealers means that dealers face fewer costs in keeping securities on their balance sheets.

Two, the ability to meet client demand smoothly means that the market becomes less vulnerable to sudden spikes or plunges in Treasury prices. When dealers can transact with one another to manage their supply of cash and Treasuries, they can lower costs to clients. The effects of scarcity or oversupply can be managed, helping markets remain more reliable and affordable for investors.¹⁰⁸

It is worth briefly noting that, from the mid-2000s, the interdealer market has undergone a structural shift to transition from an analog space to one that is now largely automated.¹⁰⁹ Its plumbing has transformed from reliance on telephones to the use of fast, artificially intelligent algorithms to drive trading.¹¹⁰ High-frequency trading (“HFT”)—in which Treasuries are being bought and sold in milliseconds or less—dominates, driving around 50–75% of trading volume.¹¹¹ One study showed that the median time between trades in ten-year Treasury notes in 2015 was around ten milliseconds.¹¹² A decade earlier in 2006, trading speed for this note stood at around one hundred times slower, with transactions occurring around one second apart.¹¹³ To be sure, this trend is a secular one. Electronic, automated trading has become the norm in equities and derivatives markets, reflecting growth in computing power, data processing, and artificial intelligence since the mid-2000s.¹¹⁴ Its extensive embrace within the historically staid Treasury market has nevertheless come as something of a surprise.¹¹⁵

High-speed trading has given rise to sources of instability.¹¹⁶ An automated interdealer market has introduced new types of traders with a different profile to primary dealers.¹¹⁷ HFT firms tend to be smaller securities firms that specialize in computerized trading across multiple types of assets, such as equities. Leading HFT firms are not household names, despite driving heavy volumes of trading. Firms like KCG, Spirex, XR Trading, or Jump Trading—while not as well-known as J.P. Morgan or Wells Fargo—have risen to become major suppliers of liquidity in the interdealer market.¹¹⁸

C. Centrality of Treasuries in Public Regulation

Owing to their default-free status and perceived liquidity, Treasuries have become the go-to protective asset in public financial regulation.¹¹⁹ Following the 2008 Financial Crisis, with top financial firms collapsing or needing bailouts, the rapid loss of solvency raised worries about how best to avoid a repeat episode.¹²⁰ The causes of the 2008 Financial Crisis are complex.¹²¹ Reform has sought to address numerous vulnerabilities.¹²² But to bluntly mitigate the catastrophic effects of firms becoming unable to pay out on short-term debts and triggering a domino of failures, buffers of rainy-day and high-quality liquid assets (“HQLA”) now represent a mainstay of financial regulatory design.¹²³

To illustrate, banks are required to maintain a cushion of highly liquid assets that can help them to cover their short-term outflows over a stressed thirty-day period. Importantly, firms should be able to convert these assets into cash within just one day, without these assets losing value. In other words, assets must be highly liquid and their fire sale in stressed circumstances ought not to cause their prices to become distorted. Overall, banks need to keep more than 100% of their expected thirty-day liquidity needs in the form of HQLA.¹²⁴ This liquidity-coverage ratio (“LCR”) represents a post-2008 reform hallmark, designed to ensure that firms do not cause contagious failures by failing to make good on their immediate commitments.¹²⁵ By preserving sufficient reserves of cash and cash-like assets, firms can feel safe in their ability to pay, while others are reassured that they will be repaid. The comfort of reliable liquidity buffers can work to limit the chances of a destructive run, when firms might try and seize cash and other assets in the worry that their counterparties cannot pay.¹²⁶

Treasuries rank in the highest tier of liquid assets for firms seeking to build their buffers of HQLA alongside pure cash and deposits with the Fed.¹²⁷ While cash and Treasuries are not exactly equivalent (for example, one has to sell a Treasury to generate cash), regulation assumes that they fall within the bandwidth of the same ultrasafe, ultra-stable, and ultra-liquid asset-type that can meet the need for immediate redemptions.¹²⁸ Further, Treasuries are supposed to do be bought and sold quickly without causing serious price distortions. Regulation encourages banks to hold unencumbered Treasuries and does not impose any discounting on the value of the Treasuries held.¹²⁹

It is widely accepted that the LCR has had a dramatic impact on how banks fund themselves and on the kinds of business that they undertake. By being mandated to keep this thick buffer of HQLA to cover outflows over a stressed 30-day period, banks confront a constant demand to maintain (and have access to) a regular supply of Treasuries and cash. In consequence, they have dramatically increased their Treasuries holdings to reflect efforts at compliance.¹³⁰ On the other side, firms have sought to also adapt their business lines to reduce or adjust the size of the liabilities to be covered within the 30-day window. Services like offering large deposit holdings to clients have incurred a cost in the form of LCR holdings.¹³¹ As banks must develop new business lines, they need to keep one eye on the Treasuries/cash market to maintain constant compliance with LCR requirements.

Beyond banks, the significance of Treasuries as a cash-like, highly liquid buffer of value has led to financial firms raising their holdings across the board. For example, as Nellie Liang and Pat Parkinson note, open-ended mutual funds hold as much as 12% of all Treasuries outstanding, while hedge funds maintain around 9%.¹³² According to Liang and Parkinson, such deep reserves of safe-haven securities in the hands of regulated firms point to a readiness on their part to sell Treasuries en masse in order to raise cash in distress.¹³³

D. Regulating the Secondary Market

This central place for Treasuries gives multiple major regulators an interest in their workings. Perhaps reflecting this quality, the oversight framework for Treasuries divides authority between several regulators, with none having lead status, but all having a stake in supervision.¹³⁴

Rather than having one lead regulator, like the Securities and Exchange Commission (“SEC”) is for equities, oversight of Treasuries is divided between at least five major agencies: the Fed, the NY Fed, the U.S. Treasury, the SEC, the Commodity Futures Trading Commission (“CFTC”) and the Financial Industry Regulatory Authority (“FINRA”).¹³⁵ The Fed supervises the banks, the SEC and FINRA oversee the securities firms, while the Treasury and NY Fed ensure surveillance over the auction process. The NY Fed also exercises designation authority for primary dealers, but it is not an official regulator for primary dealers.¹³⁶ The CFTC regulates derivatives markets that are linked to Treasuries trading—notably, Treasury futures.¹³⁷ Overlapping authority is commonplace in U.S. administrative law. As Jody Freeman and Jim Rossi outline, regulators sharing authority bring unique expertise. But they also face impediments, such as barriers to information sharing and the need for coordination in the completion of everyday oversight.¹³⁸

Likely in view of their risk-free status, the usual bevy of rules that apply to traders in major markets either do not exist for Treasuries—or do so weakly. This latitude is born out of the broad exemptions that Treasuries enjoy under the Securities Act of 1933 and the Securities Exchange Act of 1934.¹³⁹ Regulators themselves often lack a concrete picture about which rules apply to Treasuries and how they should be implemented.¹⁴⁰ Commentators observe that out of the thousands of FINRA rules for equity broker-dealers, around forty-six apply to broker-dealers in Treasury markets.¹⁴¹

Interestingly, the most visible divergence from classic securities markets regulation (for example, equity and corporate bonds) lies in the area of reporting and information dissemination. Treasuries have historically lacked a trade-by-trade reporting regime.¹⁴² Since 2017, FINRA-regulated securities firms are required to report their trades. Banks must do so as well.¹⁴³ Despite this reform, however, the 2017 regime has long left serious gaps. Those that did not traditionally fall within the category of either a broker, dealer, or bank were not required to report to FINRA. For example, a number of HFT securities firms have typically not reported directly, and neither have hedge funds.¹⁴⁴ In February 2024, the SEC passed a series of measures requiring major liquidity providers—regardless of their institutional category—to register with the SEC and to report their trades. The functional aim of such rulemaking arguably lies in broadening the regulatory perimeter to require reporting by and cast a spotlight on those undertaking major Treasuries-related business (for example, high-speed traders and hedge funds). But, the SEC’s actions were met with heavy resistance and a court challenge to their validity.¹⁴⁵ At least until the SEC’s new regulations are implemented, data in relation to non-reporting firms must be captured indirectly—for example, when a non-reporting trader transacts with one that falls under the general 2017 mandate, or data is supplied by a trading platform.¹⁴⁶ Further, public reporting of Treasuries data is fairly light and recent. The public gained access to Treasuries secondary market trading data but only since March 2020. This data has generally presented aggregate totals for weekly trading in different kinds of Treasury securities, moving only to daily aggregate reporting beginning in February 2023.¹⁴⁷

***

In summary, the Treasury market represents a critical pillar of the U.S. economy and financial system. Treasuries are viewed as risk-free. This label, however, conflates two aspects of Treasuries: (1) the likelihood of repayment and (2) their trading in secondary markets. With respect to the former, it is widely accepted that the United States will not default. However, in relation to the latter, the system of trading for Treasuries does present real risks. It reflects a design choice that places primary dealers as centerpieces in intermediation. Crucially, Treasuries fall under a system of oversight that is fragmented and without a lead authority. With reporting data only recently becoming available, and lacking comprehensive coverage, regulators face coordination and information costs when seeking to understand the riskiness of this market and its intermediation.

II. TREASURIES AND PRIVATE INDUSTRY SELF-REGULATION

In addition to playing an anchoring role in public regulation, Treasuries have become the lynchpin for safeguarding the six-trillion-dollar market for short-term credit between financial firms. Default-free and highly liquid, Treasuries are the choicest type of collateral, capable of being sold rapidly when a borrower cannot pay. In the repurchase market, any number of financial institutions borrow and lend cash/securities to one another to meet their daily funding needs—with Treasuries being the preferred type of collateral. Particularly after the 2008 Financial Crisis, firms have come to depend on Treasuries as collateral in the repo market, with around four trillion dollars dependent on Treasuries to secure debt. Because repo debt is short-term and collateralized using Treasuries, it is “informationally insensitive,” meaning, so safe that due diligence is unnecessary.¹⁴⁸

This Part describes how Treasuries have become essential to maintaining the system of financial self-regulation that drives over four trillion dollars in daily lending between financial firms. Just as in the secondary market, primary dealers are the major intermediaries in repo operations, performing a variety of functions as lenders, borrowers, and connectors that match clients. In highlighting the centrality of primary dealer intermediation, this Part describes why this task can be challenging and costly in repo markets, with the risk that primary dealers can quickly withdraw and temporarily restrict credit when their job becomes too difficult.¹⁴⁹ These challenges are not easily addressed through the regulatory framework. Unlike the secondary market for Treasuries, the repo market is regulated under a more prudential format, focused on maintaining the safety and soundness of participating firms and transactions. With a far lower emphasis on disclosure, the repo market constitutes an opaque environment by design, obscuring an understanding of its workings and its interlinkages with the Treasuries secondary market.

A. Private Credit in Financial Markets

The repo market—offering short-term, secured credit—represents a solution to the funding needs of large financial firms. A basic repo transaction works as follows. A Lender (a firm with cash) offers to loan money to a Borrower (a firm needing cash).¹⁵⁰ As with any loan, this transaction carries the risk that the Borrower might default. To limit the risk of losses arising from default, the market (1) ensures that the loan is collateralized and (2) keeps the maturity of the loan short (usually overnight, but it can sometimes extend to a month).¹⁵¹ The repo market thus represents a market for secured loans. If the Borrower cannot pay back the cash, the Lender can simply sell the securities and recover their money. The collateral that a Borrower provides is in the form of securities, like Treasuries, corporate bonds, mortgage-backed securities, or stock.¹⁵² Treasuries, unsurprisingly, constitute the most prized form of collateral. In terms of pricing, the value of the collateral exceeds the size of the loan by an amount called the “haircut.” The riskier the collateral, the larger the haircut.¹⁵³ Treasuries usually come with a haircut of around 2% for overnight borrowing.¹⁵⁴ A Borrower looking for a $100 overnight cash loan will need to provide the Lender with $102 in Treasuries.¹⁵⁵

As noted above, the maturity of loans is short, usually overnight. But loans are routinely rolled over.¹⁵⁶ This means that the loan is refreshed as it comes due: the Borrower can keep using the cash, while the Lender keeps the collateral. The short maturity structure gives Lenders the ability to control their exposure. If the Lender senses trouble, it can ask for the loan to be paid back, or it can choose to ask for more collateral to reflect any additional risk posed by the transaction. If the Borrower cannot repay, the Lender can sell the Treasuries. Danger arises for the market if the Lender feels unable to continue lending or rolling over existing loans, forcing a number of its borrowers into distress where they must pay up or find additional securities to keep borrowed cash.¹⁵⁷

The repo market is mainly divided in two: (1) the bilateral market and (2) the tri-party repo market. Further, repo transactions come in two distinct types: (1) repo trades and (2) reverse repo trades.

In the bilateral repo market, parties transact directly with one another and privately organize their own risk management. By contrast, in the tri-party repo market, the administration and settlement of trades are handled by a third-party firm that intercedes between parties to manage the risk of ensuring the trade executes.¹⁵⁸

The classification of whether a repo transaction represents a repo or reverse repo depends on whether the dealer is borrowing or lending cash in the transaction.¹⁵⁹ In a repo, the dealer is borrowing cash (and providing collateral). In a reverse repo, the dealer is lending cash (and receiving collateral).¹⁶⁰ Figure 1.A shows the size of the bilateral repo market over the period 2013–2023.¹⁶¹ There are 2 points to note: (1) the daily size of the bilateral repo market, comprising both repo and reverse repos, has varied from $3.9 trillion to $5.1 trillion over the last ten years and (2) on average, around 75% of the collateral in the bilateral repo segment comprises Treasuries, meaning that Treasury securities valued at about $3 trillion to $4 trillion remain locked up as “passive” collateral to prevent default, thereby reducing the “float” readily available with dealers for secondary market trading.¹⁶² Figure 1.B shows the size of tri-party repo market over the past decade.¹⁶³ For example, in December 2021, the daily size of the tri-party repo market stood at around $3.7 trillion, around $2.5 trillion of which was backed by Treasuries.¹⁶⁴ Given the higher risk of direct trading, the bilateral repo market uses Treasuries as collateral in a much larger proportion of its transactions.

The feature that gives the repurchase (repo) market its name describes the legal arrangements that underlie how a typical repo works. The transaction is effectively structured as a sale and repurchase of securities even though it is, for all intents and purposes, a loan. The Borrower (seeking cash) sells its securities to the Lender (for the cash) with the promise to buy them back at a pre-agreed price the next day (or whenever the deal matures). The pre-agreed repurchase price represents the amount of the loan alongside an additional slice of compensation.¹⁶⁵ Because the Lender legally owns the securities, it can sell them if the Borrower defaults. The Bankruptcy Code specifically allows repo Lenders to sell collateralized securities even if the Borrower files for bankruptcy. Ordinarily, without such protection, Lenders would be stopped from doing so by the Code’s automatic stay on enforcement actions.¹⁶⁶

Scholars observe that repo markets tend to function as a “bank-like” system for financial firms.¹⁶⁷ Those that have cash can earn money by lending it and taking collateral. In turn, those that need cash but have securities can offer their securities as collateral in return for access to cash. Repo markets reflect the reality that financial firms are unique. They cannot park millions and billions of dollars in a bank account. Those with cash want this money to generate some return, rather than having it languish unproductively. On the other side, some firms’ asset base focuses on holding securities rather than cash (for example, if they invest heavily in securities as underwriters).¹⁶⁸ The repo market unlocks value by allowing those that need cash to borrow it on a short-term and secured basis, while permitting those with cash to be able to lend it out and make money from this transaction.¹⁶⁹

Secondly and relatedly, repo markets enable financial firms to use leverage if their business model and balance sheet can support it. For example, suppose a firm has 100 Treasuries, each valued at $100. The typical haircut for Treasuries is 2%. The firm can borrow $9,800 against this collateral. It can then use these funds to buy 98 Treasuries, following which it can again use the 98 Treasuries as collateral to borrow $9,600. It can then use these funds to buy 96 Treasuries and so on, potentially achieving up to 50 times leverage.

Equally, the same Treasuries may be used in multiple transactions. Repo markets permit dealers to take Treasuries belonging to a client and to use these as collateral for their own purposes in the repo market.¹⁷⁰ A hedge fund with 100 Treasuries can entrust a Dealer with their safekeeping. Rather than simply let these assets be unproductive in an account, the Dealer and hedge fund agree that the Dealer can use the Treasuries for the Dealer’s private credit needs. In return, the Dealer can offer the hedge fund a line of credit on cheaper terms than what might otherwise have been possible. Because the Dealer can control the Treasuries, it can use them as collateral to borrow funds from a Lender. The Lender, too, can take these same Treasuries and reuse them for more borrowing.¹⁷¹ According to Infante et al., by dint of contractual agreements, primary dealers are generally permitted to reuse the vast majority of the Treasury collateral that they hold for clients. Studies suggest that as much as 85% of all Treasuries collateral may be subject to reuse.¹⁷²

Reusing collateral has a number of important benefits for both dealers and their clients. It means that dealers can provide cheaper intermediation across the board. Rather than having to buy Treasuries and spend cash to do so, a dealer can request permission from its client to borrow their securities. In turn, the client also receives cheaper services.¹⁷³ Reuse also means that the market unlocks maximum value from a Treasury. Rather than only be used once in one trade, reuse can extract economic gains when Treasuries are used across multiple transactions.¹⁷⁴ So long as parties can repay, reuse can help lower the costs of intermediation and accessing repo credit affordably.¹⁷⁵

But reusing collateral can also be dangerous.¹⁷⁶ While profitable in good times, collateral reuse can amplify distress in crisis. It undermines the assumption that repo markets provide fully secured lending. If the Hedge Fund client demands its Treasuries back, the Dealer faces a problem—as do others along the collateral chain. The Dealer must immediately source the Treasuries to return to the Hedge Fund.¹⁷⁷ If the Dealer has used Treasuries to borrow cash, it must find cash to pay its Lender back and recover the Treasuries. Where the Dealer has failed (like Lehman Brothers), the Hedge Fund can find itself caught up in long legal proceedings to recover the assets.¹⁷⁸

The fragility of collateral chains was made clear around September 16, 2019, when rates to borrow cash in the repo market spiked, climbing to almost 10% from about 2% in the week prior.¹⁷⁹ In seeking to understand why, a number of commentators pointed to concerns about the quality of the collateralization—and whether collateral chains of reused Treasuries could be counted on as watertight. Manmohan Singh of the International Monetary Fund estimated that, in the 2018 Treasury repo market, around three separate actors believed that they were entitled to the very same Treasury security.¹⁸⁰ This imputed a reuse rate of 2.2. That is, in addition to the actual owner (for example, the Hedge Fund above), 2.2 further firms considered themselves entitled to the Treasury collateral.¹⁸¹ Owing to uncertainties about whether lending was really fully collateralized, primary dealers and others became wary of parting with cash, despite the promise of an almost 10% interest rate on offer that day.¹⁸² As this episode makes clear, even though a Treasury can be reused multiple times as collateral, it can only be sold once to cover exposure. In an informationally opaque environment, in which parties do not know if they might be the one caught without viable collateral, it makes sense for primary dealers to stop intermediation and withdraw from the market.

B. Intermediation in the Repo Market

Primary dealers are the key intermediaries in the Treasuries-backed repo market. According to Copeland et al., primary dealers appear to intermediate around 80% of the bilateral repo market.¹⁸³ In addition to serving the financing needs of clients, primary dealers also use the repo markets to secure funding for themselves.¹⁸⁴

As intermediaries, primary dealers are tasked with fulfilling a number of functions in the repo market. At the most basic level, they match borrowers with lenders. When one client has cash (for example, a mutual fund), while another needs it (for example, a bank), the primary dealer connects both parties and facilitates the repo transaction (for a price). A more engaged role involves the primary dealer acting as one side of the repo trade for a client, either as borrower or as lender. When doing so, the primary dealer deploys the power of its balance sheet to take risk directly on its books.¹⁸⁵

Primary dealers have unique advantages when it comes to intermediating the Treasuries-backed repo market. Beyond access to government auctions, they also possess positional dominance. For one, as connected nodes in financial markets, with access to networks of global clients, primary dealers represent trusted repositories for client cash and securities. In return to offering services and expertise, dealers acquire access to vast amounts of client securities and cash that can be reused for repo operations. Because the repo market enables collateral reuse, dealers can subsidize the costs of intermediation.

In addition, this central position affords primary dealers some informational advantages. They are well-placed to identify, connect, and transact with counterparties. Primary dealers are likely to have knowledge about which kinds of firms tend to hold sufficient cash (for example, mutual funds) to lend, and who has enough securities to be able to borrow. Repeat relationships can help build trust, deepen knowledge about the client’s financials, and allow the primary dealer to more precisely price the terms of repo debt. Connections with multiple clients can permit primary dealers to fulfill orders for larger volumes of Treasuries/cash in which a dealer can tap and pool assets across clients. Crucially, this ability to tap into a sprawling network of resources can strengthen the financial system because its key intermediaries are positioned to supply liquidity in an elastic way.¹⁸⁶

Finally, primary dealers are active throughout financial markets and supply liquidity in a variety of assets like equities and corporate bonds.¹⁸⁷ This broad-based participation in capital markets puts primary dealers in a strong position to use their experience and expertise to better predict demand for repo funding. For example, by being active suppliers of liquidity to the corporate bond market, primary dealers are likely to have a detailed understanding of who the key buyers of bond issues are likely to be (for example, insurance firms or mutual funds). This can provide special insight into possible future pockets of demand for Treasuries/cash collateral in which investors might need short-term cash to purchase a sizable volume of bonds.

C. Regulating the Repo Market

Despite its short-term and collateralized nature, the repo market is criticized for its structural instability and the profound risk that it poses for the financial system.¹⁸⁸ Scholars argue that the repo market suffers from a similar vulnerability to banks: the chance that it suffers a run in which lenders are frightened enough to recall their short-term debt en masse. According to Gary Gorton and Andrew Metrick, a major catalyst for the 2008 Financial Crisis came from a run in one part of the repo market, making it impossible or expensive for financial firms to continue funding themselves.¹⁸⁹ In Gordon and Metrick’s study, the collateral underlying the repo loans was largely comprised of mortgage-backed securities that plunged in value.¹⁹⁰ Even where underlying securities were not as risky, the fear that they could be and that repo borrowers would be unable to repay ramped up what lenders were charging or caused them to call in their loans.¹⁹¹

Unlike banks, the repo market lacks preventative structural safeguards, like deposit insurance, that could mitigate the risk of a run.¹⁹² Instead, after 2008, it largely relies on Treasuries collateralization to assure parties that they will be repaid.¹⁹³

This focus on collateralization reflects the prudential approach that characterizes the regulation of the repo market. Broadly, a slew of capital rules for financial institutions take account of a primary dealer’s repo participation to calculate how much capital it needs.¹⁹⁴

As detailed in Part I, banks must maintain a supply of highly liquid assets that can help them to remain solvent in the event of a sudden cash drain. The mandate that major financial institutions buffer themselves up with a thick reserve of highly liquid assets reflects the lessons learned in 2008 that underscored the potential for a liquidity crunch, as exemplified by the repo market’s failure.¹⁹⁵

However, their obvious utility and importance notwithstanding, these liquidity rules have also been blamed by some commentators for amplifying the instability in repo operations. In September 2019, when cash in the repo market seemed to run dry, no bank came forward to take advantage of what would have been a lucrative opportunity to lend (with a rate of 10% on offer). According to some commentators, post-2008 liquidity rules meant that banks did not wish to lend cash because they preferred to maintain high cash reserves and meet their compliance requirements. Importantly, the Fed pays interest on the cash reserves that it holds in its accounts for banks. If these interest payments are sufficiently high, banks might hesitate before using the cash for repo lending.¹⁹⁶ As Joshua Younger et al. observe, large banks were not short of cash in mid-September 2019.¹⁹⁷ They held around $700 billion in cash reserves—far in excess of what was required of them under law.¹⁹⁸ On paper at least, they should have been able to direct some of these funds into the repo market and ease the costs of lending. This suggests that dealers preferred to prioritize keeping a thick liquidity buffer, accruing interest in their account with the Fed and waiting out the uncertainty, rather than actively deploying their balance sheet to alleviate it.¹⁹⁹

In addition to liquidity ratios, primary dealer banks can also become subject to a “capital surcharge” over and above the basic capital buffer that banks maintain—resulting in banks seeking out ways to avoid the full force of paying this extra cost. Under post-2008 rules, the surcharge kicks in when a bank is deemed to be large and systemic.²⁰⁰ The greater the size and interconnectedness of a dealer bank, the greater the likelihood that it faces a higher charge.²⁰¹ From the standpoint of policy, this systemic surcharge serves the purpose of ensuring the financial markets are better girded against the possibility that a large bank fails because this bank should have a deeper buffer from which to cover its losses. However, as an unintended consequence, it can motivate dealer banks to suddenly reduce the depth of their repo intermediation in order to avoid becoming sufficiently large and interconnected to become subject to a higher capital charge.²⁰²

Industry analysts report that large, systemically important banks routinely seek out ways to show a reduced footprint when it comes time for regulators to assign scores for the purposes of the surcharge.²⁰³ Under this regime, banks have incentives to reduce their systemic activities to just below the threshold at which a higher charge would apply. Because activities in the repo market constitute one signal of a bank’s interconnectedness, heavily limiting the depth of its involvement can help a bank to reduce the capital surcharge it faces.²⁰⁴ Further, owing to the short-term nature of repo lending, dealers can precisely time their retraction to quickly closeout repo transactions before being rated.²⁰⁵ This kind of behavior—while perhaps rational for any single dealer—clearly poses problems for the market as a whole. Repo markets can experience a broad fall in the intensity of intermediation around times when dealers are to be assessed for a surcharge.²⁰⁶ Even if the market can predict that such an eventuality will occur, episodic illiquidity can still create periods of fragility in which firms struggle to get the repo loan they need at an acceptable price. Indeed, even the fact of anticipating such liquidity-draining milestones can constitute a self-fulfilling prophecy. Firms may be less willing to come forward with their cash and Treasuries to trade assuming likely frictions in the market. Whenever dealers decide to scale down their intermediation, even if temporarily, it can introduce a broad slowdown in the flow of credit across the financial system.

In summary, private self-regulation in financial markets looks to Treasuries to protect firms and the system against default. Private lending between firms in the repo market relies on primary dealers for intermediation. Despite being collateralized using Treasuries, however, the repo market suffers from built-in risks. It is opaque by design. Reuse of Treasuries collateral creates a source of instability in crisis. Short-term financing can dry up quickly. Dealers are free to withdraw or reduce intermediation. That being said, with around four trillion dollars in daily lending backstopped by Treasuries, the financial system is entrenched in its belief that Treasuries constitute the protective safe asset to anchor private industry self-regulation.

III. THE FALSE PROMISE OF TREASURIES

This Part argues that systematic reliance on Treasuries in public and private financial regulation is internally in tension. First, we show that the secondary market and the repo market are inextricably linked and interdependent such that loss of function in one market can affect the other. Secondly, primary dealer intermediation, underpinning both markets, is subject to a slew of costs and problems. Opacity is pervasive. This prevents primary dealers from gaining a full picture of the risks. This increases the challenge facing primary dealers to navigate the internal conflict between the repo and the secondary market operations. With trillions of dollars in Treasuries and cash collateral locked-in to support the repo market, the secondary market for trading Treasuries can face a shortfall, especially during crises. In seeking to resolve this tension, primary dealers are likely to favor intermediating in the market in which they will gain the most, economically and reputationally. Alternatively, if neither market provides lucrative gains, primary dealers will rationally have every reason to withdraw intermediation altogether.

Additionally, this Part observes that the regulatory system is ill-placed to recognize the risks of an interconnected repo and secondary market for Treasuries. The Treasury market is overseen by a panoply of agencies. Their approaches to oversight diverge. Cooperation costs are built-in to a fragmentated system of oversight.²⁰⁷ Regulatory incapacity increases the dangers of risky intermediation for Treasury market fragility and casts further doubt on the ability of Treasuries to function as safe assets in public and private financial regulation.

A. Opacity, Information Costs, and Monitoring

Opacity in the repo market and the secondary market adds systematic information costs to primary dealer intermediation. A lack of full and real-time information limits monitoring and makes it harder for primary dealers to anticipate demand on their balance sheets. If the costs of opacity become too much, primary dealers may limit or withdraw intermediation to one or both markets.²⁰⁸

Primary dealers face uncertainties from a number of sources in intermediating both Treasury secondary trading and repo markets. First, both markets are home to a diverse set of users whose needs for cash and securities can vary unexpectedly. The repo market exemplifies this vulnerability.

Taking data from the periodic reports that primary dealers provide to the NY Fed, it becomes clear that Treasuries’ exposure of primary dealers is much greater in the repo markets relative to the Treasury secondary market.²⁰⁹ Figure 2 shows the average daily primary dealer exposure to outstanding repos and reverse repos over the period from 2016 to 2023.²¹⁰ Similarly, Figures 3.A and 3.B show, respectively, the average daily primary dealer inventory exposure in the secondary market, and the average daily trading volume in the secondary market.²¹¹ The data underlying the charts show that the average daily collateral exposure of primary dealers in the Treasury-backed repo market averaged $1.91 trillion in 2020 and $1.84 trillion in 2021. In contrast, the average net primary dealer exposure to the Treasuries’ secondary market totaled only around $244 billion in 2020 and $159 billion in 2021. Even the average daily trading volume of primary dealers in the Treasury secondary market—that includes both buying and selling by primary dealers—was only around $500 billion in both 2020 and 2021.

Importantly, the repo market also exerts large unpredictable demands on dealer balance sheets. By contrast, the secondary market is steadier. The data underlying Figure 2 shows that the average daily collateral exposure of primary dealers in the Treasury-backed repo market varies wildly and unpredictably from 1 week to the next. In 2020, this week-to-week difference varied from under $1 billion to $385 billion during the year and averaged around $72 billion. The average daily collateral used by primary dealers for the Treasury-backed repo market also continued to vary substantially from one week to the next in 2021, with weekly changes exceeding $100 billion about 20% of the time. Further, these position changes in the repo market are not correlated from week-to-week, reflecting constantly shifting market-wide needs.²¹²

By contrast, the secondary market is much less dramatic. Using data underlying Figure 3.A, it saw average weekly variation in primary dealer exposures of only around $17 billion in both 2020 and 2021, with maximum weekly variation of only $46 billion. These figures reflect a consistent pattern over time. Figure 4 shows the sizable extent by which changes in primary dealer exposures to the repo market dominate changes in primary dealer exposures to the Treasury secondary market between 2016–2023.²¹³

Relatedly, primary dealers are also subject to the vagaries of how other dealers use Treasuries and cash in repo intermediation. Primary dealers do not intermediate all bilateral repo transactions. Copeland et al. estimate that around 20% of transactions are not intermediated by primary dealers.²¹⁴ While primary dealers occupy an outsize and influential position, their perch only allows them a partial view. Consequently, primary dealers confront blind spots about how these other firms use collateral and cash. This can create further difficulties for primary dealers in seeking to estimate the Treasuries and cash required to sustain repo operations as well as Treasury secondary markets.²¹⁵ For example, hedge funds have emerged as active and intriguing players in repo markets. According to one 2021 study, hedge funds have doubled their exposure to Treasuries to $2.4 trillion between 2018–2020, holding around $1.4 trillion in Treasuries in mid-2019, compared to the largest banks that held only around $524 billion at that time.²¹⁶ In addition to using the repo market to borrow to fund themselves, hedge funds have also taken over some of the trading and liquidity supplying functions traditionally performed by major dealers.²¹⁷

Opacity attaching both to the repo market and to hedge funds has precluded a clear understanding of what kinds of risks hedge funds pose for dealers. For example, hedge funds are well-known for taking on debt to pursue trading strategies.²¹⁸ Hedge fund participation raises the danger that leveraged funds become a credit risk for dealers that fund them. In addition, owing to their smaller balance sheets, hedge funds may need to sell Treasuries and also pull back quickly from providing liquidity to the rest of the market.²¹⁹ The March 2020 crisis has shone a spotlight on the potentially destabilizing role of hedge funds in Treasuries.²²⁰ It also highlights the pervasive danger for dealers from corners of the market they are less familiar with yet whose activities can nevertheless dramatically impact their own.

Second, primary dealers face difficulties from their participation as intermediaries in capital markets more broadly. Crucially, primary dealers are active liquidity suppliers to risky assets.²²¹ Figure 3.A shows that about one-third of net primary dealer exposure has been in risky non-Treasury securities. Figure 3.B shows that primary dealer trading volume in non-Treasuries is comparable to that in Treasuries, such that risks arising out of trading in non-Treasury assets are similar in volume to those arising from Treasuries.

Further, Figure 2 shows that the exposure of primary dealers to non-Treasuries collateral in repos and reverse repos is about 20% to 25%. Non-Treasury collateral generally comprises mortgage and asset-backed securities.²²² As Gary Gorton and Andrew Metrick observe, these securities pose especially high risks where parties rush to liquidate their loans in fear of the collateral losing value quickly.²²³ This non-Treasury segment can heighten the risk of a repo run and force primary dealers to limit their intermediation across the board, including in Treasuries repo and secondary markets.

Third, information costs are exacerbated by difficulties in the ability of primary dealers to understand whether the Treasuries collateral they hold is actually viable. In other words, can this collateral be traced and readily sold for cash? Lengthy collateral chains, formed by reusing Treasuries multiple times, muddy understanding of whether this collateral is available in a default. Information sources are scant. Primary dealers do not have to report data on their use of client collateral in their weekly disclosures to the NY Fed.²²⁴ This means that they do not need to tell the NY Fed about how they use collateral that clients entrust to them in safekeeping. This leaves firms (and regulators) to guesstimate exposures.²²⁵

The perceived safety of the Treasuries-backed repo—and the difficulty of mapping out collateral chains—can act as a disincentive for primary dealers and others to invest in information gathering. Even if they do go to the trouble of mapping out the risk, the constantly evolving nature of repo market exposures means that this map of underlying collateral chains can change quickly.

Information and monitoring costs in the repo and secondary market for Treasuries contribute to imperfections in intermediation.²²⁶ Higher costs in understanding and pricing risk can result in intermediation becoming more selective, expensive, and governed by private interests at a cost to the public good. In response to these costs, primary dealers may temporarily cut off credit in repo, including to one another. Without the ability to fund themselves using repo operations (for example, to borrow cash), they may also withdraw from the secondary market and stop buying and selling Treasuries to investors. Critically, incomplete transparency in repo and the secondary markets can shield primary dealers that withdraw intermediation. They may be quicker to leave the market whenever they see fit owing to the impossibility of being publicly identified as the dealer that stopped supplying liquidity.

B. Conflict Between Public and Private Regulation

Primary dealers also face a conflict when intermediating cash and Treasuries between the repo and secondary markets, particularly during crisis. Supporting the health of one market (for example, maintaining systemic stability in repo) can come at the expense of the other (providing liquidity to the secondary market).

This task of accomplishing successful intermediation across both markets is complicated by a number of factors. Primary dealers must decide how they allocate the “free float” of Treasuries available to them. They must examine the volume of Treasuries that is freely available and decide how much of this float should be allocated between the secondary and repo markets.

This calculus reveals the depth of the internal conflict at play between the secondary and repo market. Critically, the free float of Treasuries is reduced by the large amounts of cash and securities that are “locked-in” in the repo market.²²⁷ This means that large volumes of this float become passively captured as collateral in the repo market.

Specifically, during 2020, the daily average Treasuries transaction volume in the secondary market was around $603 billion. Compared to this, the dollar volume of primary dealer Treasury collateral in repos during 2020 was $1.9 trillion, and the dollar volume of such collateral in reverse repos was $1.7 trillion. Taken together, without reuse, Treasuries valued at a daily average of about $3.6 trillion were captured as passive collateral in the repo contracts of primary dealers in just the bilateral repo market. This is about twelve times the average exposure of primary dealers, and six times their daily Treasury trading volume in the secondary market. This inference is not specific to 2020. Rather as seen in Figures 2, 3.A, and 3.B, these trends are persistent. High amounts of captured Treasuries float—necessary to repo operations—drastically reduce the volume of Treasuries that are available for trades in the secondary market.²²⁸

These restrictions create difficulties for intermediation by primary dealers. Captured repo collateral imposes rigidity on primary dealers that reduces how fully they can supply liquidity to the secondary market during crisis. The secondary market can experience sudden and unexpected pressure from investors to perform. For example, total aggregate trading in the Treasury secondary market in the weeks of March 6, 2020, and March 13, 2020, came to around $5.7 and $4.9 trillion, respectively—an especially turbulent 2 weeks during which the Treasury market essentially stalled.²²⁹ By contrast, as secondary trading activity normalized, it started seeing approximately half the volume by summer 2020.²³⁰

Trillions in passive repo collateral create a source of fragility for the secondary market, generating logistical and financial difficulties for primary dealers: (1) dealers might have to constantly warehouse a reserve of Treasuries to support Treasury secondary trading during periods of high demand or (2) they can risk having to buy Treasuries during a crisis in order to meet demand. The first option creates costs because dealers have to allocate capital for buying and maintaining Treasuries supply, limiting profits from intermediation. In the second case, they run the risk that Treasuries become costlier to source, potentially resulting in reduced margins for their business. A third option always remains on the table: to reduce intermediation and avoid the need to source expensive Treasuries during difficult periods. This trade-off requires primary dealers to balance their private interests with public ones. When Treasuries’ float is limited and demand in secondary trading exceeds existing reserves, the incentives of dealers to remain committed to Treasuries trading diminishes.

The coexistence of the secondary and repo markets—both intermediated by primary dealers—thus reveals real structural tensions. The growth of repo lending, demanding higher volumes of Treasuries float, can result in a corresponding decrease in securities available to lubricate the secondary market. This trade-off creates a complex problem for policy. If repo lending represents a desirable and efficient form of funding for financial institutions, ensuring it is done safely is of paramount concern. At the same time, a growing reliance on repo operations for financial institutions results in fragilities for the secondary market and the regulation that depends on it.

Moreover, as intermediaries for both markets, primary dealers have incentives to prioritize the needs of one market over another depending on private preferences. This favoring of one market over another can be motivated by a number of reasons. For example, primary dealers may see larger profits, lower risk, and reputational gains from ensuring the continuity of repo lending than from supplying expensive liquidity to the secondary market. Repo markets are far larger and primary dealers earn fees for matching counterparties.²³¹ It is one in which primary dealers dominate and have repeat relationships with major clients. Moreover, dealers are themselves beholden to the repo market for their own financing needs. For dealers, then, there is a lot to gain from seeing a growth in the size of the repo financing market—even if this means periodic retreats from the more competitive secondary market for Treasuries in which primary dealers have lost ground to high-speed electronic traders.²³²

The consequences of how primary dealers navigate this trade-off is a critical matter for public policy. If primary dealers are motivated to step away from intermediating in secondary markets, their actions call into question the view that Treasuries trade in a market that is deeply and constantly liquid. Rather, it points to one that is chronically vulnerable to the private preferences of its key intermediaries who are unlikely to continue offering resilient tradability at a cost to themselves. More broadly, for public and private financial regulation to remain credible, its intermediation must be able to deliver continued lucrative profit to its major dealers.

C. A Unique Configuration of Unknown Unknowns

As detailed above, primary dealers navigate a tricky and costly task in intermediating across both the repo and secondary market for Treasuries. Opacity limits the ability of a dealer to build a real-time understanding of the activity in the repo market external to that dealer—most importantly, where Treasuries collateral is located, and whether it can be captured and sold in an emergency. Motivation to monitor is low. The secondary market’s limited historic reporting also reduces sight of pockets of disruptive trading—and the arrival of greater competition between primary dealers and newer high-speed automated traders lowers the attractiveness of the secondary market as a place to do vibrant business. Importantly, intermediation represents a source of conflict. Dealers are caught between preserving trillions in passive collateral in the repo market to maintain its safety and soundness—and using cash and Treasuries to supply liquidity to the secondary market. Especially if collateral reuse results in uncertainty about the quality of collateral, primary dealers have every incentive to ring-fence the Treasuries and cash they have for repo operations, even if demand in secondary markets is spiking. Critically, despite dependence on the services of primary dealers to preserve intermediation, they can withdraw from both spaces whenever costs and uncertainties become too high.²³³

This combination of risks represents an unprecedented set of problems for intermediation in the repo and secondary markets. It leaves dealers and policymakers facing many unknowns.

First, as shown in Part I, the repo market has grown its reliance on Treasuries collateral sharply following the 2008 Financial Crisis as a way to privately regulate short-term credit between firms. Whereas an earlier era looked to a variety of riskier assets like mortgage-backed securities, the last decade has observed a marked shift in the direction of Treasuries as favored collateral.²³⁴ With Treasuries now collateralizing around $4 trillion of repo debt, concerns about locking-in and ring-fencing collateral carry special salience given the potential for catastrophic damage to financial stability if this collateral reserve becomes unstable.

Second, post-Crisis regulation also puts special weight on Treasuries as a mandatory asset for capital buffers. As discussed in Part I, Treasuries are particularly important for post-Crisis public regulation, with financial institutions required to maintain deep buffers of high-quality liquid assets. This signal reliance by public regulation raises the stakes for primary dealers to ensure the secondary market is supplied with trading opportunities for those firms that need to liquidate their Treasuries in a financial crunch or to buy Treasuries when a safe asset is needed. As detailed by Vissing-Jorgensen, mutual funds sold more Treasuries in 2020 than they did after the 2008 Financial Crisis, owing to the thicker reserves of Treasuries they held coming into 2020.²³⁵

On the other side, detailed in Part II, post-Crisis reforms also impose these liquidity requirements and capital surcharges on primary dealers themselves and necessitate compliance with regulations that protect these firms from becoming too big to fail. According to some commentators and scholars, these rules can also make primary dealers more hesitant to supply liquidity to the repo and the secondary market, depleting reserves of cash and Treasuries, and falling out of compliance. Importantly, scholars also note that a stricter compliance environment following post-Crisis reforms has reduced primary dealer motivation to support intermediation—and opened the door for other firms to enter the fray. While a broader trend across debt markets, commentators note that non–primary dealers (for example, hedge funds) have stepped into the breach to supply liquidity more actively.²³⁶ If this trend continues, primary dealers could face more information gaps arising from the activities of those that are new to the market as well as greater competition that diminishes their profits from intermediation.

Third, as outlined in Part II, the secondary market for Treasuries has experienced radical shifts, as primary dealers have lost ground to high-speed traders in the interdealer segment. According to one study of the major interdealer trading platform, BrokerTec, eight out of the top ten traders on the venue came from the ranks of HFT firms, rather than primary dealers. Primary dealers have continued to dominate the dealer-client segment. But this rapid waning of their professional power shows that that the secondary market has become more crowded with new entrants, and the incentives of primary dealers to take on costs to keep trading are quickly becoming weaker in the face of competition.²³⁷

Putting these factors together, Treasuries intermediation has newly evolved into an especially complex, contradictory, and costly prospect for primary dealers, policymakers, and regulation. It is also foundational to financial markets and their stability post-2008. This coming together of regulatory need and operational complexity in managing intermediation requires that regulators be equipped to spot and address the risks at the heart of a straining Treasury market structure. As argued below, this is far from the case given the highly fragmented state of current regulatory design.

D. A Breakdown in Regulation

The regulatory framework to oversee the repo and secondary market for Treasuries is ill-equipped to respond to the vulnerabilities underlying their market structure. Supervisory approaches for repo and Treasuries markets are divided between a “securities” model on the one hand (for secondary trading) and a prudential one on the other (for repo). This leaves regulators unable to develop a consolidated approach to oversight that recognizes the interdependence between the repo financing market that relies on Treasuries collateral and secondary trading that needs Treasuries to be capable of being bought and sold to realize their value quickly, cheaply, and at fair prices.

The regulatory framework for secondary trading in Treasuries is institutionally fragmented without any overarching coordination mechanism to guide rulemaking and supervision.²³⁸ As detailed in Part I, unlike equities markets that, for example, fall primarily within the jurisdiction of a single regulator (the SEC), Treasuries lack a single lead overseer. Oversight is shared between at least five major bodies: the U.S. Treasury, NY Fed, the Fed, the SEC, and CFTC. FINRA also oversees securities broker-dealers and is instrumental in data collection from reporting firms after 2017.²³⁹

Fragmentation raises serious concerns in the context of an interconnected, internally conflicted repo and secondary trading market.²⁴⁰ First, information sharing becomes hobbled by institutional barriers and bureaucratic divergences in how information is collected and analyzed.²⁴¹ Consider the so-called “Flash Rally” in the Treasury secondary market. On October 15, 2014, the Treasury secondary market experienced around thirty minutes of aberrant, anomalous trading at the start of the trading day, characterized by prices surging to some of their highest historic levels for inexplicable reasons.²⁴² Eventually, prices reverted to normal but not without first sending capital markets into chaos and confusion.²⁴³ Regulators undertook a thoroughgoing, yearlong joint investigation into the event’s possible causes and implications.²⁴⁴ While the final report did not unearth any smoking gun, the investigation itself was illuminating. During the inquest, commentators singled out the legal and logistical difficulties experienced by different agencies in collecting and sharing information with one another.²⁴⁵ The CFTC, for example, required time to conclude an information-sharing agreement in order to forward its data to other regulators.²⁴⁶ The report further revealed that regulators lacked information to such a degree that they were shockingly unaware of major transformations underway in the Treasury market, specifically, the shift to high-speed, automated trading from a primary-dealer dominated, more analog interdealer market.²⁴⁷

In other words, fragmentation points to serious institutional challenges for regulators seeking to understand the interconnected machinations of the repo and secondary markets.²⁴⁸ Agencies may not feel comfortable or be permitted to share information on those they supervise.²⁴⁹ The 2017 trade reporting regime, instituted in the wake of the Flash Rally, requires that covered securities firms report their data to FINRA.²⁵⁰ Banks, on the other hand, have to provide reports of their trading to banking regulators, suggestive of the especially sensitive nature of bank exposures.²⁵¹ To harmonize the process, FINRA and the Fed have engaged in a yearslong dialogue on coordination of data collection, under which FINRA could acquire bank-reported data as an agent of the Fed.²⁵²

Different regulatory regimes mean that regulators each have varying amounts of information on those they supervise.²⁵³ Whereas primary dealers banks and broker-dealers are overseen by various dedicated banking and securities regulators, like the Fed, the SEC, and FINRA, other major participants like hedge funds are regulated on a much looser basis.²⁵⁴ By design, hedge funds fall under a lighter touch, more opaque regulatory regime with fewer disclosures.²⁵⁵ To be sure, post-2008 rulemaking does extend the regulatory perimeter to cover their activities more fully than before. Notably, bigger hedge funds must provide disclosures on various types of exposures in financial markets in a bid to help regulators map out their systemic footprint.²⁵⁶ But the intensity of their overall regulatory scrutiny is generally far less intense than that faced by banks, broker-dealers, or mutual funds.²⁵⁷ Importantly, within the Treasuries market, hedge funds have generally fallen outside of the reporting obligation for their secondary market trades because they do not fall under the category of broker-dealers or banks.²⁵⁸ This is expected to change, at least for the most active hedge fund Treasuries traders, as the SEC’s new registration and reporting rules take effect. Given the enormity of their exposure to Treasuries and the potential scale and impact of their activities, their historic exclusion from reporting has left regulators without a valuable and essential repository of data. In addition to hedge funds, many high-speed automated traders also do not qualify as FINRA broker-dealers—though again, this may change for more active players as the SEC rulemaking takes effect.²⁵⁹ The erstwhile regulatory regime has nevertheless allowed many such firms to avoid direct reporting of trades in the interdealer market. It is worth highlighting that even though the SEC has taken steps to bridge these gaps by passing new rules to encompass hedge funds and HFT firms within a registration and reporting regime, their chances of future success appear uncertain in view of industry resistance to reforms and potentially drawn-out court challenges.²⁶⁰ A 2019 supplement to the reporting regime requires Treasuries trading platforms to identify traders in records.²⁶¹ As such, regulators can get information on particular securities firms should they need it.²⁶² However, reporting to regulators is not direct—and they must absorb costs to get data on an ex post basis.²⁶³

Gaps in information and roadblocks to cooperation have limited the ability of regulators to share insights on the major risks to Treasury secondary and repo markets. And fragmentation in regulatory design and pockets of opacity are essentially fatal to the enterprise of constructing a picture of the vulnerabilities affecting intermediation and developing ex ante constraints to control the risks.²⁶⁴

In addition to fragmentation, Treasury repo and secondary markets also operate under systems of oversight that diverge in their methodological approaches. As detailed in Part II, the Fed and the NY Fed represent the prudential end of the regulatory spectrum. Focusing on safety and soundness, a prudential approach ensures preservation of systemic safety and soundness as its critical mission. This can mean less emphasis on disclosure and transparency, for example, and more on ensuring that markets remain insulated from the risk of sudden runs and default on credit.²⁶⁵ By contrast, the SEC and FINRA represent quintessential securities markets regulators, offering deft expertise in building efficient and transparent trading markets and protecting investors.²⁶⁶ Instead of financial stability as the core guiding mission, securities market regulators nurture trading markets, underpinned by the dissemination of information, efficient price formation, and capital allocation.²⁶⁷ To be clear, these are generalizations. The SEC, for example, also focuses on financial stability (e.g., by regulating the stability of money market funds that constitute an essential part of the repo market).²⁶⁸ The Fed regularly engages with securities markets to ensure that the infrastructure, such as exchanges, is protected against collapse.²⁶⁹ These generalizations, however, aide in understanding key differences between the purposes and approaches of regulators tasked with overseeing secondary and repo markets for Treasuries.

This divergence can explain why regulators have failed to connect the shared risks facing Treasury repo and secondary markets, and to oversee both in a more consolidated way. Neither the prudential nor the securities-based model neatly fits the secondary or the repo market. For a start, the interdealer secondary market—a fairly classic securities market with heavy and liquid daily turnover—holds enormous systemic implications for the economy. If this market stops working, like in March 2020, a swath of economic actors cannot meet critical prudential needs. Concretely, reliance by public regulation on Treasuries’ liquidity (e.g., HQLA) ties the proper functioning of the interdealer market to the prudential survival of any number of financial firms and the larger system.

Yet the regulatory methods used to oversee interdealer Treasuries trading fit neither a prudential nor a capital markets paradigm and leave risks exposed. Trade-by-trade reporting is of recent vintage (2017)—and only for regulators. Public reporting is limited—with data released only in aggregate form. This reticence to widely disclose potentially sensitive Treasuries trades recognizes the systemic quality of the market. However, other regulatory aspects undermine this focus on curbing systemic risks. Perhaps most importantly, lightly regulated actors are afforded ample latitude to trade in secondary markets without having to report their activities. Hedge funds, especially, are a case in point. But high-speed securities trading firms are another. Now firmly dominant in the interdealer market, such high-speed trading firms have not fallen within the regulatory regime for broker-dealers. Therefore, they have not been subject to reporting rules (but see above for anticipated changes in response to new regulatory measures).²⁷⁰ Crucially, they have typically also been able to skirt other measures designed to address prudential risks—notably, capital requirements on broker-dealer firms that require safekeeping of rainy-day assets.²⁷¹ This leads to a possibility that highly influential traders have been transacting with only a thin base of capital, making them sensitive to losses and liable to exit rather than continue supplying liquidity especially during crisis.²⁷²

Similarly, the regulatory strategy for overseeing repos fails to account for the complex dynamic between Treasuries repo and the secondary market. As detailed in Part II, repo markets are overseen through a decidedly prudential lens. Capital buffers help safeguard against runs and collapse.²⁷³ Collateral plays a pivotal role in reducing default risk.²⁷⁴ Because of this collateral and the fear of runs, real-time detailed disclosure is limited.²⁷⁵ Yet despite this focus on safety and soundness, the workings of the repo market fail to account for the role of the secondary market in maintaining the repo market’s smooth workings.²⁷⁶ This interconnection exists for a number of reasons: (1) if secondary markets experience illiquidity, Treasuries’ prices can become unstable and distorted, impacting the viability of Treasuries as collateral; (2) repo lenders that wish to liquidate Treasuries will find themselves unable to do so in an illiquid secondary market; and (3) if primary dealers cannot buy and sell Treasuries in secondary markets, they may lack the ability to source cash and securities to fulfill repo lending. As seen in March 2020, for example, firms selling Treasuries en masse caused secondary trading to stall and badly disrupted securities prices.²⁷⁷ With the value of Treasuries directly tied to the viability of firm liquidity buffers, a lack of attention to the securities market undermines the functioning of the prudential one.

In summary, this Part shows that public and private regulation’s reliance on Treasuries is subject to a number of failures arising from a flawed system of intermediation. We show that the Treasury-backed repo and secondary trading markets are connected by a common intermediary: the primary dealer. In entrusting maintenance of the trading and repo markets to primary dealers, public and private regulation has failed to account for a number of costs that mean liquidity in both markets becomes tenuous. Primary dealers incur information and monitoring costs, navigate conflict between the needs of the repo versus the secondary market, and attend to their own private business preferences. These challenges are particularly dangerous owing to the needs of a financial regulatory system that puts Treasuries at the center both in public oversight and private self-regulation. In its second contribution, this Part argues that the regulatory framework for the repo and secondary markets is fragmented, inadequate, and insufficiently adaptive to provide consolidated supervision of a connected set of markets. The result is a Treasury market that is relied on for its resilience, but one whose foundations are poorly understood and subject to rapid erosion.

IV. PATHWAYS TO STABILITY

This Article shows that the Treasury market suffers from fragilities in intermediation that makes it unstable and unreliable, casting doubt on the assumption used by regulation to place Treasuries at the center of financial stability. To begin remedying the structural deficiencies identified in this Article, we outline three proposals. Our focus lies in enabling public and private actors to strengthen the quality of liquidity and improve their understanding of the market’s risks ex ante.²⁷⁸ We recognize that if the Treasury market fails—like it did in March 2020 and in September 2019—it will be a near certain recipient of ex post federal emergency assistance. Indeed, in January 2022, regulators announced the creation of a permanent standing facility to lend securities and cash to repo market participants when the need arises.²⁷⁹ Our focus is on taking first steps to develop strong ex ante mechanisms to improve information flows, enhance liquidity, and ensure that primary dealers are well supervised even within a highly fragmented regulatory framework. We suggest (1) developing greater transparency and information sharing in repo and Treasuries trading markets, (2) encouraging major liquidity suppliers—both primary dealers and key HFT traders—to invest in maintaining the liquidity of the market, even in times of distress, and (3) bringing greater consolidation and coordination to the regulatory framework, and requiring regulators to link supervision of the Treasury secondary markets and the Treasuries-backed repo markets more systematically.²⁸⁰

A. Transparency and Prudential Safety

As detailed in Part III, information gaps are endemic within the repo and secondary markets for participants as well as regulators. These gaps obscure an understanding of how repo and secondary trading intersect and what risks are created by dint of this connection. Reform must begin by developing reliable mechanisms for improving transparency and information flows as a first step toward empowering regulators and market participants.

Information gaps are deeply embedded throughout the Treasury market, in both the secondary trading and repo market. Reforms in 2017 and 2019 have brought reporting to secondary markets. But it is limited by significant gaps in coverage (for example, excluding hedge funds and high-speed securities firms). Public, real-time transparency is restricted. Repo markets, opaque by nature, lack systematic, up-to-date reporting.²⁸¹ This allows private parties to avoid thorough due diligence. But it is far from obvious that it is protective in all cases. Collateral reuse creates opaque chains that instill a potentially false sense of confidence in which multiple parties all count on owning a single security. Further, opacity has costs even if transparency also comes with downsides. Market participants may overreact during crises, lacking information, and not knowing with which firms the problems lie.²⁸² Opacity also constrains how flexibly primary dealers manage inventories, respond to the behavior of a variety of clients as well as unknown dealers that are also active in supplying liquidity.

As a first matter, we propose increasing the information and scope of reporting available to regulators and its participants.²⁸³ This applies to both the repo as well as the secondary market. For the repo market, this represents a paradigm shift in approach. However, we believe that it offers a much-needed lever for those in the market to take steps to assess supply and demand of Treasuries/cash more precisely. It also lowers the cost of public surveillance. Regulators remain stymied in their ability to capture real-time data on repo exposures, particularly for bilateral exposures. As Victoria Baklanova writes, supervisors are left to the grind of painstaking and patchy data collection practices that require them to piece together information from weekly or quarterly mandatory disclosures, on-the-ground examinations, or informal reporting by financial firms.²⁸⁴ Such data collection is costly, quickly out-of-date, and imposes analytical costs on account of its lack of standardization and comprehensive coverage.²⁸⁵ To be sure, since late 2019, this situation is improving. Regulators have ramped-up data collection in cleared segments of the repo market. In May 2024, the Office of Financial Research approved a new rule to enhance reporting and data collection in the bilateral repo market.²⁸⁶ But market-wide, real-time data gathering remains elusive for now, and the outcome of future efforts remains uncertain.²⁸⁷

Reporting in this market has a number of benefits. It requires dealers to develop mechanisms to record their repo trades on a real-time, granular basis ex ante, to track the collateral that attaches to a particular repo, and to determine whether collateral attaching to it might be subject to reuse. Reporting can help create systematization in relation to capturing exposures and discipline about understanding the robustness of the collateral. In addition, it can create incentives for primary dealers to be more diligent with respect to understanding how collateral is sourced, whether it might be subject to reuse, how many times, and what the risks of such reuse might be during a period of distress. Importantly, we believe that such information ought to be shared regularly between regulators and the primary dealers (at least) as a group. Key intermediaries ought to develop mechanisms whereby they circulate insights about their repo exposures to one another on a regular basis with the goal of understanding collective exposures, the robustness of collateralization, and the potential market availability of cash and securities in case of need. This allows market participants to share emerging concerns, prepare for problems, and for regulators to also be ready to deal with the consequences of fallout.

Invariably, there will be pushback on a proposal to create transparency in prudential spaces. It goes against the grain of conventional wisdom in regulating prudential risks.²⁸⁸ But, despite attachment to the status quo, regulators have begun to soften their stance on keeping utmost secrecy in banking and prudential areas. For example, in banking, regulators are now increasingly revealing some of the results of bank stress tests.²⁸⁹ Even in the repo market, some public reporting has emerged for its cleared segments.²⁹⁰ While far from full transparency, this easing of traditional fetters against disclosure in banking can hint at potential openness to real-time reporting and information sharing. In addition, regulators and market participants might also balk at the cost of enabling Treasury market transparency given the interconnected complexities of the repo market and its daily size. There is also the ever-present concern that too much disclosure could result in triggering the exact externalities that everyone seeks to avoid—a run that results in a catastrophic drain on the market’s liquidity and forces regulators to have to step in and stop the bleed.

Nevertheless, such concerns are not insurmountable, and while downsides exist, the costs embedded in the status quo are also high. Importantly, the repo market is not hermetically sealed. It does allow for some pockets of reporting (albeit not to the public). In particular, the tri-party repo market—that relies on a formal system of clearing and settlement—allows for greater reporting, collateral tracking, and unraveling of the complexity inherent in trades.²⁹¹ Stated differently, the market is amenable to systematization if parties so choose. In addition, the costs of recording trades, and tracking and reporting collateral, should not be prohibitively daunting. Primary dealers and others already do risk management as individual firms, though not on a standardized basis.²⁹² Notably, regulators routinely look to dealers self-reporting their activities as a means of gaining insights about the market. Surveys are commonplace as part of public efforts to study the ins-and-outs of the bilateral marketplace from those that inhabit it most closely.²⁹³ Moreover, a real-time data repository for the bilateral repo market would save both market participants and regulators from having to perform expensive data collection, analysis, and extrapolation of the possible state of the market on a given day. Rather than guesstimates, parties could rely on a more standard and reliable reserve of information from which to understand an already complex market.

Perhaps most importantly, the centrality of Treasuries to stability means that opacity presents an incalculably high cost in which the market suffers on account of being poorly understood and inadequately protected. As made clear in March 2020, the dislocation in the market cast a pall of doubt among market participants about the resilience of Treasuries during a global crisis.²⁹⁴ Seen from this perspective, failure to understand the market and its dynamics carries not just financial costs, but also implies larger damage from the standpoint of political economy. Finally, to avoid the potential for sudden runs (transparency, it should be noted, may also avoid runs if dealers and others better understand their exposures), data circulation around the market may be staggered and delayed.²⁹⁵ For example, the repository would provide data to a closed loop of recipients (potentially the major dealers) and do so with a delay (perhaps circulating information at intervals during the day, or perhaps at the end of each day). In other words, while transparency and reporting may appear daunting at first glance, there are ways of structuring it that can allow for some aggregation and promote a careful, calibrated approach to information consumption, collation, and analysis.

B. Increasing Resilience in Intermediation

A lack of information can fuel a race to the exit by intermediaries, resulting in liquidity draining quickly and causing distress for investors as well as firms needing to fund themselves. More reporting can provide clarity to dealers when it comes to pricing their risks. But it leaves open the possibility that they exit the market at even small signs of trouble. To ensure dealer engagement in maintaining Treasury market resilience, we suggest exploring tools to incentivize market makers to assume an affirmatively active role during periods of crisis—especially in the secondary market. As highlighted by the events of March 2020, the secondary market can face enormous strain during crisis as investors rush in to transact in Treasuries. Resilience here—when the market does not buckle under stress—helps ensure that Treasuries can perform their regulatory role as safety buffers. For the secondary market, such a duty would cover both primary dealers as well as high-speed security firms. Both types of dealers are vulnerable to exiting the market rapidly, causing a decline in available liquidity and distortion in prices.²⁹⁶

An affirmative duty on key dealers to remain trading can help to build more certainty around liquidity provision in the Treasury market. To be sure, regulators face a trade-off in introducing such affirmative duties. Imposing higher transaction costs on traders can discourage them from entering the market or encourage them to pass the costs of liquidity onto investors that use the Treasury market. On the other hand, affirmative duties can be beneficial, especially given the importance of securing liquidity for Treasuries. If dealers can simply leave depending on their own preferences, they will be likely to exit exactly when the Treasury market is most necessary—during a crisis. Investors may come to regard the perception of its fail-safe liquidity as illusory, primed to dry up whenever danger strikes.

Historically, regulators have not required primary dealers to keep the market going in a crisis—perhaps assuming that they would do so anyway. Given their long-assumed dominance, perhaps this assumption could be justified. But it cannot hold now. As detailed in this Article, the Treasury market as a whole is facing new pressures, created by heavy dependence by public and private regulation on its services. In addition, the arrival of high-speed trading firms as well as nontraditional types of repo intermediary (for example, hedge funds) add to the pressures facing primary dealers and can contribute to how they navigate private decisions about continuing to provide liquidity.

An affirmative duty to maintain market liquidity can offer greater certainty that dealers make a real effort to stay, rather than simply exiting the market. Such a duty would require an affirmative obligation on traders to remain, particularly during crises.²⁹⁷

A duty to remain trading—applied to the most active dealers (primary dealers and high-speed securities firms alike)—can help to preserve price continuity and assure investors of resilient liquidity. Most importantly, it can motivate those charged with staying to play a role in monitoring and safeguarding operations through ex ante private oversight. Those that must trade, under a new duty, in all conditions ultimately bear the costs of any fallout from disruptive trading strategies or other traders that are creating outsize risks for others. This duty ought to prompt dealers to pay attention to the quality, sophistication, and reliability of their trading behavior. In this way, a duty to remain changes the trade-off governing the behavior of large Treasuries traders. Faced with the prospect of bearing potentially heavy losses if the market goes awry, taking risks starts to look more costly. Currently, easy exit and light-touch regulation have made taking careless or deliberate risks a low-cost option.²⁹⁸

To be clear, an affirmative duty to remain is not an unlimited one, forcing firms to comply to the point of making themselves insolvent, fighting enormous fires in Treasuries at a cost of their own existence. For example, some nontraditional dealers like HFT traders tend to be smaller and less capitalized.²⁹⁹ They cannot be expected to deplete their entire balance sheet to remain trading. Bigger bank dealers will have a more intensive obligation owing to their capacity to remain trading longer. That being said, a duty to remain is also likely to result in otherwise thinly capitalized firms to have to develop deeper capital buffers in readiness. Those subject to a duty will be the most active traders. Ensuring that they are better buffered provides assurance that those charged with maintaining Treasuries intermediation have the capacity to do so.

Practically, firms are likely to resist such a duty. If they have to pay large sums to selflessly protect the market, they will rationally demand a large ex ante price from the U.S. Treasury for their commitment. And regulators might consider how they ought to compensate traders that become subject to this duty. For example, one option might be to afford them special access to information on Treasuries trading order flow and repo operations as a way to help them to calibrate their risk. As above, this can lower the costs of monitoring and also help dealers to modulate their supplies of Treasuries and cash for secondary trading.

Importantly, this idea is not new to markets—earlier eras had once demanded that a designated group of traders withstand losses to protect markets in times of stress. Those that earned the designation also enjoyed certain privileges and status as a result.³⁰⁰ While such a duty may not be critical in other markets in which liquidity provision is voluntary, introducing it for Treasuries is more than justified given the crucial importance of ensuring trading continuity in Treasuries in crises. During a crisis, affirmative liquidity provisions would clearly provide greater assurance of resilience. It also forces dealers to more fully confront the responsibility that comes with the fact of transacting in securities whose workings possess near existential significance for the global economy, further helping to strengthen market integrity.

C. Fixing the Breaks in Regulation

Developing better information flows and ensuring the market’s resilience must also be accompanied by reform at the level of public oversight.³⁰¹ This Article points to the need to develop a coordinated and hybrid approach to oversight for the Treasury market that is capable of overcoming tension between different regulatory philosophies (prudential versus market based). Remedying the ill effects of institutional fragmentation is necessary as a condition precedent to more fully understanding how the market works, identifying the risks and producing a set of rules that can mitigate structural vulnerabilities at the intersection of the repo and secondary markets.

As argued in this Article, the need for coordination between regulators takes on urgency in light of the interlinkages connecting Treasury repo and secondary markets. With a common set of intermediaries—the primary dealers—Treasuries-backed repo and the secondary markets depend on one another for each to be able to fulfill its respective mission. A patchwork system of oversight makes little sense within an ecosystem in which the trading of high-speed securities firms impacts the liquidity of collateral propping up the four-trillion-dollar Treasuries-backed repo market; or where the enormous collateral and cash needs of the repo market put the resiliency of the Treasury trading markets in jeopardy. Rules to simply govern one or the other market by itself are not enough. Rather, this Article makes clear that the Treasury market exists as a whole, underpinned by the trading and funding mechanisms working together to deliver what is universally recognized as the lynchpin of the world’s financial order.

A near-term fix to the problem of regulatory fragmentation and ad hocism lies in making FSOC expressly into a coordinating supervisory agency for Treasury and repo markets.³⁰² Created by post-2008 rulemaking, the FSOC is designed to create a layer of consolidation over the patchwork of U.S. financial regulators. With the 2008 Financial Crisis showcasing systemic interconnections in financial markets, the FSOC’s creation offers an administrative response to the risks of agencies working just on single parts of an otherwise entangled system.³⁰³ By requiring the FSOC to bring multiple regulators together, it provides a way to ensure that regulators share information, develop a plan for reform, scrutinize and debate their own supervisory methodologies, and arrive at a mode of overseeing Treasuries that recognizes the linkages between trading and repo markets. As shown in this Article, this means developing a more hybrid regulatory strategy that is capable of moving beyond blunt prudential versus securities market approaches.

Introduction of the FSOC as a coordinating regulator for Treasuries is only a first step toward creating a governance model for public oversight. Even with the FSOC, agencies may struggle to work together. They may fail to share data or coordinate. Opacity may hamper attempts to understand how risks move between repo and Treasuries trading. In the absence of a strong system of supervision, the Treasury market may well just be left to depend on the Fed’s ex post interventions in a crisis. But disruptions to U.S. Treasury secondary and Treasury-backed repo markets (for example, in March 2020) show that the current fragmentation and disorganization between regulators is untenable and harmful. Coordination through FSOC begins a process of deeper institutional reform.³⁰⁴ Further, systematized transparency (for example, through disclosure and reporting) offers a way to help bridge the difficulties faced in developing a hybrid approach to overseeing the interlinkages between Treasuries-backed repo and Treasuries trading markets. With all regulators able to share in data from both repo markets and Treasuries, understanding interdependencies should become practicable. Information—in addition to saving collection costs and bridging institutional hurdles to communication—can foster collective focus on a connected marketplace. This approach of co-opting banking and securities regulators and ensuring greater coordination through the FSOC offers a way out of the bifurcated approaches that treat repo and trading markets as basically distinct and subject to separate modes of scrutiny.

This Part proposes a three-part solution to place Treasury markets on a stronger footing to better withstand the weight this market carries for the financial system and the economy. It proposes first developing stronger information flows to increase reporting and transparency, affording primary dealers greater ease in monitoring exposures as well as giving regulators a clearer idea about the market’s structural weaknesses in real time. In addition, an affirmative obligation on major dealers to remain trading creates confidence in the resiliency of liquidity across the marketplace, especially during crises. Finally, we advocate for regulatory oversight that can bridge fragmentation and offer a more consolidated, coordinated system of supervision. The FSOC provides a convening authority. But, looking forward, fixing the fractures in regulation would help to ensure that the Treasury market’s overseers are well positioned to match the realities of its critical importance to financial market stability.

CONCLUSION

Financial stability rests on a central idea that Treasuries represent a bulwark against distress, representing the foremost risk-free asset anywhere on the globe. Free of default risk and trading in a market with supposed plentiful liquidity, public and private regulation are anchored to Treasuries for their function and assume that Treasuries will protect firms and markets from collapse. In this Article, we show why this assumption is incorrect. While Treasuries themselves are viewed as risk-free, the market that distributes them is not. It is pervasively subject to flawed intermediation. Importantly, the demands of public and private industry regulation are internally in conflict, crystallizing the harms of faulty intermediation. Despite their importance, these risks in the secondary and repo markets remain undertheorized and poorly understood, leaving Treasuries perpetually at risk of failing to perform their protective role. Without real reform, the first steps to which we outline here, we worry that Treasuries cannot live up to their reputation, undermining their promise for regulation as the anchor in financial system stability.

APPENDIX: FIGURES

Figure 1.A. Bilateral Repo Market Collateral Outstanding ($ Trillions)

Sources: SIFMA Rsch., supra note 26 (providing figures for the bilateral repo and reverse repo markets through 2021); Sec. Indus. & Fin. Mkts. Assoc., supra note 21 (providing data through July 2024).

Figure 1.B. Tri-Party Repo Market Collateral Outstanding ($ Billions)

Source: Fed. Rsrv. Bank of N.Y., supra note 26 (select “Total” and “US Treasuries excluding Strips”).

Figure 2. Primary Dealer Repos and Reverse Repos Outstanding ($ Billions)

Source: Fed. Rsrv. Bank of N.Y., supra note 209 (providing raw data on trades and positions of primary dealers).

Figure 3.A. Secondary Market Trading of Primary Dealers: Daily Inventory Risk Exposure ($ Billions)

Source: Fed. Rsrv. Bank of N.Y., supra note 209 (providing raw data on trades and positions of primary dealers).

Figure 3.B. Secondary Market Trading of Primary Dealers: Daily Trading Volume ($ Billions)

Source: Fed. Rsrv. Bank of N.Y., supra note 209 (providing raw data on trades and positions of primary dealers).

Figure 4. Daily Changes in Primary Dealer Treasury Holdings: Repo Market vs. Treasury Secondary Market ($ Billions)

Source: Fed. Rsrv. Bank of N.Y., supra note 209 (providing raw data on trades and positions of primary dealers).

97 S. Cal. L. Rev. 1349

Download

Pradeep K. Yadav*

* W. Ross Johnston Chair and Professor of Finance, University of Oklahoma.

Yesha Yadav†

† Professor of Law and Milton R. Underwood Chair, Associate Dean & Robert Belton Director of Diversity, Equity and Community, Vanderbilt Law School. We are deeply grateful for thoughtful comments, insights, and conversations. We thank Dan Awrey, Jonathan Brogaard, Peter Conti-Brown, Chris Brummer, Nakita Cuttino, Anna Gelpern, Evan Gerhard, Patricia McCoy, Mitu Gulati, Rory van Loo, Michael Kang, Lev Menand, Robert Rasmussen, Morgan Ricks, Paolo Saguato, Nadav Shoked, Danny Sokol, Kumar Venkataraman, Jialan Wang, Mark Weidemeier and participants at the Wharton School of Business Conference on Financial Regulation, the Vanderbilt Law School Annual Conference on Central Banking and Financial Regulation and faculty workshops at the University of Illinois Business School, Northwestern Law School, and USC Gould School of Law. Runzu Wang and Doris Zhou at the University of Oklahoma provided excellent research assistance.

War and Coercion

September 2024September 2024Marcela Prieto Rudolphy*

Compelled service in hostile forces is prohibited by International Humanitarian Law. In the context of an international armed conflict, it is a war crime to compel prisoners of war (“POWs”) or other protected persons to serve in the forces of a hostile power and to compel participation in military operations against the person’s own country or forces. However, conscription—or compelled service in military forces—of a state’s own citizens is not prohibited under international law. In fact, conscription, some aspects of which are regulated by International Human Rights Law, is generally legitimate.

This asymmetry—whereby compelling protected persons to fight or serve in the forces of a hostile power is a war crime, but compelling one’s own citizens is not—has puzzling implications. Take the example of Russia’s invasion of Ukraine. It is a war crime for Ukraine to compel Russian POWs to fight on behalf of Ukraine, even though Ukraine is fighting a lawful war of self-defense. Yet, it is not a war crime for Russia to compel its own citizens to fight, even though Russia is fighting an unlawful war of aggression.

Can we make moral sense of this asymmetric regime regarding compelled service in armed forces? Is the regime morally coherent? In order to make moral sense of the regime, two arguments must succeed. First, we must argue that it matters greatly whether individuals are compelled to fight in hostile forces or in the armed forces of their own state. Second, we must argue that the nature of the war they are compelled to serve in—whether the war is legal or illegal—does not matter at all.

This Article argues that the second argument cannot but fail, but it is possible to argue that compelled service in hostile forces is morally wrong and often morally worse than compelled service in the armed forces of one’s own state. It is morally worse because it is morally worse to harm those who are vulnerable and defenseless, like those who have fallen into the hands of a party to the conflict. And it is morally wrong because noncitizens lack duties to fight on behalf of other states. However, what makes compelled service in hostile forces morally wrong also makes conscription morally wrong. That is, what is wrong about compelled service in hostile forces is also present in the state’s conscription of its own citizens.

This Article thus argues that the current regime concerning compelled service in armed forces is, in fact, morally incoherent. To render the regime morally coherent, international law should (1) appropriately distinguish between conscription to serve in legal wars and conscription to serve in illegal wars, and (2) generally prohibit compelled service in armed forces.

INTRODUCTION

Russia has been conscripting men from occupied Crimea to serve in Russian armed forces for several years. During the ninth conscription campaign, which ended in June 2012, at least 3,300 men from Crimea had been enlisted, bringing the total of forced conscripted men to at least 18,000.¹

Conscription in occupied Crimea was still ongoing in 2019.² By 2022, The Guardian reported that men in the Donbas region were being forcibly conscripted to serve in the armed forces of the self-declared Donetsk Peoples Republic and Luhansk Peoples Republic.³ At the same time, Russia has been conscripting its own citizens to fight in Ukraine.⁴

In times of war, conscription of individuals in occupied territory and the state’s conscription of its own citizens share an important feature. They both involve a severe restriction on individuals who are called on to fight, and possibly to kill and die, on behalf of the state. However, although they share this important feature, international law treats them differently. In fact, one might say that there is an asymmetry in how international law treats compelled service (or conscription) to serve in armed forces. Conscription of protected persons to serve in hostile forces, when done by an occupying or detaining power, is a war crime under International Humanitarian Law (“IHL”).⁵ Conscription of the state’s own citizens to fight a war is, however, not only not a war crime, but is also recognized by international law as the state’s prerogative.⁶

Perhaps surprisingly, this asymmetry has received almost no attention in international legal scholarship. Perhaps even more surprisingly, the crime of compelled service in hostile forces and the ethics of conscription have also received very little attention, both in international legal scholarship and political theory. Yet the asymmetry regarding compelled service in armed forces (or the asymmetry regarding conscription) present in international law has some puzzling implications.

Take the example of Russia’s invasion of Ukraine. Russia’s conscription of its own citizens to fight its unlawful war of aggression is permitted by international law.⁷ By contrast, it is a war crime for Ukraine to compel Russian prisoners of war (“POWs”) to fight on behalf of Ukraine, even though Ukraine is fighting a lawful war of self-defense. Even more so, if Russian citizens voluntarily decided to join Ukrainian armed forces to fight against Russian aggression—and some of them have⁸—they are not protected by international law and could be prosecuted by Russia. In fact, in September 2022, Russia toughened up penalties for voluntary surrender to enemy forces, desertion, and refusal to fight by up to ten years in prison.⁹

International law thus distinguishes between compelled service in hostile forces and compelled service in the armed forces of one’s own state—prohibiting the first but allowing the latter—but fails to distinguish between compelled service in legal wars and compelled service in illegal wars. The distinction drawn by international law suggests that there is a normatively relevant difference between compelled service in hostile forces and compelled service by one’s own country—a difference significant enough to permit the latter and make the first a war crime. And it also suggests that there is no normatively relevant difference based on whether the wars one is forced to fight are legal or illegal.

This asymmetry demands a justification. We ought to try to make sense—moral sense—of the international legal regime on compelled service in armed forces. This is not the same as attempting to explain why the regime is the way it is. That explanation might be historical in character if, for example, states could agree regarding the prohibition on compelled service in hostile forces but could not agree—or did not want to agree—regarding the state’s conscription of its own citizens or regarding the relevance of whether the wars individuals are compelled to serve in are legal or illegal. And that historical explanation might provide a moral justification for the adoption of the prohibition on compelled service in hostile forces.¹⁰ For example, if we think compelled service in armed forces is always wrong and should be prohibited, but states could only agree to prohibit compelled service in hostile forces, we might argue that it is better to prohibit one morally wrong behavior than to prohibit nothing at all.

However, the question this Article is concerned with is not a question about the historical explanation of the current international regime on conscription, nor a question about whether we can justify its adoption. It is a question about its content. Can we make moral sense of this asymmetric regime regarding compelled service in armed forces during times of war? Is the regime morally coherent?

This Article thus brings moral and political philosophy to bear on international law.¹¹ It is concerned with the relationship between international law and morality and, in particular, with the question of how law ought to be if it wishes to be morally coherent. By moral coherence, I am referring to the idea that a legal regime (like the regime on conscription) should be complete; that is, it should equally prohibit behaviors that are similarly wrongful instead of failing to prohibit things that are as, or more, wrongful than the behaviors it already prohibits.¹² And a legal regime should also “make moral sense,” that is, it should be morally intelligible, in the sense that it does not fail to take into account important moral reasons in favor of and against prohibiting certain behaviors.

In order to make moral sense of the international legal regime on conscription, two arguments must succeed. First, we must argue that it matters greatly whether individuals are compelled to fight in hostile forces or in the armed forces of their own state. Second, we must argue that the nature of the war that individuals are compelled to serve in—whether the war is legal or illegal—does not matter at all.

These are difficult arguments to make. This Article will argue that the second argument is impossible to make; it cannot but fail. Whether individuals are forced to fight legal or illegal wars is morally significant and should be accounted for. The first argument is more plausible: it is possible to show that compelled service in hostile forces is often morally worse than compelled service in the armed forces of one’s own state and that compelled service in hostile forces itself is morally wrong. It is morally worse because it is morally worse to harm those who are vulnerable and defenseless, like those who have fallen into the hands of a party to the conflict, and because it is morally worse to be coerced to fight against those we care about. And it is morally wrong because noncitizens lack duties to fight on behalf of other states.

However, these arguments fail to support the current regime. This is so because the fact that compelled service in hostile forces is morally worse than the state’s conscription of its own citizens cannot show, on its own, that the latter is morally permissible. The fact that something is morally worse than something else says nothing about whether what is morally better is permitted. Thus, showing that compelled service in hostile forces is often morally worse than the state’s conscription of its own citizens cannot explain why compelled service in hostile forces is prohibited, but the state’s conscription of its own citizens is allowed. And the second argument, which shows that compelled service in hostile forces is morally wrong, also fails to explain why conscription of a state’s own citizens is morally permissible. This is so because citizens often lack duties toward their own states to kill and die on its behalf. That is, what is wrong about compelled service in hostile forces is also present in the state’s conscription of its own citizens.

This Article makes then three claims, which are related but logically independent from each other. First, it is morally wrong to conscript individuals to fight wars of aggression, regardless of whether the citizen’s state or hostile forces do so. Second, compelled service in hostile forces is often morally worse than the state’s conscription of its own citizens and is also morally wrong. Third, conscription by the state, even to fight legal wars, is also often morally wrong. This Article thus concludes that the current regime concerning compelled service in armed forces is, in fact, morally incoherent.¹³ It is morally incoherent because it is incomplete and cannot be made sense of.¹⁴ It is incomplete because it fails to criminalize or prohibit conduct (the state’s conscription of its own citizens) that is similarly wrong when compared to what it already criminalizes (compelled service in hostile forces). And it cannot be made sense of because it fails to distinguish between legal and illegal wars, so that states are allowed to compel their own citizens to fight illegal wars.

The fact that the regime on compelled service is morally incoherent does not mean that the compelled service in hostile forces prohibition (“CSHF prohibition”) lacks value or is morally misguided. The CSHF prohibition, this Article will argue, protects fundamental rights and interests. However, it is incomplete. It is that incompleteness that makes the current regime incoherent. To render the regime morally coherent, international law should (1) appropriately distinguish between conscription to serve in legal wars and conscription to serve in illegal wars, and (2) generally prohibit compelled service in war. The latter might seem entirely utopian. It might also make it harder for states to fight wars—possibly even lawful ones. But lawful (and just) wars are the exception, and conscription to fight in war imposes a severe restriction on individuals’ freedom. Even if international law never comes to prohibit the state’s conscription of its own citizens, there are powerful moral reasons for doing so. The fact that such a prohibition is utopian is not a (moral) reason against it.¹⁵

The remainder of this Article proceeds as follows. Part I gives a brief overview of the international regime on compelled service in armed forces, distinguishing between compelled service in hostile forces and the state’s conscription of its own citizens. Part II argues that the regime’s failure to distinguish between conscription to serve in legal wars and conscription to serve in illegal wars cannot be defended. At the very least, international law should prohibit states from conscripting their own citizens to fight wars of aggression, and the illegal nature of the war should be an additional aspect of the crime of compelled service in hostile forces. Part III tries to make sense of the fact that international law considers compelled service in hostile forces to be significantly worse than the state’s conscription of its own citizens. It argues that although there is a plausible case for why compelled service in hostile forces is more wrongful, or morally worse, than the state’s conscription of its own citizens, this argument cannot support the latter’s permissibility. And some of the arguments that show why compelled service in hostile forces is morally wrong put into question the permissibility of conscription generally. Part IV discusses what it would take for the international legal regime on conscription to be morally coherent. It argues that (1) international law should make it a war crime for states to conscript their own citizens to fight in illegal wars; (2) the unlawfulness of the war in compelled service in hostile forces should be an additional aspect of the crime; and (3) there is a pro tanto reason for international law to generally prohibit conscription to fight in war.

I. THE INTERNATIONAL LEGAL REGIME ON CONSCRIPTION

A. Compelled Service in Hostile Forces

Compelled service in hostile forces was not always prohibited and was, in fact, a common practice across different cultures.¹⁶ POWs—that is, those combatants who had been captured, due to surrender or injury, by enemy powers—were thought to have forfeited their lives by surrender or capture, and, in practice, they were often required to join the forces of their captors.¹⁷

The CSHF prohibition was introduced into treaty law with the Hague Regulations of 1899. Article 44 of the Hague Convention of 1899 prohibits the compulsion of the population of occupied territory to take part in military operations against their own country.¹⁸ The Hague Convention of 1907, Article 23(h), prohibits compelling the nationals of the hostile party to “take part in the operations of war directed against their own country, even if they were in the belligerent’s service before the commencement of war.”¹⁹ This provision is limited to nationals of the hostile party.²⁰

Later, the Geneva Conventions also included the CSHF prohibition. Under the Fourth Geneva Convention, it is a grave breach for an occupying power to compel protected persons to serve in its armed or auxiliary forces.²¹ Protected persons are “those who at a given moment and in any manner whatsoever, find themselves, in case of a conflict or occupation, in the hands of a Party to the conflict or Occupying Power of which they are not nationals.”²² And Article 147 states that it is a grave breach to compel “a protected person to serve in the forces of a hostile Power,”²³ which is not limited to situations of occupation, but applies generally in the context of international armed conflicts (“IACs”).²⁴ The Third Geneva Convention, relative to the “Treatment of Prisoners of War,” provides that compelling a POW or a protected person to serve in the forces of a hostile power is a grave breach of the conventions.²⁵ Note, however, that while compelled service in hostile forces is a war crime, enlistment that is the result of pressure or propaganda is a violation of the Fourth Geneva Convention, but not a war crime.²⁶

The CSHF prohibition is also contained in the statutes of the International Criminal Tribunal for the Former Yugoslavia (“ICTY”) and the International Criminal Court (“ICC”).

The statute of the ICTY expressly included “compelling a prisoner of war or a civilian to serve in the forces of a hostile power” as part of the grave breaches of the 1949 Geneva Conventions under its jurisdiction.²⁷ However, there were no convictions that relied exclusively on a violation of this provision, and no indictments for compelling POWs or civilians to serve in the forces of a hostile power.²⁸

The Rome Statute for the ICC also includes the prohibition in question. It distinguishes four categories of war crimes²⁹: (1) grave breaches of the Geneva Conventions in the context of IACs;³⁰ (2) other serious violations of IHL contained in the Hague Conventions, Additional Protocol I to the Geneva Conventions, and the 1925 Geneva Gas protocol;³¹ (3) serious violations of Article 3 common to the Geneva Conventions in the context of non-international armed conflicts (“NIACs”);³² and (4) other violations of IHL in the context of NIACs.³³

The selection of war crimes to be ultimately included in the Rome Statute was based on two considerations: first, the norm should be part of customary international law (“CIL”); and second, the violation of the norm should give rise to individual criminal responsibility under CIL.³⁴

The crime of forced service in hostile forces is enshrined in Articles 8(2)(a)(v) and 8(2)(b)(xv) of the Rome Statute. The former states that it is a war crime to compel a POW or other protected persons to serve in the forces of a hostile power in the context of an IAC.³⁵ The latter makes it a war crime to compel participation in military operations against a person’s own country or forces in the context of an IAC.³⁶ The expression “forces” should be given a broad interpretation, and forced service is prohibited not only regarding forces hostile to the individual’s own country, but also regarding allied countries and forces.³⁷

The first provision—Article 8(2)(a)(v)—effectively combined the language of the Geneva Conventions with Article 23 of the 1907 Hague Regulations.³⁸ The second one is based solely on Article 23 of the 1907 Hague Regulations.³⁹

Finally, the CSHF prohibition has now crystallized into CIL, at least in the context of IACs. The International Committee of the Red Cross’s (“ICRC”) statement on the latter includes, in Rule 95, the prohibition on uncompensated or abusive forced labor, and it specifies that compelling persons to serve in the forces of a hostile power is a specific type of forced labor that is prohibited in IACs.⁴⁰

Many countries incorporate similar prohibitions in their military manuals and criminal codes.⁴¹ And, in 2005, the Israeli Supreme Court found that the IDF’s “Early Warning” procedure was at odds with international law, in part because it ran afoul of “a basic principle, which passes as a common thread running through all of the law of belligerent occupation,” consisting of “the prohibition of use of protected residents as part of the war effort of the occupying army.”⁴² The “Early Warning” procedure stipulated that IDF soldiers who wished to arrest a Palestinian suspected of terrorist activity may be assisted by a local Palestinian resident, who would warn the arrestee of possible harm to themselves or those present when the arrest took place.⁴³ The procedure could only be used when it posed no risk to the Palestinian resident, and the latter consented to it,⁴⁴ but the Court found that, given the inequality between the occupying force and the local resident, consent was unlikely to be real.⁴⁵

B. Conscription of the State’s Own Citizens

Although forced labor is prohibited under International Human Rights Law, conscription is not treated as an instance of it.⁴⁶ Conscription—or compelled service in military forces—of a state’s own citizens is not prohibited under international law, except in the case of children, in which case it is a war crime.⁴⁷

Conscription is well accepted in international law regarding a state’s own citizens. Voluntary enlistment and service in foreign forces is also not a violation of international law.⁴⁸ At the Hague Conference in 1907, the German delegation wanted to incorporate an article stating that belligerent parties could not ask neutral persons to render them war services, even if it was voluntary, which was supported by the United States.⁴⁹ The proposal did not succeed. More recently, in 2022, Russia issued a law that facilitates the attainment of Russian citizenship for foreigners who voluntarily enlist in the Russian army for at least a year.⁵⁰

Regarding the conscription of noncitizens with permanent residence, the law is less settled. In fact, the United States drafted permanent residents at least during the Korean and Vietnam Wars. In both, the United States required military service of every non-U.S. male citizen admitted to permanent residency and actually residing in the United States for more than a year.⁵¹ Often, the United States conscripted resident foreigners unless they agreed to forfeit future claims to citizenship.⁵²

During the course of the two World Wars, the main Allied belligerents also conscripted nationals of other states, much to the protest of the states in question.⁵³ Opposition was based on the general principles of international law, but more often, it was based on treaties that ensured states would not conscript each other’s foreign nationals.⁵⁴ Sometimes, treaties were signed with the opposite purpose: in the course of World War II, the Allies entered into treaties with the United States to secure that nationals of the Allies residing in the United States would serve in either the forces of the United States or of their own countries.⁵⁵

Thus, it is fairly clear that conscription of resident noncitizens is not a war crime under international law, even though it might be prohibited on account of bilateral international treaties. But it is unclear whether a prohibition on conscripting resident noncitizens has crystallized in CIL or whether it remains a rule of comity, as suggested in the 1970s by Frank Upham and Charles E. Roh, Jr.⁵⁶

Regarding a state’s own citizens, excepting children, conscription is well accepted by international law, both as a general practice in times of peace and as a practice in times of war. The practice of conscription itself has an old history, and after the two World Wars, it remained the norm in many countries.⁵⁷ With the end of the Cold War, the debate about universal military conscription regained force again, and at the beginning of the 1990s, France, the Netherlands, and Belgium abandoned the system of conscription, the universal draft, or both.⁵⁸ In the process of European reintegration, military conscription largely disappeared.⁵⁹ Belfer suggests that this has to do with the fact that security identities in post-Cold War Europe are increasingly forged by cosmopolitan values.⁶⁰

Nonetheless, debates around conscription still occur in the public space, and, in the United States, resurfaced during the wars in Afghanistan and Iraq.⁶¹ Indeed, some commentators worried that reliance on all-volunteer forces (“AVF”) would be insufficient to satisfy the demands of war in those countries.⁶²

The decreasing practice of conscription in many countries has led some to speak of a “crisis of conscription.”⁶³ There have also been increases in figures for conscientious objection in some countries like Italy, Spain, and Germany.⁶⁴ But protest and resistance to the draft or to conscription have been common in different contexts and cultures,⁶⁵ and the discourse of crisis is disputed. Leander and Joenniemi, for example, argue that the landscape of conscription is not homogenous, and the so-called crisis of conscription can take different forms, not all of which involve abolishing conscription.⁶⁶ Further, Leander is skeptical that conscription as a practice is coming to an end.⁶⁷

At least legally, conscription, understood as service in military forces during times of peace or during times of war (in the latter case, the practice might be called “the draft”), is still recognized as the state’s prerogative, following from the state’s right to self-defense and sovereignty.⁶⁸ Conscription by non-state groups is, by contrast, prohibited.⁶⁹

Nonetheless, in recent years, a right to conscientious objection has started to crystallize.⁷⁰ Both the Human Rights Committee and the UN Human Rights Council have recognized a right to conscientious objection to military service, based on the right to freedom of thought, conscience, and religion enshrined in Article 18 of the Universal Declaration of Human Rights and the International Covenant on Civil and Political Rights.⁷¹Human Rights Council reiterated the view that there is a right to conscientious objection to military service, even though a number of states still do not recognize it.⁷² And the European Court of Human Rights, which, previous to 2000, did not recognize the right to conscientious objection to military service, has also adopted the view that conscientious objection is an aspect of the right to freedom of thought, conscience, and religion.⁷³

By contrast, selective conscientious objection, which accepts the legitimacy of some military action but objects to particular instances of it, is not recognized as a right under international law.⁷⁴ Still, Amnesty International has adopted cases of selective conscientious objection, which have arisen in places like South Africa and Israel.⁷⁵

Finally, in some cases, the consequences following from objecting or evading conscription can amount to persecution for the purposes of being recognized as a refugee. In its 2014 guidelines on the issue, the Office of the United Nations High Commissioner for Refugees (“UNHCR”), consistent with international law, recognized the rights of states to require citizens to perform military service for military purposes, as well as the rights of states to impose penalties on those who avoid or desert military service, provided that “their desertion or avoidance is not based on valid reasons of conscience” and that the penalties and associated procedures in question comply with international standards.⁷⁶ In the context of refugee status, the UNHCR guideline states that persecution against draft evaders, deserters, or conscientious objectors might occur in certain circumstances, such as if there is a risk of threat to life or freedom or other serious human rights violations.⁷⁷ In those cases, those who selectively object to participating in military service in a conflict contrary to the basic rules of human conduct or in an unlawful conflict and those who object to the means and methods of warfare would be covered, provided that certain circumstances obtain.⁷⁸ And the Court of Justice of the European Union held in 2020 that conscription in a conflict characterized by the repeated and systematic commission of war crimes and crimes against humanity can be assumed to involve the commission of such crimes.⁷⁹ The Court also concluded that there is a strong presumption that the prospect of punishment for refusal to fight in such circumstances would amount to persecution for the purposes of refugee status.⁸⁰

In sum, while compelled service in hostile forces is a war crime, states have the right to demand of their own citizens that they serve in their military forces during times of war and peace. The only exceptions to the scope of that right are the conscription of children and the right to conscientious objection.

In other words, international law draws a significant distinction between compelled service in hostile forces and compelled service in the armed forces of one’s state. And it also fails to distinguish between legal and illegal wars. As a result, the current regime makes it a war crime to compel protected persons to fight in hostile forces, even if the latter are engaged in lawful wars. And it makes it a state’s prerogative to conscript its own citizens to serve in its armed forces, regardless of whether the wars they are forced to fight are legal or illegal.

There is thus a question of whether the international legal regime on conscription can be rendered morally coherent. In order to do so, we would need to successfully claim that (1) the nature of the wars that people are forced to fight is irrelevant and (2) that it is much worse to be forced to fight by hostile forces than it is to be forced to fight by one’s own state. Let us take each of these arguments in turn.

II. LEGAL AND ILLEGAL WARS

The international law on conscription fails to distinguish between illegal and legal wars. This suggests that this distinction is normatively irrelevant—that for the purposes of conscription, whether in hostile forces or the armed forces of one’s state, the fact that the war is legal or illegal does not matter at all—and does not alter our moral evaluation of the facts. This claim, however, is highly implausible. Consider the following implications of the regime.

First, under the current regime, compelled service in hostile forces is a war crime. This will be the case regardless of whether those hostile forces are engaged in lawful or unlawful uses of force. That is, under the present regime, it would be equally wrong for Ukraine to compel Russian POWs to fight against Russia as for Russia to compel Ukrainian POWs to fight against Ukraine.

Second, because the state’s prerogative to conscript its own citizens also fails to distinguish between legal and illegal wars, under the present regime, a state that conscripts its own citizens to fight an unlawful war of aggression commits no international crime. In fact, the state’s behavior is arguably not even prohibited by international law. As a result, under the present regime, Russia acts permissibly when it conscripts its own citizens to fight against Ukraine.

Third, because the prerogative to conscript individuals belongs only to the state, conscription by non-state armed groups is prohibited by international law. This implies that if non-state armed groups were operating in Ukraine or Russia and forcing individuals to fight against Russia, they would be acting as wrongfully as non-state armed groups conscripting individuals to fight for Russia.

Finally, the regime also has implications for the treatment of individuals who voluntarily join foreign forces. Under international law, service in foreign forces is not prohibited, but it is also not protected. As a result, individuals who join foreign forces to fight a legal war—as some Russians are doing right now⁸¹—do not violate international law. Yet, international law does not protect them from the sanctions that their states might impose on them, including criminal punishment.

The current regime suggests that the nature of the wars one is forced to fight does not matter at all or does not matter enough to make a significant normative difference. It suggests that what matters is solely the identity of those who coerce individuals to fight: one’s state or hostile states. But this is highly implausible. Suppose the CSHF prohibition is justified, in the sense that it is true that compelling service in hostile forces is wrong. That is, suppose the CSHF prohibition is a mala in se offense in international criminal law.

The present regime suggests that compelled service in hostile forces that are fighting a legal war is as bad or equally wrong as compelled service in hostile forces fighting a war of aggression. But this cannot be true. Even if we agree that coercion from a third-party is wrongful or that a third-party lacks the authority to coerce us to do something, it matters what one is coerced to do.

Suppose B, A’s neighbor, is upset that A severely mistreats his dog. B has observed that A often hits his dog, fails to feed him for several days at a time, leaves him outside chained to a wall when it is extremely cold or hot, and so on. Although B has spoken to A multiple times about the issue and has volunteered to adopt the dog, A refuses to alter his behavior or give the dog away. Eventually, B decides to take matters into her own hands. She goes to A’s house brandishing a gun and tells A that if he does not immediately release the dog to her care, she will shoot him. A, afraid for his life, relinquishes the dog.

Now suppose D, C’s neighbor, has a dog-fighting ring. D has observed that C owns a small dog that would be the perfect bait in dog fights. D has spoken to C repeatedly and has offered increasingly higher amounts of money to C so that she sells him the dog. But C does not wish to sell her dog, no matter how much money she is offered. Eventually, D, upset by C’s multiple rejections and her love for the dog, decides to proceed anyway. He shows up at C’s house brandishing a gun and tells her that she must release the dog to his care or give him $6,000 so he can get a similar dog for his fighting ring. C does not want to relinquish her dog and knows that sustaining or contributing to dog-fighting is morally wrong. However, afraid for her life, she gives D the money.

In both examples, B and D have violently coerced their neighbors to do something they did not wish to do, and for that reason, we might say that they have acted wrongly.⁸² But it would be absurd to say that B’s and D’s behavior is, all things considered, equally wrong. B has forced A to do what was morally right, that is, what he should have done anyway: give the dog away. By contrast, D has forced C to commit a wrongful act: give money to a dog-fighting ring. Even if we think that both have acted wrongly in coercing their neighbors to do something, we would say that D’s behavior is, all things considered, worse, morally speaking, than B’s behavior. From the viewpoint of the coerced neighbors, we would also say that C’s situation is worse than B’s: unlike B, who was forced to do what he had a duty to do, C was forced to do something that she not only lacked a duty to do, but was also in fact prohibited from doing (or had a weighty reason not to do).

The same applies to the CSHF prohibition. One could perhaps respond that once protected persons are being compelled into service in hostile forces, the wrongfulness of the act is so high that it is irrelevant whether individuals are compelled to fight legal or illegal wars—that there is no sense in which either of the two is significantly or relevantly morally worse than the other. However, this position is too quick. It might be that there is no point in legally regulating them differently, but, in the example above, it seems that the fact that C has been forced to do something morally wrong is an additional aspect of the crime. It is not normatively insignificant.

It is thus not true that being coerced to fight in an illegal war is equally bad as being coerced to fight in a legal war. This is so because international law itself draws a clear and normatively significant distinction between legal and illegal wars. A legal war is a war fought to uphold the international order. By contrast, an unlawful war or a war of aggression is precisely the opposite.

In fact, aggression is considered by some to be the “crime of all crimes.” It is not only prohibited by the UN Charter, but also constitutes an international crime that entails the individual responsibility of those who plan it and conduct it.⁸³ Accordingly, several international legal scholars have developed a view that explains why we have criminalized aggression and why aggressive war is wrong. Mégret, and later, Mégret and Redaelli, have defended a human rights characterization of the crime: aggressive war constitutes a massive violation of the human rights of citizens of both the victim state and the aggressor state, including combatants.⁸⁴ Dannenbaum has argued something similar. He contends that international law’s “criminalization of aggression is not just a formal prohibition, but also an expression of aggression’s wrongfulness from the international legal point of view.”⁸⁵ The core criminal wrong of the crime of aggression is, according to Dannenbaum, the unjustified killing and human violence it entails⁸⁶: an aggressive war results in unjustified displacement, killing, and harming of thousands of individuals. Finally, more recently, Saira Mohamed has pointed out that international law has no language to name the wrong that is committed by a state such as Russia, which fights an aggressive war by relying on conscription.⁸⁷ She argues that in such circumstance, individuals do not have a duty to fight for their state and, in fact, should be protected against doing so.⁸⁸

Given what makes aggression wrong—unjustified violence against countless individuals—it is impossible to argue that the distinction between legal and illegal wars lacks relevance in the context of conscription.

Perhaps one could make the following argument in defense of the international regime’s failure to consider the nature of the wars that individuals are coerced to fight: whether a war is legal or illegal is not what matters. What matters is whether a war is morally justified—whether a war is just.

Just war theory has long distinguished between just and unjust wars, arguing that the first kind are justified, while the second kind are not.⁸⁹ Generally speaking, a just war is one that has a just cause and meets certain requirements concerning proportionality and necessity and is fought in a just manner (for example, by distinguishing between civilians and combatants).⁹⁰ Self-defense and defense of others (that is, humanitarian interventions) are widely accepted amongst contemporary just war theorists as just causes for war.⁹¹ By contrast, an unjust war is a war that is morally prohibited. It involves inflicting morally unjustified harm and death on countless innocent individuals.

In just war theory, then, the distinction between just and unjust wars is the distinction that matters. Unjust wars are morally prohibited and involve the commission of grievous moral wrongs. By contrast, just wars are morally justified. Thus, one could argue that the distinction between legal and illegal wars is normatively irrelevant; that we should concern ourselves with the distinction, at the level of morality, between just and unjust wars. But, of course, even if this argument is correct, it cannot work as a defense of the international regime on conscription. The latter fails to distinguish at all on the basis of the character or nature of the wars that individuals are conscripted to fight. It fails to distinguish between legal and illegal wars, and it also fails to distinguish between just and unjust wars.

Further, there is some overlap between what makes a war just and what makes a war legal. This overlap is, however, not perfect.⁹² Self-defense is recognized both as a just cause for war and a legal instance of the use of force in international law.⁹³ However, humanitarian interventions, which are widely recognized as a just cause for war, are not clearly legal uses of force under international law, unless they are authorized by the United Nations Security Council (“UNSC”).⁹⁴ Additionally, under the U.N. Charter, the UNSC can authorize the use of force, and it could potentially do so in circumstances in which just cause, necessity, or proportionality are lacking. That is, a legally authorized use of force by the UNSC could be an instance of an unjust war.⁹⁵

The fact that the overlap between the two is imperfect cannot, of course, support the conscription regime’s lack of concern for whether the wars are legal or illegal, just or unjust. It does, however, provide reasons to modify the jus ad bellum—that is, the legal rules on resort to force—to make it more coherent with the distinction between just and unjust wars.⁹⁶ And because the overlap between legal and just wars is not perfect, the implications of the regime on conscription as it pertains to just and unjust wars merits attention as a separate set of implications. The current regime entails that states are free, under international law, to conscript their own citizens to fight unjust wars, while compelled service in hostile forces remains a war crime, even when the war individuals are conscripted to fight is a just one.

Nonetheless, in the remainder of this Article, I will speak indistinctly of legal/just wars and illegal/unjust wars. Because this Article and the arguments focus on wars of self-defense and wars of aggression, which are examples of legal and just wars and illegal and unjust wars, respectively, it is unnecessary to keep making the distinction. All the arguments I make are applicable to both. But in those areas in which there is no overlap, and we cannot assume that a war is just merely because it is legal, the arguments I make are applicable only to the distinction between just and unjust wars.

In sum, if wars of aggression are deeply morally wrongful, the failure of the international legal regime on conscription to incorporate that distinction renders some aspects of the regime morally incoherent: they cannot be “rendered intelligible” in a moral sense.⁹⁷ This is so because, in at least three instances, the regime fails to criminalize or prohibit conduct that is worse than, or as bad as, compelled service in hostile forces.⁹⁸

First, the regime makes it so compelled service in hostile forces is equally bad regardless of whether the war one is compelled to fight is legal or illegal. Given that a war of aggression is a grave violation of the international order and a clear instance of an unjust war, it should be a worse crime to compel protected persons to fight in hostile forces in pursuit of a war of aggression than to compel them to fight in hostile forces in pursuit of a war of self-defense. It seems, as Ryan suggests, that it would be “an additional aspect of the crime” to compel service in hostile forces in a war of aggression.⁹⁹

Second, the same problem arises with the state’s prerogative to conscript its own citizens. If aggression is the crime of all crimes, how can it be that conscription of a state’s own citizens to fight an illegal war is allowed just as a state’s conscription of its own citizens to fight a legal war? Given that conscription is already a grave intrusion into one’s personal freedom, nearly equivalent to an obligation to kill and die for the state,¹⁰⁰ it seems that the kind of war citizens are conscripted to fight should be relevant. At the very least, conscription to fight legal wars and conscription to fight illegal wars should not be equally allowed.

This issue has been addressed by some scholars. Tom Dannenbaum and James Pattinson have argued that there should be a right to object to deployment in illegal wars.¹⁰¹ Dannenbaum has also argued that international law should grant refugee status to those who refuse to fight in illegal wars.¹⁰² More recently, he has argued that states have a legal obligation to recognize the refugee status of Russian troops who flee to avoid participating in a war of aggression, including those facing conscription.¹⁰³ He claims that the unlawful nature of the war should be enough to ground refugee status for Russian citizens who desert or flee Russia in order to avoid conscription.¹⁰⁴ Although the crime of aggression does not entail the international criminal responsibility of mid- and low-level soldiers, aggression’s wrongfulness lies in the fact that it causes widespread death and destruction without legal justification.¹⁰⁵ And I have argued in previous work that individuals should have a right to object to deployment in unjust wars, that international law should grant refugee status to those who refuse to fight in unjust wars, and that we should modify the jus ad bellum, too, so that it better conforms to the morally relevant distinction between just and unjust wars.¹⁰⁶

However, even if one might be able to defend these conclusions through a progressive interpretation of international law, as Dannenbaum does,¹⁰⁷ at the moment, a right not to fight in illegal wars is not recognized by international law, states remain free to conscript their own citizens to fight in wars of aggression, and refugee status has not been extended to those who refuse to fight in illegal wars. In fact, Russian citizens who have fled Russia to avoid conscription have been met with varying responses. While Canada recently granted refugee status to a Russian man who had fled his country,¹⁰⁸ Norway has hesitated to do so;¹⁰⁹ Latvia, Lithuania, and Estonia have said they will not offer refuge to fleeing Russians;¹¹⁰ and Poland has begun to turn away Russian citizens at the border.¹¹¹

Third, compare compelled service in hostile forces to fight a legal war—a war crime under IHL—with the state’s conscription of its own citizens to fight an illegal war—permitted under IHL. The current regime suggests that it is significantly worse to be compelled by hostile forces to fight a legal war than it is for the state to conscript its own citizens to fight an illegal war. But this should be, at the very least, controversial. Outside of this context, we do not think that the moral and legal status of actions people are forced to do is irrelevant, and, further, we certainly do not think that being forced to do something illegal and immoral is less bad than being forced to do something legal and morally justified, purely based on the identity of who is coercing us into doing so.

Perhaps the identity of who is coercing individuals is relevant in this context; perhaps the state is specially positioned to demand certain things from its citizens, and I will return to this in Part III. But to defend this aspect of the regime, the claim that must be defended is not only that the identity of who is coercing individuals matters, but also that it matters much more than whether individuals are being coerced to fight legal or illegal wars.

Finally, the fact that the international legal regime on conscription fails to distinguish between legal and illegal wars not only fails to capture something that is normatively significant. It is also self-defeating; that is, it is bad at achieving what international law presumably aims to achieve. If one of the goals of international law, or the jus ad bellum, is to achieve or sustain peace,¹¹² a regime that allows states to “generate soldiers for war-making”¹¹³ independent of whether they are doing so to uphold the international regime or to breach it, seems likely to generate more and longer wars than a regime that prohibited states from conscripting its citizens to fight in illegal wars.

In sum, the fact that the international regime on conscription does not pay attention to the nature of the wars that individuals are coerced to fight cannot be made sense of, morally speaking. The current regime treats equally things that are morally dissimilar in a relevant way.

International law should thus distinguish between legal and illegal wars. In the context of the state’s conscription of its own citizens, this implies that states should be prohibited from conscripting individuals to fight illegal wars or wars of aggression. In the context of compelled service in hostile forces, this implies that the illegal nature of the war one is compelled to fight should be an additional aspect of the crime.

But there is still a remaining dimension of the regime that demands justification and has not been entirely undermined by the arguments so far. Recall that justifying the present regime on conscription requires the success of two arguments. First, that the distinction between legal and illegal wars does not matter at all. This argument has failed. And second, that compelled service in hostile forces is significantly worse than compelled service by one’s state. This argument is necessary to justify the fact that compelled service in hostile forces, even to fight a legal war, is a war crime while the state’s conscription of its own citizens to fight a legal war remains the state’s prerogative. Engaging with this argument will be the task of Part III.

III. HOSTILE FORCES AND THE STATE

In Part II, I argued that the international legal regime on conscription is morally incoherent in a particular way: it treats equally, either by permitting both or criminalizing both, things that are morally dissimilar. It matters whether the wars that individuals are coerced to fight are just or unjust, legal or illegal.

If we accept the arguments made in Part II and their implications, we should accept that there is a powerful pro tanto reason for international law to prohibit the state’s conscription of its own citizens to fight illegal wars and to make the illegal nature of the war an additional aspect of the crime of compelled service in hostile forces.

However, having accepted these implications, there is a second aspect of the regime whose moral coherence remains to be proved: whether compelled service in hostile forces is both wrong and significantly worse than compelled service in the armed forces of one’s state. This argument is necessary to render morally intelligible the fact that compelled service in hostile forces to fight legal wars is a war crime, while compelled service in the armed forces of one’s state to fight legal wars is the state’s right.

Some of the arguments I will explore will also have implications for compelled service in illegal wars. As I just noted, Part II provides a pro tanto reason for international law to prohibit compelled service in illegal wars. Although I think that Part II provides a conclusive, and not just a pro tanto, reason for international law to prohibit the state’s conscription of its own citizens to fight illegal wars, the case in favor of prohibition does not rest solely on that argument. It can also rely on some of the arguments that follow.

Nonetheless, most of the arguments in this Part attempt to explain why international law treats compelled service in hostile forces to fight legal wars differently from a state’s conscription of its own citizens to fight legal wars. Perhaps there is something particularly wrong about compelling persons who are in custody or in occupied territory to serve in hostile military forces—something wrong that is present in these circumstances but is not present when states conscript their own citizens to fight wars. If this argument exists, then we can explain at least this aspect of the international legal regime on conscription.

The complexity of making moral sense of this aspect of the regime is increased by the fact that compelled service in hostile forces is not merely prohibited by IHL, but is a war crime, that is, a crime that entails individual, and not just state, responsibility. Whatever justification is provided, it must be able to account not only for the CSHF prohibition itself, but also for its criminalization as part of international criminal law. However, there is no unitary theory of war crimes nor a satisfactory definition of them.

War crimes belong to the general category of international crimes. International crimes are one of the few areas of international law in which duties are imposed directly on individuals who might be personally responsible for their conduct. Not every human rights violation is an international crime. Louise Arbour, former UN High Commissioner of Human Rights, explains the distinction in this way: “Human rights law violations are actions and omissions that interfere with the birthright of all human beings—their fundamental freedoms, entitlements and human dignity. Humanitarian crimes are, in essence, crimes that are so heinous that they shock the human conscience.”¹¹⁴

The idea that international crimes are a particular category of very heinous acts, one that shocks the conscience of humankind, is quite common. Some authors, however, have offered a different theoretical account of international crimes.

David Luban, for example, has provided a theory of crimes against humanity, understanding them as an assault on a particular aspect of human beings, namely, our character as political animals.¹¹⁵ Larry May, as we will see below,¹¹⁶ has developed a theory about war crimes that relies on notions of honor and the vulnerability of certain individuals during war.¹¹⁷ As I will argue in Section III.D, May’s theory can explain why compelled service in hostile forces is often worse than the state’s conscription of its own citizens, but it cannot explain the permissibility of conscription.

There is, however, significant uncertainty and confusion about the definition of “war crimes,” and the term is often used in various and sometimes contradictory ways.¹¹⁸ This is probably partly explained by the fact that the concept of grave breaches, and the idea of a war crime itself, is a body of law “whose normative provisions were drawn from different, and somewhat inconsistent, treaty provisions.”¹¹⁹

The most common definition of war crimes is “violations of the laws of war that incur individual criminal responsibility.”¹²⁰ The international criminal tribunals, as well as the ICC, have coalesced around three basic elements of war crimes: (1) an armed conflict; (2) a nexus between the acts of the accused and the armed conflict; and (3) knowledge of the armed conflict.¹²¹

Other accounts of war crimes, like that of Hathaway, Strauch, Walton, and Weinberg, require that the violation of the laws of war is also serious.¹²² It is generally assumed that all grave breaches of the Geneva Conventions are serious, but the seriousness of other violations might depend on the nature of the infraction itself and, perhaps, the manner in which the prohibition is broken.¹²³ This is somewhat ambiguous, both in the statute of the ICTY and in the Rome Statute, in which the Elements of Crimes introduce elements to Article 8 that require the act to be sufficiently serious to justify international condemnation.¹²⁴ There might also be breaches of the laws of armed conflict that do not qualify as serious and meriting international condemnation, but that should be the subject of domestic proceedings.¹²⁵

The most accepted definition of war crimes (violations of IHL that entail individual criminal responsibility) fails to offer a theory of criminalization. That is, it fails to guide criminal tribunals and other organs in determining what a war crime is and lacks a deep underlying justification.¹²⁶ It is not clear at all that this is what the definition is trying to do—perhaps it aims simply to give a positivist account of war crimes. But the truth is that we do not agree on a theory of criminalization, both at the domestic level¹²⁷ and at the international one. The purpose of International Criminal Law itself is also deeply contested.¹²⁸ This poses some difficulties in terms of the arguments I will offer. It might be plausible to conclude that some things should be forbidden by international law, but something else is presumably required in order for a behavior to constitute a war crime. I will come back to this at the end, but I cannot do anything but leave the issue somewhat open. To do otherwise would require developing a theory of criminalization in the context of IHL.

For now, the main challenge is to explain why compelled service in hostile forces might be morally worse than the state’s conscription of its own citizens. I will address several different but somewhat related arguments: (a) compelled service in hostile forces is irrational; (b) the CSHF prohibition aims to incentivize surrender and disincentivize longer wars; (c) citizens feel loyalty toward their own states and that loyalty should be respected; (d) protected persons are in a particularly vulnerable situation vis-à-vis their captors, which makes it morally worse to compel them into service; and (e) unlike foreigners, citizens have duties toward their own states to fight legal wars.

A. Compelled Service in Hostile Forces Is Irrational

One possible argument why compelled service in hostile forces is significantly different from the state’s conscription of its own citizens is that individuals forced to fight for hostile or enemy forces would likely defect or surrender as soon as possible and, generally speaking, would make for bad fighters.

This might be true in certain circumstances. Some individuals feel strong loyalty toward their states and armies and if compelled to fight might in fact decide to do so badly. However, this difference between compelled service in hostile forces and the state’s conscription of its own citizens cannot explain why compelled service in hostile forces is a war crime. It only shows why it is a bad idea for any given state to compel protected individuals to fight in hostile forces—that is, it shows why it is likely irrational to do so. But mere irrationality does not provide a conclusive argument in favor of the international criminalization of compelled service in hostile forces. There are plenty of things that states do that might be a bad idea, but bad ideas are not enough to “shock the conscience”¹²⁹ of humankind.

Further, conscripted soldiers fighting for their own state might also be bad fighters, particularly in comparison to professional armies. They might be reluctant to kill or engage in combat or lack relevant training. Thus, if the CSHF prohibition was justified because protected persons are likely to make bad fighters, conscription would be put into question for similar reasons. But the fact that individuals might make bad fighters is, in any case, not enough to explain why compelling individuals to fight is wrong or should be prohibited. Again, it only shows why it is a bad idea.

One might object at this stage that whether compelled protected persons make good fighters is irrelevant. What the CSHF prohibition aims to achieve is to provide incentives to surrender. Let us examine this argument.

B. POWs and Incentives to Surrender

Privileged combatants can become POWs and are consequently entitled to certain rights and protections.¹³⁰ POWs are those who have fallen into the power of the enemy and belong to one of the categories specified in Article 4 of Geneva Convention III.¹³¹ Hors de combat, that is, persons who are in the power of an adverse party, express a clear intention to surrender, and have been rendered unconscious or otherwise incapacitated by wounds or sickness,¹³² are also entitled to certain protections against mistreatment.¹³³ Among those protections is the prohibition against compelled service in hostile forces.¹³⁴

One might thus argue that, if combatants knew that after surrendering or becoming wounded or incapacitated they might be compelled by the adverse party to serve in its armed forces, they would be less likely to surrender and more likely to fight until the very end. If everyone did that, this might, in turn, make wars longer.

The CSHF prohibition then is concerned with providing individuals with incentives to surrender. In doing so, it aims to make wars shorter, and perhaps in making wars shorter, it reduces suffering. Of course, this rationale would not explain why the CSHF prohibition also applies to those in occupied territory; it is limited to POWs.

The idea that the goal of IHL is to reduce suffering (to make wars more humane) is familiar and is often referred to as “the humanitarian view.”¹³⁵ If this is right, and the incentives work in this way, then we might be able to explain why it makes sense to provide individuals with certain protections when they surrender, such as the prohibition on compelled service in hostile forces.

The humanitarian view has been the object of a variety of objections, both as a justification of the legal equality of combatants¹³⁶ and as a justification of the law on co-belligerency.¹³⁷ In the context of the international legal regime on conscription, the humanitarian view can explain why the CSHF prohibition is a reasonable rule to have, but it cannot explain why compelled service in hostile forces is worse than the state’s conscription of its own citizens. In fact, if the state’s conscription of its own citizens is likely to make wars longer or to increase suffering, there would be at least a pro tanto reason to prohibit the practice.

Note, as well, that the humanitarian account is also likely to provide an additional reason for prohibiting states’ conscription in illegal wars. If we rely on accounts of what people have incentives to do, we should, presumably, reduce incentives to wage and participate in illegal wars.

C. Loyalty

A third argument that might explain why there is something different between compelled service in hostile forces and the state’s conscription of its own citizens is that the former violates the ties of loyalty that bind individuals to their own states, while conscription by one’s own state does not. Compelled service in hostile forces entails fighting against one’s own state and betraying one’s loyalty to it.

The importance of loyalty to one’s state is not unheard of in the context of war. Levinson, for example, discusses the case of Ernst von Weizsaecker, State Secretary of the German Foreign Ministry.¹³⁸ Von Weizsaecker had internally opposed the war, but the issue of why he did not inform the Russian ambassador of Hitler’s plans against Russia in 1941 came up during the Nuremberg trials.¹³⁹ If he had done so, he would have put himself in great danger, would not have changed Hitler’s policy, and would have led to greater German losses:

The prosecution insists, however, that there is criminality in his assertion that he did not desire the defeat of his own country. The answer is: Who does? One may quarrel with, and oppose to the point of violence and assassination, a tyrant whose programs mean the ruin of one’s country. But the time has not yet arrived when any man would view with satisfaction the ruin of his own people and the loss of its young manhood. To apply any other standard of conduct is to set up a test that has never yet been suggested as proper, and which, assuredly, we are not prepared to accept as either wise or good.¹⁴⁰

This argument about loyalty was not just an idiosyncratic belief of the Nuremberg judges. The ICRC database on customary IHL states that the reasoning behind the CSHF rule is “the distressing and dishonourable nature of making persons participate in military operations against their own country—whether or not they are remunerated.”¹⁴¹ Bothe also finds the rationale of the rule in avoiding “bringing a detained person or a person in occupied territory into an unbearable loyalty conflict.”¹⁴² And the ICTY, in interpreting the concept of “civilians” for the purposes of the Fourth Geneva Convention, relied on the idea of “genuine bonds of loyalty and allegiance,” dismissing, however, the requirement of nationality.¹⁴³ In the ICTY, the issue as to who counted as civilians for the purposes of the Fourth Geneva Convention arose because the latter refers to those who find themselves “in the hands of a Party to the conflict or Occupying Power of which they are not nationals.”¹⁴⁴ This was pressing in the case of Bosnian Muslims, who could not be considered protected persons under the Convention because their persecutors were also of Bosnian nationality.¹⁴⁵

In the philosophical literature, arguments from loyalty have been discussed in the context of the crimes of treason and espionage.¹⁴⁶

Arguments that rely on loyalty have, however, limited purchase. Loyalty can be understood as the “practical disposition to persist in an intrinsically valued (though not necessarily valuable) associational attachment, where that involves a potentially costly commitment to secure or at least not to jeopardize the interests or well-being of the object of loyalty.”¹⁴⁷ In institutional settings, “loyalty can simply take the form of commitment and willingness not to undermine institutions which do well by us and, thereby, not to harm our fellow community members.”¹⁴⁸

One of the issues with a loyalty-based argument is that loyalty can often be morally problematic. This is so because individuals might experience loyalty toward all sorts of projects and associations, some of which will have no value at all, or even negative value. This would be the case with individuals who experience loyalty toward, say, racist associations, criminal organizations, and so on.

Perhaps loyalty-based arguments can succeed if we can argue that loyalty toward one’s own state (or nation) is valuable to the point of requiring protection from international law. Yet, although some states perform valuable functions, it might be better if individuals felt stronger loyalty toward other fellow human beings, rather than to their own state. This kind of loyalty also seems more consistent with some of international law’s cosmopolitan aspirations.¹⁴⁹ And relying on this account of loyalty might provide an additional argument as to why compelled service in hostile forces to fight illegal wars should be a war crime. When states do compel such service, they are asking individuals to engage in conduct that is disloyal to states that are engaged in legal wars and to the international legal system.

In any case, even if we concede that betraying one’s loyalty to one’s state is presumptively wrongful, that presumption can be overridden by other considerations—mainly, whether one’s state is engaged in violating the fundamental rights of others.¹⁵⁰ Yet, this is precisely what states engaged in aggressive wars are doing. Recall that we are trying to answer why compelled service in hostile forces to fight a legal war is both wrong and significantly worse than the state’s conscription of its own citizens to fight a legal war. This would require arguing that an individual’s loyalty toward a state fighting a war of aggression should be protected by making compelled service in hostile forces a war crime. However, that individual’s loyalty is surely misguided from the viewpoint of the international legal system itself, and it is, effectively, loyalty toward an actor engaged in serious legal (and moral) wrongdoing.

Perhaps we can argue that the objective value of an individual’s loyalty is irrelevant. It only matters whether the individual in fact experiences ties of loyalty to their own state. If they do, that loyalty should be protected. But this argument is both over and underinclusive. It is overinclusive because it would require international law to protect individuals’ loyalty toward their own state in a wider range of circumstances, extending far beyond service in hostile forces. And it is underinclusive because it would require international law not to concern itself with instances in which that loyalty is not present. That is, if particular individuals felt no special ties of loyalty toward their own state, or if they had no state, the CSHF prohibition could not be justified in their case; states would do nothing morally wrong if they chose to compel those foreigners to serve in their armed forces. This seems to miss something important.¹⁵¹

It might be the case that many individuals do feel strong ties of loyalty to their own states, even if they are misguided in doing so. This might counsel for excusing their behavior in certain circumstances. But it is difficult to argue that international law should encourage that loyalty when it is misguided by making compelled service in hostile forces a war crime.

This is not to say that loyalty is completely irrelevant in all circumstances. Individuals, for example, might be loyal to members of their own family or their loved ones, or they might have special duties to protect them from harm.¹⁵² It is understandable, from that perspective, why forcing someone to harm their own loved ones would be worse than forcing them to harm a stranger. Since fighting against one’s own country might involve harming one’s loved ones, it is plausible to argue that, in those cases, compelled service in hostile forces is worse than compelled service in the armed forces of one’s state; thus, it might be wrongful, under certain circumstances, for states to compel individuals to harm their loved ones.

However, ties of loyalty and duties of protection in close interpersonal relationships are insufficient to fully account for the CSHF prohibition due to two reasons.

First, this is not the kind of relationship that exists between states and their own citizens, nor is it the kind of relationship that exists between citizens themselves.¹⁵³ Further, interpersonal relationships and ties of loyalty do not necessarily overlap with citizenship in a given state—individuals might have family and loved ones in other countries, and, in the context of civil conflict, in which states can coerce their own citizens to fight, ties of loyalty to the state will be conflicted and, sometimes, nonexistent. Finally, even if we grant that individuals can draw deep personal meaning from their association to their state, their nation, their community, and so on, the strength of those ties pales in comparison to those in close interpersonal relationships.

At this point, one might object that it is unclear why interpersonal relationships should be so different from the kind of relationships that exists between individuals and their communities (however expansively we define the latter). After all, people draw profound meaning from their membership in those communities and often feel bound to act in certain ways on account of that membership. I find this kind of argument unpersuasive, partly because I am committed to a more cosmopolitan view of our moral obligations and the sources that give meaning to our lives. Nonetheless, even those who reject strong versions of cosmopolitanism should be persuaded by the second reason why ties of loyalty and associative duties are insufficient to account for the CSHF prohibition.

This is so because even the ties of loyalty and duties of protection that arise in interpersonal relationships are not impervious to the impartial demands of morality. That is, individuals are not morally allowed to do anything they can to protect their loved ones (or the members of their communities) from harm nor are they allowed to be partial to the interests of their loved ones in all possible circumstances. In the context of the CSHF prohibition, it can hardly be the case that associative duties and ties of loyalty toward one’s own state and its institutions—which seem categorically different or, at the very least, weaker, from those present in interpersonal relationships—could override moral obligations not to wrongfully harm others. Or, put differently, it cannot be the case that it is morally wrong for individuals to fight a war of self-defense against their own states. If hostile forces are prohibited from coercing individuals into doing so—as I think they are—it cannot be merely because doing so involves making them breach duties of loyalty or protection toward their own states. Those duties will most often be inexistent or overridden by duties not to wrongfully harm others (as individuals often do when fighting in an illegal war).

In sum, while we can acknowledge that individuals often feel loyalty toward their own states and they also have duties of protection toward their loved ones, this argument only explains why fighting against one’s own state can be worse, from an agent-relative perspective, than fighting against other states. But alone, outside of limited circumstances in which fighting for hostile forces might involve breaching duties of protection toward one’s loved ones, it cannot explain why compelled service in hostile forces is morally prohibited.

D. Mistreatment and Vulnerability

A different way in which we can argue that there is a significant difference between the state’s conscription of its own citizens and compelled service in hostile forces relies on the idea that POWs and persons in occupied territory are in a particularly vulnerable position. This argument is, of course, underinclusive—it can only explain what is wrong with compelled service in the case of POWs and those in occupied territories, but it cannot explain what is wrong with compelling “the nationals of the hostile party to take part in the operations of war directed against their own country.”¹⁵⁴

A vulnerability-based argument has two dimensions. One is more obviously concerned with the incentives at play that explain why POWs or individuals in occupied territory are more likely to be treated in objectionable ways. The second one focuses on what is particularly wrong with harming or coercing vulnerable individuals. Both dimensions of the argument are related. The first one aims to explain why the CSHF prohibition is required to protect certain individuals who are likely to be the victims of objectionable treatment. The second one aims to explain why certain forms of treatment are objectionable.

The first part of the argument worries, then, about the CSHF prohibition’s role in protecting individuals in contexts in which states might have incentives to treat them in objectionable ways. We might say, for example, that if states were allowed to conscript protected persons to fight, they would likely use them in cruel ways, as cannon fodder or diversions, in missions that have little chance of succeeding, and so forth.

Incentives are likely to operate in this way. In fact, a study by Valentino, Huth, and Croco shows that highly democratic states face great pressure to reduce the human costs of war and are thus more likely than other states to employ strategies that minimize military casualties.¹⁵⁵ States—namely, democratic states—might thus have incentives to protect their own citizens and avoid high casualties, but they are less likely to have similar incentives regarding foreigners, to whom they are not politically accountable. And we have powerful reasons to impede states from employing individuals in these objectionable ways.

However, this argument alone, although plausible, cannot meaningfully distinguish between the state’s conscription of its own citizens and compelled service in hostile forces. Simply as a matter of history, the CSHF prohibition was prior to any human rights standards that might have governed how states treated their own soldiers and conscripts, which only developed after World War II.¹⁵⁶

Going beyond this historical point, the argument about incentives cannot explain why compelled service in hostile forces is meaningfully different from the state’s conscription of its own citizens. It might be true that states often have fewer incentives to use their own citizens as cannon fodder, send them into missions certain to fail, and so on. But there are a range of circumstances in which states will, in fact, have few incentives to protect their own citizens. For example, authoritarian states are less responsive to public opinion, and as a result might be more willing to employ their own citizens in these ways. States might also employ their own citizens as cannon fodder as a war tactic in a desperate attempt to avoid defeat, particularly in the case of citizens from oppressed minority groups.

Thus, although this argument can provide a good reason for having the CSHF prohibition, it also provides good reasons for having a prohibition on conscription of the state’s own citizens. If the CSHF prohibition is justified because incentives are likely to make states use individuals in objectionable ways during combat, there would be a reason to enact a similar prohibition in all contexts in which the incentives to use individuals in those objectionable ways exist or to directly prohibit those objectionable ways themselves. Yet, no such prohibitions exist.

Although some standards of treatment have been imposed regarding soldiers and conscripts,¹⁵⁷ there is no restriction on states sending out their own citizens into difficult or impossible missions, nor any restrictions on using them as cannon fodder.¹⁵⁸ The standard of treatment recognized by the European Court of Human Rights, for example, demands that military service is performed in “conditions compatible with respect for human dignity” and that do “not impose distress or suffering of an intensity exceeding the unavoidable level of hardship inherent in military discipline” and service,¹⁵⁹ thus leaving wide discretion to states in terms of military strategies and planning. In the context of IHL, the debate concerning states’ obligations to their own soldiers seems also currently limited to the state’s obligation to tend to the wounded, which extends to the state’s own soldiers.¹⁶⁰

The argument so far, although insufficient to render the regime coherent, does give a reason in favor of the CSHF prohibition: compelled service in hostile forces might be one instance in which incentives to treat individuals in certain objectionable ways are very often present. Here is where the second dimension of the argument enters the picture. I have just pointed out that states are likely to have incentives to treat foreigners in objectionable ways when they send them into combat. The second dimension of the argument explains, precisely, why it is worse to treat POWs and persons in occupied territory in certain ways, and it relies on the notion of vulnerability, as understood by Larry May and Seth Lazar.

Larry May has developed an account of war crimes that relies on duties of humane treatment, in which war crimes are understood as “crimes against humaneness.”¹⁶¹ They are a violation of the principle that requires soldiers to act humanely, with mercy and compassion.¹⁶² This is a difficult task, May acknowledges, because while soldiers might be required to act humanely toward some, they are still allowed to—or are not prohibited from—killing enemy combatants.¹⁶³ Ultimately, for May, the rules of war are grounded in notions of honor and mercy, as well as the protection of the vulnerable.¹⁶⁴

Regarding confined soldiers (POWs and hors de combat) and prohibitions on their mistreatment, May argues that humane treatment becomes paramount: POWs are confined by one party, and that party has “every reason to want to exert vengeance or retribution on those who have been killing members of one’s armed forces.”¹⁶⁵

It is the asymmetry in power between confined prisoners and the party confining them that partly motivates May’s account. The laws of war, he argues, should counteract the strong possibility of abuse perpetrated by those who have weapons against those who do not.¹⁶⁶ POWs are in a “special moral situation because they are utterly dependent on their captors and are vulnerable in ways that soldiers on the battlefield are not.”¹⁶⁷ Walzer makes a similar point: “Just beyond the state there is a kind of limbo, a strange world this side of the hell of war, whose members are deprived of the relative security of political or social membership.”¹⁶⁸

This concern about vulnerability, understood as defenselessness or the inability to diminish one’s vulnerability to a threat, is echoed by Seth Lazar in defending the prohibition against deliberately targeting civilians.¹⁶⁹ Lazar argues that harming those who are vulnerable or defenseless is particularly bad or worse than harming the nonvulnerable because doing so is exploitative, risky, breaches a duty to protect the especially vulnerable, dominates and disempowers them, and generates unfair distributions of risk on the innocent.¹⁷⁰

Vulnerability is also the object of special protection in domestic criminal codes, in which it can operate as an aggravating circumstance in certain crimes such as homicide, or as giving rise to certain duties of protection toward, say, children.¹⁷¹

Following this line of argument, one might then reason that the vulnerability of POWs and those in occupied territories is what explains the CSHF prohibition. It does so because prisoners and persons in occupied territories are in vulnerable positions such that states have incentives to treat them in objectionable ways, and harming those who are vulnerable is morally wrong and morally worse than harming the nonvulnerable.

It is certainly true that both occupation and custody create a situation of vulnerability. In that sense, it is understandable why states might have duties to provide and care for individuals in custody and under occupation. And it is quite plausible that it is morally worse to harm the vulnerable than the nonvulnerable. If so, then we can explain why compelled service in hostile forces is morally worse than the state’s conscription of its own citizens: to the extent that POWs, hors de combat, and those in occupied territories are vulnerable and defenseless relative to citizens of the state, and to the extent that compelling them into service will entail treating them in objectionable ways, it is worse to compel the former into service than to compel citizens.

This argument, however, faces some difficulties. Recall that we are trying to answer why compelled service in hostile forces to fight a legal war is morally worse than compelled service by one’s own state to do the same. So far, the argument has established that harming those who are vulnerable is worse than harming those who are not, and that states might have incentives to treat POWs and those in occupied territories in objectionable ways (that is, states might have incentives to harm them). There are several problems with this.

First, as mentioned before, states can have similar incentives regarding their own citizens, and yet they are not prohibited from treating their own citizens in such ways. The argument is, in that sense, overinclusive.

Second, the argument on its own says very little about whether conscription is morally prohibited or permissible. This is so because the argument explains why compelled service in hostile forces is worse than the state’s conscription of its own citizens. Whether compelled service in hostile forces is morally wrong depends on whether it harms those compelled to fight. I have suggested that it does so when they are used in certain objectionable ways, or that it might do so when it forces them to breach associative duties toward their loved ones (under certain circumstances), but I have not argued that it does so in every instance. That is, I have not yet provided an argument as to whether coercion to fight legal wars on behalf of hostile forces always harms those forced to fight.

Put simply, the fact that harming someone who is vulnerable is worse than harming someone who is not does not say anything about whether the latter is morally prohibited or permitted. It only implies that one is worse than the other. That is, the fact that harming a defenseless child is worse than harming an adult does not prove, on its own, that harming the adult is morally permissible. That depends on whether the harm itself is justified.

Applied to the CSHF prohibition, one can argue that it is worse to compel protected persons into service than it is to compel one’s own citizens. That might be the case. But alone, this cannot answer whether compelled service in military forces is allowed in the first place.

Third, the argument that relies on vulnerability also requires that vulnerability is applied in a particular way, so that only those who have fallen under the power of an adverse party to the conflict and those that are in occupied territory are understood as vulnerable. However, not all POWs remain in physical custody during war, and citizens in their own states are also quite vulnerable to coercion on the state’s part (think, for example, of those who are serving criminal sentences). Further, refusing conscription might lead to criminal punishment, and the state has a wide arsenal of enforcement mechanisms at its disposal. If it is vulnerability that is driving the CSHF prohibition, then one might also find vulnerability in the state’s context when compelling individuals to serve in its own forces.

Still, we can argue that it will often be the case that POWs and those in occupied territories will be in a highly vulnerable position, and that states are likely to have incentives to use them in objectionable ways during combat, and so the CSHF prohibition is predicated on that likelihood. In those cases, we can explain why compelled service in hostile forces is wrong, and why it might be morally worse than the state’s conscription of its own citizens. But vulnerability itself might be present in the state’s own citizens. More importantly, this vulnerability-based argument does not say why compulsion to fight a legal war is always equivalent to unjustifiably harming individuals.

E. Citizens and Duties Toward the State

Is coercion to fight a legal war equivalent to unjustified harm? Or is it coercion to do what one already has a moral duty to do? Does it matter whether it is one’s state who coerces one to fight?

What is wrong (or right) about compelled service in legal wars cannot be coercion itself. It would prove too much: it would immediately make the state’s conscription of its own citizens morally suspect. It also makes coercion—which is a key feature of the state—generally morally wrong. This argument should thus be discarded. Coercion might always require justification, but what makes coercion justified responds to other facts, such as what one is coerced to do, or who has authority to coerce other individuals.

Perhaps the difference between compelled service in hostile forces and the state’s conscription of its own citizens is that citizens, unlike noncitizens, owe certain duties to their own states, and among those duties, there is the duty to fight on behalf of one’s state. If successful, this argument can explain why international law permits states to conscript their own citizens and why compelled service in hostile forces is a war crime. The first is permitted because citizens already have enforceable duties to fight that are owed to the state. The second is prohibited because noncitizens do not owe such duties to other states, and it is wrong for those states (or hostile forces) to coerce individuals to fight when those individuals lack any relevant duties to do so. Forcing them to fight would constitute unjustified harm, in a sense, and it is even worse to harm those who are vulnerable or defenseless.

At this stage, one might argue that, in the case of compelled service in hostile forces, the issue of citizens’ duties toward their own states arises twice. First, because those who are compelled to serve in hostile forces have a duty toward their own states that compelled service in hostile forces causes them to breach. And second, because those who are compelled to serve in hostile forces lack duties toward hostile forces, and compelling them to fight as if they had such duties is morally wrong. However, recall that we are now concerned with the question of why compelled service in hostile forces to fight legal wars is worse than the state’s conscription of its own citizens to fight legal wars. In this case, when states breach the CSHF prohibition, they are doing so in relation to individuals who, if fighting for their own states, would be fighting a war of aggression, which is also an unjust war. When states are engaged in unjust wars, the question of whether their citizens still retain a duty to fight on behalf of their states is highly controversial, and the most plausible answer is that they almost always lack such duties toward the state, at least in cases of egregiously unjust wars.¹⁷² Thus, I will not address whether compelled service in hostile forces also involves making individuals breach duties toward their own states.

The argument based on duties owed to the state thus requires showing that citizens have certain duties toward their own states (or their political communities) that noncitizens lack, and, in particular, that they have duties to fight on behalf of their state that noncitizens lack.¹⁷³ This argument is obviously related to the question of political obligation, that is, the question of whether individuals have a prima facie, context- and content-independent moral duty to obey the law of the jurisdiction they are in.¹⁷⁴ Because conscription is often implemented through the legal regime, refusal to accept conscription will often involve disobeying the relevantly applicable laws.

However, the question of political obligation is somewhat different in character to the question of whether citizens have a duty to fight—to die and kill—for their states. The first difference is that conscription in times of war, unlike other instances of duties imposed by law, is extremely demanding: individuals are likely to face mortal and moral peril when they are called to kill and die on behalf of the state.¹⁷⁵ Fighting in war, unlike complying with most legal rules, is a significant burden. War is risky, both in physical and moral terms,¹⁷⁶ and individuals only have one life to live—their own.

As a result, general arguments as to why individuals have moral duties to obey the law of their own states might not be sufficiently powerful or weighty to explain why individuals might have moral duties to die and kill for their states. This is not implausible: no account of political obligation defends an absolute duty to obey the law, independent of its content, and most accounts are qualified in several respects.¹⁷⁷ Further, some political theorists, like Hobbes and Rousseau, have treated the question of an obligation to kill and die for the state as a separate, distinct issue.¹⁷⁸ Rousseau, for example, held the view that to bear arms on behalf of the state was the ultimate public duty because every individual shares in the moral goods of the community.¹⁷⁹ Hobbes contended that individuals do not have a duty to fight on behalf of the state because once the state demands citizens to die on its behalf, the social contract breaks down: the state and the citizen are at war with each other and they are hence returned to the state of nature.¹⁸⁰ This is so because the end of the state, for Hobbes, is individual life.¹⁸¹ An individual who dies for the state defeats the very purpose of forming that state: the preservation of life.¹⁸² Others interpret this aspect of Hobbes’s theory as a matter of prudence, in the context of which self-preservation was paramount.¹⁸³

The second reason why the question of political obligation is different from the question of conscription is that the first one is a question about whether there is a content- and context-independent duty to obey the law. By contrast, the question regarding conscription is not content-independent: it is a question of whether individuals have duties, owed to their state, to fight in legal wars.

Despite these differences, conscription has received little theoretical attention from political and moral philosophers¹⁸⁴ besides focused attention on the legitimacy of the draft in the 1960s and early 70s.¹⁸⁵ In fact, in the contemporary ethics of war, the question about conscription has been posed as a question of whether conscription might have an impact on one’s liability to defensive force. In other words, questions about conscription have been reduced so far to the question of whether coercion would make conscripted unjust combatants morally impermissible targets.¹⁸⁶ But the legitimacy of conscription itself has been hardly discussed.¹⁸⁷

This omission is quite puzzling. As things are, states have the capacity of generating “unlimited numbers of soldiers by their coercive practices.”¹⁸⁸ Indeed, during the 19th century, pacifism about war included an argument against conscription, which was considered to be a “type of enslavement at the heart of war” that led to a system of inhumanity; that is, a system where war was seen as taking a life of its own.¹⁸⁹ Although liberal thinkers like Hobbes, Locke, and others during the 18th and 19th centuries have remarked on the incompatibility of the rights of individuals with the power that the military possesses over its soldiers, very few thinkers have seriously questioned the permissibility of the state’s conscription practices.¹⁹⁰

Because the philosophical treatment of conscription is sparse, the literature on political obligation is a good starting point for assessing arguments in favor of a duty to fight for one’s state, even if there are relevant differences between the two.

Insofar as we can establish that a duty to accept conscription in times of war exists and is owed to one’s state or to one’s political community, we can explain the distinction that exists in international law between compelled service in hostile forces and the state’s conscription of its own citizens. This thought—that the justification of conscription relies on the existence of a duty toward one’s community or one’s state—is familiar. Conscription, as Ryan notes, is often experienced both as naked coercion and as embodying a legitimate social obligation.¹⁹¹

Let us start by discussing how different theories of political obligation have justified the existence of a duty to obey the law.

1. Theories of Political Obligation

In the literature on political obligation, context- and content-independent duties to obey the law have been grounded on consent,¹⁹² fairness,¹⁹³ natural duties of justice,¹⁹⁴ associative duties,¹⁹⁵ on the basis of democratic authority,¹⁹⁶ and a commitment to law.¹⁹⁷ However, some authors remain skeptical that any such duties exist.¹⁹⁸

Theories based on consent¹⁹⁹ or commitment²⁰⁰ regard political obligations as obligations of commitment.²⁰¹ The idea is that the community has granted authority to the government and chose to undertake political obligations,²⁰² or that individuals have adopted a commitment to law.²⁰³ These theories tend to be voluntaristic (what matters for establishing a duty to obey the law is whether individuals have, in fact, consented), in which case the arguments discussed pertaining to loyalty will apply. When they are less voluntaristic (what matters is whether people should consent, or whether people in certain idealized circumstances would have consented), the arguments that apply to fairness, associative duties, and natural duties–based theories will often apply to them as well.

Fairness-based theories argue that states provide their citizens with significant benefits that they would otherwise not obtain. In a Hobbesian account of the state, for example, we would argue that citizens benefit from membership in their state because the latter’s existence allows them to exit the state of nature.²⁰⁴ In broader accounts, we might posit that the state allows individuals to access a number of goods that they could not obtain otherwise, such as security, the existence of a legal system, the solution of coordinative problems, and so on. Plausibly, noncitizens do not benefit as much as citizens from these arrangements. A fairness-based account of political obligation, then, holds that when a number of individuals “engage in a just, mutually advantageous, cooperative venture according to rules and thus restrain their liberty in ways necessary to yield advantages for all, those who have submitted to these restrictions have a right to similar acquiescence on the part of those who have benefited from their submission.”²⁰⁵ Acceptance of the relevant benefits, even in the absence of express or tacit consent to cooperate, are enough to bind individuals to do their fair share in the group.²⁰⁶ Fairness-based accounts of political obligation explain the existence of the latter on the idea that individuals benefit from certain arrangements, and those benefits make it the case that they are obligated to participate in their production by following the rules of the scheme of cooperation.

In the context of conscription, Rawls, for example, suggested that the draft could be defended as a fair way of sharing in the burdens of national defense.²⁰⁷ Given that even just and well-ordered societies cannot entirely eliminate the possibility of aggression by another state, they should make sure that the burden to defend one’s country should be evenly shared by all members of society over the course of their lives and that there is no avoidable class bias in selection.²⁰⁸ Further, given that conscription is “a drastic interference with basic liberties,” Rawls argued that conscription would be justified only if it is demanded for the defense of liberty itself.²⁰⁹ Rawls did not think this duty was content-independent: he suggested there might be a right or a duty to disobey if the war is unjust or is being fought in an unjust manner.²¹⁰

An alternative way to ground duties of obedience is to argue that citizens have associative duties or membership-dependent reasons to do their share, “as defined by the norms and ideals of the group itself, to help sustain it and contribute to its purposes,” provided that the norms are neither gravely unjust nor irrational, and the group is not corrupt.²¹¹ Scheffler argues that this might, on certain assumptions, provide the basis of an argument in favor of the existence of a duty to obey the law.²¹² Those assumptions are, first, that membership in a political society can be noninstrumentally valuable; second, that the laws of a society are among its norms of individual conduct; and third, that these reasons amount to duties.²¹³ Note that whether membership in the state (or any group) is noninstrumentally valuable depends on the justice (or injustice) of the group in question: at a certain point, the injustice of any given society will erode the value of membership in that society for, as Scheffler observes, part of the value of membership in a political society such as the state is that “it makes it possible to live on just terms with others.”²¹⁴

Accounts based on natural duties provide a different alternative.²¹⁵ They contend that there is a natural duty of justice which requires individuals to support and to comply with just institutions that exist and apply to them, as well as to further just arrangements not yet established (at least when it is not too costly for individuals).²¹⁶

Finally, duties to obey the law might also be based on the law’s democratic provenance.²¹⁷ The democratic authority account obviously has a problem regarding scope: it can only justify conscription in democratic states. Thus, it could not explain why compelled service in hostile forces is worse than the state’s conscription of its own citizens. If successful, it can only explain why compelled service in hostile forces is worse than democratic states’ conscription of their own citizens and why conscription by democratic states is permissible.

All the arguments just discussed aim to explain why individuals might have content- and context-independent duties to obey the law, within certain limits. But duties of obedience can also be grounded instrumentally, that is, on the basis of the ability of the state, the legal system, or the practice of widespread obedience to the law of securing certain good outcomes.²¹⁸ Of course, instrumentalist accounts of duties to obey the law fail to support a context- and content-independent duty to obey the law—that is not what instrumentalist accounts are trying to do.²¹⁹ At most, an instrumentalist account will be able to support duties to obey certain laws, insofar as obeying those laws is a way of securing the outcomes we want to secure.²²⁰ In the case of conscription, the argument would be able to support the duty to accept conscription if doing so was the kind of thing that helped secure certain outcomes or if not doing so would risk the collapse of, say, one’s state (provided that the state is a valuable institution).

Historically, conscription, both in times of peace and in times of war, has been justified instrumentally on the basis of different outcomes or goals,²²¹ some of which Leander refers to as “myths” regarding conscription.²²² Some of these myths have been empirically debunked, but they tend to fail on their own terms anyway.

One argument in favor of conscription, and against all-volunteer armed forces (“AVF”), is that the latter pose a danger to democracy, while conscription is central for controlling the use of force in society.²²³ There is thus a supposed link between the preservation of democracy and conscription, or, alternatively, all-volunteer armed forces can pose a threat to the political community, democracy, and freedom which conscription-based armies are less likely to pose.

However, historical examples do not support the notion that conscription-based armies are particularly effective in controlling or constraining the use of force by their own state leaders. The general draft allowed Hitler to form a powerful army and conscripts did not stop his plans from within, and both Stalin’s and Mussolini’s armies were composed of conscripts, who also posed no considerable internal resistance.²²⁴ Conscript-based militaries in Chile, Argentina, and Turkey also offered no resistance to military takeovers.²²⁵ It is thus not true that conscription can be a cure to the threat that all voluntary armies pose, and as Keil notes, the “equation ‘conscription equals democracy’ is badly flawed.”²²⁶ Interestingly, a study conducted in thirty-four European states in 1997–2017 found that in countries with conscription-based recruitment, there are higher levels of support for the military.²²⁷

Further, even if it were true that AVFs pose graver threats to democracy than conscript-based forces, this would not, on its own, provide an argument in favor of the permissibility of conscription. It only highlights an aspect that makes conscription better than AVFs. Pattison, for example, has argued that the AVF is the most legitimate way of organizing the military, partly because it does not severely infringe on individuals’ liberty.²²⁸

A second myth is that conscription works to construct a more tightly knit society, both as a source of social mobility and as a source of social integration.²²⁹ This logic does not resist analysis: in most places, women are not subject to conscription, so whatever social integration or mobility is created clearly excludes roughly half the population.²³⁰ It is also unclear why social integration should stop at the border of one’s own country.²³¹ Further, this argument might work for conscription as a state’s general practice during times of peace, but it does not really support conscription or the draft in times of war. There is no social integration when people are dead.

A third argument usually employed to support conscription is that it helps to form loyal and virtuous citizens because conscription itself works as the “school of the nation.”²³² However, as Leander points out, the idea that the military could and should play a role in forming virtuous citizens is in tension with democratic understandings of what makes a virtuous citizen in most contemporary political thinking.²³³ Even if that were the case²³⁴—that is, even if conscription was an effective tool in creating “virtuous citizens”—it cannot be plausibly sustained that as a result the state can demand people to kill and die on its behalf. The intrusion on personal freedom and the sacrifice demanded of individuals is far too great in comparison to the pursued goal, which is relatively insignificant. It is also far from clear that the state, at least one committed to some version of political liberalism, can coercively make individuals believe and endorse its own conception of virtuousness.

Thus, an instrumentally justified duty to accept conscription must go beyond these arguments. It must rely on the benefits that the state provides for its citizens, which might be diminished if the war is lost or eliminated entirely if losing the war entails the destruction of the state. In the latter case, one might be facing something like a “supreme emergency.”²³⁵ I will come back to this point later.²³⁶ In the first case, the argument relies on the notion that states fulfill valuable ends and thus have a right to their own survival. Citizens, then, must contribute to the survival of their own state by fighting.

2. The Shortcomings of Theories of Political Obligation

Whichever way one grounds these duties, if successful, individuals might have duties to fight on behalf of their state. If so, then we will have an additional reason why compelled service in hostile forces is worse than conscription, and why conscription to fight legal wars is permitted by international law. However, these theories struggle to support this notion.

First, the theories tend to be overinclusive. If we accept that individuals can owe certain duties to groups or that there are instrumentally justified duties to fight legal wars, then there is no reason to think that those duties would be owed exclusively, or even primarily, to one’s state. Any community, political or otherwise, would be able to generate such duties, provided that individuals have consented to them, the community is reasonably just, individuals benefit from the existence of the community, and so on. There is, thus, a difficult question of demarcation: why is the state the right entity to which individuals owe duties to fight legal wars?

We live in an increasingly interconnected society in which noncitizens can, in fact, benefit considerably from the functioning and existence of other states—and sometimes to a greater extent than the state’s own citizens. Further, individuals might also benefit from the existence of the international legal system, in which case they would be obliged to that system or to all states, not just to their own. In the context of war, if pacifism is false,²³⁷ there is a plausible argument that legal wars, when they are just and aim to uphold the prohibition on aggression, concern and benefit not just citizens of the states involved, but also the international community as a whole. This has implications for the theories of political obligation just discussed.

Take the fairness and natural duty-based accounts. If legal wars generate benefits to everyone, then everyone would have a duty to fight, and that duty would not be owed to one’s state but to the relevant group (those who benefit from legal wars). If everyone has duties to uphold and further just institutions, then everyone would have a duty to fight a legal war, which would not be owed to one’s state, but to all states or the international community.²³⁸ If we take an instrumental account of political obligation, then one would be required to fight anytime it would help uphold the legal regime or the existence of sufficiently valuable states. And, again, it is unclear why that duty would be owed to any particular state or to one’s own.

The problem, then, is to draw a significant line between citizens and noncitizens. If this is right, accounts of political obligation can explain why the state’s conscription of its own citizens is permitted. However, they will struggle to explain why compelled service in hostile forces to fight legal wars is a war crime, or why non-state groups are prohibited from conscripting individuals to fight legal wars. They provide a reason to the contrary. That is, they provide a reason in favor of everyone having duties to fight legal wars: duties that are owed not to one’s own state, but to the international community as a whole or to other relevant groups. If so, compelled service in hostile forces of the kind that does not involve objectionable forms of treatment or the breach of duties toward others would not be morally unjustified. Even though it would be worse than the state’s conscription of its own citizens, it would be permissible. And although prisoners and those in occupied territories would remain vulnerable, coercing them to fight would not, in at least some circumstances, constitute an instance of harming them, given that it would, in fact, constitute the enforcement of a moral duty to fight.

The second problem with theories of political obligation is that they cannot support the notion that many, or even most, individuals have duties to fight legal wars that are owed to all states, their own states, or the international community. This is so because theories of political obligation have been developed under certain ideal assumptions, mainly that institutions and states are reasonably just. In these ideal circumstances—that is, when states are just and treat members equally—these accounts might be able to take off, to a greater or lesser extent.²³⁹ But they struggle greatly in nonideal conditions, which are the present conditions of all states (and of the international community).

Take the fairness-based theory. In most states, individuals do not equally benefit from the state’s existence and the prevailing social arrangements. In fact, many states not only fail to benefit certain sectors of the population, but also actively contribute to their oppression and marginalization. That is, some individuals are not only not benefitted by the state but also are harmed by it. For example, Raff Donelson has argued that Black people and police in the United States are locked in a Hobbesian state of nature; that is, in this respect, the state fails to secure even the most basic conditions of personal security.²⁴⁰

A fairness-based account already struggles in ideal circumstances. It is hard to accept that the mere fact of benefiting from a social practice or institution provides a good argument as to why those who benefit—and who might not have accepted nor wanted the practice, the institution, or the benefit—can be burdened to obey.²⁴¹ This is particularly the case when the benefits are not equally distributed among members or the costs of complying with the obligations are higher than the benefits the person obtains from the social practice.²⁴² The latter is especially relevant in the context of conscription, in which the costs of compliance are very high (risking severe injury and death in addition to putting oneself at risk of committing morally wrongful acts), but the benefits any given person obtains from the state are likely to be significantly smaller in comparison. In nonideal circumstances this is an even more pressing issue, and the argument simply cannot take off: some individuals (often those who are most likely to be conscripted or drafted) do not benefit from the existence of the state, or benefit very little, and thus cannot be obliged to take on the higher burden of fighting to defend it.²⁴³

The same is true in an associative duties-based account: some individuals simply lack any reason to place noninstrumental value on the existence of their state. The state actively makes their lives worse or prevents them from living justly with others. And in a consequentialist or instrumentalist account, there is no reason for individuals to fight to defend a state (or an international legal order) that they might be better off without, or from which they might benefit little, when the burden of fighting is so high. Further, a consequentialist account could not establish that individuals have general duties to fight; it can only establish that, depending on the circumstances, sometimes some individuals will have duties to fight.²⁴⁴ These might seem, as Murphy writes, “banal empirical observation[s],” but “it is through ignoring such banalities that philosophers generate theories which allow them to spread iniquity in the ignorant belief that they are spreading righteousness.”²⁴⁵

Theories of political obligation then struggle in two dimensions. First, in their idealized versions, they might actually ground duties to fight legal wars that are owed to the community of states. If this is true, then there is a pro tanto reason against the CSHF prohibition when protected individuals are compelled to fight in hostile forces engaged in legal wars. This, of course, cannot provide a complete argument against the CSHF prohibition. Compelled service in hostile forces would remain both morally worse than the state’s conscription of its own citizens and morally wrong when involving objectionable treatment. Further, incentives to treat foreigners poorly would still exist and provide a reason in favor of the prohibition.

Second, in nonideal circumstances (which are the circumstances of all present states), theories of political obligation struggle to ground duties to obey the law. If they struggle to even do this, it is even harder to argue that they could support a duty to fight and kill on behalf of the state (or the community of states), given how demanding the duty is. Theories of political obligation are not unresponsive to the demandingness of the burdens associated with compliance with the law. In the case of conscription, the obligation to fight is extremely demanding, which makes it harder to defend. It is also a grave imposition on individuals’ freedom, which also makes it harder to justify.

If this is true, then both compelled service in hostile forces and the state’s conscription of its own citizens are unjust practices. Although the first is often worse than the second, both are morally wrong. This is so because, in nonideal circumstances, individuals often do not have duties toward the international community nor to their own state to fight and kill on its behalf. And because many individuals lack those duties, it is wrong for states to force them to fight, even in legal wars.

3. The Authority of the State to Enforce Duties to Fight Legal Wars

Note that the argument so far does not establish that no one has duties to fight legal wars that are owed to the state or the international community. The argument only establishes that in nonideal circumstances, many individuals lack such duties. However, at least some individuals might have duties to fight legal wars because, for example, they benefit considerably from the existence of their own state, or they have consented to have such duties by, say, joining the army. Others might have duties to fight a legal war based on instrumental reasons, and some individuals might have moral duties to fight that are owed to their own community or group.

Of course, it can be hard for the state to identify who these people are. But if it could do so, and these individuals refused to fight, can the state enforce their duties to fight?

This is a question about the political legitimacy or authority of the state.²⁴⁶ Although related to the question of political obligation, it is not the same. The question of political obligation pertains to whether individuals have a prima facie moral duty to obey the law. The question about political authority pertains to whether the state is justified to issue and enforce binding directives, sometimes referred to as the question of political legitimacy.²⁴⁷ Some think that the two questions can come apart: it is possible that states are justified in issuing and enforcing directives while, at the same time, individuals lack a moral duty to obey them.²⁴⁸

In the context of conscription, this would mean that states might be justified in demanding conscription of citizens, even in cases in which citizens might lack any duties to accept conscription that are owed to the state. If so, one could argue that compelled service in hostile forces is wrong because hostile forces lack authority over individuals to coerce them to fight on their behalf; that is, they lack political authority regarding foreigners, but they do have authority over their own citizens. Coercion over their own citizens is thus legitimate, whether it involves coercion to pay taxes or to fight wars.

The first problem with relying on the legitimacy of the state is that although most political philosophers agree that legitimacy does not require that the state is perfectly just, it does require that the state is reasonably just.²⁴⁹ However, a significant number of states are not even reasonably just. Only 57% of states in 2017 were “democracies of some kind,”²⁵⁰ and there is disagreement about whether certain states that are “democracies of some kind” are sufficiently just to be legitimate.²⁵¹ Thus, this reliance on political legitimacy cannot explain why conscription is allowed under international law. If anything, it should be severely restricted.

Second, political legitimacy aims to justify state coercion generally. But it cannot justify every instance of coercion: in fact, all theories of political legitimacy recognize limits. Conscription to fight in war is exceedingly burdensome. If, as I have argued, some citizens lack duties to accept conscription precisely due to the state’s failure to protect them, benefit them, or make them better off, then surely the state lacks any sort of prerogative, for the same reasons, to enforce directives to kill and die on its behalf. The state’s authority would only be plausible in the case of those very few citizens who have duties toward the state and, perhaps, in the case of those who have duties to fight owed to other members of the community (but not to the state). For example, in other work I have argued that when burdens are imposed unfairly, the group to which the individual belongs to is corrupt or unjust, and refusal to fight is costly, individuals can have a pro tanto reason to accept conscription because refusing it would entail deflecting harm on other, innocent individuals.²⁵² In this case, however, it is obvious that the state cannot enforce that obligation: that obligation is not, in fact, owed to the state, and the state is acting wrongfully when demanding conscription.²⁵³

In the first case, when duties to fight are owed to the state, we might still put into question the state’s legitimacy to enforce those duties, particularly in severely unequal and individualistic (i.e., capitalist) societies. Such societies, Murphy argues, incentivize individualism. In that context, there is something perverse about the state’s enforcement of obligations to fight and die on its behalf when doing so “presuppose[s] a sense of community in a society which is structured to destroy genuine community.”²⁵⁴

Finally, wars, even legal wars, involve causing harm against many innocent people, including civilians. Legal wars are also often fought unjustly, and war is a particularly morally risky enterprise. If, as Parry and Easton have argued, individuals have a presumptive claim against exposure to moral risk—which grounds duties in others (such as the state) not to expose them to moral risk—then states can hardly demand individuals to fight on their behalf.²⁵⁵ And if, as I have argued, the moral nature of what one is coerced to do matters, individuals facing conscription in conflicts that are fought in breach of jus in bello norms could also not be coerced to fight.

Ultimately, even if some citizens have duties to fight, it will often be the case that citizens are similarly placed to noncitizens: both will lack duties to fight that are owed to the state, and the state will lack authority to enforce duties to fight on its behalf. If that is true, then the state’s conscription of its own citizens is not generally permissible. Conscription to fight in wars cannot be justified as a legitimate practice deserving of international protection—at least not until states become more just.

IV. A MORALLY COHERENT INTERNATIONAL REGIME ON CONSCRIPTION

I have argued that the international legal regime on conscription is morally incoherent. In order to justify the regime, two arguments needed to succeed, yet both have failed.

First, the regime’s failure to distinguish between legal and illegal wars cannot be justified. There is no argument why compelled service in hostile forces to fight a legal war is equally bad or equally wrong as compelled service in hostiles forces to fight a war of aggression. The latter seems significantly worse. Although both cases involve coercion, the latter involves coercion to contribute to an international crime. As a result, the nature of the war one is coerced to fight should be an additional aspect of the crime.

Further, the distinction between legal and illegal wars is also relevant pertaining to the state’s conscription of its own citizens. States should be prohibited from conscripting their own citizens to fight wars of aggression. There are also powerful reasons for thinking it should be a war crime; severely infringing individuals’ liberty to coerce them to participate in deeply wrongful acts does seem to “shock the conscience” of humankind. And it certainly seems sufficiently serious for international law to be concerned with it.

Second, although, all else equal, compelled service in hostile forces of protected persons is often morally worse than the state’s conscription of its own citizens, the argument that supports this claim—that it is worse to harm those who are vulnerable and to be forced to fight against those we care about—cannot render the state’s conscription of its own citizens permissible. It can only show that the state’s conscription of its own citizens might be morally better (or less bad) than compelled service in hostile forces. But the fact that the state’s conscription of its own citizens is morally better (or less bad) than compelled service in hostile forces cannot show, in and of itself, that conscription by the state is permissible or justified. In fact, in current, nonideal conditions, citizens will often lack duties to fight that are owed to their own states, and the state will also lack authority to compel its own citizens to fight, even in legal wars. This is what makes both compelled service in hostile forces and conscription to fight wars morally wrong.

There are thus powerful pro tanto reasons for international law to prohibit states from conscripting or drafting individuals to fight wars. There might also be reasons why it should be an international crime, for the same reasons why conscription in the context of illegal wars should be one. Conscription has been described as “tyranny,” and the peculiar horror of modern warfare as a social practice is that states can exercise “ ‘tyrannical power’ ” against their own loyal people as well as their enemies.²⁵⁶ I will not argue in favor of this, but, if that were the case, the war’s illegality should be an additional aspect of the crime of conscription.

In sum, to render the regime morally coherent, the state’s conscription of its own citizens to fight an aggressive war should be a war crime; the illegality of the war should be an additional aspect of the crime of compelled service in hostile forces; and the state’s conscription of its own citizens should be generally prohibited.

However, I have not made an all-things-considered case in favor of the prohibition of conscription at the international level. I have said that the regime is morally incoherent and there are powerful reasons to render it coherent. But perhaps there are other, more powerful reasons why the state’s conscription of its own citizens to fight legal wars should be permitted by international law.

First, one might argue that prohibiting conscription would make it harder to fight both legal and illegal wars. If anything, it would make the former harder than the latter. This is so because if international law prohibited conscription, the states more inclined to comply with that prohibition would be precisely those states more likely to be engaged in legal wars. On the contrary, states likely to be engaged in illegal wars would be more likely to breach that prohibition. If that were the case, then international law would create a perverse system in which rogue states would breach the prohibition on conscription while law-abiding states would not, thus allowing the first to gain a significant military advantage over the latter.

I think this worry is overblown. It might be that legal wars would become more expensive to fight than illegal wars if the first relied on professional armies and the latter relied on conscription. However, I do not think the costs of war are reasons weighty enough to permit conscription, which severely infringes on individuals’ liberty. States would remain free to have AVFs, and they would also remain free to ask individuals to volunteer to fight in legal wars in certain circumstances.

Second, some might argue that conscription would be justified in the case of what Walzer calls a “supreme emergency,”²⁵⁷ in which the very existence of the state (or the international community) is at stake. Friedman, who was generally opposed to conscription, suggested that for a major war or “in times of the greatest national emergency,” a strong case in favor of compulsory service can be made.²⁵⁸ This would be the case if winning the war required conscription because, say, not enough individuals would volunteer to fight otherwise. But whatever one thinks about this case, it is not enough, on its own, to show that conscription should thus be generally permitted by international law. It only shows that there might be an exception regarding the (moral) prohibition on conscription. But laws should not be made thinking solely about the exception, especially when they are likely to be abused to commit wrongdoing.

Third, suppose that international law is committed to peace, as discussed before. And suppose, further, that lack of conscription in times of war makes it more likely for state leaders to go to war. This is known as the “chickenhawk syndrome.”²⁵⁹ Basically, the existence of an all-volunteer army, as opposed to one based on the draft or conscription, severs the connection between citizens and the wars that are fought on their behalf.²⁶⁰ Doing so makes war much easier: political leaders are more likely to go to war when war requires no personal sacrifice of their own or of their loved ones.²⁶¹

One might then posit that the prohibition on conscription would be self-defeating: instead of resulting in fewer wars, it would result in more wars, thus making peace more difficult to achieve. It is hard to know what to make of this objection. Conscription of citizens who lack duties to fight involves the violation of individuals’ most important rights at the hands of the state. It is at the very least controversial that we can make trade-offs regarding such rights in order to achieve certain desired outcomes (in this case, fewer wars).²⁶² If anything, the solution to this problem is to introduce other modifications, at the domestic or international level, to make wars harder to fight and that do not involve the violation of individuals’ rights.

Further, it is not clear at all that the inability of states to conscript their own citizens to fight would result in more wars. States are much more likely to fight illegal (and unjust) wars than to fight just ones, and allowing them to conscript individuals only facilitates their wrongdoing.

Finally, one might argue that although it is true that many individuals do not have duties to fight on behalf of the state, at least some individuals do. Thus, because at least some of the time conscription involves forcing people to do what they already have a duty to do, it should not be prohibited by international law. However, it would be impossible for states to determine who has moral duties to fight on their behalf (and in some cases, the state would still lack legitimacy to enforce such duties). As a result, a regime of conscription that could distinguish between those who have duties to fight and those who do not is not only unfeasible, but also very likely to get it wrong and thus violate individuals’ freedom by forcing them to fight. This seems to me a sufficient reason to generally forbid conscription to fight wars: if when states conscript individuals, they are likely to violate their freedom, then we should generally forbid conscription.

Again, the argument so far does not establish that all individuals in all circumstances lack duties to fight legal wars. Some individuals will have such duties in certain circumstances—for example, due to instrumental reasons, or because the state or the international community has benefitted them considerably. The point is only that the state should not be generally allowed to conscript individuals because, in present circumstances, many individuals lack those duties and conscripting them constitutes a grievous violation of their freedom. There are powerful reasons to generally prohibit state practices that are likely to violate the rights of many individuals, particularly when we cannot ensure that the practice is conducted in a way that can appropriately distinguish between individuals who have duties to fight and individuals who do not.

It is also important to note that the fact that there might be a legal prohibition on the state’s conscription of its own citizens does not present an insurmountable obstacle in enforcing moral duties to fight regarding those individuals who have them. In those cases, social pressure to comply with duties to fight might be justified, and states might be able to avail themselves of other mechanisms, short of coercion, to persuade individuals to fight on their behalf.

Thus, I think that the reasons in favor of rendering the international legal regime morally coherent are not just pro tanto reasons, but also conclusive ones, at least if one thinks that individuals’ rights should operate as a constraint on state action. International law should prohibit conscription to fight in war. The state’s conscription of its own citizens to fight aggressive wars should be a war crime. And the illegality of the wars individuals are compelled to fight should be an additional aspect of the crime of conscription.

There are different ways in which these changes could be achieved. For example, they might involve making conscription generally prohibited under international human rights law, with conscription to fight aggressive wars qualifying as a war crime under the jurisdiction of the ICC. It would also be necessary to grant extensive rights to individuals to engage in conscientious objection against military service in war and expand refugee protections to those who are forced to fight wars of aggression. The latter mechanism has the advantage of leaving the assessment of the war’s legality to domestic tribunals, thus bypassing the usual deadlock in the UNSC regarding these matters. I leave somewhat open the details of what a coherent regime would look like in practice, how it should be effectively enforced, and so on.

The idea that international law should generally prohibit conscription in times of war is, perhaps, completely utopian. It is also likely to destabilize other areas of international and domestic law, given how pervasive the idea that we owe special duties to our states is.²⁶³ We might thus think that it is very unlikely that states would ever agree to any such norm prohibiting conscription during times of war, and that we should focus on making it a crime for states to conscript individuals to fight in wars of aggression. This might be true as a matter of what is presently feasible to achieve and how we should set our priorities.

However, the fact that something is utopian does not alter the moral demands.²⁶⁴ If conscription to fight legal wars is often wrong, then it remains wrong—even if we are unable to prohibit it—at least until states or the international community are more successful in achieving justice.

CONCLUSION

This Article started by pointing out an asymmetry in the international legal regime on conscription: while compelled service in hostile armed forces is a war crime, a state’s conscription of its own citizens is the state’s prerogative. In order to defend the content of this regime, two arguments needed to succeed: first, that the distinction between legal and illegal wars is normatively insignificant; second, that compelled service in hostile forces is significantly different from conscription.

Both arguments have failed. The distinction between legal and illegal wars is morally significant. And although compelled service in hostile forces is wrong and is often morally worse than the state’s conscription of its own citizens, the latter is often not morally permissible.

As a result, the international legal regime on conscription is morally incoherent. It is morally incoherent because it is incomplete and cannot be made sense of.²⁶⁵ It is incomplete because it fails to criminalize or prohibit conduct (the state’s conscription of its own citizens) that is similarly wrong than what it already criminalizes (compelled service in hostile forces). Furthermore, it cannot be rendered morally intelligible in two ways. First, what makes compelled service in hostile forces wrong also makes the state’s conscription of its own citizens wrong, yet the latter is permitted. Second, international law fails to distinguish between legal and illegal wars.

None of these implications mean that the CSHF prohibition lacks value or is morally misguided. On the contrary, the CSHF rule prohibits something that is morally wrong in a context in which states are likely to have perverse incentives to use individuals in objectionable ways, and as such, it is a valuable rule. Nonetheless, the international legal regime on conscription allows states to use the lives and bodies of millions of people to fight wars, both lawful and unlawful. Furthermore, it allows them to do so through coercive means. In order to render this regime morally coherent, conscription should be prohibited, and the unlawfulness of the war individuals are coerced to fight should be an additional aspect of the crime of conscription.

After all, individuals have only one life to live, and wars sow destruction, death, and suffering on a massive scale. As Walzer notes, “there has never been a more successful claimant of human life than the state.”²⁶⁶

97 S. Cal. L. Rev. 1289

Download

Marcela Prieto Rudolphy*

* Associate Professor of Law and Philosophy, University of Southern California Gould School of Law, Los Angeles, California, United States of America, mprieto@law.usc.edu. Profesora adjunta extraordinaria, Universidad Adolfo Ibáñez, Escuela de Derecho, Santiago, Chile. I am indebted to Scott Altman, Ángel Díaz, Felipe Jiménez, Mugambi Jouet, Erin Miller, Jeesoo Nam, Jessica Peake, and the attendees of the Southern California International Law Scholars Workshop; the attendees of the Diálogos of Derecho Internacional Conference; Norrinda Brown, Benjamin Zipursky, and the attendees of the Fordham Legal Theory Faculty Workshop; Jaya Ramji-Nogales and the attendees of the ASIL Mid-year Research Forum Meeting; Seana Shiffrin, Lawrence Sager, and the attendees of the UCLA Legal Theory Workshop; and the attendees of the USC Faculty Workshop for their generous feedback. A special thanks to Jessica Murphy for her assistance with research and the editors of the Southern California Law Review for their excellent work.

Housing Gridlock

September 2024September 2024Brian M. Miller*

The housing crisis dominates much of political and economic life, and it is driven in large part by a lack of housing supply. Recognizing this, many commentators have called for the end of single-family zoning. And some jurisdictions have answered the call. But even if that groundswell grows, the housing shortage likely will persist. One underappreciated reason for that persistence is that a private network of restrictions stands ready to pick up where zoning leaves off. Specifically, restrictive covenants abound nationwide, and, just like zoning, they often cap how much housing can be built without regard to local or regional demand. The aggregate result of these covenants is what this Article calls “housing gridlock,” a side effect of dispersed private ownership rights that prevents property from reaching its full potential in service of both the public good and economic productivity. This Article explores the legal, financial, and, above all, psychological forces that brought about gridlock and further its hold today.

This Article then proposes a solution. Drawing primarily on the psychological phenomena and financial incentives that entrench housing gridlock in the first place, it crafts a two-stage procedural and remedial mechanism to break that gridlock. Critically, that mechanism aims to minimize psychic harm to existing property owners while freeing up additional housing supply. It begins with a lump-sum opt-in mechanism for willing covenant beneficiaries located near a proposed housing project, and an action only for damages within a predefined range for any remaining beneficiaries. This Article describes the finer details of this proposal and explores its potential advantages over other options in the preexisting legal toolkit from economic and (again, especially) psychological perspectives. Accordingly, it provides an asset to the multifront struggle for housing availability.

INTRODUCTION

The affordable housing crisis is well-documented: homelessness abounds;¹ those who live in homes spend a larger portion of their income on housing than in the past;² the rate of homeownership among young adults over the past several years is lower than it has been in much of recent memory.³ At a basic level, it is no mystery why this is the case. We have a severe housing shortage. Freddie Mac estimates a shortage of 3.8 million housing units nationwide.⁴ Though a simple model of supply and demand cannot tell the entire story, it can go far. More people competing for fewer homes means sellers or landlords can command higher prices.⁵ Thus, many people are without homes, and those who secure housing pay more to do so.

Unsurprisingly, then, the majority of the population agrees that the U.S. needs more housing.⁶ But that is where the consensus ends. What kind of housing? Where should it be located? How do we get more of it? These questions confound policymakers and prompt fiery reactions from people across the political and socioeconomic spectrums.⁷

One flashpoint in this contentious ordeal concerns zoning. Single-family zoning is perhaps “[a]s American as apple pie,”⁸ but it and other restrictions with similar practical effects suppress the housing supply. It inflates prices and drives suburban sprawl, disproportionately harming racial minorities and lower-wealth Americans.⁹ Indeed, racial animus likely motivated much of the initial push for single-family restrictions.¹⁰ Today, then, single-family restrictions hoard opportunity on behalf of existing homeowners.

Anyone who keeps their ear to the ground in the world of housing policy, urban governance, or even national politics has heard calls to abolish single-family zoning.¹¹ It’s not just chatter: multiple states and cities across the country, including Oregon,¹² California,¹³ Maine,¹⁴ and Minneapolis, Minnesota,¹⁵ have passed laws to break single-family zoning’s hold on housing development. These are welcome developments, but, of course, they alone are insufficient to solve the housing shortage.¹⁶

This Article targets another legal factor that blocks many cities and regions from meeting housing demand: single-family restrictive covenants. Restrictive covenants are private arrangements attached to property deeds limiting how property can be used, for the benefit of certain surrounding properties.¹⁷ The terms often apply for many decades to the burdened properties and automatically renew afterwards.¹⁸ Covenants thereby can “run with the land” and do not merely bind specific parties to an agreement.¹⁹ On the substance, restrictive covenants can limit property use in various ways.²⁰ One of the most prevalent covenant terms is one that restricts the burdened property to hosting one single-family home.²¹ Other common restrictions include minimum square footage, minimum setbacks from lot lines, and height limits.²² In other words, restrictive covenants, where they exist, often place the same lid on housing development through private rights as single-family zoning does through public regulation. And they often contribute de facto to the same racial segregation and wealth inequality that the now unenforceable racially restrictive covenants once pursued de jure.²³

These covenants are incredibly and increasingly common.²⁴ So, even if zoning restrictions were lifted across the entire nation, a sizeable portion of single-family properties would still be frozen indefinitely, regardless of the demand for housing in that region. And not only that, but there is also reason to suspect that even where such restrictive covenants do not yet exist, homeowners may seek to implement them when faced with the fear, post “upzoning,” that they will lose the “neighborhood character” they have grown accustomed to.²⁵ At the end of the day, policymakers could all get on board with unlocking more housing, but an extensive network of private legal apparatuses can still say no. If we care about finding real and lasting solutions to the housing crisis, we must explore freeing up property from longstanding private rights that often entrench inefficiencies and stand in opposition to the public good. At the same time, to avoid significant demoralization costs and psychic harm to property owners resulting from upended expectations in the short term (and to avoid certain political failure), we must find a path forward that seeks to honor owners’ reasonable expectations.

The legal academy has given this problem limited attention. At a high level of generality, many scholars have commented on some of the problems associated with restrictive covenants and other forms of deadhand control over real property.²⁶ Additionally, a few scholars have noted that restrictive covenants could swoop in where single-family zoning leaves off.²⁷ More specifically, Robert Ellickson recently provided a helpful synopsis of the problem of “stale” restrictive covenants of all forms and of some potential paths forward for those burdened by them to free themselves.²⁸

And, just this year, Gerald Korngold explored the likely prominence and harms of single-family covenants in the wake of zoning reform.²⁹ He evaluates whether invalidating such covenants might run afoul of the Takings Clause, and then urges courts to consider such covenants void as contrary to public policy under certain circumstances.³⁰

But the literature has only just begun to explore the nature and the depth of the problem as it relates to the housing crisis. This Article attempts to do so. It makes multiple new contributions to the discussion on the housing crisis and private law. First, it explains how the abundance of restrictive covenants contributes to housing “gridlock,” a scenario in which fragmented and dispersed ownership of private rights prevent productive and socially beneficial land use. Second, this Article employs an underutilized approach to both describe that problem and explore solutions. Building on the work of Stephanie M. Stern and Daphna Lewinsohn-Zamir, who have provided a general primer on the relationship between psychology and property law,³¹ this Article reviews contemporary psychological research on ownership, possession, and subjective attachment. It then applies that research to, first, suggest why restrictive covenants’ hold may be stronger and more damaging than some commentators have suggested and, second, to evaluate strengths and weaknesses of legal and policy tools to escape that hold. One of the primary insights this Article hopes to convey is that housing gridlock from restrictive covenants is a deeply psychological problem, and that any solution must account for the driving psychological phenomena.

This Article proceeds in two parts. Part I explains how restrictive covenants contribute to gridlock. It evaluates housing gridlock—how we got here—from legal, financial, and especially, psychological perspectives. Part II explores how to break out of that gridlock. It surveys potential paths forward from within the common law of property as well as certain regulatory or legislative approaches. Then, relying on common law, economic, and psychological principles, it constructs a new solution. That solution involves a statutory mechanism, with procedural and remedial components, that facilitates the bypassing of covenants when it would be efficient to do so. Finally, it aims to respect the preexisting interests of covenant beneficiaries along the way. In the process of explaining this proposed solution to housing gridlock, this Article explores the advantages—especially psychological—that the solution presents compared to other options.

I. THE TRAGEDY OF THE HOUSING ANTICOMMONS

This Section describes the housing gridlock that restrictive covenants have helped usher in. Section I.A identifies the concept of the tragedy of the “anticommons” and explains why it applies to the modern housing market generally and residential restrictive covenants specifically. Section I.B explores three interrelated social-scientific explanations of how we arrived at housing gridlock—a legal one, a financial one, and a psychological one—and it explains why the psychological explanation is particularly important. This Article thereby draws in part, but not exclusively, on law & behavioral economics to describe the problem, with a special emphasis on underappreciated psychological forces at play.

A. Gridlock as a Feature of the Modern Housing Market

Few concepts are more influential to the foundations of private property law than that of the “tragedy of the commons.”³² The basic idea is that when a resource (such as a freshwater lake that is useful for drinking and recreation) is unowned and subject to open access, each individual with access to it has a personal incentive to use it heavily and not limit themselves.³³ “Every other person might use it up first,” the thinking goes, “so if I don’t use it now I might never get any.” So, even though it would be in everyone’s interests to preserve the common resource for years and generations to come, the resource is instead spent or spoiled rapidly.³⁴

Theorists traditionally have proposed two main solutions to avoid the tragedy of the commons.³⁵ The first is centralized control by a public or governing body, giving authority and power to a single entity to perhaps allow public access to the resource, but limiting such access as necessary to preserve it.³⁶ The second is privatization, allowing private individuals or entities to own all or portions of the resource.³⁷ Under such an arrangement, the owners theoretically have an incentive to use the resource productively without spoiling it.³⁸

But there’s another tragedy that is far less famous. Michael Heller called it “the tragedy of the anticommons.”³⁹ While the tragedy of the commons describes an undesirable consequence of too much open access, the tragedy of the anticommons sits at the other end of spectrum as an undesirable consequence of, in a sense, too much private ownership.⁴⁰ Here’s the idea: Certain public goods rely on the use of property. And, indeed, some of those goods cannot manifest unless either enough property is pooled together, or a smaller unit of property is utilized or developed to a substantial degree.⁴¹ Think of public transportation corridors, large parks, or stadiums. But if property is divided among too many private owners, or if the rights concerning one unit of property are spread among too many owners, then for those greater public goods to manifest, more people must be willing to relinquish their right to exclude others from using their small slice of the pie.⁴²

A fictional example can make this more concrete. Imagine a swimming pool that is not owned by one person or entity, but instead is owned by hundreds of different people in individual chunks. Each person owns around one square meter and has complete and total rights over their small space. If even a handful of the individual owners decide to fence off their small portion of the pool from anyone else, they could prevent all other people from swimming laps. Because ownership of the pool is fragmented across many parties, the resource—the pool itself—cannot reach its full productive potential.

Although the swimming pool example is farfetched, anticommons and gridlock can calcify under the radar in ways that shape the real world. Michael Heller points to biomedical patents.⁴³ Years ago, a large pharmaceutical company developed what looked to be a promising treatment for Alzheimer’s Disease.⁴⁴ But the treatment never made it out of the lab because it wasn’t financially feasible for the company to pay off all the holders of patents of necessary component technologies.⁴⁵ The U.S. government developed the patent system to incentivize innovation in the world of biomedical technology. And it did. Driven by patents as rewards, we underwent a revolution in biomedical technology in the late 20th century.⁴⁶ But sooner or later it rings true that “there is nothing new under the sun”⁴⁷ and one must build with the materials that already exist. When too many people have the right to exclude others from using those building blocks, building a wall from the bricks can approach impossibility.

Hence the term “gridlock.”⁴⁸ We might have the building blocks to assemble resources that would massively improve public welfare, but whether we reach our goal depends on whether assembling those building blocks is feasible or whether they will instead stick in place, widely dispersed. I argue that we are witnessing a tragedy of the anticommons in housing too. Dispersed ownership rights stifle the production of the housing necessary to meet demand and ensure greater affordability and stability for all.

The most well-documented source of housing gridlock is regulatory: specifically, zoning and related regulations.⁴⁹ Most cities are governed by a zoning code, which limits land use in different geographic areas to specific uses—such as single-family housing only.⁵⁰ And in most cities, a property owner who seeks to use their property in violation of the zoning restrictions may request a “variance”: permission from the local government to, say, build a quadplex or small apartment building in an area otherwise restricted to single-family houses only.⁵¹ Often, a variance request triggers an elaborate comment and review process;⁵² even for the smallest proposed projects, house owners can voice their opposition and demand studies on traffic, environmental impact, water runoff or burden on public utilities.⁵³ Because this process is expensive and time-consuming for the property developer, often the result is either that housing is never built, or the end product contains fewer units at a higher selling price than originally planned.⁵⁴ That being so, gridlock and anticommons can follow from not only too many people owning pieces of what we’d typically think of as “property”—a piece of land, a patent or copyright, for example—but also from too many people or entities having the right and power to stop others’ use of their property, demonstrating the fragmentation of property rights.⁵⁵

However, loosening zoning restrictions will not necessarily suffice to escape gridlock. In many regions in and around metropolitan centers, restrictive covenants could pick up much of the slack. Imagine a developer buys a property and wants to build multifamily housing—a condo building, four townhouses, or even just a duplex. The property isn’t subject to any notable zoning restrictions. But it is within the bounds of a homeowners association (“HOA”). And within the HOA’s Covenants, Conditions, and Restrictions (“CC&Rs”) is a provision stating that all properties may only be used for the operation of one residence.

In this case, however, there is no variance process. The developer could petition the HOA board to allow multi-family housing, but (1) the board, made up of other homeowners in the subdivision, has no interest in losing the single-family, detached-home character of their neighborhood;⁵⁶ and (2) in any event, CC&Rs often can be amended only by a supermajority vote of all property owners within the association,⁵⁷ so if the developer wanted the legal right to build even a duplex, numerous owners in the vicinity would have to agree to the change. The chances of succeeding are slim.

Perhaps that would strike some people as just fine. We are talking about a single-family neighborhood, so maybe the property owner should simply find somewhere else to build the condos, townhouses, or duplexes. And anyway, the restrictive covenants should come as no surprise, right? The developer should have known what it was getting into. But for at least three reasons, restrictive covenants have contributed to and will continue to produce housing gridlock—not only as to an individual piece of property here and there, but also for residential properties and housing supply at local, regional, and national levels.

First, HOAs and single-family covenants are common and even rising in popularity. Single-family homes are the most prevalent form in the United States housing stock, representing around two-thirds of all units.⁵⁸ And around a quarter of all single-family homes in the United States are subject to HOA governance.⁵⁹ That share is poised to increase in the coming years—over 60% of new, single-family homes are subject to HOA governance, and the number is over 80% for new homes constructed in subdivisions.⁶⁰ Moreover, because a single-family restriction is one of the most common covenant terms for HOAs,⁶¹ it is reasonable to assume that the large majority of single-family homes under HOA governance are bound by a restrictive covenant to stay that way. The practical effect of HOAs’ increasing popularity for single-family home developments, and the prevalence of single-family use restrictive covenants, is that it is increasingly difficult to purchase a home that could one day be expanded into multiple units.⁶² And that is so even without taking into account the prevalence of other sorts of covenant terms that in practice allow only for single-family development—like minimum square footage requirements for each residence.⁶³ For this reason, I, like Lee Anne Fennell and James Winokur, am skeptical of arguments that decentralized governance by collections of private covenants could meaningfully provide for an open “market” of governance options for homebuyers.⁶⁴ Increasingly CC&Rs converge on this land-use uniformity and gobble up land as they do so.

Second, some people who own single-family homes either purchase them without recognizing that they are bound by a single-family covenant and the significance of that fact, or eventually come to want freedom from such a covenant when they stand to benefit from its removal.⁶⁵ Most people, we can assume, buy a home primarily because they like the home and its general location. And their awareness of HOA restrictions and amenities may not extend far past knowledge of the monthly or annual fee, as well as the access owners have to common areas, like a swimming pool.⁶⁶ Over time, research shows, homeowners often come to disfavor many restrictions governing their private land use.⁶⁷

Third, an obvious but important fact: land is a scarce resource.⁶⁸ As some land gets developed, the overall share of available land shrinks—especially land well-connected to amenities. Every property and subdivision developed for single-family use accelerates land consumption and constrains the resources available for future housing development.⁶⁹

Of course, these phenomena conspire to lock individual land parcels into perpetual hosts of one single-family home. But the aggregate effects could be substantial too. We have a housing shortage, but where new housing is being constructed, it’s disproportionately single-family and subject to a covenant keeping it that way.⁷⁰ Such development, being less dense and involving excessive land consumption, contributes to suburban sprawl.⁷¹ Radiating out from city centers, more and more large land agglomerations are “filled” with sparse housing; and because of covenants the gaps are not easily filled in later. That means that new housing necessarily stretches away from population centers rich in jobs, recreation, social services, and other amenities.⁷² This phenomenon disproportionately burdens lower-wealth residents for whom greater transportation costs and longer commuting times are especially difficult to manage.⁷³

And it likely will not improve on its own. Although covenant authors can specify that restrictions expire after a certain number of years, they usually do not do so.⁷⁴ Instead, restrictions often apply unless removed through an onerous process, in effect sticking to the corresponding land indefinitely. So, the more HOAs, the more single-family restrictions, and the more land is consumed by indefinite limitations on housing.

Thus, private rights can prohibit cities and surrounding regions from meeting housing demand.⁷⁵ A society filled with single-family land use may function well for a time. But that time is likely to be short if the regional population is to increase. Unused land is depleted, so would-be developers cannot simply buy another lot and build it up. The land that is already occupied is stuck in a web of private promises keeping it from being redeveloped to accommodate more people. Thus, gridlock is inescapable. Properties where the market would offer top dollars to purchase and convert into multiple housing units cannot budge. Too many people—most often HOA boards and members—have the unilateral right to say no.

B. Why Did We Get Gridlock?

The previous Section explained housing gridlock and how restrictive covenants contribute to it. But there is a more fundamental, or at least chronologically prior, question: Why do we have so many HOAs and restrictive covenants? No one factor or force can explain it all, but this Section discusses three separate, yet related explanations: one legal, another based on financial considerations and incentives, and a third psychological. It explains why the psychological element is particularly critical to grasp.

1. The Legal Backdrop

Non-possessory rights in land date back several centuries.⁷⁶ In fifteenth-century England, much land that up until then had stood open for passage was enclosed, and so free travel became more difficult.⁷⁷ In response, the “easement” was born as a legal right of passage through land under private control.⁷⁸ However, for centuries, those non-possessory interests rarely extended to provide rights, allowing people to limit how others used other properties.⁷⁹

Through the 1800s, the Industrial Revolution and corresponding urbanization brought people closer together and also presented a new array of land uses that could have unwelcome spillover effects on neighboring properties.⁸⁰ Non-possessory property interests that could be enforced to limit neighbors’ land uses proliferated.⁸¹ But even then, courts at first would apply such restrictions to the parties to the original agreement only.⁸² That changed in the mid-1800s, when American courts began to enforce these restrictive covenants not only against the parties to the initial agreement, but also to successive owners of the relevant properties.⁸³ The thought at the time was to expand property markets, turning contract-created development rights and limitations into transferable commodities.⁸⁴ Such a framework, proponents thought, would help efficiently allocate land uses to properties where they were most suited.⁸⁵

That history is our inheritance, and today the legal rules surrounding restrictive covenants make it quite simple to lock land uses into their present states for long periods of time. For a covenant to run with the land, one traditionally needs four things: (1) intent for a burden on the property to run to future owners; (2) horizontal privity (typically satisfied if the restrictions were placed when the properties were held in common and at an initial land conveyance or sale); (3) vertical privity (meaning a chain in conveyance of the property all the way back to the owners at the time the burden was created); and (4) touch and concern the burdened land (a notoriously fuzzy requirement, which for the purposes of this Article can be understood as a requirement that the burden somehow involves land use as opposed to a non-land-related restriction on owner behavior).⁸⁶

To simplify, this standard generally means that if at the initial sale of a piece of property, the right people aim to restrict how the property can be used into the future, they just have to say so, and future owners will be bound in precisely the same way. If someone owns two adjoining lots and wants to sell one, they can simply agree with the first purchaser that the transferred lot can only be used for one single-family home, unless the owner of the other property gives express permission to do otherwise. If that restriction is recorded with the deed to the sold property, then it can potentially stand for generations, even after all involved properties are sold again and again.

The most prominent source of single-family covenants today likely comes from HOA CC&Rs. Such restrictions are quite easy to create even though they can burden large collections of properties in one shot. The standard process is that a developer purchases a large piece of land, then subdivides it in accordance with local legal requirements.⁸⁷ In so doing, the developer, as the sole owner, has the right to set conditions on the properties’ future use, and can establish that the properties will all be members of a common-interest community (for example, they may be governed by an HOA and perhaps share in some common amenities).⁸⁸ The developer can then outfit each subdivided piece with a house and all requisite utility infrastructure. The properties at that point are worth far more taken together than the large portion of land the developer initially bought.⁸⁹ So the developer sells each home. If the developer specified in the community’s CC&Rs that each property is limited to single-family use, then each would-be purchaser must simply accept that restriction or look elsewhere.⁹⁰ And because the geographic areas where the developer acquires large chunks of land are often sparsely populated at the time and thus better suited to single-family homes, it is quite common for a developer to choose to include single-family covenants in a subdivision’s founding documents. In the short term (which is what matters to the developer), there is a market for it.⁹¹

Once the developer has sold off all the properties, it no longer retains control over how each property is used. That authority, with the pre-drafted restrictions, typically transfers to the HOA or the community members when the last property is sold, if not before.⁹² However, the restrictions tend to stick. There are many reasons for this, some of which will be explored in more detail below. But from a legal standpoint, the most important factor is that under the standard documentation for these common-interest communities, CC&Rs can only be amended through rather difficult means. It is typical for the restrictions to be amendable only by supermajority vote of the homeowners in the community.⁹³ It is not immediately obvious why in any given case the developer makes it so difficult to change restrictions. As discussed in Section I.B.2, infra, perhaps the developer suspects that buyers will pay more for the assurance that their neighborhood will not change in years to come. Or perhaps it is because such odious provisions are simply features of standard form CC&Rs today.⁹⁴ Either way, an entire new community is left with the restrictions, and it often takes nearly the entire community to be of the same mind to change the restrictions to even a single property.

In sum, the American legal system dove headfirst into commodification of fragmented land interests. In doing so, it made it quite simple for owners to restrict use of land for generations to come and to divide the power to restrict among many parties. And in the modern day, large developers use the same basic legal tools to cement indefinitely the single-family use of potentially hundreds of properties at a time—such that even when the developer is long gone, the law ensures that meaningful change is unlikely.

2. Financial Incentives

The financial considerations surrounding housing gridlock are essential but insufficient to understand the problem. Although they might partially explain why single-family restrictive covenants became so prevalent, they sometimes fail to explain why those covenants persist in certain locations. Basic financial considerations may suggest that restrictive covenants are eventually discarded when they expend their basic utility.⁹⁵ But that often does not play out in practice.

As the preceding Section suggests, financial incentives often encourage developers to mandate single-family use at the beginning of a new neighborhood’s life. For developers with massive capital and hopes of substantial profit, the goal is to construct, and then sell, a lot of homes.⁹⁶ One option would be to buy one or multiple smaller plots of land in an urban center where demand for multi-family housing near preexisting amenities is high. Some developers do take that path. But if other developers have already gobbled up city land for that purpose, or if local zoning provisions prevent a developer from building enough housing units to turn a large enough profit, then they might look elsewhere simply as a matter of necessity. In growing regions, that means looking to the open pastures of suburbs and beyond, where demand for housing is beginning to rise but land is still relatively cheap.⁹⁷ If they cannot build upward, they build outward.

That may partially explain why developers build so many single-family homes in expansive, sprawling subdivisions. But it does not necessarily account for why they restrict all the homes in the subdivision to single-family use. No one forces them to do so.

Although more research could be done as to why developers take that step, there are a couple of related possible explanations. First, and most importantly, there is reason to believe that, at least at the time the developer is first selling off the homes, single-family restrictions do not materially diminish the values of the properties they constrain. Most buyers of newly constructed single-family homes in sparsely populated areas are not interested primarily in an investment opportunity; they are looking for a place to live.⁹⁸ Thus, they perhaps do not care at the outset whether tomorrow or years from now they could be prevented from converting their home into a duplex, or from adding an accessory dwelling unit. Relatedly, buyers are not simply buying a home subject to a restriction; they are buying a home that is surrounded by other homes subject to a restriction. Someone looking for a home in a sprawling development in a sparsely populated region might prefer some assurance that their neighborhood character will remain stagnant.⁹⁹

But there comes a time when land cannot keep up with demand. As the population increases in a region that was once comparatively empty, the public interest demands more homes. As urban centers fail to accommodate the housing needs of prospective residents, more people seek to move into outer ring suburbs.¹⁰⁰ If the land is already substantially filled in these areas by single-family homes on large lots, housing prices will climb.¹⁰¹ Under such circumstances, developers could easily fill multiple housing units per lot.

Thus, once housing demand increases sufficiently, single-family covenants may come to deflate individual property values. As a general rule, per-unit construction costs decrease as the number of units in the structure increase.¹⁰² Further, the more housing units that can be constructed, the more total revenue is up for grabs.¹⁰³ For those reasons, a property that can host twenty, ten, or even two units is typically more valuable than a property that can only host one.¹⁰⁴ That is the basic economic incentive for developers to meet higher housing demand. And although no comprehensive study has been done to measure the effect on a property’s value of lifting a single-family covenant, there is analogous research regarding zoning.

Those numbers seem to point one way: “upzoning” increases property values where housing demand is high. This is especially true for single-family homes. One study found that in Minneapolis, the citywide upzoning initiative increased the value of single-family property by around 3%.¹⁰⁵ Another study found that in Chicago, properties near transit services saw a dramatic increase in value from upzoning—by 15–20%.¹⁰⁶ A study focused on Auckland, New Zealand also saw notable value increases.¹⁰⁷

Why, then, does the gridlock persist? If properties tend to jump up in value when restrictions limiting them to single-family use are lifted, why do single-family restrictive covenants still dominate so much of our single-family housing stock? There may be several reasons, including the high transactions costs someone who wants to bypass a covenant would incur by seeking waivers from all covenant beneficiaries. But from the perspective of all other parties involved, one of the potential causes is that those who would have to decline to enforce the covenant could in the short term suffer a drop in property value, or at least a drop in the subjective value they place on their property and on living in that neighborhood. In other words, although unlocking growth on a given parcel increases that parcel’s value, surrounding parcels might not similarly benefit, especially if they remain restricted. That is one possible reason why even though upzoning a particular property tends to increase its value, properties that are part of an HOA subject to single-family restrictions are sometimes valued higher than comparable properties not within an HOA.¹⁰⁸ Multiple studies have concluded that constructing new multi-family housing on a particular piece of property tends to depress the rent of nearby properties.¹⁰⁹ This result is no surprise, and is in fact a core goal of upzoning’s proponents.¹¹⁰ Thus, adding to the housing stock tends to reduce, or at least slow the increase of, the price of housing units that would compete against the new housing for occupants. This is perhaps a laudable result for the public at large, but one that is unwelcome to local homeowners centrally concerned with their own property’s value.

Yet, it is unclear whether such an effect holds true when looking at the value of surrounding homes that are not in the arena to compete for multi-family occupants, but that instead simply remain single-family properties. In theory, if enough potential buyers of a single-family home are dissuaded by the presence of multi-family housing nearby, then a home that remains a single-family property while its neighbors turn into multi-family homes might decrease in value. The evidence, however, struggles to show that such an effect plays out in practice. Again, borrowing from the zoning context, a survey of parts of the greater Raleigh, North Carolina area suggested that the upzoning of property had no significant effect, either positive or negative, on the value of neighboring properties.¹¹¹ Admittedly, it can be difficult to isolate the effect of upzoning and determine whether it alone tends to depress the value of neighboring properties. For one, demand for housing is often already quite high in those geographic areas (hence the upzoning decision). Also, upzoning and an increase in multi-family housing units may be accompanied by new amenities like restaurants or stores that can make the neighborhood more attractive to potential buyers.¹¹²

On the whole, then, single-family development with single-family restrictive covenants may make some financial sense at the time the covenants are first established. But as a general rule, whenever housing demand increases sufficiently, single-family covenants suppress the values of the properties subject to them. And although homeowners might assume that allowing other nearby properties to host multi-family development could suppress their own home’s value, the actual evidence may not definitively support that theory. What the evidence does support is that if all owners within an HOA could free up their own properties for multi-family use, then they would likely see their property values rise.¹¹³ But the restrictions persist. Not enough owners within an HOA get on board with the change. Given that this perpetual gridlock is not fully explained by the financial interests of the homeowners, another force must be at play. The next Section turns to it.

3. Underappreciated Psychological Forces

In general, owners of properties subject to a common single-family covenant stand to financially benefit by the lifting of that restriction. The fact that they do not take the plunge leads to a conclusion that is perhaps intuitive: owners who resist upzoning (and for the purposes of this Article, those who would try to enforce a single-family covenant if given the chance) do so not necessarily because they have calculated a quantifiable financial loss they might suffer, but at least in part because at a psychological level they have an aversion to the change in neighborhood character that the entry or proliferation of multi-family housing might bring about. This perceived change in quality of life can be a driving force even if it does not manifest in quantifiable financial harm. Because human psychology is potentially such a critical impetus behind the prevalence and enforcement of restrictive covenants, any solution to the resulting gridlock would benefit from a review of the relevant psychological principles. This Section serves that purpose.

Property ownership is not only a legal trait; it has deep personal implications to the owner and potential subsequent owners.¹¹⁴ That being so, concerns that motivate the retaining, using, and transferring of property are not only financial; they are psychological as well. This Section reviews contemporary psychological research that can help explain why property owners may hold onto possessions and entitlements notwithstanding contrary financial incentives.

People develop psychological attachments to their possessions. From a philosophical standpoint, thinkers from Hegel¹¹⁵ to Margaret Radin¹¹⁶ have argued that owning property is a prerequisite for human freedom, because at a psychological level people begin to associate what they own with who they are. Objects become part of the subject; material things contribute to immaterial “identity.” In other words, people need property to develop a fuller concept of self-personhood.

Psychological research supports this theory to a degree, particularly through two related key concepts: the “Endowment Effect” and the “Mere Ownership Effect.”¹¹⁷ Although the two concepts are not perfectly identical, they refer to a similar phenomenon—essentially, that people tend to place greater value on a thing they own or possess than on the exact same thing if they do not own or possess it.¹¹⁸ This Article refers to the two concepts interchangeably. The classic methodology to measure these phenomena is to construct an experiment of randomly selected people assigned to either be a “buyer” or a “seller” of some item, such as a coffee mug.¹¹⁹ The studies find that the “sellers” possessing the item assign the item a higher value than the buyers not possessing the item do.¹²⁰

The Mere Ownership Effect thus reveals that if we own or possess something, by that fact alone we will come to think of it as more valuable. The phenomenon applies to all sorts of “possessions,” even nonphysical ones, like intangible entitlements.¹²¹ And it applies even when a person has not had time to use or become more accustomed to the possession—merely being informed of possession or ownership is enough to trigger the effect.¹²²

Extrapolating to the broader world of property law, the Mere Ownership Effect means that owners of property will tend to value their property above general market value. Thus, they will resist selling an item even if offered the highest value reasonable buyers might pay. The Mere Ownership Effect means, therefore, that the status quo is sticky. Property will tend to stay in the same hands even when buyers are willing to pay the value of the property’s economic utility.

A related psychological principle is that “losses loom larger than gains.”¹²³ Someone who stands to lose a possession is likely to think that they will lose more than another person would think they gain by acquiring that same possession.¹²⁴ And the fear of losing X amount of value is felt more acutely than being denied an opportunity to gain X amount of value.¹²⁵ Some researchers have suggested that this principle of “loss avoidance” is what gives rise to the Endowment Effect.¹²⁶ Whether or not that is true, research shows that people are bad predictors of the importance of preference satisfaction.¹²⁷ In other words, people tend to think that having their preferences frustrated will be worse than it actually is. Once someone gets attached to anything—an object, a right, or even an idea—then they tend to overestimate how much they need it, and how bad it will be to lose it.¹²⁸

So, people tend to overinflate the value of anything they begin to conceive of as within their grasp. But perceptions of possession are not the only force that binds people to their things.

The more an owner associates a possession with meaningful memories and social relationships, the greater value the owner is likely to place on the possession.¹²⁹ This principle is intuitive—people subjectively value the sentimental. We value our close relationships with other people, and so we value the items that represent or remind us of those connections. Margaret Radin thus has argued that property law should place greater value on possessions like heirlooms, keepsakes, personal body parts, such as donatable organs, and the family home.¹³⁰ She calls these “personhood” property, possessions that are uniquely tied to the owner’s conception of self-identity (for example, a person who identifies as a “parent” is likely to place greater-than-market value on a picture that their child drew for them).¹³¹ Psychological research backs this up (although, interestingly, perhaps not as strongly for the family home as for some other possessions).¹³² All in all, our possessions take on greater meaning, and thus greater subjective value, when we come to associate them with things that give life deeper meaning, like family and friendship.

Just as people tend to place greater subjective value on possessions that are associated with meaningful social connections, people generate the most subjective value of all from the social connections themselves.¹³³ How many movies resolve by reminding the characters and the viewers that family, friends, and so forth are most important? The trope exists because it resonates at a psychological level.

Take these psychological phenomena and combine them. Humans place a value premium on things they already own or possess. We fear losing something more than we would desire to gain that same thing if we did not already have it. We overestimate how much we will suffer from losing something. And of all the things we can own, we guard most fiercely that which we associate with social and relational ties. If all these phenomena are borne out in the psychological research, another conclusion flows naturally: if we hold a tool that enables us to stop or slow change to our neighborhood, we will value that tool dearly—more so than we would value the opportunity to get that tool in the first place.

At this point the connection to restrictive covenants is coming into focus, and we can begin to answer the big question: Why do people hold onto restrictive covenants, especially those mandating single-family use only, when relinquishing such a right could make the most financial sense and when HOA restrictions are often unpopular among those subject to them?¹³⁴

In part it is because we are wired to see what we have (for example, our neighborhood as it currently stands) as much more valuable than what we would pay to get it in the first place.¹³⁵ Our neighborhood is ours, so ipso facto it is more valuable.¹³⁶ Thus, the people who must consent to the reworking or circumventing of a restrictive covenant are the same people who are psychologically predisposed not to do so, even if offered the equivalent of top market value to relinquish the right. Homeowners who buy a home in an HOA with single-family use restrictions might or might not see the HOA restrictions as a selling point. But once the home, the neighborhood, and the restrictions become theirs, psychological forces cement the status quo.

I. UNLOCKING HOUSING POTENTIAL

What can be done to break the housing gridlock? Legal mechanisms and economic interests lay the groundwork for gridlock by blanketing vast swaths of land in single-family restrictive covenants. And later, when both financial incentives and the public interest dictate that a change might be needed, the entrenched collection of private property rights collaborates with features of human psychology to stifle that change. This Part explores various options for breaking the gridlock covenants wrought. It evaluates each option through economic and psychological lenses and ultimately arrives at a new solution: in areas without single-family zoning but limited by a single-family covenant (or other covenant with a similar effect like one prescribing minimum square footage per unit and setbacks from property lines),¹³⁷ a property owner seeking to build multi-family housing can take advantage of a new procedural and remedial mechanism provided by statute. The mechanism in its first stage would encourage covenant beneficiaries to waive their enforcement rights, and in its second stage would limit remaining beneficiaries to one specific remedy that would allow efficient bypassing of covenants while still giving effect to preexisting legal rights. After this Part scans and evaluates a catalog of other options to avoid single-family covenants, it will explain this proposal in more detail and explain its advantages over other options, especially, but not only, from a psychological perspective.

A. Preexisting Tools for Breaking Gridlock

This Section explores some of the tools already available to private parties, courts, and policymakers for potentially bypassing the gridlock caused by restrictive covenants. It will evaluate the strengths and weaknesses of each from legal, economic, and psychological perspectives. Ultimately, it concludes that the standard toolkit likely will not be enough to accomplish the goal.

1. Private Bargaining’s Limitations

The simplest (not necessarily the easiest) way to avoid a restrictive covenant is to obtain a waiver—to convince the relevant party or parties not to enforce the covenant. Someone who seeks to develop multiple homes on a property could offer money to the property owners who have the power to enforce a single-family covenant.¹³⁸ No doubt this approach could work sometimes. But one of the central points of the preceding Part is that it often will not. Housing gridlock means that too many people have the right to say no. This is especially true in HOAs that include dozens or hundreds of homes. If a restrictive covenant will stand unless the large majority of homeowners vote to get rid of it,¹³⁹ then all it takes is one or a small handful of holdouts who cannot be paid off to let it go. Psychological phenomena tell us that holdouts are likely.¹⁴⁰ So, many homeowners will require well above “market value” to relinquish a covenant, if they could name a price at all.¹⁴¹ Only in cases in which the builder’s expected profit is enormous will the builder successfully buy development rights from all the parties that can stop them.¹⁴² If that were common, housing gridlock might not be a significant problem. But because psychological phenomena suggest that people will likely value and guard their covenant rights, the buyout method is unlikely to yield meaningful results on its own.

2. Broadscale Invalidation

Turning attention to the other end of the spectrum from free market to government control, the state or local government could simply declare single-family covenants invalid or unenforceable. Such an approach isn’t merely hypothetical. California has done it—at least for certain special circumstances. In 2021, when the state passed a law removing single-family zoning statewide, it also targeted certain covenants. It declared that any restrictive covenants limiting “the number, size, or location of the residences that may be built on the property, or that restrict the number of persons or families who may reside on the property,” are unenforceable against an owner or developer of “affordable” housing.¹⁴³ The language of the provision makes clear that it does not destroy all single-family covenants statewide, but instead only applies to those covenants that would restrict the construction or operation of housing that is subject to certain significant rent restrictions.¹⁴⁴ But in the sphere where the provision applies, its word is final. Any existing single-family covenants are powerless, no matter how longstanding they are or how many other properties purportedly benefit from them.

This total-abrogation approach has some benefits. It can cover many existing covenants all at once, providing numerous developers and landowners advance notice and assurance as to the (lack of) legal force those restrictions carry moving forward. It also, of course, completely bypasses the holdout problem. It does not matter whether all, or even any, property owners benefiting from the covenant can be persuaded to relinquish their hold. And relatedly, it perhaps does not directly cost the government or the builder anything.¹⁴⁵ The philosophy is simple: if the problem is that too many people have a private property right enabling them to say no to development, then remove the right entirely.

But total government appropriation is a blunt instrument. As such, it might bring the hammer down on the housing gridlock caused by single-family covenants while also rattling other interests that we would prefer to leave undisturbed. As described above, the primary harm from the loss of a covenant often is not so much financial as it is psychological.¹⁴⁶ And psychological research shows that being the subject of government coercion can be especially damaging.¹⁴⁷ Take a study regarding the use of eminent domain, which is similar in some ways to government invalidation of a covenant (in that it takes a property right or interest away from a private party). A study posed to its subjects multiple scenarios in which an owner parted with a parcel of land. In two scenarios, the owner agreed to sell it, and in the other, the land was confiscated by eminent domain in exchange for compensation.¹⁴⁸ But in all the scenarios subjects were informed that the owner of the parcel valued it at the same specified market value.¹⁴⁹ Nevertheless, subjects reported that the owner was worse off in the eminent domain scenario.¹⁵⁰

Stern and Lewinsohn-Zamir call this the “coercion premium.”¹⁵¹ It represents the fact that people suffer some sort of harm from being forced to part with property that is separate from and in addition to the value of the property itself. From that premise, Stern and Lewinsohn-Zamir suggest that any use of eminent domain to which the Takings Clause applies should require “just compensation” above the property’s market value.¹⁵² The idea is that if what the owner lost is beyond market value, then just compensation is also more than market value.

This principle carries at least two implications for restrictive covenants. The first concerns the takings issue itself. Courts across the country appear split as to whether a restrictive covenant gives the benefitted properties the sort of interest such that they are entitled to compensation when the government extinguishes that interest through a taking—but many jurisdictions likely would say that it does.¹⁵³ So, any such proposal might run into significant constitutional challenges depending on where it is enacted.¹⁵⁴ In the jurisdictions that might find the extinguishing of a restrictive covenant to be a taking, the government could incur an expense—perhaps a substantial one—to destroy the covenants.¹⁵⁵ And whatever “just compensation” ultimately might be, the coercion premium makes it more likely that landowners would challenge the action in court, thus presenting another cost to the government from litigation.¹⁵⁶

The second implication is that coercively terminating restrictive covenants will in itself inflict some level of psychic harm.¹⁵⁷ The precise amount of harm is perhaps impossible to quantify. But if the question of best policy turns on utilitarian considerations to any degree, the psychic harm must be part of the calculus nonetheless.¹⁵⁸ How much weight and credence it deserves is of course another question. Perhaps local or state policymakers would conclude that the benefit of more housing from terminating numerous single-family covenants is worth the cost of the psychic harm to single-family homeowners.

However, as that cost-benefit analysis shakes out, policymakers contemplating it should keep in mind two pillars of political participation: voice and exit.¹⁵⁹ These two forces drive much local (and, to an extent, state) policymaking. Voice refers to residents’ ability to give their input, either through detailed communication of some kind or, most significantly, through voting.¹⁶⁰ If enough people are fed up with policymakers’ decisions, they can vote them out. If that happens, then the officials that fill the vacancies might be more amenable to the homeowners’ interests (and therefore might try to undo the undoing of the covenants, leading to an end result that is worse than before—a reestablishment of the conditions that led to gridlock and new government leaders who might be more anti-housing on the whole).

The related force, “exit,” refers to residents’ ability to vote with their feet by moving away from jurisdictions with policies they disfavor and into jurisdictions with policies they favor.¹⁶¹ Exit casts a shadow over local and state government decision-making because when residents exercise it, the jurisdiction loses economic activity and housing demand, and so the tax base and revenue diminishes.¹⁶²

Voice and exit are of particular concern when the displeased demographic is homeowners. For one, homeowners tend to be especially active in local politics,¹⁶³ so any new policy or law that disfavors them is sure to garner strong opposition. Also, because homeowners (especially of single-family homes) tend to be wealthier than the average citizen,¹⁶⁴ their exit can be distinctly harmful to localities’ short-term fiscal goals.

But it is possible that in the long run these bogeymen of voice and exit would not prove catastrophic to the jurisdiction seeking to “take” the covenants. As for voice, if lifting single-family covenants creates room for enough new people to move into the jurisdiction early enough, then perhaps some of the strength of an anti-new-housing voting bloc can be diluted by new participants in public life. And as for exit, perhaps the exit of wealthy homeowners who prefer freezing neighborhoods as single-family only would not be that harmful to the jurisdiction if the removing of covenants unlocked trapped property value and increased the number of taxable housing units in the base.¹⁶⁵

Overall, then, if a local or state government chose to address housing gridlock simply by destroying single-family covenants, it would have to be ready for some negative economic and political backlash. And although it may turn out that destroying the covenants is worth the backlash on the whole, the coercion premium brought about by such actions should give decisionmakers pause and reason to evaluate less heavy-handed legislative options. Robert Ellickson recently surveyed some such options, such as statutory provisions limiting the lifespan of covenants to two or three decades.¹⁶⁶ He disfavors such restrictions because, if set on too short a timeline, they may terminate covenants before the covenants have expended their value to those directly benefited.¹⁶⁷ I would also be wary of them because of the flipside of the coin: If set on too long of a timeline, they might allow harmful covenants to maintain their hold for too long. In other words, such provisions may sometimes prove beneficial but fail to narrowly tailor to problem sources.

3. Traditional Common-Law Tools

Although covenants are supported by strong legal backing, the common law provides some principles to escape them in special cases. The two most obviously relevant here are the changed-circumstances doctrine and voiding as contrary to public policy.¹⁶⁸ The changed-circumstances doctrine in some ways comports with the psychological underpinnings of possession and attachment. But both that doctrine and the principle of voiding as contrary to public policy are unlikely sufficiently accessible from a legal standpoint to make much difference in the aggregate.

Under the common law, the changed-circumstances doctrine holds that a party owning a property burdened by a restrictive covenant can escape its enforcement if local conditions have changed so that the covenant no longer serves its purpose and no longer benefits the once-benefited property.¹⁶⁹ For example, consider if one property (hosting one house) was in covenant with another (also hosting one house) to never have loud parties, but years later the benefited property is purchased and converted into an industrial site. A court could determine that the purpose of the covenant, presumably to ensure peace and quiet for whoever lived in the home on the benefited parcel, can no longer be carried out because that property no longer hosts a home. The party can go on.

Relying on the changed-circumstances doctrine has some intuitive appeal for voiding single-family covenants. Imagine that six homes on one street were under a covenant for single-family use only. But twenty years after the establishment of the covenant nearly all of the other homes on the street and the surrounding area have been converted to duplexes, quadplexes, or small apartment buildings. If the purpose of the covenant was to do what could be done to preserve the single-family character of the entire surrounding area, and (so the owners thought) to preserve property values, perhaps the covenant can no longer serve its purpose. Psychologically, the feeling of loss the owners might experience by seeing the covenant dissolve would probably be diluted, because the neighborhood had already been changing for years. And if it was once financially beneficial to lock in single-family use, perhaps increased housing demand in the area (signaled by all the new construction) makes it so that the properties would be worth more if not so limited.

But the problem with the changed circumstances doctrine is that courts employ it only in rare cases.¹⁷⁰ As a general matter, courts hesitate to find changed circumstances unless there is no reasonably conceivable benefit to the covenant.¹⁷¹ In the context of single-family covenants, that would be incredibly hard to show. At least theoretically, ensuring that a neighboring property or properties remains single-family could benefit the surrounding properties’ values, because it suppresses local housing supply. The purpose of the covenant may not be economically efficient, but as long as the court finds a purpose, it likely will stand.¹⁷²

And the fact that such covenants often appear as part of the regulations of an HOA with dozens or more properties make the changed-circumstances doctrine an even weaker tool. If an HOA with a single-family covenant covers a sufficiently large swath of land, then it necessarily will insulate the properties within it from much neighborhood change. Instead, the entire subdivision or several blocks of the neighborhood remains a single-family enclave, and the areas beyond the HOA’s bounds are the spaces that might change. Of course, proliferation of multi-family housing outside of the bounds of the HOA is more than likely a primary reason why HOA members would want the single-family covenant to persist. The covenant’s purpose is to preserve the area as distinct from its surroundings.

So, although declining to enforce single-family covenants because of changed circumstances makes sense from a psychological standpoint and is probably less likely than other approaches to elicit backlash from psychic harm to homeowners, in practice under current common-law rules it is unlikely to be a powerful tool to break housing gridlock on any meaningful scale.

Declaring such a covenant invalid as contrary to public policy is potentially more powerful, but ultimately suffers from a similar weakness. Under the Restatement approach, a covenant is invalid if it is contrary to public policy.¹⁷³ It is rare that a court concludes a covenant is contrary to public policy.¹⁷⁴ But when it does happen, often it is because a state (or sometimes even the federal government) has made clear through enacted statutes or other written law that what the covenant aims to accomplish is specifically disfavored, or that it stands directly opposed to a goal codified in state law.¹⁷⁵ In the context of restrictive covenants, the public policy exception closely adheres to the unique principle that a court’s enforcement of certain restrictive covenants constitutes “state action.”¹⁷⁶ Thus, when a court is faced with a request to enforce or invalidate such a covenant, it must keep in mind whether doing so would contravene the explicit goals expressed in the state or federal constitution, or any other law such as one passed by a legislative body.

If a court concluded that enforcing a single-family covenant was contrary to public policy, then that would be a powerful tool to break apart some of the housing gridlock. The result of such a decision would completely bypass the covenant and thus free up the land for more productive residential use. But it is unclear what it would take to prove that enforcing a single-family covenant is against public policy, and even that inquiry could vary substantially by state. There are some easy cases. Take California, for example. There, again, the state passed a law declaring that any single-family covenant is unenforceable against a developer or operator of “affordable housing.”¹⁷⁷ Quite plainly, then, there is a public policy against single-family covenants blocking the development of affordable housing. But that is an easy case because the statute already does all the work—the statute by its own enactment guts all such covenants, so the “public policy” exception on the judicial side of the equation is unnecessary.

And there are easy cases on the other end of the spectrum, when enforcing such a covenant would clearly not violate public policy. If the government actor has already zoned large portions of similar land for single-family use only, then it would be hard to credibly assert that the government has any discernible public policy against private agreements that would do the same.

But there are cases between these two extremes that are trickier. Take for instance any of the jurisdictions that have effectively eliminated single-family zoning on the public side of things, but unlike California have said nothing about private covenants.¹⁷⁸ On the one hand, one could argue that a jurisdiction that took the affirmative step of freeing all land from the constraints of government-induced single-family restrictions would likewise oppose private arrangements that constrained property in a similar way. On the other hand, one could also argue that if the government wanted to touch private covenants, it could have said so more explicitly. And in any event, a government could reasonably want to loosen government constraints but remain agnostic on what private entities and private rights accomplish in the same subject area.

In an article released this year, Gerald Korngold specifically argues that the public policy doctrine is a viable method for voiding many single-family covenants.¹⁷⁹ He rightly notes that occasionally courts in some jurisdictions have viewed the doctrine more broadly, applying it even without explicit enacted guidance from the legislature on the issue.¹⁸⁰ I would welcome extending that approach to the issue of single-family covenants. But because it appears that such an approach would mark a stark departure from how courts in many states have approached the public policy doctrine, I think it profitable to seek another solution.

In sum, I believe that the void-as-contrary-to-public-policy approach likely is not a reliable mechanism for breaking gridlock moving forward under the current state of the law. At the very least, in many jurisdictions it would potentially require the jurisdiction to already have eliminated single-family zoning, and so far, that has happened only in a handful of cities and states. And even in those jurisdictions, it is likely that some other specific expression of a policy against restrictive covenants operating to constrain housing supply would be required.

B. A Hybrid Solution

This Section first briefly identifies a potential multi-step solution to escaping housing gridlock while limiting negative externalities and psychic harm to property owners. It then explains each component of the solution in more detail and explores why it presents some advantages to other approaches from legal, economic, and especially psychological standpoints.

1. The Two-Stage Solution to Escape Gridlock

This Section gives an overview of a procedural and remedial solution to housing gridlock that single-family covenants (and covenants with a similar effect) helped bring about. The first stage of the solution involves a new legal mechanism created by statute. It would be available to any property owner subject to such a covenant in an area without single-family zoning. If the property owner (I will call this entity the “builder” for clarity) wants to construct multi-family housing on its property, it can force a decision of each beneficiary property on whether to decline to enforce the covenant. The builder would submit to the HOA members a rough plan for the development, such as whether it intends to create a duplex, add an accessory dwelling unit, or build an apartment or condo building, as well as the estimated size of the structure. Along with the rough plan, it would submit a lump-sum monetary offer—the amount the builder is offering to the property owners collectively to not oppose the development. The beneficiary property owners (most often this will include all other members of the HOA) then can individually decide whether to “opt in” to the offer. Those who do so will be entitled to a pro rata share of the total sum.

The second stage of the solution acts as a backdrop. Whether or not a majority of interested property owners opt in to receive a pro rata share of the monetary offer, the construction can go forward. But for those who did not opt in, a legal action for damages will still be available once the structure is completed to a habitable state. They cannot sue to block the development or require it to be torn down if it is already built; but they will be entitled to damages of some amount. There would be no need to determine liability—the builder’s violation of the covenant would be open and undisputed. The action, if it does not result in settlement, will proceed to a trial or hearing on the issue of harm alone. At that proceeding, the property owner would be entitled to a jury determination of damages within statutorily defined ranges, which could rest not only on evidence of the market value of enforcing the covenant in that instance, but also on any sort of harm (financial, psychological, or otherwise) the suing property owner suffered from the new construction.

The following Sections explore each component of the proposed solution in more detail and discuss the comparative advantages over other approaches from legal, economic, and especially psychological perspectives. They address the damages backdrop first since that component is more central to the overall scheme, and then address the initiating offer second.

2. Why Damages?

By now the stickiness of restrictive covenants is evident. They are hard to get rid of without express agreement by most benefited property owners.¹⁸¹ Broad coercive action by governing authorities might achieve the primary goal of breaking gridlock, but in the process it might generate various negative externalities rooted in the psychic harm that research suggests it would cause.¹⁸² And the common-law mechanisms limiting the perseverance of such rights are narrow and apply rarely.¹⁸³ But courts and the common law can contribute another tool: remedies. It is one issue what private rights parties possess; it is another issue which way a court will operationalize the right.¹⁸⁴ At the simplest level, sometimes a court can issue an injunction protecting a right, while other times it can order damages as payment for violating the right.¹⁸⁵

A promising tool for breaking the hold restrictive covenants have on housing supply would be—to use the classifications made famous by Calabresi and Melamed’s foundational work—converting the right provided by such a covenant from a property rule to a liability rule.¹⁸⁶ In other words: enforcing covenants only in such a way that the beneficiary of the covenant can receive payment for its violation, but cannot by force of law keep the burdened property in compliance. In a recent article, Robert Ellickson briefly suggested a damages approach and commended one state, Massachusetts, which has provided for it by statute in some special instances.¹⁸⁷ This Section explores a damages approach for single-family covenant violations from legal, economic, and psychological viewpoints. It then discusses the finer details of how such an approach could be implemented from a practical standpoint.

i. The Legal Landscape

From a legal perspective, damages are the preferred remedy across much of the common law.¹⁸⁸ Even for property rights (which one would rightly assume are often protected by a “property rule”), courts often require the plaintiff seeking to enforce its right to show why equitable relief like an injunction is appropriate. For intellectual property, electronic property, chattel, and sometimes even real property, a plaintiff must show that damages cannot adequately compensate them and that they will suffer irreparable harm without an injunction.¹⁸⁹ And even after all of that, a court might decline to issue an injunction if it finds it inequitable to do so based on the interests both of parties to the lawsuit and of third parties.¹⁹⁰ Thus, at a high level of generality, it comports with common-law remedial principles to presume that damages are a proper remedy when a restrictive covenant is violated.

In practice, however, most jurisdictions will enforce restrictive covenants by injunction.¹⁹¹ A covenant is a property interest and so, the thinking goes, an injunction to protect against its violation is presumptively appropriate.¹⁹² Injunctions are often available simply on a showing that a covenant was violated, without any necessary demonstration of harm by the complaining party.¹⁹³ And indeed, an injunction in some circumstances might even require a property owner to tear down a structure that was erected contrary to the covenant.¹⁹⁴ For restrictive covenants specifically, many courts favor injunctions to enforce them precisely because it is challenging to quantify the harm from violating them.¹⁹⁵ From a historical perspective, this practice follows from a unique turn of events in Anglo-American law that lowered the bar for what a restrictive covenant beneficiary would have to show to obtain equitable relief. Traditionally, covenants could be enforced against successors in interest only when the original covenanting parties were in horizontal privity; but eventually courts placed restrictive covenants under the umbrella of a new form of nonpossessory interest known as an equitable servitude, which more freely allowed for enforcement by injunction.¹⁹⁶

Yet in most jurisdictions courts still ultimately retain remedial discretion, and thus need not enforce a covenant through an injunction if doing so would be inequitable.¹⁹⁷ In Blakeley v. Gorin, a property owner planned to construct a large hotel and apartment building that would connect to the neighboring property at the rear via a large, elevated bridge.¹⁹⁸ But a covenant required property owners to leave a sixteen-feet-wide space behind their buildings at the rear of the property.¹⁹⁹ The restrictive covenant was over a century old and originally served to preserve a cart path.²⁰⁰ Even though the court noted that cart paths are now mostly obsolete, it explained that the covenant still served the valuable purpose of preserving light and air for surrounding properties.²⁰¹ Thus, the court held that the covenant should be enforced.²⁰² However, the court chose damages instead of an injunction to do so.²⁰³ In its view, the harm to surrounding properties from reduced light and air was minimal compared to the benefit to the developer and the public from the more productive use of the land.²⁰⁴

Likewise, sometimes courts specifically conclude that damages are an adequate legal remedy for the violation of a restrictive covenant. In Crossmann Communities, Inc. v. Dean, a builder violated a setback covenant by beginning to construct a house too close to the property boundary line.²⁰⁵ A neighboring property owner within the planned community sued to enjoin the construction.²⁰⁶ The Court of Appeals of Indiana held that although the covenant was enforceable, damages were an adequate remedy because “[a] restrictive covenant constitutes a compensable interest in land.”²⁰⁷

When it comes to single-family covenants, though, courts almost never elect damages instead of injunction.²⁰⁸ Why? It is not necessarily because they walk through the standard equitable considerations and determine an injunction is necessary. Instead, courts typically note that for violations of such covenants, plaintiffs are not limited to damages they can prove.²⁰⁹ Essentially, the court simply states that an injunction is allowed in the face of a violation and goes on its way.²¹⁰

That approach would be misguided in many cases dealing with proposed or completed multi-family housing development. If restrictive covenants were treated like most other private legal rights, then a plaintiff would have to affirmatively show that an injunction is necessary to safeguard their interests and that such relief is equitable considering all parties affected by it.²¹¹ In many cases, a property owner benefiting from a single-family restrictive covenant may have a very difficult time making such a showing. As described above in Section I, presumably from a homeowner’s perspective, the main reason to cherish a single-family covenant is that it preserves property values, both by suppressing nearby housing supply and by preserving “neighborhood character.” But, again, the evidence is tenuous that removing such a covenant significantly harms surrounding property values.²¹²

Regardless, if there is some measurable harm to property value by terminating the covenant, then damages could conceivably compensate for it because it would be financial loss.²¹³ And if there is no measurable harm to property value, that likely means the harm is not so much financial as psychic. Assuming psychic harm provides the foundation for a cognizable legal interest, it is true that damages might not always perfectly compensate for it. From a psychological standpoint, people prefer in-kind redress over monetary redress, even when the loss they suffered is of a fungible asset.²¹⁴ Extrapolating to the context of psychic harm, it is reasonable to assume that if there is real psychic harm from terminating a covenant, damages might not completely compensate for it. That may be one of the reasons the Restatement (Third) of Property and many courts remark without much explanation that because harm from the violation of a restrictive covenant is hard to quantify, injunctive relief is appropriate.²¹⁵

Still, under standard remedial principles, a court may go on to evaluate whether injunctive relief—even if a better compensator than damages—is appropriate based on a “balance of equities”; that is, whether an injunction makes sense given the harm such relief might cause to both the defendant and the broader public.²¹⁶ Often it does not make sense in light of those considerations. The benefit to the plaintiff homeowner would be avoiding some indeterminate amount of psychic harm. But the harm to the defendant and the public from an injunction (that is, strictly enforcing the covenant) might be more pronounced. The owner of the land limited by the covenant would suffer the financial consequences of only being able to operate a single-family home when multi-family use might be dramatically more profitable. And the public would suffer the consequences of one more instance of constrained housing supply.

The legal landscape thus points two ways here. On the one hand, courts are generally free to opt for damages instead of an injunction when presented with a violation of a covenant. On the other hand, courts in practice rarely do so, especially for single-family covenants. I next move on to the interrelated economic and psychological perspectives on the issue to explore why courts’ overprotectiveness of such covenants with injunctive relief is likely misguided.

ii. Economic Considerations

From an economic perspective, damages are often the best remedy if available. That is because they allow for “efficient breach.”²¹⁷ This concept is most prominent in contract theory, but it applies more broadly. To put it simply, damages place a number value on a legal right or duty. If a party wants to violate such a right or duty, damages set how much they must pay to do so.²¹⁸ And most commonly, damages aim to compensate the injured party for what it lost. So, if a violating party chooses to violate a right, and then pays damages to do so, theoretically that party has determined that its action is worth more to it than the money it had to pay. The end result is that the injured party is no worse off than before, and the violating party is better off—approaching a Pareto-efficient result.²¹⁹

If instead a violating party does not have the opportunity to violate a right and pay damages—indeed, if it cannot permanently violate a right at all because it is prohibited by an injunction—then the law entrenches inefficiency.²²⁰ The violating party cannot pay to get something they value more than the money, even when allowing them to do so theoretically would not ultimately make anyone else worse off.

To make it plain for the context of this Article: in theory a single-family covenant has a specific quantifiable value to a beneficiary of it (such that the owner assumes it enhances their property value).²²¹ And a prospective builder expects a certain financial gain from being able to construct a duplex, quadplex, or other multi-family housing development.²²² If the value the builder expects to gain from the project is greater than the value of the single-family restriction to the neighbors, then an efficient framework would allow the developer to violate the covenant and pay the neighbors what the restriction on that particular property was worth to them. The developer still profits, and the neighbors are no worse off (and that is not to mention the added benefit to the public at large from the increased housing supply).²²³

As explained in Part I above, although such a result could theoretically be achieved through private bargaining, the logistical difficulties of doing so means that a “liability rule” might be necessary to reach an optimal result.²²⁴ Under Calabresi and Melamed’s famous and foundational remedial framework, the standard perspective is that a right should be protected by a property rule when transactions costs are low, and a liability rule when transactions costs are high.²²⁵ The reason for this is that when transactions costs are low, such as when only two parties in close proximity are involved, then it is logistically simple for them to negotiate a buyout.²²⁶ But when transactions costs are high, such as when a builder must obtain permission from many parties, then a liability rule bypasses drawn-out negotiations and holdouts and jumps straight to compensation.²²⁷

For single-family covenants, a liability rule is often more appropriate. This is so especially when the population of property owners with the right to say “no” to multi-family development is large and thus the risk is great of holdouts who will accept no reasonable price for a change.²²⁸ Combined with the psychological phenomena leading neighbors to assume that such restrictions are worth more than they in fact are,²²⁹ this means that a lawsuit with damages as the end result is often a necessary tool to reach an efficient arrangement that private bargaining cannot achieve.²³⁰

What about an obvious and important economic objection—that any such limitation on the strength of single-family covenants might in the short run constrain housing supply? The idea is that there must be some profits-focused reason why subdivision developers include the single-family covenant in new CC&Rs.²³¹ And so if such a provision is weaker, the developers might have a harder time selling off homes initially. By extension, they perhaps would have less incentive to develop land in the first place.

Two responses: first, although it is true that homes within HOAs tend to be more valuable than those not in HOAs,²³² the evidence does not indicate that reciprocal single-family covenants are necessarily the primary reason for that. It is more likely that the better services within an HOA compared to those of the surrounding locality is a major selling point, with the complete, broad network of restrictions playing a role alongside.²³³ Indeed, a single-family restriction on a given piece of property likely suppresses that property’s value.²³⁴ So even if it were a selling point that other surrounding houses are covenant-bound to remain single family, any value bump from that fact could be offset by the fact that the same restriction applies to the purchased property too. Put another way, any “demoralization costs” of limiting the enforcement of restrictive covenants may in part be offset by the “morale benefits” of loosening constraints on individuals’ free use of their properties.²³⁵

Second, the objection fails to distinguish between local market conditions at the time of initial sale of the home and the time of subsequent transfers of the property to new owners. Most likely, by the time an owner of a house restricted to single-family use determines that they want to develop additional housing, years have gone by since the covenant was initially placed.²³⁶ When the initial developer of the subdivision first sold off the homes, it was able to capitalize on any value added immediately from the single-family restrictions (if there was any value added). So, the fact that legal remedies might make some room for market pressure towards multi-family housing down the road is unlikely to significantly affect the initial profitability for the developer from building out and selling off the subdivision of homes restricted to single-family use. The availability of damages down the road unlocks new housing and probably will not stifle initial subdivision development.

iii. Psychological Phenomena at Play

One might argue, though, that the precise problem is that you cannot put a numerical value on a single-family restriction from the point of view of the neighbors.²³⁷ This very Article even suggests that psychological attachment, more than quantifiable financial interest, explains the perseverance of such covenants.²³⁸ But even if it is difficult to put a number on what it means to lose the benefit of a restrictive covenant, psychological phenomena suggest that enforcing a covenant, but ordering damages, is perhaps the best way to pursue the public interest in a way that does minimal harm to landowner expectations.

First, I will address some psychological research that might seem to point away from a monetary remedial scheme. Research shows that people prefer in-kind redress over monetary relief.²³⁹ When they lose something, they are likely to be more satisfied if it is replaced with something similar—even if the thing lost is fungible.²⁴⁰ Relatedly, Stern and Lewinsohn-Zamir argue in favor of property rules and injunctive relief because in their view this principle means that courts systematically undercompensate injured parties.²⁴¹ If that is so, then perhaps one might assume that a property rule enforced by an injunction makes the most sense; if money in exchange for the loss of a right does not seem to make the injured party feel whole, then monetary relief is inadequate and an alternative remedy like an injunction is necessary.

Relatedly, the Mere Ownership Effect may partially explain the existence of property rules in general. If a liability rule assumes that basic infringements on property rights can be rectified through pay, property rules assume that sometimes they cannot.²⁴² A property rule promises that the legal system will protect a property interest even if infringement of that interest does not manifest as quantifiable financial harm (hence the classic standard for injunctive relief, which requires the plaintiff to show that they will suffer irreparable harm without an injunction and that legal remedies like damages do not adequately safeguard their interests).²⁴³ Property rules assume real harm even absent affirmative proof of it.²⁴⁴

To put it in terms of the Mere Ownership Effect: if something becomes “mine,” then for that reason alone it is more valuable to me.²⁴⁵ Therefore, an act that I see as an afront to the thing being “mine” hurts me, even if it causes no measurable damage to the object and even if it does not prevent me from using my property. If my right to exclusive possession is not respected, then I have lost something.²⁴⁶ At the end of the day, then, monetary compensation would fall short because the core of the harm is difficult or impossible to quantify.²⁴⁷ And so, if we care about that non-quantifiable harm, the best way to guard against it would be a property rule expressed through injunctive relief.²⁴⁸

That argument is compelling, but it suffers from a few shortcomings when applied to the issue at hand. First, it would mean mostly surrendering to gridlock. If we assume that because owners of restrictive covenants would rather have the covenants than money, we therefore must accommodate that preference; then life will go on as it has. In the numerous and increasing swaths of land across our country where single-family homes under covenant dominate, but housing demand is high, we will simply persist indefinitely in the housing shortage. Psychological realities may be quite important, but do not alone carry all the normative weight. Property owners might report that they prefer something over something else, but whether the law should cater to the preference is another matter entirely. Stern and Lewinsohn-Zamir, despite their favoring of in-kind remedies and property rules, note that just because people might prefer one kind of remedy does not mean the law necessarily should accommodate their desire.²⁴⁹

Indeed, while we can generate crucial insights by understanding psychological phenomena undergirding people’s relationship to their possessions, we likely cannot satisfy short-term preferences of all property owners and correct the legal and market mechanisms that have brought about housing gridlock at the same time. Again, recall the Mere Ownership Effect.²⁵⁰ Because people value their possessions above market rate by the simple fact that they possess them, a feature of human psychology makes efficient transfer of goods less likely.²⁵¹ The Mere Ownership Effect thus entrenches a status-quo bias into people’s relationship to property. And property rules simply bolster that status quo, regardless of whether it is economically efficient or socially beneficial to do so.²⁵²

Next, even from a psychological standpoint, we must take the preference for in-kind redress itself with several grains of salt. This Article explained above how the fear of neighborhood change has such deep psychological roots.²⁵³ But there is another element at play here. It is the fact that people are bad estimators of their future flourishing.²⁵⁴ Take for example, one of the most serious forms of compulsory lifestyle change: relocation. If people develop a fear and aversion to changes in their familiar surroundings like their neighborhood, then even more so they assume they will suffer deep psychic harm from being forced to move somewhere else entirely. But the psychological reality is that that is generally not the case. In fact, there is almost no evidence at all that people who must relocate to another home suffer any lasting psychological harm from that event.²⁵⁵ There are some narrow exceptions, such as people living in poverty for whom relocation often means eviction and homelessness.²⁵⁶ But for most people, even though the prospect of change may hurt, they readjust quickly.

That is not to say the initial discomfort does not matter. But when the choice is between an injunction that freezes a housing shortage under force of law and damages that both attempt redress and free up new housing, we have some important questions to ask: If any discomfort from the addition of new housing may wane into the immeasurable given enough time, then why operate under legal rules that, in effect, indefinitely prevent such change? And if psychological research shows that a person would not pay over market value to get a restriction in the first place, then why should we effectively “pay” them far above market value (expressed through unending specific performance)²⁵⁷ when they stand to lose that restriction?

Furthermore, not only do people adjust to new circumstances despite their initial discomfort, they also can be nudged to see their own possessions and entitlements through a different lens. Specifically, the degree of psychic harm a person experiences by losing a legal entitlement can be influenced by how that entitlement was previously framed and presented to them. One study measured this effect among a group of first-year law students.²⁵⁸ Students were divided into two groups and all were given a laptop. The students in one group were told they owned the laptop, which, they were informed, included the right to use, exclude others, and transfer.²⁵⁹ The other group was told that they owned all those same sticks in their bundle of property rights but not specifically that they owned the laptop itself.²⁶⁰ Both groups were then informed that the school was placing certain restrictions on the laptops’ use. Between the two groups, the students who were told they owned the laptop were more likely to report that they “lost” something because of the restrictions compared to the students who were merely told they had a collection of specific rights regarding the laptop.²⁶¹ Researchers later concluded that forewarning property owners of potential restrictions on their free use of property leads those owners to feel less like their rights had been violated when restrictions were later presented.²⁶²

So, if owners of a possession or entitlement are informed about the qualified nature of their right, they are less likely to feel a sense of loss when restrictions manifest later. For single-family covenants, this means that people’s expectations about what entitlements a restrictive covenant gives them potentially can be adjusted. If, in practice, they begin to see that violation of such a covenant entitles them to some form of financial compensation, then over time they may adjust to see their interest in the covenant as a financial one—and not an amorphous, yet deeply cherished, interest that entitles them to block neighborhood change.

On the other side of the coin, that same principle demonstrates why the damages approach has some psychological advantages to other methods of breaking housing gridlock, especially those involving sweeping mandates or broad invalidation of such single-family covenants. A damages regime enables people to relearn the nature of their interests. A wide-reaching statute that terminates or voids all single-family covenants in one shot gives people no time to adjust their expectations about their own entitlements before those entitlements are taken away. But if the covenants stay in place and are simply enforced differently when violations arise, then people perhaps have room to adjust their expectations without feeling like they have been entirely deprived of their rights. Admittedly, property owners might still experience some degree of surprise, such as when the legislative body first enacts an approach like the one this Article urges. Furthermore, the first property owners in a given geographic area to find themselves neighbors to a covenant violation under the new remedies approach would not have had sufficient time or previous examples based on which to adjust their expectations. But for subsequent neighbors, the blindsiding effect could be reduced.

But if damages are the answer, how will they be calculated? It is an important question because, as Stern and Lewinsohn-Zamir point out, the “coercion premium” means that property owners are unlikely to receive compensation that makes them whole, at least in the takings context.²⁶³ First, property owners, so the thinking goes, refuse to sell in a voluntary market transaction because they value their property above market value.²⁶⁴ Second, and relatedly, when the government resorts to a taking, the property owner suffers an additional psychic harm through the coercive practice.²⁶⁵ So, if just compensation is determined by a court’s estimation of fair market value, that compensation will systematically undercut what the property owner feels they lost.²⁶⁶ If all of that is true in the takings context, presumably it could hold true in the damages context too. In both cases, the property owner is given money in the face of their refusal to relinquish a property right.²⁶⁷

Still, the fact that someone subjectively would feel immense loss from having a property right transformed to a liability right does not by itself mean that their preference should determine proper compensation. If it did, then we could hardly hope to escape the holdout problem. A property owner who deeply cherishes the restrictive covenant could be entitled to such a substantial damages award for the loss of the covenant that developers would seldom find it profitable to build and risk the lawsuit. If all that mattered were property owner expectations, that might be fine. But if the public interest factors into the balance of the equities, we need another approach.

iv. Four Features of the Damages Action

To address this problem, I propose a compensation method with four key features.²⁶⁸ The first is that the property owner ought to be allowed to have a jury decide compensation if they so choose. Some jurisdictions already allow for this, even if they are not required to as a matter of constitutional law.²⁶⁹ A jury trial could in theory soften the effect of the coercion premium. Each member of the jury is someone who could own property or reside in a home that is part of a neighborhood with covenants.²⁷⁰ Because each of them could find themselves in the same situation, they could sympathize with the property owner’s subjective plight. Conversely, the jury could be a useful tool to moderate the intensity of the property owner’s preferences. Although the jury could have sympathy for the property owner, the fact that the jury is drawn from a “cross-section” of the community²⁷¹ means that it is less likely to be swayed by a property owner with an unusual attachment to the single-family covenant.²⁷²

The second and related feature of my compensation regime is that the property owner ought to be allowed to present evidence explaining why and to what degree they value their right to prevent the building of the structure at issue above market. In takings cases, most often the measure of just compensation is the market value of what was taken (that is, for the purposes of the issue at hand, the right to enforce the covenant in that particular instance), or the difference in fair market value between the plaintiff’s entire property before and after the taking.²⁷³ But of course in tort actions and elsewhere across the common law, other factors (such as pain and suffering and other psychic harms) can enter the equation to determine what would make an injured party whole.²⁷⁴

The third feature of my compensation regime is that the damages should be based on the harm from the construction of the precise structure or development the covenant-violating landowner builds, not from an extinguishing of the right to enforce the covenant against any other property bound by it.²⁷⁵ In other words, someone who wants to construct a medium or large apartment building may have to pay more in damages than a homeowner who wants to add an accessory dwelling unit behind their house, or convert their house into a duplex. If the central harm from the non-enforcement of a single-family covenant is more rooted in the change that could result from the specific covenant violation, as opposed to the weakening the covenant in the abstract, then it makes sense for the damages due to reflect that.²⁷⁶

The fourth and final feature of the compensation regime is statutory ranges or caps on damages, if not inconsistent with state law where the regime would be implemented. Specifically, the legislative body should prescribe either a range or a maximum damages award per property owner that considers the nature of the new structure, the nature of the surrounding structures, and the distance from the new structure to the property of the aggrieved owner. In other words, the legislative body could, for example, provide a range of permissible damages awards for duplexes built in single-family neighborhoods, which would be slightly lower in value than the range of permissible awards for quadplexes in the same neighborhood, which would be lower than that for ten- to twenty-unit apartment buildings in the same neighborhood, and so on.²⁷⁷ Prescribing a range instead of specific values enables courts or juries to consider various other factors that might make the new structure more or less harmful to the aggrieved property owner in a given case. And prescribing a range instead of allowing any damage award whatsoever accomplishes at least two things: (1) it gives builders some amount of predictability and thus confidence about whether it will likely be financially feasible for them to move forward and build the non-conforming structure;²⁷⁸ and (2) it guards against the risk that some juries might offer astronomical awards even for minor departures from single-family use restrictions—disproportionately burdening builders of smaller structures who likely have less capital on hand. Many states have addressed whether statutory caps on noneconomic compensatory damages are constitutional as a matter of their state’s law.²⁷⁹ In most states such limitations would likely be constitutional,²⁸⁰ but in the minority of states where they would not be, this compensation regime would have to operate without express limits on the compensatory damages that a plaintiff property owner could obtain.²⁸¹

It would be difficult here to declare the specific dollar value ranges appropriate for all possible cases. A legislative body could benefit from thorough input on that matter from builders and property owners. But, importantly, whatever ranges the legislative body chooses must not allow for such significant damages awards that builders would rarely bother moving forward with relatively noninvasive multi-family projects for which there is market demand. To make it just a bit more concrete: in regions where demand for more housing is high but not astronomical, damages awards in the range of several hundred dollars for the nearest neighboring properties may be appropriate for converting one home to a small handful. The value could scale up for larger construction projects and scale down for neighbors located farther from the project. Perhaps three-digit damages would strike many people as surprisingly low (not to mention the two-digit damages potentially in play for more distant neighbors). But the key here is that the damages award would, again, be provided in response to one covenant violation, not for termination of the covenant all together. The same or a similar amount in damages could be in play the next time around if another neighbor plans to develop housing in violation of the covenant too. And from the builder’s perspective, it is possible that they will have to pay every other property owner within the HOA in some form, either through damages or as part of the initial lump-sum payment. Thus, if individual neighbors are entitled to too high of a value in compensation, rarely would builders go through any of this process at all (except in cases of extremely high potential profit from the project).

These four features of the process of calculating damages should go far towards ensuring that property owners’ psychic harm is taken seriously, but not given so much weight that gridlock can persist as it has. And, as the next Section explains in more detail, from a psychological standpoint, it gives property owners a voice in the process—a crucial component to minimizing psychic harm throughout the procedural stages themselves.²⁸²

One thorny issue still remains: the proper timing for such a damages action. Many HOA CC&Rs specify time limits on how long construction projects may take; these are often one-year limitations.²⁸³ And HOAs may levy reasonable fines for failure to comply with such a deadline.²⁸⁴ That being so, an action for damages could ripen whenever covenant terms have been violated.²⁸⁵ For an express single-family covenant, that means the plaintiff can sue once a multi-family structure is built out to a habitable standard (that is, when the defendant no longer uses the property as a single-family residence as the covenant requires).²⁸⁶ For other restrictions that have the effect of only allowing for one single-family unit, such as minimum-square-footage requirements, an action could ripen as soon as the new construction has caused a violation. The HOA’s time limits on construction can serve as separate restrictions for which fines could accrue to the HOA daily after the deadline passes. This acts as an incentive for builders not to unduly delay and thus force aggrieved neighbors to delay their compensation. Yet, it is possible that HOAs could limit construction times so stringently that few multi-family projects could go forward.²⁸⁷ To dodge this counterpunch, localities or states may need additional legislative provisions directly targeted at allowing reasonable construction times.

3. Why an Opt-In Mechanism?

There is, however, one more psychological observation that bears on the question of the proper mechanism for escaping a single-family covenant, for which a damages approach alone might not meaningfully account. It is the fact that even property owners who lose an entitlement tend to feel that they have not lost as much if they (1) were given voice in the process,²⁸⁸ and (2) were not singled out for negative treatment.²⁸⁹ To account for the coercion harm and singling-out harm, I suggest a new legal mechanism that would chronologically precede the damages schema described above.

Psychologists have conducted specific research on the psychological effects of takings, and of procedural legal processes more generally. They have found that people often care as much about being treated with dignity and respect during legal processes as they care about the end result.²⁹⁰ Likewise, people are averse to legal processes that single them out for unfavorable treatment, even if it is in the name of the public interest.²⁹¹ That means that people are likely to suffer psychic harm if they are coerced into a situation in which they are disadvantaged for the sake of some public good but others against whom they compare themselves are not.²⁹²

Property owners who expect to be able to limit fellow HOA members to single-family use, but then can only collect market price for the right instead, might feel that their particular interests were steamrolled on behalf of some public good. That does not mean the damages approach is inappropriate, but it does beg the question of whether that particular form of psychic harm could be minimized along the way.

My proposal is one that gives landowners voice, but not veto. The legal mechanism is created by statute and triggered when an owner of a property within an HOA seeks to build in violation of a single-family covenant, and that area is not subject to single-family zoning. The builder can initiate a decision by each member of the HOA. It offers a single lump-sum value in exchange for the right to violate the covenant. Each member decides whether to opt in to receive a portion of the sum. The sum is divided among all HOA members who opt in, and in exchange, those members and the HOA itself relinquish the right to challenge the building. For those that do not opt in, the damages approach will apply.²⁹³ If they object to the construction, they can roll the dice and collect damages in court.

This legal mechanism draws inspiration from land assembly districts (“LADs”), though it has some key differences. LADs seek to break another form of property gridlock.²⁹⁴ Sometimes a large development that would span several individual land parcels would generate more economic value or public benefit than the sum of all the individual parcels under their current use.²⁹⁵ But because each individual parcel owner can refuse to sell, a single holdout can sink a socially beneficial development. LADs tackle this problem.²⁹⁶ When a state or locality authorizes an LAD, it gives the neighborhood the power negotiate a sale of all the land within it—either by majority or supermajority vote, or by the appointment of a board to negotiate the sale.²⁹⁷ The dissenting property owners can opt out, but then they would still lose their land to the project by eminent domain (and in exchange, receive just compensation from the government).²⁹⁸ All other properties effectively receive a pro rata share of the overall sale price.²⁹⁹

My proposal is similar to the LAD mechanism in two important ways. One, it allows for a lower-transactions-costs method of negotiating a selling price than bargaining with each individual entitlement holder. Two, it provides an incentive against holdouts,³⁰⁰ and ultimately can move forward whether there are individual holdouts or not. My proposal differs from the LAD mechanism because whereas with a LAD the owners have the ultimate say by majority vote over whether the land collection is sold at all,³⁰¹ in my proposal for restrictive covenants the development may move forward either way.³⁰² But the property owners can choose whether they prefer compensation through the slice of the developer’s up-front offer or whether they would rather leave their compensation to litigation and jury determination within the predefined damages range.

A psychological benefit of this approach is that it gives the property owners a sense of ownership over whatever result they reach. Those who opt in to the developer’s offer choose to do so and thus escape any acute harm from direct coercion. But even those who do not opt in could to an extent feel that their interests mattered. True, they were not able to stop the development simply by speaking up. But when they ultimately are only entitled to damages in court, that result is one they had some choice in bringing about; they are in the same boat as all other HOA members in that respect.

Of course, this procedural mechanism will interplay with the damages backdrop. If the builder has reason to believe that juries would give more favorable awards to neighboring property owners and strongly compensate for any unquantifiable psychic harm (that is, select higher values within the predefined damages ranges), then the builder might want to avoid such damages actions. They therefore would have an incentive to offer more in the initial lump sum to try to persuade owners to opt in and thus relinquish their rights to a later suit. Conversely, if the builder offers too much in the lump sum, but the neighborhood contains many holdouts, then the builder might end up substantially paying in the lump sum and still face the prospect of numerous damages suits, after which it would pay out again and again. And from the neighboring property owners’ perspectives, the lower the value a jury is likely to award within the damages range, the more likely the neighbors will be encouraged to accept a lump sum monetary offer by the builder up front. But the crucial point is that the two-stage solution, while guaranteeing that a builder can develop multi-family housing if it is determined to do so, also provides neighboring landowners both a degree of ownership over how their interests are credited and a mechanism for targeted compensation. This approach avoids holdouts and loosens housing gridlock in a way that takes property interests and people’s psychological attachments to them seriously.³⁰³

CONCLUSION

The housing crisis is complex. It is partly a legal and financial problem and, as this Article especially emphasizes, partly a psychological one. Even if homeowners have the economic incentive to build or allow for multi-family housing, and even if localities and states lift the “zoning straitjacket,”³⁰⁴ housing supply still could lag. Private restrictive covenants enshrining single-family housing curb housing production in many localities, potentially distressing millions of Americans and disproportionately harming those living in poverty. Psychological phenomena concerning how people develop attachments and perceive change conspire with powerful legal tools to hold these restrictions in place. The solution to this gridlock thus must account for those conspiring factors. This Article charts a possible course: an initial lump-sum offer from a builder with an opt-in mechanism for each HOA member, followed by a carefully prescribed damages action available to all remaining members. If we care deeply about both property owner expectations and providing an adequate supply of housing, this approach shows a way forward.

97 S. Cal. L. Rev. 1233

Download

Brian M. Miller*

* Assistant Professor of Law, Duquesne University Kline School of Law. I would like to thank Molly Brady, Charles Barzun, John Infranca, Peter Danchin, Michele Gilman, Will Moon, Anne-Marie Carstens, Aadhithi Padmanabhan, Alexi Pfeffer-Gillette, Chris Bryant, Matthew Sipe and Chelsea Banister for insightful comments. Thanks also to all the participants in the Maryland Law/Baltimore Law Junior Faculty Workshop for helpful feedback on an earlier draft. And thanks to the participants in the Richmond Junior Faculty Forum for helpful comments as well. I am grateful to Marc LeVan for valuable research assistance and to Tanya Thomas for research assistance at the earliest stages. Finally, thanks to the exceptional editors of the Southern California Law Review. Errors are mine.

Major Questions, Common Sense?

September 2024September 2024Kevin Tobia,* Daniel E. Walters† & Brian Slocum‡

The Major Questions Doctrine (“MQD”) is the newest textualist interpretive canon, and it has driven consequential Supreme Court decisions concerning issues from vaccine mandates to environmental regulation. Yet, the new MQD is a canon in search of legitimization. Critics allege that the MQD displaces the Court’s conventional textual analysis with judicial policymaking. Textualists have now responded that the MQD is a linguistic canon, consistent with textualism. Justice Barrett recently argued in Biden v. Nebraska that the MQD is grounded in ordinary people’s understanding of language and law, and scholarship contends that the MQD reflects ordinary people’s understanding of textual clarity in “high-stakes” situations. Both linguistic arguments rely centrally on “common-sense” examples from everyday situations.

This Article tests whether these examples really are common sense to ordinary Americans. We present empirical studies of the examples offered by advocates of the MQD, and the results challenge the arguments that the MQD is a linguistic canon. Moreover, the interpretive arguments offered to legitimize the MQD as a linguistic canon threaten both textualism and the Supreme Court’s growing anti-administrative project.

INTRODUCTION

The Supreme Court’s most consequential interpretive canon is a new one: the major questions doctrine (“MQD”). The basic idea is as follows: when an agency undertakes a “major” policy action, the statutory authorization must be clear and specific (rather than unclear or general).¹ In several high-profile cases, the Court has used the MQD to strike down agency actions involving vaccine mandates,² environmental regulation,³ and student loan relief.⁴ Given this track record, no wonder critics have argued that the MQD poses an existential threat to the administrative state, since few statutes are likely to provide the requisite clear language, and what constitutes “majorness” is subjective and potentially applicable to a wide range of agency actions.⁵

Despite its undeniable influence, the MQD is undertheorized, and it remains a canon in search of a justification.⁶ Scholars and judges have splintered in their understanding of how the doctrine operates on statutory language.⁷ For instance, one advocate of the canon describes it as a requirement for a “clear and specific statement from Congress if Congress intends to delegate questions of major political or economic significance to agencies.”⁸ Two critics of the MQD have described it similarly as a rule requiring courts “not to discern the plain meaning of a statute using the normal tools of statutory interpretation, but to require explicit and specific congressional authorization for certain [major] agency policies.”⁹ In response, Justice Barrett in Biden v. Nebraska has denied that the MQD requires courts “to depart from the best interpretation of the text,” and claims that the canon is not a clear statement rule and does not require explicit congressional authorization of the “precise agency action under review.”¹⁰ These kinds of disagreements, while perhaps technical, influence how the doctrine is defended and employed, and even implicate its future as an interpretive canon.

So far, efforts to legitimize the doctrine have been unpersuasive. The canon is used primarily by self-identified textualists,¹¹ but critics (textualist and non-textualist alike) have alleged that the MQD is inconsistent with textualism, or even is anti-textualist, because it displaces the ordinary meaning of statutory text in the name of normative values.¹² In fact, the MQD’s rise coincides with a surge of skepticism among textualists and commentators about the validity of substantive canons generally.¹³ The Court’s use of the MQD even prompted Justice Kagan to retract her quip that “we’re all textualists now.”¹⁴ She now notes: “It seems I was wrong. The current Court is textualist only when being so suits it.”¹⁵ These critiques allege that the MQD inappropriately licenses textualists to depart from the best reading of statutory text in the name of values or norms. An ideal response for a textualist favoring the MQD would be some account of how the MQD determines the linguistic meaning of a statute.

Increasingly, textualists are making precisely this “linguistic” move. Some textualists now propose that the MQD is a linguistic interpretive canon, consistent with textualism.¹⁶ On this account, textualists remain committed to the ordinary reader’s understanding of language, with the MQD simply reflecting how ordinary people, exercising basic “common sense,” generally understand the meaning of statutes delegating authority to agencies.¹⁷ On this “linguistic” picture, normative or substantive values are not relevant to the canon or its application, and they certainly do not lead textualists to depart from the best reading of the text. Instead, the MQD is just like any other linguistic canon—it reflects only a generalization about how ordinary people use and understand language in context.¹⁸ This rebranding of the MQD as a linguistic canon has rapidly moved from the pages of law reviews¹⁹ to the Supreme Court.²⁰ There, Justice Barrett recently denied that the MQD is normatively driven and instead argued that it merely reflects ordinary people’s “common-sense” understanding of instructions, including those given by Congress.²¹

In this Article, we evaluate the MQD’s “linguistic turn” and subject its premises to empirical study. We study two key issues: (1) Does the MQD follow from ordinary people’s understanding of language and, more specifically, delegating instructions?; and (2) Do ordinary people interpret more cautiously or narrowly in “high-stakes” situations? The empirical results support answering “no” to both questions. Contrary to the MQD proponents’ contentions, the results indicate that ordinary people do not adjust their judgments of textual clarity according to the stakes of interpretation, and they interpret broad delegations broadly, even in situations in which Justice Barrett claims that “common sense” would dictate narrower interpretations of the scope of authorization.²²

Part I introduces the MQD and the two linguistic arguments that have been offered in defense of the canon. After briefly addressing the defense of the MQD as a substantive canon in Section I.A, we turn in Section I.B to the proposal that ordinary interpretation shifts in “high-stakes” contexts, and that this behavior justifies the MQD as a linguistic canon.²³ The high-stakes argument appeals to an example from analytic philosophy²⁴ and prior legal scholarship²⁵ that suggests that high-stakes contexts diminish ordinary knowledge. Thus, as a famous hypothetical illustrates, you might know that the town bank is open on the weekend when planning to deposit a small check with low stakes. In contrast, in a higher-stakes context (for example, if the check is for ten thousand dollars and must be deposited before Monday to avoid an overdraft), you may decide instead that you do not really know that the bank is open. Legal scholarship proposes that this is how ordinary people understand knowledge: ordinary knowledge is stakes sensitive.²⁶ More importantly for the MQD, an emerging argument builds on this premise to suggest that ordinary understanding of textual clarity is also stakes driven: in high-stakes contexts, a text is less clear.²⁷ As such, in those high-stakes (or “major”) cases, courts should require highly specific language to authorize agency action.

Section I.C introduces Justice Barrett’s separate proposal that ordinary language is context sensitive and anti-literal, and therefore a textualist faithful to the ordinary reader should adopt the MQD as a means to determine the best reading of statutory language.²⁸ Justice Barrett’s argument also appeals to an intuitive example: instructing a babysitter to “have fun with the kids” while handing him a credit card might literally permit the babysitter to take them on an overnight trip to an out-of-town amusement park (after all, doing so would be “fun”). But in context, ordinary people employ “common sense” and understand the literal meaning of the instruction to only permit the most reasonable set of applications of the instruction.²⁹ Ordinary people are therefore non-literalists, understanding general delegations to be more limited in meaning than their terms alone might suggest. As such, the argument goes, the MQD is “consistent with how we communicate conversationally,” making it a valid linguistic canon that reflects an interpretive commitment to ordinary people.³⁰

Justice Barrett’s argument is important and places her as a leader among the Court’s textualists; she is the only textualist advocate of the MQD who has offered a proposal to square the MQD with textualism. At the same time, the linguistic argument in her brief concurring opinion is not entirely clear. As such, we attempt to charitably reconstruct Justice Barrett’s defense as a workable argument—that is, one that derives the MQD conclusion from the babysitter hypothetical premise.

Part I contributes to the literature by explaining these two new arguments for the linguistic MQD in sufficient detail. Unpacking the arguments clarifies each argument’s theoretical challenges and empirical claims. Both arguments employ hypotheticals about how ordinary people interpret language but, significantly, support these hypotheticals with references to academic philosophy or judicial intuition; neither uses empirical evidence.

Parts II and III investigate these empirical claims, both by engaging with the existing empirical literature on high-stakes knowledge (much of it uncited by proponents of the linguistic MQD) and by conducting original survey experiments of both high-stakes interpretation and how ordinary people interpret instructions. Part II considers the claim that ordinary knowledge is stakes sensitive. This claim has been influential in philosophy,³¹ legal scholarship,³² and now the major questions debate.³³ Although philosophers claim knowledge is stakes sensitive, many existing studies report that stakes have little or even no effect on ordinary attributions of knowledge.³⁴ And, to our knowledge, no empirical study bears on the question of whether higher stakes reduce textual clarity (a related but different issue). The critical link in one version of the linguistic MQD argument is therefore entirely untested.

Part III presents studies designed to test the empirical claims of the linguistic MQD arguments. Our studies use the exact two cases offered by proponents of the linguistic MQD—the “bank case” and the “babysitter hypothetical”—to conduct original survey experiments. Overwhelmingly, ordinary people in our studies did not interpret these scenarios consistently with the empirical premises of the linguistic MQD arguments.

Part IV develops three sets of implications that follow from our empirical evidence and the textualist efforts to legitimize the MQD as a linguistic canon. These implications concern the empirical evidence for the linguistic MQD (IV.A), challenges that the linguistic MQD poses for textualism (IV.B), and the relationship between empirical evidence of how ordinary people view delegations and administrative law, including intriguing evidence that people are more concerned about underenforcement of instructions compared with overenforcement (IV.C).

In brief, the extant and new empirical findings do not support the linguistic MQD. Specifically, the findings count against the predictions of the two leading linguistic MQD arguments, using the exact cases offered in defense of the linguistic MQD. Of course, we are open to the possibility that study of further examples could weigh against our conclusions. But for interpreters deciding today whether to employ a “linguistic MQD,” there is insufficient empirical support and theoretical clarity to cast the MQD as a valid linguistic canon. Moreover, the results provide stronger support for a new counter-MQD: ordinary people understand general authorizing language as consistent with a broad range of reasonable actions that fall under the text’s meaning. Textualists committed to the “ordinary reader” and “interpretation from the outside” claim to follow those commitments to where they lead—and the current evidence favors an interpretive rule far from the current MQD.³⁵

I. THE MAJOR QUESTIONS DOCTRINE AND THEORIES OF ITS LEGITIMACY

The MQD has sparked a great deal of scholarly effort to specify exactly what the doctrine is and how it fits into traditional categories of interpretive doctrine. In this Part, we survey these efforts, many of which conclude that the MQD is a substantive, or normative, canon.³⁶ These classifications matter because substantive canons are increasingly questioned as being inconsistent with textualism.³⁷ Classifying the MQD as substantive (rather than linguistic) is tantamount to saying it is illegitimate or tenuous, at least on textualist grounds.³⁸ Perhaps not surprisingly, some textualist defenders of the MQD have not fully endorsed the idea that the MQD is a substantive canon.³⁹ In fact, as we discuss below, perhaps the most serious attempt to ground the MQD in interpretive law asserts that the doctrine is instead a linguistic, or semantic, canon.⁴⁰ In theory, at least, this move would legitimize the canon for textualists and everyone else because the doctrine would simply be folded into the relatively uncontroversial search for the ordinary meaning of delegating statutes.⁴¹

This pivot to a linguistic defense raises many questions, very few of which have been answered. After describing how the linguistic defense works, we then highlight theoretical limitations, open questions, and the broader implications of defending the MQD as a linguistic canon.

A. The Canonization of the Major Questions Doctrine

1. Historical Threads of the Major Questions Doctrine

The MQD is not entirely new; it is in the process of “metamorphosis.”⁴² Arguably, the first appearance of something like the MQD was in the plurality opinion in a 1980 case known as the Benzene Case.⁴³ In that case, the Occupational Safety and Health Administration (“OSHA”) was charged with promulgating standards that “most adequately assure[], to the extent feasible, on the basis of the best available evidence, that no employee will suffer material impairment of health or functional capacity even if such employee has regular exposure to the hazard dealt with by such standard for the period of his working life.”⁴⁴ Rather than follow OSHA’s argument that the statute, fairly read, seemed to require it to “impose standards that either guarantee workplaces that are free from any risk of material health impairment, however small, or that come as close as possible to doing so without ruining entire industries,” the plurality opinion held that OSHA had only been delegated authority to regulate “significant” risks.⁴⁵

As Cass Sunstein notes, although the Court invoked the nondelegation doctrine and constitutional avoidance to arrive at this statutory interpretation, it is impossible to square what the Court did with the “(standard) nondelegation doctrine.”⁴⁶ The interpretation offered by OSHA, in addition to doing little violence to the text of the statute, would “sharply cabin” the agency’s discretion.⁴⁷ Sunstein suggests that the plurality opinion in the Benzene Case instead endorsed the novel idea that “without a clear statement from Congress, the Court will not authorize the agency to exercise that degree of (draconian) authority over the private sector.”⁴⁸

It was hardly clear at the time, however, that the Court was creating something called the “major questions doctrine”; in fact, that would not become clear until very recently. Instead, for several decades, the Court intermittently invoked similar, but often distinct, reasoning from the Benzene Case in regulatory cases involving “extraordinary” circumstances, all while leaving the precise theory behind the reasoning unstated. Paradigmatic of these invocations is FDA v. Brown & Williamson Tobacco.⁴⁹ In that case, the Food and Drug Administration (“FDA”) promulgated a rule regulating tobacco products as “drugs” under the Food, Drug, and Cosmetics Act. The Court applied the familiar Chevron two-step analysis and concluded, on the basis of an examination of legislative history, that Congress had unambiguously declined to give the FDA this power.⁵⁰ The Court added another reason for its conclusion, though, stating that “[i]n extraordinary cases . . . there may be reason to hesitate before concluding that Congress has intended . . . an implicit delegation.”⁵¹

As the “implicit delegation” phrase reveals, the Court explicitly couched its consideration of the “majorness” or “extraordinariness” of the power asserted by the FDA as part of the Chevron analysis. Thus, the MQD acted as a “carve-out” or “exception” to the ordinary rule that statutory ambiguities constitute implicit delegations that an agency is given primacy over courts to resolve, so long as it does so reasonably.⁵² Instead, when “extraordinary” questions are presented by the agency’s claim of delegated authority, the Court itself resolves the ambiguity at Chevron step one.⁵³

The Brown & Williamson opinion’s use of proto-MQD logic departed from the apparent logic of the Benzene Case in an important way. The Benzene Case left little room for an agency interpretation to survive once the doctrine was triggered. The only way to prevail was to point to clear statutory authorization that could not be limited by the Court to avoid the major implications of the agency’s interpretation. Sunstein calls this the “strong version” of the MQD.⁵⁴ By contrast, in Brown & Williamson, Sunstein sees a “weak version” that theoretically allowed an agency’s major action so long as the statutory interpretation could be endorsed by a Court engaged in independent (de novo) review without according the agency any deference.⁵⁵

As a practical matter, the weak version of the MQD seemed to win out for a while after Brown & Williamson, and on at least one occasion, an agency did win in a major questions case. In King v. Burwell, the Internal Revenue Service (“IRS”) interpreted the Affordable Care Act to make tax credits available even if an individual purchased health insurance on a federal insurance exchange, despite statutory language that limited tax credits to plans purchased through “an Exchange established by the State.”⁵⁶ Like in Brown & Williamson, the Court noted that there “may be reason to hesitate before concluding that Congress has intended such an implicit delegation.”⁵⁷ Unlike in Brown & Williamson, however, the Court concluded that the agency had the power to issue the rule, even on a de novo interpretation of the statute. Although the Court’s interpretation of the statutory language at issue has been criticized,⁵⁸ the important point is that the “weak version” of the MQD—that is, an “exception,” or “carve-out” from Chevron deference—seemed to rule the day. The only open questions were about where, precisely, to locate the major questions exception: at Chevron step zero,⁵⁹ step one,⁶⁰ or step two.⁶¹

2. The Modern Major Questions Doctrine and Its Justification

Enter what Mila Sohoni calls the “major questions quartet.”⁶² If it was unclear exactly which version of the MQD existed before the quartet, the waters have become only murkier afterward. One thing is unmistakably clear though: The Court did not treat the MQD as a mere exception or carve-out from Chevron deference. Instead, it “unhitched the major questions exception from Chevron.”⁶³ In fact, the majority opinion in West Virginia v. EPA,⁶⁴ the leading case in the quartet, did not even mention Chevron in its elaboration or application of the MQD.⁶⁵ Instead, the Court offered an almost entirely new gloss on the doctrine:

“[I]n certain extraordinary cases, both separation of powers principles and a practical understanding of legislative intent make us ‘reluctant to read into ambiguous statutory text’ the delegation claimed to be lurking there. To convince us otherwise, something more than a merely plausible textual basis for the agency action is necessary. The agency instead must point to ‘clear congressional authorization’ for the power it claims.”⁶⁶

For the vast majority of commentators, these words have been taken to suggest that the current Court, post-quartet, thinks of the MQD as a particularly powerful form of substantive canon: a clear statement rule.⁶⁷ On this reading—which seems similar to the implicit use of the doctrine in the Benzene Case—Congress must have spoken with unmistakable clarity in order for agencies to have the “major” power they are claiming to have been delegated. If there is any ambiguity, and even if the agency has a “plausible” basis for concluding that it has the authority under applicable statutes, the agency cannot exercise that power. Some are not convinced the MQD is a clear statement rule and view it as a weaker substantive canon that resolves ambiguity.⁶⁸ Accordingly, when the MQD is applicable, any statutory ambiguities should be resolved against the agency’s assertion of power so as to vindicate “separation of powers principles.”⁶⁹ In any event, a common understanding is that the MQD is driven by a normative commitment to a limited role for administrative agencies in the legal system, and perhaps by a “delegation doctrine” that insists that agencies have no power unless it is affirmatively shown that Congress has granted it to them.⁷⁰

The MQD is inherently controversial as a substantive canon regardless of whether it is a clear statement rule or a tiebreaker canon. Simply by virtue of being a substantive canon, the “new MQD” is in tension with textualism. As Justice Kagan, a self-avowed textualist, puts it, there is some momentum for “toss[ing] [substantive canons] all out.”⁷¹ As she noted in her West Virginia dissent, channeling Karl Llewelyn, “special canons like the ‘major questions doctrine’” function as “get-out-of-text-free cards.”⁷² Recently, Benjamin Eidelson and Matthew Stephenson have exhaustively assessed “leading efforts to square modern textualist theory with substantive canons” and ultimately concluded that “substantive canons are generally just as incompatible with textualists’ jurisprudential commitments as they first appear.”⁷³ The MQD, insofar as it is a substantive canon, would not be spared.⁷⁴

Beyond these generalized concerns with substantive canons, some commentators have questioned whether the MQD satisfies basic expectations about the Court’s recognition and use of substantive canons, even assuming that they can sometimes be legitimate aids to interpretation. Simply put, the Court has not been at all clear about the source of the normative foundation of the MQD.⁷⁵ For Sohoni, formulating the MQD as a kind of constitutional avoidance rule fails because of the “Court’s failure to say anything about nondelegation”—a failure that “creates genuine conceptual uncertainty about what exactly it was doing in these cases.”⁷⁶ The currently prevailing nondelegation test asks merely whether Congress has provided a “reasonably intelligible policy” to guide an agency’s exercise of discretion.⁷⁷ That test would not have provided anywhere close to a “significant risk” of constitutional invalidity in any of the statutes examined in the major questions quartet.⁷⁸ Although Justice Gorsuch in his concurrence in West Virginia v. EPA suggested that the MQD is inspired by the nondelegation doctrine (and probably his preferred version of the nondelegation doctrine, which is not the law currently), the majority pointed more generally to “separation of powers principles.”⁷⁹ Some have inferred that the Supreme Court might be interested in developing constitutional principles demanding affirmative proof of delegation in certain circumstances—and that the MQD reflects this implicit constitutional project⁸⁰––but if so the Court has not been explicit. This uncertainty about the connection between constitutional principles and the MQD also seems to doom the MQD under Justice Barrett’s own test for the legitimacy of substantive canons within textualism, under which there must be a reasonably specific constitutional principle to which a constitutionally inspired substantive canon attaches.⁸¹ In other words, if the MQD is a substantive canon, its substance, or normative content, is not clear. Most substantive canons either reflect a broad societal consensus or are tied closely to constitutional law. The MQD at first glance has neither of these attributes.

3. The Modern Major Questions Doctrine’s Linguistic Turn

Perhaps not surprisingly, given the strong pushback that the MQD has received when it is formulated as a substantive canon, defenders of the MQD are increasingly suggesting that the MQD is not a substantive canon at all. Instead, proponents suggest it is a linguistic canon.

This rebranding is not as far-fetched as it might seem at first. “ ‘[L]inguistic’ validity and ‘substantive’ value are properties of canons.”⁸² The standard dichotomy between “linguistic” and “substantive” canons suggests that a canon has at most one property; but, it is conceptually possible for a canon to have both.⁸³ There is evidence that some canons that have long been treated as “substantive canons”—such as anti-retroactivity and anti-extraterritoriality—are also consistent with how ordinary people understand rules. For example, when a rule (especially a punitive rule) does not explicitly state whether it applies retroactively, prospectively, or both, people tend to understand it to apply only prospectively.⁸⁴ Insofar as textualism is guided by ordinary understanding of language,⁸⁵ textualists have good reason to consider such “substantive” canons as simultaneously linguistic ones. Even some tough critics of substantive canons like Eidelson and Stephenson show some openness to these arguments: “[T]he textualist’s reasonable reader . . . opens the door to recasting some seemingly substantive canons as simply default inferences that a reasonable reader would draw . . . . The presumption against extraterritoriality is a possible example.”⁸⁶

Could a similar linguistic argument support the MQD? Acknowledging that criticisms of the MQD as a substantive canon “are, to some if not a large extent, warranted,”⁸⁷ Professor Ilan Wurman recently rebranded the MQD as a linguistic canon.⁸⁸ Wurman argues that the MQD could be understood as motivated by a theory of linguistic usage about how interpretive uncertainty should be resolved rather than as importation of substantive or normative values into the interpretive enterprise. He appeals to prior work in philosophy and legal philosophy, which argues that “high-stakes” contexts lead to less knowledge or legal clarity.⁸⁹

Even more recently, Justice Barrett has proposed her own, separate linguistic argument for the MQD’s legitimacy. The Supreme Court has made the major questions quartet a quintet with its decision in Biden v. Nebraska. That case concerned President Biden’s 2022 proposal to forgive $10,000 to $20,000 in student loans for low to middle-income borrowers. Biden’s Department of Education traced the authority for their emergency loan relief to the HEROES Act, a 2001 law that grants the U.S. Secretary of Education the ability to “waive or modify” provisions related to federal student loans “in connection with a war or other military operation or national emergency.”⁹⁰ After Biden announced his administration’s loan forgiveness program as a response to the COVID-19 national emergency, several states challenged the program. That case reached the Supreme Court and divided the Justices 6–3 along conservative-liberal lines. Justice Roberts’s majority opinion proceeded with traditional textual interpretation, concluding that the government’s student loan relief is not within the statutory meaning of “waive or modify” any provision. But the opinion also referenced the major questions doctrine, as an alternative ground for the holding.

Justice Barrett wrote separately to argue that the MQD is not a substantive canon but rather “a tool for discerning—not departing from—the text’s most natural interpretation.”⁹¹ Candidly, and consistently with her prior writings on substantive canons,⁹² Justice Barrett conceded that the substantive canon version of the MQD might be “inconsistent with textualism” and therefore “should give a textualist pause.”⁹³ By grounding the MQD in how ordinary readers apply common sense in reading statutory text, Justice Barrett aims to put the MQD on more solid footing, particularly for textualists.

After the opinion, some suggested that Justice Barrett’s argument “mirrors” Wurman’s.⁹⁴ We disagree: the two arguments both present the MQD as a linguistic canon, but the arguments are distinct. Wurman appeals to high-stakes context and the resolution of interpretive uncertainty, while Barrett appeals to anti-literalism and contextual restriction concerning major actions (with nothing about high stakes). Thus, Wurman’s argument centers on “ambiguity” caused by high stakes, whereas Justice Barrett’s theory is about how ordinary people generally use “common sense” to interpret non-literally (with no mention of “ambiguity”). The next two Sections separately reconstruct Wurman’s (I.B) and Justice Barrett’s (I.C) linguistic arguments in detail and present some theoretical challenges for each.

B. The Major Questions Doctrine as a High-Stakes Linguistic Canon

One important line of work defending the “linguistic” MQD appeals to the philosophical and legal-philosophical literature on stakes and knowledge.⁹⁵ That theoretical literature proposes that knowledge is sensitive to high stakes: it could be true that one knows a proposition in a low-stakes context (for example, the bank is open) but does not know that proposition, given the same evidence, in a high-stakes context.

The legal literature about stakes and interpretation, including the linguistic MQD defense, takes this claim about knowledge to be important. But the relationship between knowledge and legal interpretation is not entirely clear. Roughly, the argument goes as follows: we are less likely to know a proposition when the practical stakes of its truth are raised, and similarly, we are less likely to assess that a text is clear when the practical stakes of its meaning are raised.⁹⁶

The linguistic defense of the MQD is clearly based in part on this philosophical literature about stakes and knowledge. Before interrogating the full argument, however, we must spell it out. Here we attempt to reconstruct the defense.

1. Reconstruction of the “High Stakes” Linguistic Defense of the Major Questions Doctrine

(1) [Empirical Premise 1: Stakes-Sensitive Knowledge]: The ordinary reader’s knowledge is sensitive to high stakes.⁹⁷

(2) [Empirical Premise 2: Stakes-Sensitive Clarity]: The ordinary reader’s understanding of textual clarity is sensitive to high stakes.⁹⁸

(3) [Definition: MQD Case]: In a MQD case, the agency’s statutory powers are defined in linguistic terms that are semantically clear but highly general. The agency is exercising “vast powers” of great economic/political significance and pointing to the statutory language as authorization.⁹⁹

(4) [Premise]: MQD cases involve a high-stakes context.¹⁰⁰

(5) [Textualist Premise]: Judges should interpret statutory language from the perspective of the ordinary reader.

(6) [Minor Conclusion, from 1, 2, 3, 4, 5]: In a MQD case, the text is unclear.

(7) [Premise]: If a text is unclear with respect to authorizing an agency’s action, it does not authorize that action.

(8) [Major Conclusion, from 6, 7]: In a MQD case, the agency’s action is not authorized.

Attempting to construct the argument fully and precisely reveals several interesting features and questions. First, consider the two “Empirical Premises” (1 and 2). It is unclear exactly what function the first Empirical Premise (about knowledge) serves. It is included in the argument above because it features repeatedly and centrally in Wurman’s (and Doerfler’s) scholarship on high stakes, but even if that Premise were false, Premise 2 alone could support the argument.

Why, then, does the “high-stakes” literature emphasize knowledge in addition to textual clarity? Perhaps because there is little data bearing on the truth of Premise 2, but there is rich, decades-old philosophical literature that seemingly supports Premise 1.¹⁰¹ As such, we understand the legal literature to be using Premise 1 as support for Premise 2: philosophers have concluded that knowledge is stakes sensitive, and this conclusion supports also concluding that textual clarity is stakes sensitive.

In Part III, we investigate the stakes-knowledge-clarity relationship empirically, but here we note some initial skepticism about the inference from knowledge to clarity. Law includes technical language,¹⁰² and as such, many ordinary people do not have direct knowledge of a law’s meaning. Nevertheless, this does not imply that a particular law is unclear, in the sense of being unclear to a legal expert or inherently indeterminate. Recent empirical work supports this point: ordinary readers understand law to include technical legal meanings, and they defer to legal experts to elaborate those meanings.¹⁰³ The mere fact that laypeople do not know the meaning of a law without further inquiry or assistance strikes us as an implausible basis for judges to treat the law as ambiguous or unclear.

Moreover, the “Minor Conclusion” (6) only follows with a very strong interpretation of the meaning of “sensitive to high stakes” (1) and (2). To conclude that “general” statutory language is unclear because of ordinary sensitivity to a high-stakes context, one must interpret (2) to mean that a high-stakes context eliminates clarity.

Wurman describes the MQD as limited to “resolving statutory ambiguities.”¹⁰⁴ This is a common way to describe a “tiebreaker” canon. We ultimately find this confusing insofar as Wurman also presents the MQD as a linguistic canon, a rule of thumb that is evidence of linguistic meaning. If “ambiguity” refers to linguistic ambiguity, an applicable “linguistic” canon would render the statute non-ambiguous. For example, in Lockhart the Court faced a linguistic ambiguity.¹⁰⁵ Lockhart was convicted under 18 U.S.C. § 2252(a) and faced a mandatory minimum due to an earlier conviction. The penalty increased if the defendant had a prior conviction “under the laws of any State relating to aggravated sexual abuse, sexual abuse, or abusive sexual conduct involving a minor or ward.”¹⁰⁶ That final modifier (involving a minor or ward) could modify all three noun phrases (aggravated sexual abuse, sexual abuse, and abusive sexual conduct) or just the last (abusive sexual conduct). The series qualifier canon instructs us to apply the modifier to all three noun phrases. The determination that the series qualifier canon applies qua linguistic canon is a decision that the linguistic meaning of the provision is determinate and has a specific meaning, not that it is ambiguous. If ambiguity persists—for example, if there is a competing linguistic canon that counsels in favor of the opposite interpretation—the Court might resolve ambiguity with some non-linguistic consideration, such as the rule of lenity.

Alternatively, perhaps the argument is that the MQD is “linguistic” in the sense that it represents how ordinary people believe that ambiguity should be resolved, and thus how ordinary people would choose to resolve disputes in MQD cases. But that would be an unusual sense of “linguistic.” Existing linguistic canons help determine the linguistic meaning of a provision; they do not enter the interpretive process after that meaning has been concluded to be indeterminate.

This might all seem pedantic, but it highlights a problem with this linguistic defense of the MQD. We have done our best to explain the argument in a clear form, but we are unsure that there is even a workable argument for the “high stakes” linguistic MQD that arrives at the Major Conclusion (8).

Beyond this general issue (that the logic of the argument itself is unclear), several of the premises are open to debate. For example, perhaps some of the Court’s major questions cases do not involve high stakes or sufficiently high stakes (Premise 4).¹⁰⁷ Premise 7 is also controversial: just because a text’s meaning is unclear does not necessarily imply that it should be interpreted against an agency delegation (perhaps instead, it should be interpreted with a presumption of judicial nonintervention).¹⁰⁸

Nevertheless, most of our attention in this Article is on the two Empirical Premises, 1 and 2. Whatever the argument is, it is clear that these two premises are central: the “high-stakes” argument repeatedly appeals to these claims.¹⁰⁹ If these premises—and especially the second premise—are empirically invalid, the entire argument is a nonstarter. Part II of this Article presents evidence bearing on Premise 1, and Part III presents original empirical studies bearing on both Premise 1 and Premise 2. To preview the findings, (1) although academic philosophers have long assumed that higher stakes reduce knowledge, many studies find that stakes have no effect on ordinary people’s knowledge attributions;¹¹⁰ (2) we find a very small effect of stakes on knowledge (far from sufficient to conclude that “the ordinary reader” is stakes-sensitive about knowledge), and no effect of stakes on linguistic clarity.¹¹¹

C. The Major Questions Doctrine as an Anti-Literal Linguistic Canon

A second argument for the “linguistic” MQD surfaced in summer 2023. Justice Barrett’s concurrence in Biden v. Nebraska proposes that the MQD has a linguistic basis in ordinary people’s anti-literalism and sensitivity to context.

The crux of the argument is an appeal to the predicted reaction of ordinary people to everyday situations, such as Justice Barrett’s “babysitter” hypothetical:

Consider a parent who hires a babysitter to watch her young children over the weekend. As she walks out the door, the parent hands the babysitter her credit card and says: “Make sure the kids have fun.” Emboldened, the babysitter takes the kids on a road trip to an amusement park, where they spend two days on rollercoasters and one night in a hotel. Was the babysitter’s trip consistent with the parent’s instruction? Maybe in a literal sense, because the instruction was open-ended. But was the trip consistent with a reasonable understanding of the parent’s instruction? Highly doubtful. In the normal course, permission to spend money on fun authorizes a babysitter to take children to the local ice cream parlor or movie theater, not on a multiday excursion to an out-of-town amusement park. If a parent were willing to greenlight a trip that big, we would expect much more clarity than a general instruction to “make sure the kids have fun.”¹¹²

Justice Barrett explains that additional context could make a difference, including (1) “maybe the parent left tickets to the amusement park on the counter,” (2) “[p]erhaps the parent showed the babysitter where the suitcases are, in the event that she took the children somewhere overnight,” (3) “maybe the parent mentioned that she had budgeted $2,000 for weekend entertainment,” (4) the “babysitter had taken the children on such trips before,” or (5) “if the babysitter were a grandparent.”¹¹³ Notably, not all of these are additions to the text of the statement. We are sympathetic to this view of non-text-based context, but it is arguably a significant departure from traditional text-focused textualism.¹¹⁴

Moreover, Justice Barrett argues that the babysitter hypothetical illustrates how “we communicate conversationally” and that the MQD merely represents “common sense” in a different context:

In my view, the major questions doctrine grows out of these same commonsense principles of communication. Just as we would expect a parent to give more than a general instruction if she intended to authorize a babysitter-led getaway, we also “expect Congress to speak clearly if it wishes to assign to an agency decisions of vast ‘economic and political significance.’ ” That clarity may come from specific words in the statute, but context can also do the trick. Surrounding circumstances, whether contained within the statutory scheme or external to it, can narrow or broaden the scope of a delegation to an agency.¹¹⁵

This justification coheres with Justice Barrett’s “ordinary speaker” approach to interpretation. In Congressional Insiders and Outsiders, Justice Barrett argues that judges should approach language “from the perspective of an ordinary English speaker—a congressional outsider.”¹¹⁶ This generally requires avoiding insider knowledge about Congress: “What matters to the textualist is how the ordinary English speaker—one unacquainted with the peculiarities of the legislative process—would understand the words of a statute.”¹¹⁷

While Justice Barrett’s babysitter example is intriguing, it is not immediately clear how it supports the MQD. A skeptic might read the babysitter-to-MQD argument as committing a “motte” and “bailey” fallacy, conflating one position that is very easy to defend (the motte) with one much harder to defend (the bailey). It is undeniable that context influences interpretation and it would not be surprising that ordinary people are more confident in delegation of power with additional supporting contextual evidence. If the babysitter had previously taken the children on trips ((4) from above) or the agency had a longstanding practice of developing new programs, that context would often make readers equally or more confident that a text delegating authority to that agent encompasses similar action.

But this observation (that context can lend further support to particular actions taken pursuant to a delegation) does not justify the MQD. Justice Barrett’s key claim about ordinary language is much stronger, something like: ordinary people understand general delegations to X to be limited to only the most reasonable ways to X, absent further textual or contextual support for X. Recall Justice Barrett’s argument about the babysitter’s trip: “But was the trip consistent with a reasonable understanding of the parent’s instruction? Highly doubtful.”¹¹⁸ The central claim in the strong form of Justice Barrett’s argument is not merely that context matters but that absent supporting context, ordinary delegations are limited to the set of most reasonable applications of the instruction.

To appeal to the “motte” claim in support of the “bailey” claim is to trade an obvious fact about context to support a highly controversial claim about intuitive understanding of delegations. We do not, however, read Justice Barrett to make such a slippery move. There is a more charitable way to read her concurrence (that is, relying on the stronger key claim). This reading relies on an interesting and empirically testable question: When a text delegates an agent the power to X with general language, do people intuitively understand the delegation to be limited to only the set of the most reasonable/natural ways to X, or do they understand the delegation more broadly (even if not entirely literally)? For example, when a parent instructs a babysitter to “use this credit card to make sure the kids have fun this weekend,” does that authorize only the most reasonable actions (for example, ordering pizza, ordering a movie), or does it also authorize some actions that would be understood as less reasonable (for example, taking the kids to an amusement park)? Similarly, when Congress delegates to an agency, is the agency limited to only the set of most reasonable understandings (absent supporting context), or do people understand delegations to communicate a broader (if not quite literal) authorization?

Justice Barrett’s “linguistic defense” of the MQD leaves some questions open—the quotations above capture the bulk of the defense. Our formal reconstruction of the arguments follows.

1. Reconstruction of the “Anti-Literalism” Defense of the Major Questions Doctrine

(1) [Definition: Ordinary Majorness]: For a given rule, an action is “major” if the ordinary reader understands it, absent additional context, as not among the set of most reasonable ways to follow the rule.¹¹⁹

(2) [Definition: MQD Case]: In a MQD case, the agency’s statutory powers are defined in linguistic terms that are semantically clear but highly general. The agency is exercising “vast powers” of great economic/political significance and pointing to the statutory language as authorization.

(3) [Empirical Premise: MQD Cases Involve Ordinary Majorness] The ordinary reader takes MQD cases to involve a “major” action (for example, in the MQD cases, the ordinary reader takes the contested action, absent additional context, as not among the most reasonable ways to follow the rule).

(4) [Textualist Premise]: Judges should interpret statutory language from the perspective of the ordinary reader.¹²⁰

(5) [Empirical Premise]: Absent additional context, the ordinary reader understands rules that delegate power to an agent to have significant contextual limitations against all “major” actions; such a rule’s communicative content is limited to authorizing only the set of most reasonable actions.

(6) [Conclusion]: In MQD cases, absent additional context, judges interpreting delegations should interpret delegations to exclude all major actions.

II. PHILOSOPHICAL AND EMPIRICAL BACKGROUND

The previous Part introduced the two linguistic MQD arguments, one concerning high stakes and one concerning anti-literalism. This Part provides background from philosophy and empirical studies related to these arguments.

Some of the questions at the heart of the “high-stakes” MQD defense have been long debated by epistemologists (philosophers who specialize in the study of knowledge). More recently, the same questions have been studied empirically by psychologists and experimental philosophers.¹²¹ Much of this work challenges a premise in the high-stakes MQD argument: although philosophers have claimed high stakes impact knowledge, high stakes have (at most) a small effect on ordinary judgments of knowledge. Section II.A reviews this research.

Section II.B provides background related to Justice Barrett’s claims about context and anti-literalism. Context matters in interpretation, and recent research has found that ordinary people understand law in line with anti-literalism, as Justice Barrett notes. However, there is no extant research that supports the stronger empirical premise in the anti-literalism argument.

A. Stakes and Knowledge

1. Philosophical Epistemology of Stakes and Knowledge

For decades, philosophers have evaluated stakes’ impact on knowledge with hypothetical “thought experiments.”¹²² Consider a pair of cases as an example.¹²³ The only differences between cases are highlighted in italics.

(1) Low-Stakes Bank Deposit:

Bob and Jane are considering whether to stop at the bank to deposit a check on a Friday. Nothing turns on whether they deposit the check in the next week. The line is long, and they consider coming back on Saturday. Bob says that he remembers that the bank was open last Saturday, and Jane replies that banks sometimes change their hours. Bob says, “I know the bank will be open tomorrow.”

In this case, many philosophers claim that Bob knows that the bank will be open tomorrow.¹²⁴ Now consider a slight variation on this case.

(2) High-Stakes Bank Deposit:

Bob and Jane are considering whether to stop at the bank to deposit a check on a Friday. It is critical that the check is deposited on one of the next two days. On Sunday, there will be a large debit to Bob’s account, which does not currently have enough funds, and the check is Bob’s only means to cover that expense. The line is long, and they consider coming back on Saturday. Bob says that he remembers that the bank was open last Saturday, and Jane replies that banks sometimes change their hours. Bob says, “I know the bank will be open tomorrow.”

In this case, philosophers say that Bob’s statement is false.¹²⁵ He does not know the bank will be open tomorrow.

The epistemology literature has taken philosophers’ shared reactions to these cases as intuitive data. And philosophers have offered different theories to make sense of that data. These are rich and complicated philosophical debates, which this Article does not have the space to rehearse or explore deeply.¹²⁶ Our principal interest is in how this work has informed recent debates in legal philosophy.

Legal-philosophical scholarship has drawn on this work in epistemology in support of the claim that high-stakes legal interpretation differs from lower-stakes interpretation. Ryan Doerfler suggests that high-stakes contexts influence textual clarity,¹²⁷ and Wurman piggybacks on this premise to argue that stakes sensitivity supports the MQD.¹²⁸ Importantly, these legal applications appeal to “ordinary speakers”¹²⁹ and “ordinary epistemic justification,” especially reactions to the bank cases described above.¹³⁰ A starting premise is that, for ordinary speakers of ordinary language, stakes impact knowledge; this is typically illustrated by the low- and high-stakes bank example.

2. Do Stakes Impact Knowledge? Empirical Perspectives

Despite the pedigree of the stakes-knowledge literature, there is one big problem: many empirical studies report that stakes have no effect on ordinary attributions of knowledge. As Joshua Knobe & Jonathan Schaffer explain, “[l]ooking at this recent evidence, it is easy to come away with the feeling that the whole contextualism debate was founded on a myth. The various sides offered conflicting explanations for a certain pattern of [stakes-sensitive] intuitions, but the empirical evidence suggests that this pattern of intuitions does not exist.”¹³¹

Much of this evidence comes from “experimental philosophy.” Rather than relying on the intuitions of philosophers (some of whom might have a lot at stake in intuitions about contextualism), experimental philosophers examine the understandings of ordinary people. Moreover, they often conduct experiments, which present different participants with different versions of the same scenarios, varying in only one respect (for example, higher stakes). This allows experimenters to draw inferences about whether certain factors (for example, stakes) affect people’s judgments in these cases. Some readers may be familiar with experimental philosophy’s testing of the well-known “trolley dilemma.”¹³² Many have also poured substantial effort into testing the influence of stakes on knowledge, especially in the “bank cases.”

Do stakes affect lay attributions of knowledge? Many studies report no.¹³³ As one important example, consider the study conducted by David Rose and other contributing authors. They gave participants versions of the bank case described at the start of this Section. They collected data from over 3,500 participants across 16 countries. The vast majority of countries show no significant effect, and for the few that show an effect, the size is very small (about a 10% difference in low- versus high-stakes cases). The researchers conclude that, overall, there is “virtually no evidence that stakes affect knowledge attribution.”¹³⁴

Other papers report a complicated pattern for other epistemic notions besides knowledge. For example, Mark Phelan finds no effect of stakes on judgments about how (epistemically) confident someone should be in a between-subjects study, but he finds an effect in a within-subjects study (when the same participant considered matched cases).¹³⁵

Other studies report stakes effects for more complicated (and perhaps controversial) measures of knowledge. As an example, consider Alexander Dinges and Julia Zakkou’s study.¹³⁶ This study instructed participants to consider a scenario in one of three versions. All scenarios began with the following:

Picture yourself in the following scenario:

You and Hannah have been writing a joint paper for an English class. You have agreed to proofread the paper. You’ve carefully proofread the paper 3 times and used a dictionary if necessary. You spotted and corrected a few typos, but you didn’t find any typos in the last round anymore.

You meet up with Hannah to finally submit the paper. Hannah asks whether you think there are no typos in the paper anymore. You respond:

“I know there are no typos anymore.”

At this point, . . .

Then, the scenarios proceeded in either a “neutral,” “stakes,” or “evidence” version. The “stakes” manipulation sought to change the practical significance of the knowledge claim, while the “evidence” manipulation sought to change the evidence base on which the knowledge claim rests.

Neutral: . . . Hannah reveals to you for the first time that she’s always been a big fan of the Backstreet Boys. You’ve never liked the Backstreet Boys, but since you like Hannah, you promise to listen to a few songs she particularly recommends. You doubt that it will change your mind but agree that it doesn’t hurt to give it a try. As you’re about to submit the paper, Hannah asks whether you stand by your previous claim that you know there are no typos in the paper. You respond:

Stakes: . . . Hannah reveals to you for the first time that it is extremely important for her to get an A in the English class. Her scholarship depends on it, and she’ll have to leave college if she loses the scholarship. If there is a typo left in the paper, she’s very unlikely to get an A, so it is extremely important to her that there are no typos in the paper. As you’re about to submit the paper, Hannah asks whether you stand by your previous claim that you know there are no typos in the paper. You respond:

Evidence: . . . Hannah reveals to you for the first time that she’s secretly read your previous term papers and always spotted lots of typos in them even when you said you had carefully proofread them. She apologizes for not telling you earlier. You are slightly disappointed but forgive her. Hannah is a good friend, and you appreciate that she was honest with you in the end. As you’re about to submit the paper, Hannah asks whether you stand by your previous claim that you know there are no typos in the paper. You respond:

All scenarios ended with: “I do” or “I don’t,” asking participants to pick the response they would be more likely to give.

Using this “stand by” question, the researchers found a difference. In the “Neutral” version, 94% of participants stood by their knowledge claim (“I do”); in the “Stakes” version, 76% of participants stood by; and in the “Evidence” version, 42% stood by. The researchers found similar results in a bank case. The Neutral-Stakes difference suggests that stakes can impact knowledge attributions. The Stakes-Evidence difference indicates that other factors (for example, an attributor’s evidence base) also matter and can have a larger effect than stakes. This difference (76% versus 42%) is one of the larger differences reported in the literature.¹³⁷

It is not clear if agreement with “standing by” a claim is equivalent to agreement with knowledge of a claim. To “stand by” a claim calls to mind the action associated with the claim (that is, going to the bank today or not). From a cost-benefit perspective, stakes are relevant to action. The rising expected cost of failing to act in light of a possible bank closure or paper typo is relevant to a rational actor’s decision-making. Arguably, some of the observed small impacts of stakes on lay attributions of knowledge could be reflecting lay participants’ actionability judgments: in the high-stakes context, Bob’s knowledge has not changed, but whether he should go to the bank has changed.

Overall, the evidence is mixed concerning whether stakes impact ordinary knowledge attributions. Historically, many philosophers had stakes-sensitive knowledge intuitions, predicted that others would, and developed complex theories about those effects.¹³⁸ Yet, a large number of empirical studies of thousands of ordinary participants, across many languages and cultures, have found no impact of stakes, or only a very small effect, on knowledge.¹³⁹ Very recently, one new study has reignited the debate, finding some support for the impact of stakes on epistemological judgments.¹⁴⁰ Yet, another recent study reports that stakes do not affect judgments about knowledge¹⁴¹ but do affect judgment about action.¹⁴² In total, there is evidence pointing in both directions. Resolving the debate will require further empirical research as well as systematic theorizing of the seemingly conflicting empirical results.

Consequently, it remains far from settled that high stakes reduce knowledge for “the ordinary person.” Most studies have found that stakes do not impact knowledge in this way. And even for the studies that do report an effect, it is small. If 95% of participants evaluate that there is knowledge in a low-stakes case, and 80% evaluate that there is knowledge in a comparable high-stakes case, does this imply that the “ordinary person” has stakes-sensitive knowledge intuitions? Advocates of ordinary stakes sensitivity need to spell out why stakes-sensitivity manifesting in 10–15% of ordinary participants implies that the ordinary reader has stakes-sensitive knowledge.

The claim that high stakes impact knowledge figures prominently in the argument for a high-stakes linguistic MQD.¹⁴³ Extant legal literature has drawn heavily on this claim in supporting that “high-stakes” interpretation differs from lower-stakes interpretation. In doing so, it has drawn primarily from hypotheticals in academic philosophy (the “bank cases”) and intuitions about those hypotheticals offered by academic philosophers. Insofar as the legal literature concerns stakes’ impact on ordinary people’s knowledge attributions,¹⁴⁴ those legal debates would benefit from greater engagement with the large body of recent empirical work summarized in the previous Section.

3. From Philosophy to Legal Philosophy

The previous two Subsections have introduced the debate about stakes and knowledge in epistemology. But it is important to recall that the connection of this debate to legal philosophy requires another step. For example, Doerfler proposes a connection between “clarity” or “plain meaning” of a statute and knowledge about the statute’s meaning: “[T]o say that the meaning of a statute is ‘clear’ or ‘plain’ is, in effect, to say that one knows what that statute means.”¹⁴⁵ The logic appears to be that clarity attributions are a subset of knowledge claims, such that a property demonstrated to affect knowledge claims should transitively affect clarity claims.

Ultimately, this relationship between knowledge and clarity is outside the scope of our Article (the relevant question for the linguistic MQD is stakes’ impact on clarity). However, there are some philosophical questions to raise about the proposed relationship between clarity and knowledge. One, which we described earlier, concerns technical meaning. A layperson might not know what a statute means because it is technical, yet the statute may not be “unclear” to that person in the relevant sense of clarity (that is, ambiguous). As another difference, consider factivity. Philosophers often propose that knowledge is factive: I know p only if p. But it is not obvious that clarity is factive. The meaning of a statute might appear clear (that is, not ambiguous) to an agent while the agent is wrong about the statute’s meaning, and thus the agent lacks knowledge of the statute’s meaning. Such a case would be a counterexample to the claim that an agent knows what a statute means if and only if the meaning of the statute is clear.

Most importantly, the empirical evidence about ordinary attributions of knowledge reviewed here—to the extent that it even does support stakes sensitivity—does not necessarily extend to ordinary determinations of whether statutory text is clear. The studies to date mostly used the bank case, but the bank case presents no rule to which clarity judgments might attach. It might be possible that the clarity of rules is reduced for ordinary people in higher-stakes contexts. Indeed, it is theoretically possible that clarity judgments about textual rules are more sensitive to stakes than knowledge more generally. But it is just as possible that there is a breakage: that is, that clarity claims are not simply a subset of knowledge claims but a special and different kind of knowledge claim. However, as far as we are aware, these are entirely untested empirical hypotheses. Without any empirical evidence specific to clarity claims, it would not be possible to bootstrap ordinary stakes-sensitive clarity from ordinary stakes-sensitive knowledge (moreover, as we have argued, ordinary stakes-sensitive knowledge is also empirically dubious). Part III therefore tests this clarity claim.

B. Context and Anti-Literalism

Justice Barrett’s concurring opinion in Biden v. Nebraska offers a different argument for the MQD as a linguistic canon. For Justice Barrett, the MQD simply reflects “common sense” inferences about how broader context restricts language’s (literal) meaning.¹⁴⁶ Justice Barrett illustrates this with the babysitter example, claiming that ordinary people understand a delegation to a babysitter to have implicit limits (although a babysitter’s attempt to transgress those normal limits might be allowed by a supplemental clear authorization). This, Justice Barrett suggests, is precisely how an ordinary reader would read a statute delegating authority to an agency, and therefore a canon requiring a clear statement from Congress is justified.¹⁴⁷

1. Anti-Literalism and Context in Ordinary Language

Anti-literalism is an important feature of ordinary language. Consider François Recanati’s discussion of the “You are not going to die” example from Kent Bach:

[Imagine] a child crying because of a minor cut and her mother uttering . . . [“you are not going to die”] in response. What is meant is: “You’re not going to die from that cut.” But literally the utterance expresses the propositions that the kid will not die tout court—as if he or she were immortal. The extra element contextually provided (the implicit reference to the cut) does not correspond to anything in the sentence itself; nor is it an unarticulated constituent whose contextual provision is necessary to make the utterance fully propositional.¹⁴⁸

This example helpfully illustrates that we often understand propositions anti-literally, in light of context, and that the relevant context need not come from the statement itself. The very same words “you’re not going to die,” convey a different meaning when uttered after a child gets a cut than they would in some other context where the literal meaning would be the correct meaning.

The powerful influence of context is not limited to anti-literalism. Extratextual context can also disambiguate. As an example, consider the statement “Do not take drugs and alcohol.” Does this mean “Do not take either one?” Or does it mean “Do not take the two together?” The answer varies across contexts.

If this rule were presented in the context of a substance abuse counseling session, our extratextual knowledge about that session leads us to understand this text [to prohibit each individually]: Don’t take drugs; don’t take alcohol. However, if this rule were presented in the context of a patient’s annual physical, in which the doctor prescribed cholesterol-reducing medications, our extra-textual knowledge about that session encourages [understanding the rule to prohibit the combination].¹⁴⁹

2. Anti-Literalism in Ordinary Understanding of Legal Rules

Justice Barrett’s argument is attractive in its appeal to context and anti-literalism. And Justice Barrett is not the only modern textualist to appeal heavily to anti-literalism; Justices Gorsuch and especially Kavanaugh have also called attention to the perils of overliteral interpretation.¹⁵⁰

For modern textualism, this is a welcome development. Analysis of the (linguistic) meaning of legal rules should attend to context and exceed pure literalism. As one example, consider the linguistic canons. Many linguistic canons reflect intuitive contextual restrictions from literal meaning. “No cars, trucks, or other vehicles may enter the park” might literally prohibit bicycles from the park, as most ordinary people take a bicycle to be a vehicle.¹⁵¹ However, the principle of ejusdem generis instructs interpreters to construe the broad, catchall term “vehicle” in light of the listed items (“cars,” “trucks”).¹⁵² Even if laypeople are not familiar with the name “ejusdem generis,” they intuitively apply this kind of reasoning when analyzing both legal and ordinary rules.¹⁵³

People also apply other types of contextual restrictions from literal meaning. This includes some contextual rules that are not currently recognized by courts as linguistic canons. For example, people understand that universal quantifiers like “any” often do not mean literally any.¹⁵⁴ If this tendency were at least as systematic in ordinary understanding as those underlying conventional linguistic canons (for example, the tendency to restrict catchall terms as ejusdem generis reflects), a textualist committed to the ordinary reader should employ those new canons (for example, the “quantifier domain restriction canon”).

Recent legal scholarship has also asked whether thinking about context and anti-literalism might reveal that some “substantive” canons are also linguistic canons.¹⁵⁵ Some clear statement rules—such as anti-retroactivity and anti-extraterritoriality—could be seen as linguistic canons, based on our understanding of context. Taken literally, many statutes would seem to apply at all times, in all places.¹⁵⁶ But people understand statutes to communicate temporal and geographical restrictions: while there is some division among laypeople, overall, people tend to understand rules to apply only prospectively, and only territorially.¹⁵⁷

Textualists may rhetorically privilege the “ordinary reader” and express support for anti-literalism, but they have not yet adopted many of these suggestions. For instance, no textualist has adopted an anti-literal “quantifier domain restriction canon” or theorized anti-retroactivity as a linguistic canon (although it is a long-standing clear statement rule). These context-sensitive rules are relatively robust and systematic and are supported by empirical evidence. We have reservations about a textualism that ignores such systematic patterns of anti-literalism while also freely adopting “ad hoc” anti-literal arguments related only to particular cases. On this score, Justice Barrett’s concurrence in Biden v. Nebraska is commendable in hypothesizing about a broader contextual principle that generally guides ordinary understandings of delegations (that is, a principle applying across cases, not an ad hoc appeal to context and anti-literalism related only to the authorization of emergency student loan relief). Whether Barrett’s contextual principle is systematic and empirically supported is a separate question.

Anti-literalism and contextual restriction are powerful ideas that accurately reflect language usage, but if textualists have no theory about when one can appeal to them, there is a danger that textualists can freely frame different readings as “literal” and “anti-literal,” choose liberally among them, or simply ignore non-literal meanings when doing so is convenient.¹⁵⁸ The claim that “in context,” a text does not “literally” mean what it says is also a powerful way for motivated interpreters to escape a text’s clear meaning.

Context matters. But if textualists have no theory about what counts as context and when they must appeal to it, ad hoc appeals to context are like “looking out over a crowd and picking out your friends.”¹⁵⁹ Except here, the “friends” are not even limited to preexisting sources; they also include entirely novel hypothetical examples generated by the judge.

3. Contextual Restriction of Delegations?

As Section II.B argued, the “anti-literalism” argument of the linguistic MQD needs a stronger premise than simply “people sometimes understand language non-literally.” The mere fact that “you are not going to die” has a nonliteral meaning does not justify the MQD.

The premise necessary to the argument involves a new claim about ordinary understanding of delegations. Justice Barrett proposes that there is some MQD-like principle that is part of ordinary people’s common sense, concerning the limited authorization from a general delegating instruction. It is for this reason that she relies on the babysitter hypothetical, an anti-literalism intuition-pump about an ordinary instruction that delegates power to an agent. General delegation language, Justice Barrett posits, has an anti-literal limitation. Unless there is further specific authorization, that general language is understood to be limited to only the most reasonable actions.

This is an interesting and empirically testable proposition: ordinary people understand general delegations to be limited to only the most reasonable actions falling under the language of the delegation. As far as we know, there is no empirical study that has examined this question. We present a new study to do so in Section III.B.

III. NEW EMPIRICAL EVIDENCE

This Part tests key empirical claims at the core of the linguistic arguments for the MQD. In both tests, we seek to reduce our researcher degrees of freedom (that is, eliminate cherry-picking scenarios) by relying on the exact cases that advocates of the linguistic defense offer: the high-stakes “bank case” and the “babysitter hypothetical.”

Section III.A presents a study that tests whether ordinary people’s judgments about knowledge are lowered in high-stakes contexts (using the bank case). It also examines, for the first time, whether people’s understanding of a rule is impacted: Are rules perceived as less clear in high-stakes contexts?

Section III.B presents a study to examine the babysitter case: a parent instructs the babysitter to use a credit card to “make sure the kids have fun.” Do ordinary people understand this instruction to license taking the children on a road trip to an amusement park, or do they understand it to be limited to only more reasonable actions?

Section III.C responds to the primary two objections to the studies that have appeared in print since we first publicized this Article’s empirical findings.

A. Do High Stakes Reduce Knowledge and/or Clarity? The Bank Case

1. General Overview

The first study examined whether (high) stakes reduce ordinary attributions of (1) knowledge and (2) clarity of rules. We randomly assigned participants to either a low-stakes¹⁶⁰ or high-stakes¹⁶¹ condition of the bank case. In each condition, participants read a version of the famous bank case, in which Bob and his wife discuss whether a bank is open on Saturday. Participants answered two types of knowledge questions, drawn from the previous literature.¹⁶² The basic knowledge question asks:

In your personal opinion, when Bob says “I know the bank will be open” is his statement true?

Yes, Bob’s statement is true.

No, Bob’s statement is not true.

Defenders of context sensitivity have argued that this question more accurately tracks debate about contextualism than questions that simply ask participants to rate “knowledge.”¹⁶³ The “strict” knowledge question asks:

In your personal opinion, which of the following sentences better describes Bob’s situation?

Bob knows that the bank will be open on Saturday.

Bob thinks he knows that the bank will be open on Saturday, but he doesn’t actually know it will be open.

Next, we randomly assigned participants to one type of rule: Clear, Ambiguous 1, Ambiguous 2, Unclear. The study presented a vignette explaining that Bob’s wife now used her phone to find the bank’s policy on its website. We randomly presented participants with one of four types of rules:

[Clear] The bank is open on Saturdays.
[Ambiguous 1] The bank is closed on Sundays.
[Ambiguous 2] The bank is closed only on Sundays and federal holidays.
[Unclear] The bank is open during regular business hours.

Participants rated whether the rule is clear or unclear concerning whether the bank is open on Saturday:

Now imagine that Bob’s wife uses her phone to search for the bank’s policy. She finds a website for the local bank branch. The website’s text states: “[RULE]” In your personal opinion, is this rule’s meaning clear or unclear concerning whether the bank is open on Saturday?

Clear: The bank is open on Saturday.

Clear: The bank is closed on Saturday.

Unclear.

In sum, we experimentally varied two factors: Stakes (low, high) and Rule Type (Clear, Ambiguous 1, Ambiguous 2, Unclear). This study examines whether Stakes affect lay judgment of knowledge (basic and strict). The study also examines whether Stakes affect lay judgment of a rule’s clarity across hypothesized clear, ambiguous, and unclear rules.

2. Methodological Details

All study materials, hypotheses, exclusion criteria, and primary analyses were preregistered at Open Science.¹⁶⁴ The study data is also available at the same site. A total of 501 participants were recruited from Prolific.co and compensated $1.00 ($12.00/hr) for a 5-minute task. To be eligible, participants must have completed at least 10 tasks on Prolific, with a 100% approval rating, and they must currently reside in the United States.

Within the study, there were several check questions. First was a simple attention check question, which asked participants to select the answer “purple” in a long list of colors. There was also a manipulation check, clearly labeled as an “attention check”: “Attention check question: According to the story, which of the following statements is correct?” The options were “it is very important that Bob and his wife deposit their money” [correct answer in high-stakes condition] and “it is not very important that Bob and his wife deposit their money” [correct answer in low-stakes condition]. Later in the study, there was a third multiple choice attention check: “Alex is taller than Sam, and Sam is taller than John. Who is the shortest?” [correct answer = “John”; incorrect answers = “Alex,” “Sam,” “They are all the same height”]. Finally, all participants were asked to complete a CAPTCHA. Participants who answered any one of these questions incorrectly were excluded from the analyses. Thirty-two (out of 501; i.e., 6%) participants were excluded from these criteria.

3. Results

A total of 469 participants were included in the data analysis (mean age = 39.58; 50% men, 48% women, 1% non-binary).

A binomial logistic regression revealed an effect of Stakes on knowledge. Participants attributed knowledge less in high-stakes cases (prob. = 0.86, 95% CI = [0.81, 0.90]) than in low-stakes cases (prob = 0.95, 95% CI = [0.91, 0.97]), odds ratio = 0.35, 95% CI = [0.18, 0.70], z = -2.99, p = 0.003.¹⁶⁵

Figure 1.

Figure 1. Percentage attributing knowledge (top panel) and strict knowledge (bottom panel), in low- and high-stakes bank cases. In the high-stakes case, knowledge attributions were slightly (about 10%) lower. Overall, the majority of participants attributed knowledge in low- and high-stakes cases.

A binomial logistic regression revealed an effect of Stakes on strict knowledge. Participants attributed strict knowledge less in high-stakes cases (prob. = 0.66, 95% CI = [0.60, 0.71]) than in low-stakes cases (prob = 0.78, 95% CI = [0.72, 0.83]), odds ratio = 0.55, 95% CI = [0.37, 0.83], z = -2.85, p = 0.004.¹⁶⁶

A multinomial logistic regression examined the effect of Stakes (low, high) and Rule Type (Clear, Ambiguous 1, Ambiguous 2, Unclear) on judgment of the bank rule’s clarity (clearly open, clearly closed, unclear). First, consider the effect of Stakes. Comparing clearly open and clearly closed responses, there was no effect of Stakes, z = 0.06, p = 0.956. Comparing clearly closed and unclear responses, there was no effect of Stakes, z = 0.38, p = 0.705. Next, consider the effect of Rule Type. Comparing clearly open and clearly closed responses, there was a significant effect of the clear versus unclear rule, z = -3.07, p = .002. There was no significant effect among the other rule types, |zs| < 0.21, ps > 0.8. Comparing clearly closed and unclear responses, there were no significant rule type effects, |zs| < 0.2, ps > 0.85. Finally, there were no significant Stakes * Rule Type interactions, |zs| < 0.41, ps > 0.68.¹⁶⁷

Figure 2.

Figure 2. Percentage attributing a clear meaning (open or closed) or unclarity for four different rules in low- and high-stakes cases. There were large and significant differences among the rules’ perceived meaning: the “Obviously Clear” and “Ambiguous 2” rules were generally understood to mean clearly open; the “Ambiguous 1” rule was understood to be unclear or mean clearly open; and the “Obviously Unclear” rule was unclear. However, there was no impact of high stakes on clarity judgments for any type of rule, whether the rule was clear (for example, Obviously Clear), ambiguous (for example, Ambiguous 1), or unclear (for example, Obviously Unclear).

4. Discussion

The results regarding stakes and knowledge are consistent with the prior literature. Some previous studies have found a small effect of stakes on knowledge in the United States.¹⁶⁸ Here, we find a similar small effect: In the low-stakes bank case, 95% attribute knowledge, but in the high-stakes bank case, this number drops to 86%. The “strict knowledge” measure reflects a similarly sized difference (78% versus 66%).

i. Is Knowledge “Sensitive” to Stakes?

The empirical results clarify the importance of refining this philosophical question: What is it for ordinary knowledge to be “sensitive to stakes”? One (weak) interpretation is that in some circumstances, for some people, stakes affect knowledge attributions. A stronger interpretation is that for most or all people, there are some cases in which knowledge is lost in high-stakes contexts. The strongest interpretation is that in many or most circumstances, high stakes defeat knowledge (for many or most people).

Once we have greater philosophical clarity about what it means to say knowledge is sensitive to stakes, we can analyze those theses in light of the empirical results. The results here straightforwardly provide support for the weak interpretation: the high-stakes manipulation affects (some participants’) attributions of knowledge. But the results do not support the “stronger” or “strongest” interpretations. The vast majority of participants attributed knowledge in low- and high-stakes cases. And even for the “strict knowledge” question, most participants still judged that there was (strict) knowledge in the high-stakes scenario. In other words, for the vast majority of participants, stakes did not impact knowledge.

ii. Do High Stakes Reduce Clarity?

The results provide a more straightforward answer to this question. The high- versus low-stakes manipulation had no impact on whether people understood rules to be clear or unclear. Importantly, we used four types of rules, which varied in their basic level of clarity. With respect to whether the bank is open Saturday, “the bank is open on Saturday” is obviously clear; “the bank is closed on Sunday” is ambiguous; “the bank is closed only on Sundays and federal holidays” is ambiguous;¹⁶⁹ and “the bank is open during regular business hours” is unclear. For all of these rules, high stakes did not increase the base level of unclarity.¹⁷⁰

B. Ordinary Understanding of Delegations: The Babysitter Case

The second study examines how ordinary Americans understand delegations in an ordinary context. This Study takes inspiration from Justice Barrett’s recent concurrence in Biden v. Nebraska, which offered a new linguistic defense of the MQD.

1. General Overview

The second study examined Premise 5 from Justice Barrett’s argument, the second empirical premise: When assessing whether an agent has followed or disobeyed a rule granting authority to perform some actions, do ordinary people restrict the rule’s literal meaning to only the set of most reasonable actions (absent additional context)?¹⁷¹ Study 2 examines this question by presenting participants with an ordinary rule granting authority, followed by one of five possible actions. These five actions varied in their anticipated reasonableness, and we examined whether participants evaluated each as following or violating the rule.

As in Study 1, we sought to minimize our researcher degrees of freedom by relying on existing and important test cases that have been offered by advocates of the linguistic MQD. For Study 2, we chose Justice Barrett’s “babysitter” hypothetical, as well as Justice Barrett’s proposed “major” action: a babysitter taking children to an amusement park in response to the instruction “Use this credit card to make sure the kids have fun this weekend.”

We randomly varied the conventional gender of the parent’s name (Patrick or Patricia) and babysitter’s name (Blake or Bridget). This did not affect rule violation judgment. Below is the text of the scenarios with the names Patricia and Blake:

Imagine that Patricia is a parent, who hires Blake as a babysitter to watch Patricia’s young children for two days and one night over the weekend, from Saturday morning to Sunday night. Patricia walks out the door, hands Blake a credit card, and says: “Use this credit card to make sure the kids have fun this weekend.”

Next, the scenario continued in one of five ways:

[MISUSE] Blake only uses the credit card to rent a movie that only he watches; Blake does not use the card to buy anything for the children.

[MINOR] Blake does not use the credit card at all. Blake plays card games with the kids.

[REASONABLE] Blake uses the credit card to buy the children pizza and ice cream and to rent a movie to watch together.

[MAJOR] Blake uses the credit card to buy the children admission to an amusement park and a hotel; Blake takes the children to the park, where they spend two days on rollercoasters and one night in a hotel.

[EXTREME] Blake uses the credit card to hire a professional animal entertainer, who brings a live alligator to the house to entertain the children.

All scenarios concluded with:

The kids have fun over the weekend.

We anticipated that the five scenarios would be seen as varying in their “reasonableness” as a response to the rule “Use this credit card to make sure the kids have fun this weekend,” with the REASONABLE scenario as maximal and the others as less reasonable. As we describe below, this prediction was borne out.

In all of the questions, we randomly varied whether the scenario described the parent’s directive as an “instruction” or “rule.” This also had no effect on rule violation judgment. Below we present the questions using the term “instruction.” After reading the scenario, participants first answered a comprehension question:

Attention check question: According to the story, which of the following statements is correct?

[CORRECT] Patricia’s instruction was “Use this credit card to make sure the kids have fun this weekend.”

Patricia’s instruction was “Do not use this credit card to make sure the kids have fun this weekend.”

Patricia’s instruction was “Use this credit card for anything this weekend.”

Patricia’s instruction was “Do not use this credit card for anything this weekend.”

Next, participants answered the rule violation question:

[Rule Violation] In your personal opinion, which better describes this situation?

Blake followed the instruction.

Blake violated the instruction.

We also measured participants’ judgment of the rule’s literal meaning and purpose.¹⁷² Finally, we measured participants’ evaluation of whether the babysitter’s action was a reasonable response to the instruction:

[Reasonableness] Think about how Blake responded to Patricia’s instruction. In your personal opinion, is this an unreasonable or reasonable way to respond to that instruction?

(completely unreasonable) 1 2 3 4 5 6 7 (completely reasonable)

2. Methodological Details

As for Study 1, all Study 2 materials, hypotheses, exclusion criteria, and primary analyses were preregistered at Open Science.¹⁷³ The study data is also available at the same site. A total of 500 participants were recruited from Prolific.co and compensated $1.00 ($12.00/hr) for a 5-minute task. To be eligible, participants must have completed at least 10 tasks on Prolific, with a 100% approval rating, they must currently reside in the United States, and they must not have taken Study 1. Within the study, there were the same two check questions used as exclusion criteria in Study 1 (attention check and transitivity) and the new comprehension check described in the previous Section. Twenty-four (out of 499; i.e., 4.8%) participants were excluded with this criteria.

3. Results

A total of 475 participants were included in the data analysis (mean age = 37.74; 48% men, 50% women, 2% non-binary).

First, we examined whether the five acts differed in their perceived reasonableness with respect to the rule. A linear regression revealed significant effects of the Action (misuse, minor, reasonable, major, extreme). Compared to ratings for the “reasonable” act (buying pizza and a movie for the kids), ratings for the misuse act (buying a movie for only the babysitter) were significantly lower, β = -1.67, 95% CI = [-1.89, -1.46], p < .001; ratings for the minor act (playing cards rather than purchasing anything) were significantly lower, β = -0.48, 95% CI = [-0.69, -0.27], p < .001; ratings for the major act (purchasing the amusement park trip) were significantly lower, β = -1.03, 95% CI = [-1.24, -0.82], p < .001; and ratings for the extreme act (purchasing the alligator entertainer) were significantly lower, β = -1.77, 95% CI = [-1.98, -1.56], p < .001.¹⁷⁴

Figure 3.

Figure 3: Reasonableness Ratings. Ordinary judgments of an action’s reasonableness in the babysitter hypothetical. Higher scores indicate greater reasonableness (1–7 scale).

Next, we examined which of the five acts participants understood as instances of following or disobeying the instruction. A binomial logistic regression revealed effects of Act type on rule violation. For the misuse case, rule following prob. = 0.15, 95% CI = [0.09, 0.24]; for the minor case, rule following prob. = 0.51, 95% CI = [0.41, 0.61];¹⁷⁵ for the reasonable case, rule following prob. = 1.00, 95% CI = [0.00, 1.00];¹⁷⁶ for the major case, rule following prob. = 0.92, 95% CI = [0.84, 0.96];¹⁷⁷ and for the extreme case, rule following prob. = 0.90, 95% CI = [0.82, 0.94].¹⁷⁸

Table 1.
Case	Was the rule violated?	Was the action reasonable (7) or unreasonable (1)?
Reasonable	0%	6.84 (Most reasonable)
Minor	49%	5.83 (Highly reasonable)
Major	8%	4.68 (Reasonable)
Misuse	89%	3.32 (Unreasonable)
Extreme	10%	3.12 (Unreasonable)
Note: Table 1 represents the proportion of participants judging that the action violated the rule and the estimated marginal mean ratings of the action’s reasonableness. Some actions that were not the most reasonable (for example, major, extreme) were seen as largely consistent with the rule; others that were seen as fairly reasonable (for example, minor) were also seen as inconsistent with the rule

4. Discussion

This Study aimed to test the empirical claims underlying the “babysitter hypothetical,” an example that has been used to support claims in a linguistic defense of the MQD.

i. Do People Understand Different Actions to Vary in Their Reasonableness as a Response to the Rule “Use This Credit Card to Make Sure the Kids Have Fun This Weekend”?

Yes. People evaluated some actions as highly reasonable, such as buying pizza and a movie for the kids. Other actions appeared less reasonable, like taking the kids to an amusement park or simply playing cards (and not buying anything). Others were even less reasonable, such as hiring an alligator entertainer or using the card to only purchase something for the babysitter. These results are unsurprising, but this variation is essential to test the key claim that the babysitter hypothetical has been offered to demonstrate.

ii. Do People Understand Authorizing Rules to Be Limited to Only the Set of Most Reasonable Actions?

No. Although people evaluate Justice Barrett’s “major” action (taking the kids to an amusement park) as less reasonable than at least one alternative, they nevertheless understand it as consistent with the rule. Moreover, people evaluated the even more extreme example of bringing a live alligator to the house as consistent with the rule.

To be sure, people did rule out some actions as impermissible. In particular, the respondents overwhelmingly said that misuse of the credit card for the babysitter’s benefit rather than that of the children violated the rule. They also divided roughly evenly over the babysitter’s decision to forgo using the credit card at all. We will have more to say about these interesting patterns in Part IV,¹⁷⁹ but for now, the most important thing to note is that two of the less reasonable actions that tested the boundaries of the instruction were nevertheless deemed to be within the parent’s rule.

iii. Why Do People’s Judgments About an Act’s Reasonableness and Rule Violation Differ?

Our survey also included questions about the rule’s literal meaning and the rule’s purposes. First consider reasonableness judgments by considering the results for purpose and literal meaning. Figure 4 presents the results for the purpose question. On inspection, this pattern of purpose attributions across actions is similar to the pattern of reasonableness ratings (Figure 3): actions seen as more reasonable were also the ones seen as most supportive of the rule’s purposes. The ratings for purpose and reasonableness, r = 0.63, 95% CI = [0.57, .0.68], p < .001, were more highly correlated than the ratings for purpose and literal meaning, r = 0.39, 95% CI = [0.31, .0.47], p < .001.

Next consider judgments about rule violation. Both literal meaning and purpose were correlated with rule violation judgment, but rule violation was more strongly correlated with literal meaning, r = 0.67, 95% CI = [0.62, .0.72], p < .001, than purpose, r = 0.49, 95% CI = [0.42, .0.56], p < .001.

Figure 4.

Figure 4: Purpose Ratings. Ordinary judgments of whether an action supports (rather than opposes) the rule’s purposes in the babysitter hypothetical.

These analyses are exploratory and further work is required to more fully understand the differences in participants’ judgments about whether an action is reasonable and whether it violates a rule, but the Study here clearly shows a difference in these judgments.¹⁸⁰ The question of whether the rule was violated and the question of whether the action was a reasonable response to the rule are understood differently by ordinary people: These questions are not synonymous. The comparisons to the purpose measure suggest a stronger relationship between reasonableness and purpose than rule violation and purpose.

Textualists concerned with the ordinary meaning of rules would presumably favor the rule violation question over the reasonableness question. Textualists who place significant weight on whether an action was “reasonable” with respect to a rule may be incorporating purposive reasoning, which is not as clearly relevant to ordinary people’s straightforward understanding about whether an act violates a rule.

The results reported here about laypeople’s rule violation judgments are consistent with prior work. Previous studies have found that both text (operationalized as literal meaning) and purpose influence rule violation judgment, but the former has a stronger influence.¹⁸¹ In sum, ordinary people lean towards textualism, but not the “common sense” limitations claim at the heart of the linguistic MQD.

C. Objections

This Section considers the two primary objections that have been raised in print about the results since we first circulated a draft of this Article.

1. Objection 1: Subjects Must Be Sensitive to Stakes

One objection concerns stakes sensitivity. Wurman writes, “In conversation, Ryan Doerfler has pointed out that it does not appear that the participants [in this Article’s Study 1] were asked whether the rule was clear to Bob, as opposed to themselves, and Bob is the one sensitive to stakes in the example.”¹⁸²

It is not clear why this observation constitutes an objection. One version of this objection is that only the judgments of those directly impacted by the stakes are relevant to legal theory, and because our study’s participants are not themselves impacted by the bank’s closure, their judgments about knowledge and clarity are not useful. This objection proves too much. The legal literature theorizing the effects of stakes-on-knowledge and stakes-on-clarity draws heavily on philosophical thought experiments (especially the bank case about Bob). None of these examples involve high stakes for the thought experimenter. The stakes are always for the subject described in the scenario, like Bob. The assumption is that those considering the scenarios can evaluate the significance of stakes (for some other person). If this objection undercuts our experiments, it also undercuts the merit of the original philosophical thought experiments offered to support Wurman’s argument.

A different way to elaborate this observation into an objection is to propose that (1) there is a more subjective relationship between stakes and clarity and (2) that (subjective sense) of clarity is relevant to legal interpretation. For a particular judge, that judge’s determination of clarity depends on the practical stakes to that particular judge. We do not have the space to fully engage with the merits of this theory, but some of its consequences are unusual. Because the subjective practical stakes of a decision may vary between judges, on this subjective view of stakes and clarity, such differences in subjective stakes would appropriately correspond to differing evaluations of clarity. A judge experiencing high practical stakes could deem a text unclear, while a judge experiencing lower practical stakes could deem the same text clear. However, many would think that whether a legal text is clear or unclear (in the sense relevant to legal interpretation) should not vary among judges in this way.¹⁸³ On this highly subjective view, to predict whether a law is correctly identified as “clear” (in the eyes of a particular judge), one must know what practical stakes the (particular) judge faces.

Although we find this an unusual view about what clarity means in current legal interpretation, this objection’s underlying claim is an empirically testable one. As such, we investigate this empirical question as a robustness check: Do stakes affect people’s judgment about whether the rule is clear to Bob?

2. Objection 2: Only Parents’ Views About the Babysitter Hypothetical Count

A recurring objection to our study about the babysitter hypothetical concerns the population surveyed. Both Josh Blackman and Ilan Wurman have suggested that the appropriate audience for Justice Barrett’s hypothetical is parents.¹⁸⁴ Because Justice Barrett’s hypothetical involves a parent, the objection goes, we should look to (only) the views of parents in understanding the meaning of the parent’s instruction.

Our original study did not collect data about participant’s parental status because we see it as irrelevant to the legal theory debate about the ordinary meaning of the parent’s instruction to the babysitter—more on that below. However even if we had that data in the first study, it is extremely unlikely that filtering by parental status would result in a different bottom line result given that the overall results lean so strongly in one direction.¹⁸⁵ Only 8% of participants shared the intuition that the babysitter violated the parent’s instruction, and of the other 92% of participants, it is unlikely that that the vast majority (say 90%) did not have children. In the next Section, we present a new study that collects this additional demographic data. The results do not vary depending on parental or babysitter-hiring status.

Our more fundamental responses to this objection are theoretical and appeal to longstanding principles of interpretation. First, consider the question of audience: Should the babysitter hypothetical be limited to parents? That is, should textualist interpretation’s “ordinary reader” be limited to only a small subset of ordinary readers?

First, the objection assumes the wrong interpretive perspective. In Justice Barrett’s hypothetical, the parent is the lawgiver and the babysitter is the audience. But the correct textualist focus, according to Justice Barrett, is on interpretation from the “outside[],” not from the “inside[].”¹⁸⁶ Textualists typically view interpretation from the perspective of a “hypothetical reasonable person,” not from the perspective of the lawgiver. Fidelity to the text of the statute, as understood by an ordinary reader, is the best way to remain a faithful agent of Congress (or, as Justice Barrett would have it, as a faithful agent of the people). Thus, even if one specific focus in the babysitter hypothetical were deemed more appropriate, for modern textualists that focus would more likely be that of a babysitter (the instruction’s reader), not a parent (the instruction’s author).¹⁸⁷

Second, textualists do not subdivide the general class of ordinary people that determines ordinary meaning. Instead, the “Supreme Court tends to employ a one-size-fits-all approach to interpretation.”¹⁸⁸ We recognize the importance of interpretive communities and the observation that statutes have audiences.¹⁸⁹ However, the concept of audience is most often used in textualist theory and otherwise to support the use of technical meanings (instead of ordinary meanings) for certain specialized statutes. Otherwise, the same statute might mean different things to the different groups subject to it, a position that Justice Scalia (writing for the Court) condemned.¹⁹⁰ The proposal to find ordinary meaning only in the views of the people most directly implicated by the law is thus a radical departure from modern textualism.

This suggestion (the legal interpretive equivalent of “ask only parents”) also strikes us as unworkable. If an interpreter aimed to limit “ordinary meaning” to the meaning a statute has to the people most directly impacted by it, how do we identify the people in this community? Even in Justice Barrett’s more straightforward babysitter hypothetical (and again setting aside that the babysitter is the audience, not the parent), we could ask: Are the relevant readers all parents, parents who go away for weekends, parents who can also afford babysitters, parents who would be willing to hand a credit card to a babysitter, or parents who would be willing to hand a credit card to a babysitter with limited instructions?

Even if the relevant subcommunity could be identified, it is not clear a judge would be well positioned to identify this narrow subcommunity’s understanding. If the textualist interpretive inquiry shifted from one about ordinary meaning to one about “ordinary meaning for only the audience most directly impacted by this statute,” might judicial intuition be especially unreliable if judges were not part of this latter subcommunity?

This suggestion becomes more bizarre as we shift from the babysitter hypothetical to real legal examples. In Biden v. Nebraska, who is the relevant interpretive community of people: the Department of Education, people with student loans, or some other group? If we take this objection and analogy seriously, it seems we should ask who is the “parent” in Biden v. Nebraska? Presumably, it is Congress. Do Blackman and Wurman suggest that Congress’s views are most relevant in interpretation? If so, this objection offered by Justice Barrett’s defenders, emphasizing a narrow subgroup of people who give or implement this instruction, is inconsistent with Justice Barrett’s broader approach to interpretation, which emphasizes judges as faithful agents of the “people,” not Congress.¹⁹¹

In sum, the objection to “ask parents” about the babysitter hypothetical is not persuasive. Theoretically, the legal-interpretive analogue to “ask parents” is unmotivated, unworkable, and inconsistent with modern textualism. Nevertheless, we address this objection in the next part of this Section, in a replication study that asks for the participants’ parental status. Empirically, the results are no different for participants who are parents or who have hired a babysitter.

3. An Additional Empirical Study

We are not persuaded by the theory underlying these two objections, but we are grateful to those who have raised them. And, setting the theoretical issues aside, it is possible to test these objections empirically. To do so, we conducted one final study.

Five hundred participants were recruited from Prolific to complete Study 1 and Study 2, with a few minor modifications aimed at addressing the objections described previously. A total of 445 participants passed the same attention checks described in Study 1 and Study 2 above and were included in the analysis (mean age = 37.9, 47% men, 51% women, 2% nonbinary). The final demographics section also asked about whether the participants had children (38% yes, 59% no, 2% prefer not to respond), had hired a babysitter (21% yes, 77% no, 2% prefer not to respond), and had worked as a babysitter (48% yes, 51% no, 1% prefer not to respond).

i. Testing Clarity to Bob (the Agent Sensitive to Stakes)

Participants first read the Study 1 materials concerning Bob and the bank. They were again randomly assigned to high or low stakes and one of the four rule types (Obviously Clear, Ambiguous 1, Ambiguous 2, Obviously Unclear). Participants answered the same questions about knowledge and strict knowledge, as well as a new question that Wurman recommends about clarity to Bob:

[Clarity to Bob] Now imagine that Bob’s wife uses her phone to search for the bank’s policy. She finds a website for the local bank branch. The website’s text states [rule text varying by scenario].

Consider Bob’s perspective on this scenario.

Is this rule’s meaning clear or unclear to Bob concerning whether the bank is open?

Clear: The bank is open on Saturday.

Clear: The bank is not open on Saturday.

Unclear

A multinomial logistic regression examined the effect of Stakes (low, high) and Rule Type (Clear, Ambiguous 1, Ambiguous 2, Unclear) on judgment of the bank rule’s clarity to Bob (clearly open, clearly closed, unclear). First, consider the effect of Stakes. Comparing clearly open and clearly closed responses, there was no effect of Stakes, z = 0.95, p = 0.341. Comparing clearly closed and unclear responses, there was no effect of Stakes, z = 1.26, p = 0.209. There was no significant effect of Rule Type and no significant Stakes * Rule Type interactions.

The results for these questions about whether the rule is clear to Bob also show no effect of stakes. For the Obviously Clear rule, stakes did not affect judgments of clarity to Bob (2% of participants selected unclear in high stakes; 2% in low stakes); for the Ambiguous 1 rule, stakes did not affect judgments of clarity to Bob (29% of participants selected unclear in high stakes; 38% in low stakes); for the Ambiguous 2 rule, stakes did not affect judgments of clarity to Bob (4% of participants selected unclear in high stakes; 12% in low stakes); and for the Obviously Unclear rule, stakes did not affect judgments of clarity to Bob (61% of participants selected unclear in high stakes; 59% in low stakes).

The results for knowledge and strict knowledge were similar to the results found in Study 1. A binomial logistic regression revealed an effect of Stakes on knowledge. Participants attributed knowledge less in high-stakes cases (prob. = 0.85, 95% CI = [0.80, 0.89]) than in low-stakes cases (prob = 0.94, 95% CI = [0.89, 0.96]), odds ratio = 0.38, 95% CI = [0.19, 0.73], z = -2.89, p = 0.004. A binomial logistic regression revealed an effect of Stakes on strict knowledge. Participants attributed strict knowledge less in high-stakes cases (prob. = 0.64, 95% CI = [0.58, 0.70]) than in low-stakes cases (prob = 0.81, 95% CI = [0.75, 0.86]), odds ratio = 0.42, 95% CI = [0.27, 0.65], z = -3.91, p < 0.001.

In sum, one objection to our original Study 1 is that it fails to ask the right question about clarity: it should ask whether the text is clear to Bob, not clear in general. This follow-up study tested that question about clarity to Bob, finding identical results: participants’ judgments about clarity to Bob were not sensitive to high stakes.

ii. Parents Only

The second objection is that we should consider only the views of parents. Consider the results of the same study, replicated, broken out by whether participants are parents, and have hired a babysitter.¹⁹²

Table 2.
Case	Violation: All Participants	Violation: Parents Only	Violation: Hired Babysitter Only	Was the act reasonable (7) or unreasonable (1) (All Participants)
Reasonable	0%	0%	0%	6.87
Minor	30%	26%	33%	6.23
Major	8%	7%	10%	4.41
Misuse	81%	79%	84%	3.21
Extreme	21%	18%	25%	3.00
Note: Table 2 represents the proportion of participants judging that the action violated the rule and the estimated marginal mean ratings of the action’s reasonableness.

Comparing across all participants, parent participants, and those who have hired babysitters, the results are essentially identical. Participants generally disagreed that the babysitter violated the rule/instruction by taking the children to an amusement park overnight, and this did not depend on whether those participants were themselves parents or had hired a babysitter.

4. Additional Objections

We have responded to the two major objections leveled against the studies since we made a draft of this Article public. However, there are two other objections that strike us as worth pursuing, but which we do not have the space to fully explore here.

The first is that in our babysitter experiment, we should have asked a different question. As a reminder, we asked “which better describes this situation?”—that the babysitter “followed” the instruction/rule or “disobeyed” the instruction/rule? This strikes us as a straightforward way to capture textualists’ concern: What does the rule mean to the ordinary reader? Wurman suggests that we should have asked other questions, like whether participants agree that the instruction “include[s] authorization” to undertake this action, or whether participants think “ordinary, reasonable interpreters of this parent’s instruction would have interpreted it to include this scenario.”¹⁹³ Wurman does not motivate these suggestions with much theory, and it is not obvious why these phrasings would identify participants’ understanding of the meaning of the rule. For example, recall that there are many theories of interpretation: textualism, purposivism, and consequentialism. It is not obvious that most people think the “reasonable interpreter” is a textualist. Perhaps people think that the “ordinary reasonable interpreter” is not a pragmatist. If so, asking people about their views of “the reasonable interpreter” would reliably generate non-textualist judgments.

Nevertheless, in our third study, we also asked these two additional questions: (1) “In your personal opinion, which better describes this situation?”—(a) The parent’s instruction/rule “authorized” the babysitter “to undertake this action”; or (b) The parent’s instruction/rule “did not authorize” the babysitter “to undertake this action”; and (2) “In your personal opinion, which better describes this situation?”—(a) “An ordinary person interpreting” the parent’s instruction/rule “would understand it to allow what” the babysitter did; or (b) “An ordinary person interpreting” the parent’s instruction/rule “would not understand it to allow what” the babysitter did. The results did not differ in the dramatic way that Wurman predicts. For the first authorization question, 85% of participants agreed that the “major” action was authorized. For the second “reasonable interpreter” question, a majority (57%) agreed that this reasonable interpreter would give the textualist response to the major action case: the reader would understand the instruction/rule to allow what the babysitter did.

A final objection states that the parent-babysitter analogy is a poor analogy for the Congress-agency relationship. This objection is sometimes offered as a critique of the MQD, not a defense, and as a reason why we should not indulge a faulty analogy. Less frequently, it is raised as a defense to the MQD—suggesting that the context of a real-world delegation would surely include consideration of constitutional structure.¹⁹⁴ We are exploring the idea that the babysitter hypothetical misses important relevant context to the evaluation of delegations in a future piece, but for now we set it aside. This Article takes the babysitter hypothetical from Biden v. Nebraska on its own terms. For readers that are skeptical that such an analogy provides insight into the meaning of statutory language, our study offers a second line of critique: the analogy does not offer insight into statutory meaning, and even if it did, its assumption about ordinary readers is faulty.

IV. IMPLICATIONS

The recent pivot to a linguistic defense of the MQD is a watershed moment for two fields of law that often intersect: statutory interpretation and administrative law. Through the narrowest lens, the reframing of the MQD as “linguistic” attempts to insulate the nascent MQD from scrutiny as hypocritical anti-textualism, allowing conservative judges to use the doctrine to curb the power of the administrative state without turning in their textualist cards.¹⁹⁵ But the move also resonates much more deeply. If accepted, the connection being drawn between ordinary people and the MQD would move textualism further towards an “outsider” orientation, with implications well beyond the narrow purview of the MQD.¹⁹⁶ Likewise, if accepted, the linguistic defense of the MQD would tend to reinforce trends toward an explicitly “libertarian administrative law,”¹⁹⁷ backing it with the force of supposedly ordinary people’s commonsense understanding of how government should work.

The theoretical critiques and original empirical evidence presented thus far in this Article support skepticism about the arguments to adopt the MQD as linguistic. In this Part, we explain why, and we also reflect on what our evidence says more generally about the fields of statutory interpretation and administrative law.

We start in Section IV.A by discussing how our investigation and findings challenge the conclusion that the MQD is a valid linguistic canon. In light of existing empirical work, our new empirical studies, and our new theoretical analysis and objections, we conclude that the two “linguistic defenses” of the MQD do not have adequate empirical support or theoretical clarity to succeed. Of course, defenders of the MQD might propose new arguments or different evidence, but for now, it is difficult to see on what basis one could employ the MQD as a valid linguistic canon.

Section IV.B explains that Justice Barrett and Wurman’s attempts to establish the MQD as a linguistic canon raise serious challenges to textualism. Justice Barrett’s arguments about “common sense” and “context” are so general that they threaten to undermine textualism’s commitment to enforcing the rule of law by privileging semantic content, even when unexpected applications are at issue. In turn, Wurman’s defense of the MQD necessarily involves a broad conception of “ambiguity.” This broad framing of ambiguity has been criticized by Justices Scalia and Kavanaugh and, like Justice Barrett’s arguments, would result in courts using “ambiguity” as a pretext to avoid the semantic meanings of statutes.

Finally, Section IV.C addresses broader implications for administrative law and regulation. We have reservations about any strategy to ground judicial interpretation in “ordinary people’s” understanding of ordinary examples, especially for a topic as technical as administrative law. Nevertheless, for the sake of argument, we consider where such an “ordinary” approach should take textualist interpreters. Empirical evidence about ordinary understanding of law and language suggests a dramatically different approach than what Justice Barrett suggests for the MQD. Ordinary people understand broad delegations to include a wide range of reasonable actions consistent with the delegation. Moreover, our findings reveal something we did not expect: ordinary people are fairly skeptical that underimplementation of delegated authority is consistent with facially broad delegations. These facts do not support the MQD, but they might support other linguistic canons—many of which have more in common with Chevron than the MQD—and they may counsel some rethinking of administrative law’s indifference to agency inaction.

A. The Major Questions Doctrine Is Not a Valid Linguistic Canon

The most immediate question motivating our studies is whether there is a valid basis for considering the MQD as a linguistic canon of statutory interpretation. As discussed above, canons are traditionally distinguished according to whether they are justified by normative or legal principles (in which case they are substantive) or whether they help determine the linguistic meaning of statutory language (in which case they are linguistic).¹⁹⁸ Although a canon can be both substantive and linguistic,¹⁹⁹ the MQD’s defenders have emphasized the MQD’s supposed linguistic properties because of growing concerns among textualists about both substantive canons generally and the MQD in particular. The existing empirical evidence reviewed in Part II and original empirical studies in Part III suggest this is a false start: the linguistic properties identified by the MQD’s defenders do not find support in the intuitions (or “common sense”) of ordinary people. Consequently, at least in the absence of further empirical studies, the MQD cannot, and should not, be defended as a valid linguistic canon capturing how ordinary readers understand delegating statutes.

1. The Evidence Does Not Support a “High-Stakes” Linguistic Major Questions Doctrine

i. High Stakes and Knowledge

Start with the theory that the MQD is justified on the grounds that, for ordinary people, the stakes of an interpretive dispute impact the text’s clarity.²⁰⁰ This argument begins by appealing to analytic philosophy and legal theory that posits a relationship between stakes and knowledge claims.²⁰¹ The central example is the bank case: when little depends on the bank being open on Saturday, we know that it is open; but, when the stakes of the Saturday deposit are higher, we do not know that it is open.

However, a large empirical literature reports this claim to be false,²⁰² and the entire philosophical literature to be “founded on a myth” about people’s reactions to these cases.²⁰³ Many studies find that high stakes have no effect at all on knowledge. Moreover, most of these studies use the exact case (the bank case) to which defenders of the linguistic MQD appeal.

The comparatively fewer studies that find an effect on knowledge report a small effect. In those studies, high stakes reduce knowledge for around 10% of participants, but not for the vast majority.²⁰⁴ This Article’s new large empirical study (N = 500) finds a similarly small effect on knowledge, only a 9% difference between the low- and high-stakes cases.²⁰⁵

Textualists are not always clear about how to construct their “ordinary reader,” but it is difficult to see how even this small difference (95% of people in low stakes agree there is knowledge, and 86% of people in high stakes agree there is knowledge) is sufficient to conclude that “the ordinary reader” has less knowledge in high-stakes contexts. For the vast majority of ordinary participants, high stakes have no impact on knowledge; the foundational premise in the “high-stakes” MQD seems to reflect an unordinary epistemology.

ii. High Stakes and Clarity

The “high-stakes” argument for the linguistic MQD uses this (false) premise about knowledge as a theoretical foundation to support a technically distinct, and to date untested, claim that ordinary people follow the same epistemological pattern when making judgements about the clarity of statutory language. Assuming that people do this, the argument concludes that a high-stakes situation can render otherwise clear statutory language unclear.

The recent “high-stakes” legal interpretation literature seems to assume that statutory interpretation essentially involves a kind of knowledge claim, such that high stakes’ impact on knowledge necessarily carries over into the interpretive context.²⁰⁶ Our data (the only we are aware of on this point) is not consistent with this transitive logic.²⁰⁷ We found that high stakes have a small effect on knowledge, but no effect at all on textual clarity. This finding supports the conclusion that ordinary judgments of knowledge do not rise and fall consistently with ordinary judgments of textual clarity.

More importantly, we find that high stakes have no effect on clarity for texts of varied levels of baseline ambiguity. High stakes did not reduce ordinary people’s sense of clarity for a fairly clear text or even for texts that were initially more ambiguous.²⁰⁸ This finding challenges the more critical premise in the “high-stakes” MQD defense (concerning clarity, not knowledge).

Together, these two problems count against the “high-stakes” linguistic defense of the MQD. High stakes have (at best) a small impact on knowledge and no impact on clarity. We have also noted various other theoretical issues with the “high-stakes” linguistic argument. For example, even if high stakes had the hypothesized effects, it is not clear why reduced knowledge or textual clarity puts more weight on judges’ readings of the statutes or implies anti-agency interpretation rather than putting more weight on agency interpretations of the statutes.²⁰⁹

2. The Evidence Does Not Support an “Anti-Literalist” Linguistic Major Questions Doctrine

i. The Data Do Not Support the Stronger Claim Necessary to the “Anti-Literal” Linguistic Major Questions Doctrine

The previously discussed considerations about anti-literalism²¹⁰ are insufficient to support a strong conclusion about the MQD. Just because people sometimes interpret non-literally and display context sensitivity does not imply that courts should interpret general delegating language to authorize only a small subset of agency actions that fall under the text’s meaning. One could easily agree that (1) delegations should not always be interpreted literally, while also holding that (2) anti-literalism does not lead to the MQD.

In Section III.B, we reconstructed Justice Barrett’s argument in sufficient detail to deliver the MQD conclusion. We understood her key empirical claim to be the following: absent additional context, ordinary people understand rules that grant authority to an agent to have significant contextual limitations against all “major” actions; such a rule’s communicative content is limited to authorizing only the set of most reasonable actions. Here, an action is “major” if readers understand it, absent additional context, as not among the set of most reasonable ways to follow the rule. While this is a much stronger premise than mere anti-literalism, an even stronger premise is necessary to conclude that in MQD cases, absent additional context, judges should interpret delegations to exclude all major actions.

Our empirical study tested this claim about ordinary understanding of grants of authority.²¹¹ Here, we again sought to minimize researcher degrees of freedom and chose cases that have been offered by advocates of the linguistic MQD. In Study 2, we examined Justice Barrett’s “babysitter case.” We found that most ordinary people do not take the babysitter’s actions, that is, taking children on a multi-day trip to an amusement park, to be unauthorized by the parent’s instruction to use the parent’s credit card to ensure that the kids have fun over the weekend. To the contrary, 92% of respondents took the babysitter’s actions to be consistent with the rule/instruction. When we looked at a more extreme hypothetical—bringing a zookeeper to the house to entertain the kids with a live alligator—respondents judged the babysitter’s actions less reasonable but virtually just as authorized by the parent’s instruction to “make sure the kids have fun.”

However, our respondents did not simply think anything followed the rule. Fully 85% of them thought that the babysitter’s decision to use the credit card for something other than the children’s entertainment violated the instruction, and 49% believed that it was a violation of the instruction to entertain the children too little.

Importantly, these different actions varied in their perceived reasonableness. Participants agreed that it is more reasonable to respond to the parent’s instruction by buying the kids pizza, and less reasonable to take the kids to an amusement park or hire an animal entertainer. Nevertheless, participants judged that these latter actions—while not part of the most reasonable set of responses—are fully consistent with the rule.

Ultimately, these findings suggest that even if Justice Barrett is right that context matters for interpreting grants of authority to administrative agencies, that fact alone does not justify the strong MQD. To point to “common sense” and “context” may be entirely reasonable for a judge—we will have more to say about this in the next Section—but referring to them does not rule out “major” or less reasonable agency actions, at least in the minds of ordinary readers.

3. Limits of the Evidence, and the Bottom Line

Our two studies test the central examples that have been offered by proponents of the MQD as a linguistic canon. Both of those arguments appeal centrally to claims about how ordinary readers understand language; neither of those claims is supported by the studies conducted here. Of course, this Article’s focus is on the linguistic arguments, not the many other defenses of the MQD.²¹² And concerning the linguistic case, we are open to future arguments and empirical studies: some future revision of a linguistic defense of the MQD could possibly succeed. In this Section, we briefly highlight some of the limits of our studies and the doors they leave open for proponents of the MQD. We also summarize our “bottom line” about the MQD.

i. Substantive Arguments for the Major Questions Doctrine

First, and perhaps most obviously, our studies do not foreclose a substantive basis for the MQD. That is, rather than grounding the doctrine in how text is understood, proponents of the MQD might point to constitutional or normative values that should lead judges to depart from the best reading of statutory language when agencies take major actions. The fact that none of the other Supreme Court justices joined Justice Barrett’s concurrence might suggest that at least five justices are comfortable with the idea that the MQD is solely substantive rather than partly or entirely linguistic.

So far, the Court has not clearly articulated the substantive basis of this canon: for Justice Gorsuch, the source of normative substance appears to be the nondelegation doctrine; for Chief Justice Roberts, the source is general separation of powers principles. But this lack of clarity about from where the justices are drawing the MQD’s substantive content does not mean that the MQD might eventually come, through an incremental process, to coalesce around some common narrative that would suffice to justify the MQD as a substantive canon alongside the many other substantive canons that our legal system recognizes. Given the growing textualist skepticism of substantive canons, as well as the contestable premises of the nondelegation doctrine and separation of powers, we doubt that such a defense would be uncontroversial,²¹³ but this is a topic that falls outside the scope of this Article.

ii. Linguistic but Non-Ordinary Arguments for the Major Questions Doctrine

Second, our studies focus on linguistic defenses that tie themselves explicitly to appeals to the construct of the “ordinary reader.” While we think this focus is defensible, given the larger textualist commitment to the ordinary reader as the anchor for interpretation,²¹⁴ it is also possible to defend a linguistic MQD on the grounds that it represents some kind of generalization about how Congress likely intends delegating statutes to be interpreted. The move here is to ground the MQD in what Beau Baumann calls the “descriptive case”: that is, an empirical assertion about the ordinary context of delegating statutes and the way Congress operates when it passes delegating statutes.²¹⁵

Indeed, the Court in West Virginia v. EPA said as much when it cited a “practical understanding of legislative intent” as a basis for the MQD;²¹⁶ both Wurman and Justice Barrett nod to this possibility as well.²¹⁷ On Wurman’s account, it makes sense as a linguistic matter to bake this contextual evidence of how Congress treats important questions into our reading of delegating statutes—that is, to interpret ambiguous statutes as not intended to delegate important matters. Justice Barrett’s concurrence in Biden v. Nebraska makes a similar move. After noting that all interpreters seek to “situate[] text in context,” Justice Barrett posits that “[b]ackground legal conventions . . . are part of the statute’s context.”²¹⁸ In a principal-agent relationship, “ ‘the context in which the principal and agent interact,’ including their ‘prior dealings,’ industry ‘customs and usages,’ and the ‘nature of the principal’s business or the principal’s personal situation’ ” help form the background legal conventions that govern delegation.²¹⁹ From there, Justice Barrett argues that we know from the context of how Congress usually delegates to agencies that Congress is “more likely to have focused upon, and answered, major questions, while leaving interstitial matters [for agencies] to answer themselves in the course of a statute’s daily administration.”²²⁰

These kinds of arguments based on the “descriptive case” run into persistent empirical problems—namely, there is ample evidence that Congress often does intend to delegate major questions to agencies through vague language, and only weak and contested evidence that Congress does not so intend.²²¹ These kinds of arguments are also in significant tension with textualism, which generally eschews evidence of legislative intent except insofar as it is “objectified” in statutory language. However, given the evidence presented in this Article, these arguments may still be more promising for proponents of the MQD than a linguistic defense premised on ordinary meaning.

On the whole, then, it does not seem like the doors that are left open by our study are ones that would be attractive to the textualist justices who have given us the MQD. But we cannot deny another possibility: that textualism itself may evolve (or dissolve?) in ways that accommodate the MQD on these other grounds. We turn to that topic in the next Section, but before doing that, we would reiterate that the ordinary-meaning defense of the MQD is, by all appearances, a total dead end. Textualists would be hard-pressed to continue to defend the MQD on this theory of the case and this record of decision.

iii. The Bottom Line

This Section has briefly noted some limitations of the Article. We make no claims about other (non-linguistic) defenses of the MQD. And we are, of course, open to the possibility that some future argument or evidence could rehabilitate the linguistic defense of the MQD.

However, it is important to emphasize that we endorse a firm conclusion about the current state of affairs for the linguistic MQD and textualists’ use of the canon. The two extant linguistic defenses of the MQD depend on empirical claims about specific hypotheticals (for example, the bank case) that are not supported by empirical studies of ordinary Americans. Proponents of the linguistic MQD may offer new, more workable arguments, with different thought experiments, or different empirical support. But until then, there is no basis to employ it as a linguistic canon, and there is now significant evidence counting against core claims of the two publicly stated linguistic arguments.

Second, even for textualist judges with no interest in the linguistic defense, the empirical data about ordinary readers counts against the MQD’s consistency with ordinary language. Given ordinary readers’ understanding of language, there is more evidence in favor of treating the MQD as an anti-linguistic canon than a linguistic canon.²²² And as Justice Kagan remarked, judges who appeal to such non-principles over linguistic interpretation are not really textualists.²²³

B. Broader Implications for Modern Textualism

Justice Barrett and Wurman’s arguments have implications for textualism beyond the narrow (but hugely important) issue of whether the MQD is a linguistic canon. Textualism’s claim to distinctiveness centers on a commitment to interpretation according to a text’s linguistic meaning, thereby promoting rule of law values.²²⁴ Textualism thus abjures judicial discretion to depart from that linguistic meaning.²²⁵ As Justice Scalia emphasized, judges should not exercise an unbounded “personal discretion to do justice.”²²⁶ Instead, judges should be restrained even when some results may have been unanticipated by the legislature.²²⁷

Justice Barrett’s expansive view of “context,” “common sense,” and non-literal interpretation expands, but also challenges, these foundations of textualism. Justice Barrett admirably argues for a sophisticated version of textualism that rejects literalism and recognizes implied terms.²²⁸ Even so, existing interpretive canons that recognize implied terms are narrow, and thus do not undermine textualism’s commitment to linguistic meaning.²²⁹ In contrast, Justice Barrett’s “common sense” interpretive canon is unbounded, granting judges considerable discretion to claim that a wide range of actions fall outside of the text’s meaning (or “reasonable meaning”).

Wurman’s arguments also have implications that threaten to expand, if not unravel, textualism. Recall that Wurman, unlike Justice Barrett, frames the MQD as a tiebreaker canon that resolves statutory ambiguity.²³⁰ Wurman is correct that the Court has referenced “ambiguity” in MQD cases. This framing of the MQD, however, requires a broad view of ambiguity that would make its determination even more discretionary, and likely more pretextual.

1. Justice Barrett’s Theory of Non-Literal Interpretation

Justice Barrett’s general appeals to context and non-literal interpretation are consistent with modern textualist scholarship and thinking. Justice Kavanaugh has also repeatedly emphasized the distinction between literal and ordinary meaning and has insisted that courts should avoid overly literalist meanings.²³¹ Similarly, John Manning argues that “the literal or dictionary definitions of words will often fail to account for settled nuances or background conventions that qualify the literal meaning of language and, in particular, of legal language.”²³²

Textualism, though, purports to privilege semantic meaning, thereby giving a relatively limited role to non-literal meanings informed by context and pragmatics. Thus, while Manning endorses some non-literal interpretation, his “background conventions” are narrow ones relevant to the “relevant linguistic community” subject to the law, such as common law criminal defenses.²³³ Besides these limited examples, according to Manning, judges “have a duty to enforce clearly worded statutes as written, even if there is reason to believe that the text may not perfectly capture the background aims or purposes that inspired their enactment.”²³⁴ Doing so ensures “Congress’s ability to use semantic meaning to express and record its agreed-upon outcomes.”²³⁵

A coherent textualism would thus recognize a narrow role for implied terms. Crucially, an implied term must be one that would be obvious to the discourse participants, rather than one imposed by the interpreter for other reasons. An implied term must therefore reflect a presupposition about meaning that is warranted in the circumstances.²³⁶

Statutes are often drafted at a high level of generality, and Justice Barrett is correct that readers of those rules understand that sometimes the rules expressed are not meant to be taken literally in all respects. Crucially though, the relevant existing interpretive canons are implicated in narrow circumstances and provide relatively specific rules for limiting literal meaning.²³⁷ Furthermore, empirical evidence supports these narrow rules as linguistic and thus consistent with how ordinary people interpret legal texts.²³⁸

Justice Barrett’s view of implied terms as governed by “common sense” and “context” is similar to Richard Fallon’s approach. Fallon argues that “[o]rdinary principles of conversational interpretation call for us to ascribe a reasonable meaning to prescriptions and other utterances unless something about the context indicates otherwise.”²³⁹ Fallon reasons that “[i]n ordinary conversation, we do not waste time and breath offering elaborations and qualifications of our utterances that ought to be obvious to any reasonable person.”²⁴⁰ Instead, a “reasonable person” understands that “[t]he moral reasonableness of a particular ascribed meaning possesses a distinctive importance.”²⁴¹ Both Fallon and Justice Barrett draw on principles of conversational communication and context, and while Fallon references “reasonable meaning” and Justice Barrett “common sense,” the two are essentially the same idea. In fact, Justice Barrett uses the word “reasonable” in relation to interpretation eleven times in her Biden v. Nebraska opinion (e.g., “reasonable understanding,” “reasonable view,” “reasonable interpreter”).²⁴² Furthermore, her appeal to “common sense” and “reasonable” interpretations has, like Fallon’s view, room for moral and normative beliefs to motivate non-literal interpretations.

The similarities between the interpretive approaches of Justice Barrett and Fallon should be surprising and troubling to textualists. Fallon’s interpretive principle is in furtherance of his decidedly anti-textualist view of interpretation.²⁴³ Justice Barrett’s principle of “common sense,” guided by “context,” is supposedly in furtherance of textualism, but it raises questions that do not have easy textualist answers. Can the principle always defeat the literal meaning of a statute? How can “common sense” even be defined? Even if “common sense” could be defined, do judges share the same “common sense” as ordinary people, or do judges speak with what Eskridge and Nourse refer to as an “upper-class accent?”²⁴⁴ The ability for judges to appeal, with little restraint, to “common sense” and “context,” calls to mind Scalia’s fears about non-textualist judging: “personal discretion to do justice” as the judges saw fit.²⁴⁵

2. The Anti-Textualist Broad View of Ambiguity

An additional threat to textualism is posed by a broad view of “ambiguity.” Recall that Wurman argues that the MQD is a linguistic canon that resolves statutory ambiguity.²⁴⁶ In support of this claim, Wurman quotes from MQD decisions where the Court argues that the relevant statutes are “ambiguous.”²⁴⁷ This defense of the MQD is unsurprising. Textualism is much more permissive about available arguments and interpretive sources when a provision has been deemed “ambiguous.”

There are two key drawbacks in viewing the MQD as serving a tiebreaking role in resolving ambiguity. First, doing so understates the MQD’s role in the Court’s precedents. The MQD has not merely resolved “ties” between meanings; it has caused the Court to choose meanings it would not otherwise have selected. Second, Wurman’s view requires a definition of ambiguity that should be especially troubling to textualists, and the significance of the issue extends beyond the MQD.

Wurman’s argument raises an essential question: On what basis can a provision be deemed “ambiguous”? Wurman suggests that a provision can be “ambiguous” even when a court can determine the provision’s “best reading.”²⁴⁸ Thus, crucially, the question of ambiguity does not require that a provision be indeterminate. In other words, the semantic meaning of the provision’s terms could be clear (even if broad) but still “ambiguous,” based on non-textual considerations like the novelty and importance of an agency’s actions.

Use of the “ambiguity” label often obscures rather than clarifies linguistic issues. Specifically, it glosses over the distinctive linguistic features of the prototypical statute involved in MQD cases, which is a statute with broad but semantically clear terms. These features—broad but semantically clear—should represent for textualists a prima facie case against the MQD. After all, textualists assert that courts should focus on the semantic meaning of statutes.

Outside of MQD cases, some textualists have recognized the potential dangers associated with a judicial focus on “ambiguity.” Most significantly, Justice Kavanaugh has criticized “ambiguity” as an interpretive doctrine because its identification is standardless and subjective.²⁴⁹ Its discretionary identification and legitimizing power, however, make “ambiguity” an especially attractive interpretive tool for judges. “Ambiguity” is extremely useful because it gives a court cover to interpret a statute narrowly or broadly on the basis of normative concerns. For instance, an explicit announcement of ambiguity allowed the Court in King v. Burwell to “avoid the type of calamitous result that Congress plainly meant to avoid” and gave it justification for “interpret[ing] the Act in a way that” improves health insurance markets and does not destroy them.²⁵⁰

“Ambiguity’s” legitimizing power explains why the Court in MQD (and other) cases is motivated to label a provision as “ambiguous” without much consideration about whether it is applying a coherent definition of ambiguity. It may be activist to interpret a clear statute narrowly because doing so would be in tension with the provision’s linguistic meaning. In contrast, resolving statutory “ambiguity” is necessary to decide the interpretive dispute, and choosing the narrower interpretation does not conflict with the provision’s linguistic meaning. Thus, if a provision is problematically broad, labeling it as “ambiguous” does not require the Court to explicitly reject its literal meaning.

If a provision can be “ambiguous” even when a court can nevertheless determine its “best reading,” “ambiguity” would mean something like “any uncertainty about the meaning of a provision.” But this sort of definition would make ambiguity ubiquitous and is inconsistent with how it is used in Chevron and other tiebreaker canons like the rule of lenity.²⁵¹ If instead “ambiguity” means that a provision must actually be indeterminate, there is no “best reading” of a provision, but merely possible competing meanings.

The question of ambiguity thus hinges on whether “ambiguity” is synonymous with “indeterminacy.” Even if the terms are synonymous, framing the MQD in terms of “ambiguity” should be unappealing to textualists. The MQD would still be a matter of judgment that depends on how one weighs semantic and pragmatic evidence. In other words, a combination of meaning and context makes a provision clear or, conversely, ambiguous. Univocal semantics and univocal pragmatics may uncontroversially result in a clear provision, and multivocal semantics and multivocal pragmatics in an ambiguous provision, but other combinations are contestable and subject to normative resolution via highly discretionary judgments.

The choice is thus between a narrow definition of “ambiguity” that would require the semantic meaning of the statutory text be indeterminate in some way, and a broad definition that would allow even semantically clear language to be deemed “ambiguous” based on non-language concerns like statutory purpose. Justice Scalia argued that the broad view of ambiguity is “judge-empowering” and mocked the idea that “[w]hatever has improbably broad, deeply serious, and apparently unnecessary consequences . . . is ambiguous!”²⁵² A broad definition of ambiguity would allow the label to be used at any time by emphasizing any number of pragmatic considerations, such as the problematically broad semantic meaning of terms or the “novelty” of an agency’s interpretation. If instead, as Justice Scalia argues, pragmatic evidence can only clarify semantically indeterminate text, ambiguity would therefore require indeterminate semantic meaning and be a narrower, less discretionary doctrine.²⁵³

Textualists in MQD cases should be honest about their use of “ambiguity.” If they use the term broadly, they should explain why Justice Scalia’s critique of the broad definition is mistaken. If they instead agree with Justice Scalia, the MQD cases involving clear (but broad) semantic meaning should thus be viewed by textualists as similar to situations not involving ambiguity. In such cases, if the Court wishes to narrow the literal meaning of the language, it should state so explicitly, giving reasons for why such narrowing is consistent with the judicial function.

C. Broader Implications for Administrative Law

This Article has taken textualists’ defenses of the MQD at face value. But some harbor a more realist or critical take on the MQD. Perhaps the MQD is animated by neither constitutional values nor language, but rather by the aim of limiting the administrative state’s power. And perhaps leaving questions about the MQD’s legitimacy unresolved allows strategic ambiguity, which is better for this purpose.²⁵⁴ Some go even further to argue that the justices are engaged in a form of constitutional hardball, seeking to aggrandize themselves vis-à-vis the other branches of government.²⁵⁵ It is certainly difficult to overlook the hostility that many of the justices express toward modern administrative government and the legislative acts that authorized it.²⁵⁶

Yet, turning our attention away from the Supreme Court and toward the broader legal community, our findings about how ordinary people understand delegations of authority have significant implications for administrative law well beyond the MQD. While we acknowledge that there are good reasons to be skeptical about outsourcing questions of administrative law to laypeople, insofar as textualist principles animate the statutory interpretation questions at the heart of administrative law, it is worth asking where ordinary people’s intuitions lead.²⁵⁷ Below, we highlight a couple takeaways from this exercise. An irony of textualist’s turn to “ordinary people” to support the MQD may be that it actually supports a significantly cabined judicial role in controlling delegation of authority to the administrative state. Far from endorsing a kind of “libertarian administrative law” that treats delegations of authority to administrative agencies with suspicion and seeks almost perfunctorily to narrow them,²⁵⁸ ordinary people appear to take general ordinary delegations to license a range of reasonable actions.

To be sure, we considered ordinary judgments of an ordinary, private delegation (that is, the babysitter), but critics of the administrative state have made that ordinary context relevant by insisting that general principles of private agency and/or ordinary delegations law should inform public law delegation.²⁵⁹ We are also skeptical that there is an easy way to study the “ordinary person’s” view of specific cases. As prior research has shown, interpreters’ values affect their interpretation.²⁶⁰ Asking ordinary people whether the EPA has authority to issue broad climate change regulations under the Clean Air Act is likely to tell us more about people’s values and politics than their understanding of language. Thus, the implications we spell out depend on the validity of this ordinary analogy—the one made by the linguistic MQD’s defenders (recall the “high stakes” appeal to the ordinary bank case and the “common sense” appeal to the ordinary babysitter case).

To start, our study of the babysitter hypothetical revealed a surprising result about what ordinary people would think of the amusement park hypothetical. Taking the children to the amusement park might not be the most reasonable response to the instruction to “use this credit card to make sure the kids have fun this weekend,” but it certainly does not violate it (after all, an amusement park is “fun”). The study also revealed that the vast majority of ordinary people believe that the parent’s instruction extends to the even more unusual action of bringing a live alligator to the house. This surprising finding suggests that people do not limit delegations to only the most reasonable actions or the ones most consistent with the rule’s purpose.

Ordinary readers approached the limits of broad delegations through a textual and purposive lens. Compared with the amusement park, alligator, and movie scenarios, respondents were far more likely to say that the babysitter violated the instruction when the babysitter failed to achieve the purpose of the instruction (as in the case of not using the credit card and potentially shortchanging the children’s fun) and when the babysitter actively undermined it (by using the credit card for the babysitter’s own enjoyment). This finding is difficult to understand unless ordinary readers understand delegations in large part as remedial—that is, as seeking to empower the agent to solve a problem or achieve some goal—rather than exclusively delimiting—that is, as setting out the scope of the agent’s power.²⁶¹

The modern textualist commitment to ordinary people’s understanding as a basis for interpretation²⁶² and linguistic canons²⁶³ opens the door to uncovering a linguistic basis for other canons, including new canons.²⁶⁴ As a hypothetical, imagine if a textualist were to carefully consider evidence about ordinary people’s understanding of delegating language (e.g., in the babysitter case) and attempt to “canonize” those intuitions into administrative law doctrine. The result would probably be a fundamental recalibration of the field—but not in the way the MQD imagines. Were one to follow the evidence, it seems to instead support canonizing a sort of “counter-MQD” that presumes that general delegations should be interpreted broadly (or at least not as restrictively as Justice Barrett’s argument claims), significantly curtailing judicial power to limit Congress’s attempts to empower administrative agencies.

In addition, and relatedly, our findings are in some tension with administrative law’s traditional approach to questions of underimplementation of statutory delegations. A variety of administrative law doctrines insulate agency discretion to decline to enforce the law: for instance, Heckler v. Chaney provides that agency nonenforcement decisions are almost never reviewable by courts,²⁶⁵ and Norton v. Southern Utah Wilderness Alliance makes it impossible for challengers to force agency action unless they can point to a discrete duty (rather than a more general failure to pursue broad policy goals of a statute).²⁶⁶ These doctrines insulate agency underuse of delegated regulatory authority from judicial scrutiny. Yet our findings suggest that ordinary readers may be more troubled by delegated authority’s underuse than uses that fit with the language but exceed an observer’s sense of reasonableness.²⁶⁷ On the flip side, when agencies do take action pursuant to their delegations, judges often artificially narrow those delegations.²⁶⁸ Canons that might theoretically push in the opposite direction—toward liberally construing “remedial” statutes, for instance—have fallen into disrepute.²⁶⁹ This basic asymmetry in the treatment of delegations to agencies—deep skepticism of exercises of delegated authority coupled with indifference toward failures to exercise delegated authority at all²⁷⁰—may be exactly backwards if ordinary people’s intuitions are to be the guide.

Again, we do not endorse any particular changes to administrative law here. There are many good reasons, such as the institutional constraints under which agencies operate, to disfavor outsourcing administrative law into ordinary people’s linguistic or legal intuitions (whatever those may be).²⁷¹ There are also many countervailing concerns, such as fair notice and due process, that may justify curtailing expansive ordinary readings of delegating statutes.²⁷² But we also believe that for those inclined to remake administrative law through the eyes of the ordinary reader, it is worth grappling with facts rather than judicial hypotheticals about those ordinary readers. People are far more comfortable with broader interpretation of general-language delegations than many textualists have assumed, and they appear to be disproportionately uncomfortable with violations through underuse of delegated authority.

CONCLUSION

The MQD is the most influential interpretive development at the modern Supreme Court.²⁷³ Yet it lacks a compelling theoretical basis and a satisfactory explanation of its consistency with textualism, the interpretive theory held by the MQD’s advocates. The new “linguistic MQD” purports to solve both problems: because the MQD reflects ordinary understanding of language, it is a valid linguistic canon and thus consistent with textualism.

This Article has taken this linguistic defense on its own terms and studied the two central ordinary examples offered by its advocates. We find that ordinary people do not understand language as textualists have assumed. High stakes do not undermine knowledge or impact textual clarity, and people do not understand general delegations to be limited to only the most reasonable set of actions. These results challenge the essential empirical claims at the heart of the arguments for the linguistic MQD. While scholarly debate should continue, judges must take stock of the evidence and decide whether to employ the canon—and whether to do so in the name of linguistics and ordinary people. In our view, there is insufficient empirical support and theoretical clarity to cast the MQD as a valid linguistic canon. Arguably, the linguistic defense is the only viable theory for textualists to consistently employ the MQD. Thus, unless they offer a successful alternative, the results here support the broader conclusion that consistent textualists should not employ the MQD.

97 S. Cal. L. Rev. 1153

Download

Kevin Tobia*

* Associate Professor of Law, Georgetown University Law Center.

Daniel E. Walters†

† Associate Professor of Law, Texas A&M University School of Law.

Brian Slocum‡

‡ Stearns Weaver Miller Professor, Florida State University College of Law. For helpful comments and/or discussion, we thank Cary Coglianese, Anuj Desai, Ryan Doerfler, Rebecca Kysar, Edouard Machery, Ángel Pinillos, Larry Solum, Ilya Somin, Ilan Wurman, and audiences at the American Association of Law Schools Annual Conference, and at Cornell, Georgetown, NYU, and USC law schools. Thanks to the Southern California Law Review for excellent editorial assistance. We also thank Kirsten Worden and Michael Cooper for their research assistance.