How to Lie with (Political) Statistics

On the morning of the 2016 presidential election, the New York Times published its final pre-election blog post on its data analysis vertical, The Upshot. Confidently forecasting an imminent victory for Hillary Clinton, the paper’s modelers projected she had an 85 percent chance of winning. Other forecasts, the article noted, were similarly favorable, ranging from 71 percent in Clinton’s favor (Nate Silver’s FiveThirtyEight) to over 99 percent (the Princeton Election Consortium). The most likely Electoral College outcome was 322 votes for Clinton—just shy of Obama’s performance in 2012.

In reality, of course, Clinton suffered a decisive defeat. That evening, the Times’s infamous election “needle” traversed nearly its entire semicircle in a few hours, reaching over 95 percent likelihood of a Trump victory by 11:15 p.m. on the East Coast. In the battleground states of Michigan, Wisconsin, and Pennsylvania—Clinton’s so-called “Blue Wall,” all of which wound up going for Trump—polling was found to be off by an average of 5 to 6 percentage points, all biased in favor of Clinton.

Recently reignited debates about the place of polling in progressive politics are one skirmish in a broader struggle over the future of the Democratic Party.

The errors were so large that the next time around, in the 2020 election, the Times began noting how predictions would differ if they were as off as they had been in 2016. But even this effort didn’t save face. In both the 2020 and 2024 election cycles, models were by many metrics just as wrong as they had been in 2016, or even worse. In 2020, Biden was projected to win the popular vote by 7 to 8 points but wound up with just a 4.4-point margin, marking the largest polling error in forty years; at the state level, the error was the worst since tracking began in 2000. As for last year, polling bias fell but again cut in the same direction: in the run-up to Election Day, Kamala Harris emerged as the slight favorite in both the popular vote and Electoral College. We all know how that turned out.

These much-mocked projections—along with the cycles of disillusionment and defenses they spawned—loom large over recently reignited debates about the place of polling and data in progressive politics. The dispute is one skirmish in a broader struggle over the future of the Democratic Party, coming at a moment when the stakes could hardly be higher for forging a left-liberal popular front against Trump’s authoritarianism.

Roughly speaking, one side touts polling and other electoral data as our best guides to political reality—and thus, it is claimed, to political strategy. These tools are far from perfect, the argument goes, but they are better than alternatives. To those in this camp, hostility to data smacks of impressionism at best (just “vibes”), rank ideology at worst; either way, it’s irrational—a dangerous denial of reality—and progressives must do politics by poll if they actually care about winning. Critics, in turn, insist that it’s the data-mongers who are the ideologues, ignorant of history, lacking in moral backbone, and devoid of political imagination. The influence and professional prestige they wield as “very smart,” well-credentialed, STEM-fluent wonks—steeped in a technocratic worldview, at odds with bottom-up movements—is decried as just a way of dressing up megadonor-approved policies in the guise of neutral expertise, or just a means for media companies to capture audience attention by treating politics like sports.

The shape of this debate today is the product of the rapid rise of the political polling industry in the late twentieth century. From candidate selection to message crafting, audience targeting, and news reporting, polling and data analytics have come to play an outsize role in professional politics and mainstream media, shaping not just how politics is conceived, practiced, and covered but also the contours of liberal common sense. It wasn’t always this way. FDR was the first president to draw on polling to help guide campaign strategy; historians have traced its rise to prominence among Democratic strategists since the 1960s, a trend coinciding with the “hollowing” out of political parties and the explosion of money in politics. The Upshot and Vox both launched in 2014—the same year FiveThirtyEight relaunched on ESPN—clinching the ascendance of a wonkish approach to progressive punditry during Obama’s second term. Today, the orientation is epitomized by data-driven consultancies like David Shor’s Blue Rose Research and election analysis groups like Split Ticket—the latter cofounded by Lakshya Jain, the director of political data at the new self-styled organ of liberalism’s salvation, The Argument.

Much ink has been spilled on these developments—in particular, on whether ubiquitous political data analysis has in fact led to greater political insight and more effective political campaigns. But the sparring can create more confusion than it dispels. How is one supposed to decide between the positivists and the skeptics? Are polling and quantitative models reliable indicators of things like what voters think and how they will behave? Were these methods, and the broader sensibility they reflect, invalidated by what happened in 2016, 2020, and 2024? Without clear, shared standards about when a prediction can be validated as “correct” and what level of accuracy makes for a successful method, debate endlessly spins on. When skeptics cite embarrassing misses like Clinton v. Trump—or surprise upsets like Zohran Mamdani’s—the data gurus reply that predictions are always probabilistic and polls always have a margin of error, and besides, gut feelings perform even worse. When critics rail against polling’s professional-managerial function or its association with monied interests, quants dismiss these criticisms as bad-faith efforts to cast aspersions on their work and reputations that don’t show how their evidence-based models are flawed. (Never mind that the models are usually proprietary, not open to public auditing.)

More insight can be gained by looking under the hood of a particular controversy. And as it happens, a new one has been brewing this year concerning Wins Above Replacement (WAR)—a metric analogous to the baseball statistic of the same name, used by Split Ticket to argue that Democrats should “moderate” to improve their electoral prospects. The WAR wars can seem like a purely technical dispute over methodology. In fact, they illustrate why arguments over “what the data say we should do” can’t ever be settled by the data alone. The chain of reasoning that runs from political data to political prescription depends on value-laden judgments about the game of politics itself: which past patterns will hold in the future and which can be changed, how open we are to experimentation, and how much we are willing to risk if we’re wrong.

These judgments, which define and delimit the scope of what politics can do, are inescapable: no strategist can avoid making them. The problem arises in concealing them, treating them as plain facts rather than open to reasonable dispute.

Sponsored

The basic idea behind Split Ticket’s WAR score is to compare how well a congressional candidate performs in an election with how well they are “expected” to perform based on “structural” factors—incumbency, for example, or past voting patterns in the district. A WAR score of zero is interpreted as meaning the candidate’s performance was perfectly generic: what any “replacement-level” candidate from the same party would have achieved. A higher WAR score is taken to indicate a more impressive performance relative to expectations—and a clue that emulating the candidate could improve someone’s electoral chances.

Consider an example. Split Ticket starts with the candidate’s actual performance, measured relative to the performance of the presidential candidate of the same party. Suppose a Democratic congressional candidate won her race last year by 6 percentage points in a district that went +2 for Harris. The discrepancy—a raw overperformance of +4—is suggestive. But can we conclude on this basis alone that the congressional candidate was a “better” candidate than Harris—say, because of her distinctive messaging style, policy views, or charisma? Not if other structural factors contributed to the result. If the candidate were an incumbent, for example, she might be expected to have a built-in advantage over the presidential candidate of, say, +3 points. Accounting for this “structural” factor, her WAR—the portion of her performance attributable to her individual characteristics, beyond what we’d expect the typical incumbent to get—would be +1 point.

For analysts who mine WAR or something like it for political insights, the score is not simply a descriptive measure: a way of capturing and ranking how well candidates perform. It’s considered valuable evidence for inferring what factors help win elections and thus for making political prescriptions. In other words, the aim is to draw conclusions about what kinds of candidates do better than others—and on that basis, to make judgments about political strategy writ large, including which candidates to support with party funding and which platforms and messaging styles to encourage (and discourage).

That’s just what happened in February, when Jain and Harrison Lavelle of Split Ticket published a Washington Post op-ed reporting their findings that ideological moderates—as measured by another numerical index—outperformed progressives in 2024, in the sense of having higher WAR scores. “The sooner partisans and pressure groups reject the seductive notion that America actually wants their specific version of ideological purity, rather than moderation that might compromise principles they value, the sooner they will win more elections and get more of the policy they want,” they concluded. No error bars, no uncertainty: just flat, bald-faced prescription.

Join our newsletter

New pieces, archive selections, and more straight to your inbox

Critics—other political data analysts—objected that Split Ticket’s structural adjustments were misguided; alternative approaches yielded a far more attenuated benefit to moderates or even no benefit at all over progressives. Yet an even more fundamental issue with using Split Ticket’s WAR scores to guide electoral strategy went unscrutinized. The crux of the problem lies in the use of the presidential vote share as an effective baseline for candidate performance. What justifies this assumption? Analysts cite the fact that most voters are reliable party-line voters who pick down-ballot candidates based on the party of the presidential candidate they prefer. But party loyalty is not the only assumption at work here. To take Harris’s or Trump’s vote share to set an anchor on candidate performance is to assume a broader set of facts about elections and voting behavior: not just that voters typically commit to their preferred presidential candidate’s party up and down the ballot but also that the set of voters that shows up to vote is fixed by the presidential line and the presidential line only. Down-ballot candidates might be able to “swing” some subset of these voters away from the presidential baseline, but they cannot influence the participating electorate itself. In treating the proportion of voters who show up to vote for Harris as an initial approximation of how well a generic Democratic candidate should be expected to do, Split Ticket effectively treats the district’s presidential vote share as a measure of the district’s partisan baseline or general voting tendencies—features of the district, insensitive to what candidates run down-ballot. If the presidential vote share were itself affected by down-ballot candidate races (for instance, if voters showed up to vote specifically for a down-ballot candidate, and then voted for that party’s presidential candidate because they were already at the polls), then it would no longer present a structural baseline. Rather than isolating candidate quality from structural features independent of the candidate, WAR would instead measure candidate quality relative to a baseline that the candidate’s quality itself influenced. In short, the model treats the presidential vote in a district as an exogenous feature of the district, insensitive to who else (other than the presidential candidates) runs. With this presumption, the game of politics becomes the game of trying to appeal to as many of the people who show up to vote—but not because of you—as you can. To be sure, WAR enthusiasts think these assumptions about voter behavior are simply true, or at least close enough to the truth that it’s worth betting huge stakes on. But the issue is not really one of truth or falsity, because the purpose of WAR scores is to provide a normative prescription about electoral strategy: a claim that, given past patterns, we ought to run this candidate over that one. As a result, the claim relies not simply on a view of how voters are likely to behave but whether this is the right way of setting up the game of politics: as purely a contest to shift the votes of the people who happen to show up, rather than, say, as a play also for them to show up—running candidates who can bring to the polls voters for whom neither Harris nor Trump is a compelling draw. The point, to be clear, is not that the model does not factor in turnout. It is that the model implicitly treats things like turnout in the vein of structural factors, not movable by candidates or a political strategy generally. And though it might be that turnout is in fact difficult to shift, its difficulty alone cannot justify despair and saying that one will therefore not even consider its relevance to political strategy. The point is general: insofar as models of voting behavior seek to guide political action, they can never be justified by empirical facts alone. To set up politics as a particular game or field of play is simultaneously to establish the rules for winning that game: to specify the actions that can and cannot be taken and the broader terms of success. This, after all, is the driving idea behind WAR’s aim in distinguishing between “candidate quality” and “structural factors” in the first place: to parse out that which is within the control of a political strategy from that which is not. Background assumptions such as those about turnout embedded in the presidential vote share baseline do not just determine what factors should and should not be considered when determining candidate quality; they speak to the very nature of the game that the candidates are supposedly better or worse at. What’s more, the effect of making such strong regularity assumptions—that past social patterns will extend into the future, because the game is set up to ensure they will—is to foreclose on those very strategies that might have challenged it. The presumed regularity on which the assumption depends might then itself be an effect of not trying things differently—not a fundamental background constraint on politics, but itself a result of the way we do politics. Candidates and campaigns do shape who shows up to vote and so in turn what a district’s voting tendencies are. And it is precisely the tendency to lose sight of this fact that leads pundits and pollsters to deep-seated status quo bias, quite literally defining out of consideration strategies that depart from business as usual and through which elections can in fact be won—strategies, for example, that aim to alter the composition of standard voting blocs, drawing in new voters and realigning how existing voters vote up and down the ballot, rather than ones that ask us to resign ourselves to the idea that the third of eligible voters who did not vote in 2024—nearly ninety million people—should simply be absorbed into the baseline of low expectations for the American electoral system and written off as irrelevant to political strategy. This is not to say that data and the methods of empirical research are of no use for understanding politics or of devising political strategy. To admit that the social world is too staggeringly complex to draw straightforward causal prescriptions from models like these is not, in itself, to descend into skepticism and nihilism about the prospect of social knowledge. It is simply to recognize a basic difference between the social world of human action and the world of physics. Quantitative methods of prediction and explanation used in the social and natural sciences might appear similar at a formal or technical level—each tries to learn from data, and each might even use similar modeling methods to do so—but they are not at all similar at the level of causal interpretation and normative prescription. Assuming that the sun will rise tomorrow on the basis of past experience, or that Newton’s laws of motion will continue to hold, is not at all like assuming that district voting patterns in the 2028 presidential election will stick to old trends, however “fundamental” those trends may seem to modelers today. The most important difference is that while we can’t change the laws of motion, we can, in principle, change patterns of human behavior. That presumably is the point of politics, after all.
In claiming the mantle of science, the data-centered approach to politics has drawn ever tighter boundaries around what it takes to speak knowledgeably about politics and so in turn who can be entrusted to direct it. Furthermore, this form of analysis’s overarching emphasis on the structural determinants of human behavior—on all the ways that human behavior is already dictated by conditions outside of our control—is in tension with one of the core tenets of political organizing: that people and things are permeable to change. There is something paradoxical, even self-defeating, about an approach to political strategy that takes so much of our political life as fixed and preordained and thereby minimizes the significance of doing actual politics: organizing people’s understandings of the world in a way that moves them to behave in ways radically different from how they had previously been behaving.
When the task of politics is reduced to identifying the top predictors of voting for Democrats, success is reduced to ensuring behavioral compliance.
To construct a model of political strategy based primarily on existing empirical patterns of belief and action is to express a fundamental skepticism toward the possibility that these might be changed. And it is, in the end, to treat people as objects rather than agents. The body politic is not to be engaged as comprised of human beings whose actions have reasons or are expressive of goals and values. The aim is to discover something like the “physics” of social and political life: what its general tendencies are, what it is attracted to or repulsed by, in order to know how to best handle, steer, and manage its direction. When the task of politics is reduced to identifying the top predictors of voting for Democrats, success is reduced to ensuring behavioral compliance. The vision of politics—technocratic experts pulling just the right levers to secure just enough votes—is not just sterile and unappealing. It also has real consequences of its own, contributing to the prevailing sense that strategists and politicians don’t really care about what they are saying or doing. The poll-tested candidate looks like a phony, bought and sold by donors and managed by party operatives. To this, the mainstream positivist will of course say that they are guided by democratic rather than managerial values—that “determining what people truly believe,” as Jain put it in his defense of polling in The Argument, “is a core demand of democratic systems.” It is obviously true that what the public thinks and wants is crucial to democratic politics. But it is only by collapsing “what people truly believe” into the results of polling that satisfying this democratic demand becomes a purely technical exercise for the data analysts. Jain is confident in this equation. Polling, he writes, “tells us what people believe, how they plan to vote, and how they view the world. It’s not the truth, but it can get us much closer to it than anything else can.” But even he seems ultimately ambivalent about what public opinion is and whether it is stable enough to help guide political strategy. He notes in the conclusion of his article that “in a world of great uncertainty and constant change, many voters are also a maze of contradictions. People are quite malleable and fluid on many different topics, and they can have deeply held values while simultaneously changing the way they vote and think in response to current events.” What might explain this fluidity? What might spur these changes? These are the sorts of observations that might lead one to see public opinion as forged through politics rather than existing stably prior to it. Lots of claims about politics cannot be translated into numbers; and many others cannot be validated by data, even in theory. To the positivist, moral claims, exercises of imagination, claims about what is possible but as-yet unrealized come with glaring asterisks. New tactics and strategies are by definition unvalidated, and in that sense they are admittedly a risk. But there are risks the other way, too. The more fundamental disagreement is whether the deep problems of American society are too serious to confine ourselves to business as usual—including the tendency to defer to the prescriptions of pollsters whose noisy and imperfect models take so much of our political life as fixed and preordained, even as so much of it is so clearly rapidly changing.

In the end, this is not a dispute that can be resolved by data. Politics cannot be drained out of political strategy, for the basic reason that central questions of this kind are simply not open to purely technical or empirical resolution. Skepticism toward prevailing modeling methods is therefore neither dogmatic nor anti-empirical. Rather, it is a reasonable and considered judgment about the social world—shaped by distinct interpretations of evidence, assessments of risk, senses of urgency, and visions of what is possible and worth trying—which says, on this basis, that the challenges we face can only be met by radical changes to the prevailing political order, even if doing so means pushing past the boundaries of prevailing political reason. Independent and nonprofit, Boston Review relies on reader funding. To support work like this, please donate here.

The post How to Lie with (Political) Statistics appeared first on Boston Review.

rssfeeds-admin

Share
Published by
rssfeeds-admin

Recent Posts

Pokémon TCG: Journey Together Booster Bundles Are Discounted at Amazon Today

Amazon is going through something of a massive restocking mission this week for Pokémon cards,…

36 minutes ago

Pokémon TCG: Journey Together Booster Bundles Are Discounted at Amazon Today

Amazon is going through something of a massive restocking mission this week for Pokémon cards,…

37 minutes ago

Magic: The Gathering’s TMNT Unique Pizza Bundle Is Finally Back In Stock Online – Here’s What It Includes

Magic: The Gathering has kicked off its Teenage Mutant Ninja Turtles set prerelease weekend, but…

37 minutes ago

Why Is Spider-Man: Beyond the Spider-Verse Taking So Long? Producers Phil Lord and Chris Miller Explain

The much-delayed Spider-Man: Beyond the Spider-Verse currently has a June 18, 2027 release date. If…

37 minutes ago

Resident Evil Requiem Launches Big on Steam, Breaks Series Record

Resident Evil Requiem has landed on Steam, and is now the series' biggest launch to…

37 minutes ago

Amazon Restocks a Whole Bunch of Pokémon TCG Cards for Pokémon’s 30th Anniversary

There's a whole bunch of great Pokémon deals available right now online as part of…

38 minutes ago

This website uses cookies.