Before I go any further,
let's get one thing out of the way. Nate Silver is a genius. He is skilled
not only in manipulating data but also in providing explanations for how he
manipulated data. Silver gets plenty of credit for the first thing and way too
little credit for the second thing.
My little niche in the
world has been carved writing about fantasy baseball and applying math to
player valuation. By trade, I've spent the last few years of my career applying
math and statistics to health care analytics. Using statistics properly is
difficult enough. Using statistics well and explaining to
someone how you got there is climbing Mount Everest. Silver's 538
blog not only climbs that mountain, it reaches the summit.
The knee-jerk reaction to
Tuesday’s election results was to praise Silver as the Dean Emeritus of the
Electoral College. Some were amazed that Silver nailed 49 out of 49 states, and
will probably get Florida right as well.
While there's no doubt that
Silver did a terrific job, the surprised reaction of the masses came because of
some last second projections from some far less analytical sources. Conor
Friedersdorf of The Atlantic has a good
piece on this phenomenon. To some on the right it was difficult to
believe that Barack Obama could possibly clear 250 electoral votes, let alone
win an election. In reality, the opposite was the case. All along, Mitt Romney
had a difficult path to the Presidency where the electoral votes were
concerned.
To be fair, there were
voices on the right that either thought Obama was the favorite or that Romney
was only a slight favorite who didn't have a realistic path to 300 electoral
votes. Ted Frank at Point of Law posted a very well thought out and rational
piece a week before the election explaining his issues with Silver's methodology. While I don't agree with many of Frank's conclusions,
unlike a lot of those on the right Frank offered a thoughtful rationale that
went beyond "well it can't possibly be correct that Obama is winning by
that much."
The right also seemed to
get stuck on the idea that because Silver’s model gave Obama a 91% chance of
winning that Obama was going to win in a landslide. But this wasn't what Silver
was saying at all. Rather, his system was estimating the probability of
an Obama victory at 91%. That nine percent chance for Mitt Romney wasn't
trivial. To use an apt analogy that Bret Sayre of Fake Teams provided
on Twitter,
a home team in baseball with a one-run lead, a runner on third base, and two
outs in the top of the ninth has an 87% chance of winning. The game is close,
but the odds of the road team winning are poor.
The most significant
problem with the right's analysis is that on a basic level using state polling
data works. Nate Silver isn't the only analyst that uses state
polling data to try and predict the outcome of elections. Election Projection and Electoral Vote both rely heavily on
state projections. Unlike Silver, they don't shake and bake with the numbers. Both
sites aggregate state polls going back about a week. Both sites predicted an
Obama victory, 303-235. The only state both sites got wrong was Florida, which
Silver (probably) got right.
As a fun exercise, I went
back even further than Election Projection and Electoral Vote and calculated a
leader based on a month's worth of polls, using only one poll (the most recent
one) per polling firm. This methodology is generally considered a poor one
because older polls are generally considered "stale" and don't take
into account recent events that could tip the scales in one direction or the
other. How does this "neutral" model do compared to Silver's more
rigorous model?
Nate Silver's 538 model
versus "neutral" state poll model
State
|
Raw
Polling
Average |
Silver/538
|
Actual
|
California
|
Obama
+14
|
Obama
+17.4
|
Obama
+20.5
|
Colorado
|
Obama
+1.1
|
Obama
+2.5
|
Obama
+4.7
|
Connecticut
|
Obama
+12.7
|
Obama
+14.1
|
Obama
+17.8
|
Florida
|
Romney
+1.3
|
Obama
0
|
Obama
+0.6
|
Iowa
|
Obama
+2
|
Obama
+3.2
|
Obama
+5.6
|
Massachusetts
|
Obama
+19.7
|
Obama
+19.1
|
Obama
+22.8
|
Michigan
|
Obama
+3.6
|
Obama
+7.1
|
Obama
+8.5
|
Minnesota
|
Obama
+5.7
|
Obama
+8.6
|
Obama
+6.4
|
Missouri
|
Romney
+10.8
|
Romney
+8.1
|
Romney
+9.6
|
Nevada
|
Obama
+2.6
|
Obama
+4.5
|
Obama
+6.6
|
New
Hampshire
|
Obama
+1.8
|
Obama
+3.5
|
Obama
+5.8
|
North
Carolina
|
Romney
+2.7
|
Romney
+1.7
|
Romney
+2.2
|
Ohio
|
Obama
+2.8
|
Obama
+3.6
|
Obama
+1.9
|
Pennsylvania
|
Obama
+3.8
|
Obama
+5.9
|
Obama
+5.2
|
Virginia
|
Obama
+0.2
|
Obama
+2
|
Obama
+3
|
Washington
|
Obama
+13.6
|
Obama
+13.6
|
Obama
+13.3
|
Wisconsin
|
Obama
+4.7
|
Obama
+5.5
|
Obama
+6.7
|
National
|
Obama
+1.7
|
Obama
+2.5
|
Obama
+2.5
|
I included every state that
had a minimum of five or more polls listed at Real Clear Politics in the 30
days leading up to the election.
While the right kept
vaguely and not-so-vaguely accusing Silver of "cooking the books" in
the days leading up to the election, Silver's model may not have been
aggressive enough. In 11 of the 15 states where Silver had Obama winning, the
actual margin was even higher than Silver predicted. The raw polling model was
even more tepid on Obama than Silver's model, lagging behind him in 13 of 15
states.
Silver's model worked very
well. But we shouldn't be too surprised. State polling-based models have worked
very well in the last three election cycles. If you used raw polling averages
in 2004 and 2008, you would have correctly predicted the winner in 49 out of 50
states both times, with only Indiana (2008) and Ohio (2004) being off the mark.
If you used a shorter time frame in 2004 than 30 days, you would have predicted
a George Bush victory; John Kerry was only ahead in one poll in the last few
days leading up to that election.
The "surprise" of
Silver doing so well with his model has more to do with a lack of belief in
data than in anything that Silver is or isn't doing. While the right's
apoplectic reaction has magnified this perception, Presidential election
coverage in general has always tried to present the state of the race as
something that is difficult to get a feel for and could go in any direction.
While this is theoretically true, we now have three Presidential election
cycles where state polling has offered us an excellent baseline to predict what
will ultimately happen on Election Day.
No comments:
Post a Comment