Garbage Data makes for Garbage Predictions

Pollsters and campaigns are operating with increasingly unverifiable data.

Here in Washington state, we vote by mail. My ballot was cast last week. So, instead of standing in lines and reporting anecdotal information about local turnout, I’m, mostly, doing what I do every weekday: homeschooling my boys. I say mostly because during breaks between math and science lessons, I’m making GOTV calls.

This is not the first time I’ve made these types of calls. It’s not even the first state or decade I’ve made these calls. The trends I’m seeing are disheartening. As our society becomes more mobile, and more insular, gathering data for campaigns becomes much more difficult. Despite the promises of the Digital Age, we actually know less about the voting habits of our neighbors than we did twenty years ago.

On a list of fifty people in my precinct who, as of Friday evening, had not yet returned their ballots, I spoke to fewer than fifteen. I left messages for another fifteen. The remaining forty percent either had the wrong phone number or had moved. Phone banking has experienced diminishing returns for several years. People don’t like robocalls (No, campaign consultants, they really DON’T, no matter how hard you try to sell them) so they put the phone on “direct to voice mail” or angrily slam the phone down the minute you introduce yourself.

Campaigns combat the problems with phone banking by knocking on doors and trying to make personal connections with constituents. State Legislator Paul Graves (5th LD) and candidate for US Congress from Washington’s 8th Congressional District Dino Rossi have knocked on thousands of doors this cycle. This is more effective than phone calls but has challenges as well. Gated communities, retirement homes, high-rise apartment buildings, and “bedroom communities” are notoriously hard to doorbell. Even in “typical” suburban subdivisions, most families have two working adults. That means you have maybe two hours on weekdays to reach people at home. Add to that the problems of geography in the northern tier of states: the sun sets in Seattle at 4:40pm these days. Candidates literally run out of daylight before they can knock on doors.

Big data was supposed to help with this. By aggregating participation in caucuses, voting patterns, what magazines you subscribe to, what Facebook pages or ads your respond to, and what organizations you belong to, campaigns were supposed to be able to predict your voting patterns. In broad national generalizations this generally works. In small, local campaigns, data falls apart. Some anecdotal examples: After working for a US Senator for four years, I left the workplace to raise my children. Between 2010-2012 we lived in four different states. In 2016, as a delegate to the Republican National Convention and a GOP District Chairwoman here in western Washington, the GOP data center still had me listed as a “weak Republican”. This classification was based on the fact that I hadn’t participated in Washington political circles and because I did not belong to the NRA or subscribe to Republican publications. No one had bothered to make a personal connection with me prior to 2016 and had not bothered to update their information following the 2016 caucuses and primaries in Washington.

My husband, by the way, is still listed as a “probable Democrat”. This is hilarious. How does a man born and raised in small town Kentucky, married to a Republican political activist, get listed as a probable Democrat? He’s in the computer game industry, and the Facebook pages and ads he clicks on, the professional organizations that he belongs to, and the fact that no one bothers to actually ask him, all point to a more typical Democrat. (I update the GOP data files when I have access, but we moved last year so we’re “new” voters to some systems.)

For national polls and campaigns these are tiny sampling errors. In suburban districts with tight races for federal and state offices, these identification errors, compiled with inaccurate phone numbers and out-of-date voter rolls can critically hamstring a campaign. How many voters are we not reaching because they aren’t listed as members of our party, or not listed at all?

These challenges aren’t going to get any better. They may, in fact, get worse. When every political consultant and pundit is writing a campaign post-mortem later this week, they’d be wise devote serious thought about how to return our political campaigns to a local focus. Personal connections win campaigns. Making them is not easy and it’s not cheap, but it makes for a better representational Republic.