Election data Q+A

May 1, 2016

This article appeared in Berkeley Engineer magazine, Spring 2016

Jasjeet Sekhon is the chief scientist at the Fung Institute for Engineering Leadership. He holds faculty appointments in political science and statistics, and he is a senior fellow at the Berkeley Institute for Data Science. Sekhon’s research applies computational and statistical analysis to massive datasets of granular information collected about people. In addition to using that data to study how persuasion works in elections, he also studies the effectiveness of digital advertising and personalized medicine.

Where do data and elections converge?
When and how often you vote is part of the public record. Where you live is part of the census data. With that information alone, politicians can make assumptions about you and how you will vote. The most boring version of this results in targeted mail, which most people have seen. But the use of data is becoming more sophisticated. It’s not just a mailer anymore; now there might be an experiment embedded in it to try to figure out the right messaging a candidate should use. The cost of doing that kind of thing is coming down as data services and infrastructure become less expensive. During the elections of 2012 and 2014, experiments were performed on 20 to 30 million potential voters, and that isn’t even counting the online-only experiments. The largest experiments ever done on people are happening right now.

When did data become such an important part of political campaigning?
U.S. elections in particular, and certainly for the president, have been transformed. In 2000, the election was shockingly close. That made people think more about the small margins. In 2008, the Obama campaign had a large analytics team. They ran lots of experiments and randomized different appeals. They were able to collect individual data on people in battleground states. By 2012, the Obama campaign had gotten really good at this and became a huge machine.

How is the 2016 election shaping up to be different from previous elections?
Our data is good enough that we can make predictions with a high level of probability when the elections are close and stable. But in this election, with a wild-card candidate like Donald Trump, the normal predictive models may not be accurate anymore.

What’s next?
Getting people to vote is one thing, but changing their mind about candidates or issues — persuasion — that’s a much messier problem. How do you know if a campaign ad or a piece of direct mail actually changed a voter’s mind? Probably by 2020, we will have addressable TV, which means TV ads that can be targeted to an individual viewer, much like how Internet ads work. That will not only change how elections operate, but it will be a huge change for advertising. We’ll be able to better understand how effective ads are.

Where else do these lessons apply?
The commonality between election data and other forms of human data is that it is very granular and it is about the way people behave, whether it’s an Internet ad or the effectiveness of a new drug or a political campaign. We haven’t been here before, and there’s still a lot we don’t really know.

Topics: Public policy, Research