Monday, November 13, 2023

Should Social Media Data Replace Opinion Polls—And Voting?


Pity the poor opinion pollsters of today.  Their job has been mightily complicated by the rapidly changing nature of communications media and the soaring costs of paying real people to do real things such as knocking on doors and asking questions.  In an age when even the Census Bureau has mostly abandoned the in-person method of counting the population, opinion polls can't compete either.  For a time—say 1950 to 2000—their job was made easier by the advent of the near-universal telephone.  But the rise of robocalling, mobile phone proliferation with the caller ID feature, and the consequent general aversion of nearly everybody to answering a call from someone you don't know, has made it much harder for opinion poll workers to approach the ideal of their business:  a truly representative sample of the relevant population.


So why not take advantage of the technological advances we have, and use data culled from social media to do opinion polling?  After all, we are told that some social-media and big-tech firms know more about our preferences than we do ourselves.  Out there in the bit void is a profile of everyone who has anything to do with mobile phones, computers, or the Internet—which is almost everyone, period.  And much of that data on people is either publicly available or can be obtained for a price that is a lot less than paying folks to walk around in seventeen carefully selected cities and countrysides knocking on one thousand doors. 


Well, anything a piker like me can think of, you can bet smarter people have thought of as well.  And sure enough, three researchers at the University of Lausanne in Switzerland have not only thought of it, but have collected nearly two hundred papers by other researchers who have also looked into the topic. 


In surveying the literature, Maud Reveilhac, Stephanie Steinmetz, and Davide Morselli apparently did not find anyone who has gone all the way from traditional opinion polling to relying mainly on social-media data (or SMD for short).  That is a bridge too far even now.  But they found many researchers trying to show how SMD can complement traditional survey data, leading to new insights and confirming or disconfirming poll findings.


With regard specifically to political polls, a subject many of the papers focused on, one can imagine a kind of hierarchy, with one's actual vote at the top.  Below that is the opinion a voter might tell a pollster in response to the question, "If the Presidential election were held today, who would you vote for?"  And below that, as far as I know, anyway, are the actions the voter takes on social media—the sites visited, the tweets subscribed to, the comments posted, etc. 


It only stands to reason that there is some correlation among these three classes of activity.  If someone watches hours of Trump speeches and says they are going to vote for Trump, it would be surprising to find that they actually voted for Bernie Sanders as a write-in, for example. 


But there is a time-honored tradition in democracies that the act of voting is somehow sacred and separate from anything else a person happens to do or say.  Because voting is the exercise of a right conferred by the government, in the moment of voting a person is acting in an official capacity.  It is essentially the same kind of act as when a governor or president signs a law, and should be safeguarded and respected in the same way.  A president may have said things that lead you to think he will sign a certain law.  He may even say he'll sign it when it comes to his desk.  But until he actually and consciously signs it, it's not yet a law.


There are laws against bribing executives and judges in order to influence their decisions, and so there are also laws against paying people to vote a certain way.  That is because in a democracy, we expect the judgment of each citizen to be exercised in a conscious and deliberate way.  And bribes or other forms of vote contamination corrupt this process.


Despite the findings of the University of Lausanne researchers that so far, no one has attempted to replace opinion polls wholesale with data garnered from social media or other sources, the danger still exists.  And with the advent of AI and its ability to ferret out correlations in inhumanly large data sets, I can easily imagine a scenario such as the following.


Suppose some hotshot polling organization finds that they can get a consistently high correlation between traditional voting, on the one hand, and "polling" based on a sophisticated use of social media and other Internet-extracted data—data extracted in most cases without the explicit knowledge of the people involved.  Right now, that sort of thing is not possible, but it may be achievable in the near future.


Suppose also that for whatever reason, participation in actual voting plummets.  This sounds far-fetched, but already we've seen how one person can singlehandedly cast effective aspersions on the validity of elections that by most historical measures were properly conducted. 


Someone may float the idea that, hey, we have this wonderful polling system that predicts the outcomes of elections so well that people don't even have to vote!  Let's just do it that way—ask the AI system to find out what people want, and then give it to them.


It sounds ridiculous now.  But in 1980, it sounded ridiculous to say that in the near future, soft-drink companies will be bottling ordinary water and selling it to you at a dollar a bottle.  And it sounded ridiculous to say that the U. S. Census Bureau would quit trying to count every last person in the country, and would rely instead on a combination of mailed questionnaires and "samples" collected in person. 


So if anybody in the future proposes replacing actual voting with opinion polls that people don't actually have to participate in, I'm here to say we should oppose the idea.  It betrays the notion of democratic voting at its core.  The social scientists can play with social-media data all they want, but there is no substitute for voting, and there never should be.


Sources:  The paper "A systematic literature review of how and whether social media data can complement traditional survey data to study public opinion," by Maud Reveilhac, Stephanie Steinmetz, and Davide Morselli appeared in Multimedia Tools and Applications, vol. 81, pp. 10107-10142, in 2022, and is available online at

No comments:

Post a Comment