For over a century, London’s Hyde Park Corner has been filled with strangers shouting from soapboxes. A little more recently, the internet has seemingly made soapbox-shouters of us all. Everyone has a right (thank goodness) to voice her opinion for all to hear. But it can be hard to sift through the different voices to decide who might be worth listening to.
The Brexit vote and the American election have made the question of whose opinion we trust more salient. Like the speakers at Hyde Park Corner, the news stories that circulate online are not always filled with reliable information. Algorithms are doing more than ever before to determine which stories we see; can better algorithms help accurate, reliable stories win more of our attention?
From a computer-science perspective, the question at hand is all about reputations. In (say) a small village, people’s reputations can be developed and communicated purely through personal interactions; online, we need ways to develop reputations among large groups of strangers. In the early 2000s, Sep Kamvar – now a professor at the MIT Media Lab – developed an algorithm called EigenTrust to create a reliable measure of trustworthiness among members of a peer-to-peer sharing network. Peer-to-peer networks made it possible for strangers to share data easily and robustly, but were vulnerable to malicious actors sharing corrupted or inauthentic files. Users needed a way to determine whether a peer on the network was trustworthy before receiving a file from her.
EigenTrust works by cleverly aggregating users’ information about each other’s behaviour. If you share a corrupted file with me, I will rate you poorly. But how much weight should the algorithm put on my opinion? That depends on my own reputation: my opinion will receive more weight if I’ve received positive ratings from many other trusted users. My reputation depends on how many users rated me positively, but also on how good their reputations were; the algorithm makes repeated calculations across the entire network of relationships until a stable reputation score is achieved for every user. The reasoning might sound circular, but the maths works elegantly.
This approach does have some precedents in the social world. If you’re looking for a new employee you’ll surely put a lot of weight on recommendations from Ann, who you know is trusted by everyone in your field. Conversely, if Ann tells you not to hire someone, her word alone can outweigh many positive recommendations from others. You’re happy to give some weight to recommendations from people Ann recommends; her strong reputation rubs off on those she can vouch for. Meanwhile, you’ll put very little weight on recommendations from Beth, who you met one day at a conference and don’t know anything about; her recommendation won’t count against a candidate, but it also won’t help them.
EigenTrust is based on this “intuition that you trust your friends’ friends,” says Kamvar, but mixed with a “computation that there’s no analogy for in the social world” because in our non-digital lives we lack the algorithm’s bird’s-eye view of the entire network. A similar method helped Google dethrone Yahoo as the sovereign of search at the turn of the century: Google’s famous PageRank algorithm determines a page’s trustworthiness by the links it receives from other trustworthy pages, which themselves are identified by the links they receive from others. This turned out to give Google a huge advantage in surfacing high-quality pages to promote in its search results.
Social news sites such as Reddit (the eighth-biggest website, by traffic, in America) seem like natural candidates for EigenTrust-like algorithms. These sites rely on user-voting to determine which stories other visitors will see, and users receive points (“karma”) for submitting popular stories. And while it’s not often described this way, Facebook is in some ways a social news site: which stories show up in our newsfeeds is determined in part by which stories our friends Like, click on and hide. For social news sites, giving more weight to the opinions of more trustworthy users seems like an easy improvement.
The founder of Hacker News, a social news site for coders, originally thought so. In 2007, Paul Graham planned that the site would have “a group of human editors who train the system in what counts as a good story. Each user’s voting power will then be scaled based on whether they vote for good stories or bad ones.” But things turned out differently: it was hard to find users who consistently exhibited good taste. Similarly, “karma is a weaker signal than people would like to believe, and certainly than we believed ten years ago when the idea was more fresh,” says Daniel Gackle, the site’s chief moderator. “Karma turns out to be more a matter of how much time a person spends posting to the site than of how good their posts are. Also, posting popular things (and so getting lots of karma) is not the same as posting good things, at least by Hacker News’ definition of ‘good’.”
This highlights a key challenge for algorithms like EigenTrust and PageRank: they do not really judge trustworthiness in any objective sense, but rather a modified measure of aggregate popularity. It’s possible to “seed” these algorithms by using human judgment to declare certain people or pages trustworthy in advance, and that approach might still have potential, but Hacker News’s experience shows that it certainly isn’t easy. Which leads, perhaps, to a sad conclusion: sometimes we get the news we deserve.