Friday, June 17, 2011

Is Google “Protecting” You From Exposure to Opposing Views?

My latest post at Critical Thinking Applied:

From a recent article in the New York Times:
On the Web, we often see what we like, and like what we see. Whether we know it or not, the Internet creates personalized e-comfort zones for each one of us.

Give a thumbs up to a movie on Netflix or a thumbs down to a song on Pandora, de-friend a bore on Facebook or search for just about anything on Google: all of these actions feed into algorithms that then try to predict what we want or don't want online.

And what's wrong with that?

Plenty, according to Eli Pariser, the author of "The Filter Bubble: What the Internet Is Hiding From You." Personalization on the Web, he says, is becoming so pervasive that we may not even know what we're missing: the views and voices that challenge our own thinking.
I found this very upsetting. I've never been a fan of online personalization—on multiple occasions I've turned off Google's personalized search result function, only to find that it has a Rasputin-esque way of mysteriously turning itself back on—but until now my reasons have been limited to my own curmudgeonly preferences.[1] This, however, is something that could actually have larger implications. More from the article:
With television, people can limit their exposure to dissenting opinions simply by flipping the channel, to, say, Fox from MSNBC. And, of course, viewers are aware they're actively choosing shows. The concern with personalization algorithms is that many consumers don't understand, or may not even be aware of, the filtering methodology.
My original plan for this post was to wrap it up somewhere around here—with the common sense observation that it's not a good thing for our search results to be aimed at reinforcing whatever biases can be gleaned from our search history, especially if the user has no idea this is going on—but then my natural skepticism took over. Wait a minute, I thought, I haven't actually seen any non-anecdotal evidence that personalization of search results extends to a user's political leanings—it's just kind of taken as a given that this is the sort of thing Google would do. This despite the fact that the Times quotes a Google spokesman saying they do no such thing:
"People value getting information from a wide variety of perspectives, so we have algorithms in place designed specifically to limit personalization and promote variety in the results page," said [Jake] Hubert, the Google spokesman. He added that the company looked forward to "carefully reviewing Mr. Pariser's analysis of this important issue."
After trying, and failing, to find an instance of someone else doing an experiment similar to what I had in mind,[2] I put this article on hold, went into my Google account and cleared my search history (I hadn't knowingly changed any settings since the previous time I cleared the search history and set it to not record my searches in the future, but, again—like freakin' Rasputin), and made sure to leave the record-keeping and personalization features turned on.

At that point there were any number of directions I could've gone, but I've been picking on conservatives for a while now, so I decided to switch it up. Over the next seven days I searched Google for information on topics like the failings of capitalism, racism in the Tea Party, healthcare as a human right, the economic benefits of amnesty, the ineffectiveness of abstinence-only education, and the disastrous consequences of the Bush tax cuts. I looked into whether I'm eligible for food stamps. I pretended to care about Alec Baldwin's opinions on things. I tried to determine who was America's greatest president—FDR, Kennedy, or Clinton. I took the opportunity to satisfy my genuine curiosity about Florida's recent "voter suppression" law.[3] Early in the process I did a few searches along the lines of "Republicans are evil" and "Republicans don't care about minorities", until I realized that manipulating Google's personalization algorithm probably requires a little more finesse than simply typing your views (note: those are not my views) into the search box as if you're having a one-sided conversation.

When I got tired of politics, I looked into hybrid cars, the ethics of eating meat, The West Wing DVD sets, the local time in France (there was an unrelated reason for that one, but I figured it fit the theme), post-graduate programs in English literature, Burning Man, John Lennon lyrics, the video for Aerosmith's "Eat the Rich", Portland, Oregon's locally-owned coffee shops, and on and on.

In all, I did more than 200 searches on more than 100 unique topics, and I visited about 250 sites from among the search results, being careful to only go to those that seemed likely to reinforce liberal views. Every search, and every decision to click on a particular result, was geared toward building a search history that (a) reflected a person whose politics, lifestyle, and hobbies were stereotypically liberal to an absurd extreme, and who had no interest whatsoever in challenging that worldview, and (b) was otherwise a somewhat realistic imitation of how people (people like me, at least) actually use search engines.[4]

Having accumulated what seemed like enough data for Google to mess around with, and having done so over what seemed like a long enough period of time for the data to be incorporated into their system, I did some tests. For this part I wanted to use search terms dealing with topics that were (a) not specifically covered in the search history, (b) controversial enough that a non-personalized search would likely yield results supporting multiple points of view, and (c) recent enough that most of the available commentary would be opinion-based (but not so recent that search results would be changing almost in real time). I came up with the following:
  1. "Vaughn Walker"
  2. NPR funding
  3. oil companies "price gouging"
  4. "Rick Scott" "high speed rail"
  5. "budget cuts" entitlements
  6. "debt ceiling"
I logged out of my Google account and searched for each of the above terms—first as a standard web search, then a Google News search—and saved a screenshot of the first page of results. Then I logged back in and repeated the process. I compared the results, and…

[…drumroll…]

I got nothing. I mean, not literally nothing, but I didn't get much. The Google News results were identical for all six searches. The web searches had some differences, but in each case the "personalized" results reflected the same variety of viewpoints as the "neutral" results. The first ten results for searches #4 and #5 were identical. The results for searches #1 and #3 were the same ten sites, but in a slightly different order. The results for search #6 ("debt ceiling") were in a slightly different order, and also differed in one wholly unspectacular way:
  • The 5th "personalized" result was this CNN Money article from May 17, while the 6th "neutral" result was this CNN Money article originally dated January 11, but republished on May 18. The two articles are almost identical.
The results for search #2 (NPR funding) were in a slightly different order, and also differed in two wholly unspectacular ways:
  • The 10th "personalized" result was this US News and World Report article, while the 10th "neutral" result was this Columbia Journalism Review article. When I saw this I re-did the searches and, sure enough, in both cases the other article was among the top two links on the second page of results.
  • The 6th "personalized" result was this Huffington Post article from March 17, while the 8th "neutral" result was this Huffington Post article from April 13.
I saved that one for last because it's the only difference with even a modicum of substance, though we're still talking about two articles from the same website about basically the same topic. Why the personalized results included the article from March 17 and not the one from April 13 is beyond me, though I note that only the former contains a quote by Representative-at-the-time Anthony Weiner, whose name appears in the search history because I wanted to know what Dan Savage had to say about the mess Weiner got himself into.[5]

And that's about it. My smoking gun is an article that referenced Anthony Weiner vs. another from the same site that didn't. What's to be learned from all this? Nothing, really, but here are some theories, in order of the odds I'd place on a given theory turning out to be the correct one:
  • 1 to 1 – Google's algorithm really is designed to encourage varied results, and does so effectively.
  • 3 to 1 – I didn't generate a search history with the right kind of data. I realized about halfway through, for example, that a fairly straightforward way to personalize search results would be to keep track of a user's favorite websites, so maybe if I had focused a little more on that.
  • 6 to 1 – I didn't generate a search history with enough data to produce meaningful results.
  • 9 to 1 – The field (i.e. the ever-present possibility that my findings are flawed for some reason I haven't thought of).
  • 200 to 1 – Google has developed an ingenious mechanism—based on suspicious activity such as doing the same searches from the same computer while logging on and off—to distinguish regular users from smartass bloggers, and in the latter case to ensure that nothing incriminating is discovered.
In closing, a tangential thought. I wouldn't have started this project if I didn't think there was a good chance I'd come away with some damning evidence that personalization of search results stifles exposure to opposing viewpoints. When that didn't happen, I was discouraged, and almost couldn't motivate myself to finish the article. But isn't that how science—or whatever you call it when your goal is essentially to reverse-engineer a large corporation's proprietary software—is supposed to work? I formed a hypothesis, tested it, and here are the results. They may be underwhelming, but at least whoever comes across this article will have something to build on.

1. What's especially maddening is that I've made a number of Google searches along the lines of "how to turn off personalization" and "stop tracking search history". The idea is to personalize my online experience, right? And I've made it abundantly clear that I don't want my online experience personalized. So why are my search results not personally tailored to my preference for search results that aren't personally tailored? WHAT ELSE DO I HAVE TO DO?
2. I found a few studies that sort of address the issue, but not really. In this 2010 study, volunteers provided (a) information about their browser settings, Google account settings, and other details that might affect personalized search results, and (b) the results they got for various search terms involving antique lamps. In this 2011 study, researchers set up Google accounts for three dead philosophers—Foucault, Nietzsche, and Kant—and built search histories based on terms from the indices of the philosophers' books. They're worth looking at, if you're into this sort of thing, but in both studies the search terms used to test the effects of personalization were chosen for their neutrality, so the findings don't really tell us anything about why different people might get different results.
3. Really, ACLU? It's a civil rights violation to shorten the early voting period from 14 days to eight (while expanding the permissible daily voting hours)? Florida didn't even have early voting until 2004.
4. For the occasional search that didn't support that narrative, I switched over to Bing, which, at 27, might make me the all-time youngest Bing user.
5. Answer: He's adamantly pro-Weiner. He also supports the Congressman.

5 comments:

  1. that leave a 1:49 ish chance of something else? or are these some sort of confused house odds not meant to add up 100 %? Of course there's probably a non zero probability that I have forgotten how math works.

    -Jordan

    ReplyDelete
  2. You're right, the third one was supposed to be 6-to-1, not 7-to-1. I don't know how to feel about the fact that you managed to find a typo in a set of entirely made-up numbers, but I'm going to go with impressed.

    ReplyDelete
  3. with the exception of a single split infinitive, spectacularly written. alec baldwin is so witty and reactionary right?

    ReplyDelete
  4. What, "being careful to only go to..."? Bah! There's nothing wrong with a split infinitive. (Although, I agree that there's room for improvement. But after trying and failing multiple times to fluidly construct that sentence I was about to completely drive myself crazy, and eventually I had to just give up.)

    ReplyDelete
  5. You should be proud of yourself for publishing evidence that disproved your hypothesis. Not enough people are humble enough to admit when they were wrong, but this shows you think truth is more important than ratifying your political views.

    *Subscribed*

    ReplyDelete