There's a word cloud about halfway down the post that shows words frequently used in comments on "rumor" posts. Words that are darkly shaded are associated with true rumors, and words that are lightly shaded are associated with false rumors.
I've never been into word clouds as a data visualization tool, but if you do use a word cloud, and you're using color to communicate something relevant about the data, PLEASE do not use a monochrome gray gradient, as this post has done. It's really difficult to tell whether "government" is slightly darker or lighter than "certainly," for instance.
A blue-red gradient would have worked a lot nicer in this case, in my humble opinion.
I don't like word clouds either (and I'm one of the authors!). You'll notice there is no cloud in the actual paper, just in the blog post.
Do you have any references about human perception of blue-red versus monochrome gradients? Would be interested to have good recommendations for such cases.
I'm not GP, but the problem, in my colorblind opinion, is not the hue of the gradient. The extremes of your gradient are way too close together. Nothing really stands out. At least with blue/red, you can make it one extreme or the other (as opposed to white or black...on a white background).
That said, once they got to purple, I'd barely be able to distinguish them, anyway.
My first thought when they mentioned it was the optical illusion where the color looks different because of the colors around it[0]. Maybe having distinct colors (red/blue) would help with that.
Regarding the last graph, the observed data looks like it matches a log-normal distribution, not a power law... Is this correct? Not sure how to interpret it, though. How did you compute your estimation?
I think log-normal distributions usually fit this kind of data as well as power laws do. Anyway, such a plot is not the best way to choose between those two models (see http://arxiv.org/abs/0706.1062). We don't actually do any such test in the paper though (or say that it is a power law or log-normal). You might want to check out the paper for more details. https://www.facebook.com/publications/244240069095667/
It was hard for me to find again, so i'll leave this here if anyone else is interested. The psychology behind a lot of how rumors form and perpetuate fall under herd behavior, availability cascades and information cascades. Pretty much all of it is based on a group of people deciding that something they all agree on is important, such as a significant threat, or a particularly good outcome. Everything from public policy to stock market prices to internet memes and media scandals seem to be decided this way.
This is really interesting, and I remember reading something similar awhile ago and having the same thought as I had today. I wonder how this would look if they segmented the data by education levels.
When I look through my social media news feed (or read BuzzFeed posts on the hilarious dumb things that have been said on the Internet), I see a very big difference in things that are posted, if just simply from those that are college-educated vs. those that are not. Certainly this is not a guarantee. I'm sure I've shared stuff that has been fake, and even reputable news agencies make mistakes. But to me the data would be a[nother] compelling argument for better and more access to education--to stop the damn rumors! (My real goal is to put Snopes out of business.)
Very interesting indeed. I think it raises many more questions than it answers about human behavior and the infectiousness of rumors in social media. I'd be particularly interested to see an analysis of whether sharing a false rumor has any effect on the reshare rate of a user's future posts. In other words: is a user's influence or perceived reliability (as measured by the relative rate of reshares of his/her future posts) diminished following the initial share of a false rumor? Reduced reshare rates could be a positive reflection of an increasingly skeptical and better-informed user community. Consistent reshare rates would be...a less optimistic sign.
edit: also, in light of "A Batesian Mimicry Explanation of Business Cycles" https://news.ycombinator.com/item?id=7634628 , could this be a good basis for a bubble investment model?
Seems like it would be easy to follow the identified snopes links and parse the true or false they give in order to put a verified or debunked link with every share (linking to snopes). Now that would be useful. All this data mining is interesting, but just tells us what we already know (people share lot's of rumors that are frequently false.) Why not do something about it?
A lot of rumors are the result of poor understanding or misinterpretation. For example, there was a big rumor a year or two back that the US government was buying hundreds of millions of rounds of ammunition, sparking fears of everything from a conspiracy to drive up prices to an imminent imposition of martial law. The basis of the rumor was a government solicitation to bid on ammunition pricing, so that the government could lock in its ammunition purchases for the next few years at a fixed and predictable price, rather than being subject to spot price fluctuations which would make budgeting more difficult and possibly result in higher costs. However, to understand that that the government was looking for (essentially) a call option required a basic knowledge of finance and a fairly high level of reading ability to deal with the 'officialese'. It's not surprising that many people misinterpreted a request for a pricing guarantee as a solicitation for actual supply. There was also a misunderstanding of how much ammunition the government actually uses, with people who cheerfully fire off hundreds of rounds during their own practice sessions overlooking the fact that federal employees who carry firearms also have to take part in training and practice sessions where they expend large quantities of ammunition. More here: http://www.snopes.com/politics/guns/ssabullets.asp
Trolls shouldn't need much explanation; a tradition of spreading outlandish or silly rumors is as old as the hills, and for trolls every day is April 1st.
Then you have people with a vested interest in spreading rumors of one sort or another, often for political ends. Negative rumors about 'Obamacare' have been widespread in recent years for obvious reasons, likewise people on the left often expressed their animus towards the previous Republican administration by making up negative stories reflecting their view of how that administration would behave, or what would motivate them in the case of some otherwise random-seeming fact.
There are a great many people who are not especially concerned with truth, but with shaping people's behavior in order to bring about a certain result - whether that is getting people to buy a product or a book, or to change the price of a financial asset, or to create a more favorable political environment etc.
Sometimes, it is with no malice or untruths intended. It is amazing how a story can get screwed up with the telling even without the exaggerations. I think every English class in the country has done the "tell your neighbor" exercise that results in a messed up phrase.
A lot of myths started with a grain of truth. "He left it in the woods chained to a tree" -> "He went away, and left it in the woods chained to a tree" -> "He went away, I think to war, and left it in the woods chained to a tree" -> etc.
Facebook operates at a different intellect level to HN.
A friend spends a lot of time on Facebook and, whenever she wants to tell me some fact or a joke I always ask 'is this a Facebook joke/fact?'. Clearly if it is a 'Facebook joke' I won't find it funny, I will just have to suffer boredom, hence I ask first so I can be prepared for some soul destroying trivial nonsense.
In social networks there is a sort of recommendation from a personal friend that comes with stuff. So, some mis-fact-rumour just has to get past that threshold of convincing-ness, thereafter it can go on viral.
Some rumours start as one story that gets mis-interpreted and then it spreads, there is nothing that the original source can do about it. We all do it '640K RAM, enough', 'Al Gore, invented the interwebs' and so on...
I particularly enjoyed the one about the rare calendar. You'd think the act of scratching off a month from last year and replacing it with one from this year would be a Clue(TM) that this is not that rare an event....
There's a word cloud about halfway down the post that shows words frequently used in comments on "rumor" posts. Words that are darkly shaded are associated with true rumors, and words that are lightly shaded are associated with false rumors.
I've never been into word clouds as a data visualization tool, but if you do use a word cloud, and you're using color to communicate something relevant about the data, PLEASE do not use a monochrome gray gradient, as this post has done. It's really difficult to tell whether "government" is slightly darker or lighter than "certainly," for instance.
A blue-red gradient would have worked a lot nicer in this case, in my humble opinion.