Today Stan Nikolov, who just finished his masters at MIT in studying information diffusion networks, walked us through one particular theoretical model of information diffusion which tries to predict under what conditions an idea stops spreading based on a network’s structure (from the popular Easley and Kleinberg Network book). Stan also gathered a huge amount of Twitter data, processed it using Pig scripts, and graphed the results using Gephi. The video lecture below shows you some great visualizations of the spreading behavior of the data!
http://youtu.be/lbCmFZpMNxA
The slides in his Lecture Notes let you see the Pig scripts in more detail.
You can see the videos that Stan created on his blog.
For those who want the details before watching the video, this is a threshold-based model for people choosing to do A or B based on what their neighbors did, modeled as a coordination game where if neighbors pick the same thing, they get a payoff. Even though spread of topics on Twitter is not quite the same kind of coordination game, Stan tells us the threshold model is very popular independent of the game-theoretic justification.
Easley and Kleinberg ask what is it about the structure of the network that cause something to keep spreading or stop spreading. They prove that clusters defined in terms of cluster density stop spreading and that, in fact, they are the only thing that stops spreading.
For his data, he got some trending hashtags (the type that are memes or word games, like #ThingsYouSayToYourBestFriend, not the type that are events, like #debates), and recorded how they spread before the hashtag actually becomes a trending topic (so that the dominant mode of spreading is from person to person, not from some exogenous event, or from the hashtag being on the trending topics list).
Information diffusion in networks is a really difficult topic to work on empirically, so Stan, thank you so much for this terrific work!
Pingback: Information Diffusion on Twitter by @snikolov « Another Word For It