Clustering Communities in the Twitter Network

Team Members:
Nikhil Chadda
Anupam Prakash
Achal Soni

Project Summary:
The twitter social network was modeled as an unbalanced bipartite graph with partitions corresponding to a small set of influential kernel users and large communities of people who follow them. Influential
kernel users were identified using the trustrank api from Infochimps. Communities in the ego nets of influential twitter users were found using a pagerank based local partitioning algorithm and heuristics for finding dense clusters. Communities discovered in the egonets of a random sample of kernel users were validated against rare words extracted from user tweets and were found to correspond to diverse real world topics.