ParaTweet: A Twitter Content Based Recommendation Engine


ParaTweet Application

ParaTweet Final Report

ParaTweet Final Presentation

ParaTweet Presentation in PDF Format

Midterm Project Report

Team Members and Roles

  • Rohit Turumella & Anthony Salgado – backend development in addition to designing the database models and ensuring a resilient application
  • Sanketh Katta & Jamie Turley – front end development of the product and its connection with the backend.

Name of Twitter Project Mentor

  • Shai Haim

Project Goals

  • The goal of ParaTweet is to create a new recommendation engine for Twitter users that gives them suggestions for who to follow based on the content that one consumes. There are a large number of users on Twitter and a common complaint amongst many users of the service is that it is hard to find new users to follow who are similar to the people that they follow.Although Twitter has a few features that deliver suggestions for users to follow and has curated lists for certain topics, the current approaches are centered on closing the links in one’s social graph (Triadic Closure) and creating static lists for various topics. In our conversations with our mentor, we came to realize that the current way that Twitter recommends people to follow ignores an important facet of Twitter – one’s user timeline (which contains all the tweets of the people that they follow). Follow and List Recommendation Engines are extremely valuable to the Twitter ecosystem because
    they help generate growth, increase user engagement with the service, and reduces churn by keeping  users engaged with new content that is relevant to them.ParaTweet generates recommendations based on the content that a user consumes which allows us to conduct textual analysis to get a more holistic understanding of the type of content that a user prefers. The recommendation engine functions as a web application, which is currently released to the public ( that allows users to enter a given Twitter Username and get a list of personalized recommendations for users that that person should follow.

Project Timeline

  • November 9th:
    • Each team member will review documents relevant to the application and discuss their findings with the rest of the group.
  • November 13th:
    • Database schema well-defined.
    • Code in place to retrieve relevant information from Twitter. The database will then be populated by the results of the information retrieval code.
    • First pass at UX will be mocked up and a front-end to do simple verification of results.
    • Algorithm for determining similarity finalized, to be implemented in the second phase.
  • December 3rd:
    • First pass at similarity algorithm and UX implemented.
    • Prioritize and fix bugs in the application, fine-tune algorithm if unsatisfactory results.
  • December 3rd-10th (Testing)
    • Documentation for the application.
    • Application thoroughly tested with no obvious bugs. UX refined and finalized after getting user-feedback.

Literature Review

Please see the Midterm Project Report linked above for the complete review.

Measuring influence and social networking potential on Twitter has been discussed in various other papers as well as in numerous blogs and online media. Related scientific work on Twitter includes approaches which measure influence by not only taking followers and interactions into account, but also by analysing topical similarities with the help of a ranking method similar to PageRank [1].

Other approaches define different types of influence on Twitter, namely indegree, retweet and mention influence [2]. This specific paper concluded that each indicator leads to a different ranking of users and that indegree, i.e. the number of followers a user has,
reveals little about the actual influence of a user. Retweet influence is strongly content-oriented, whereas a high mention influence suggests a high value of the user’s name.


[1]Weng, J.; Lim, E.-P.; Jiang, J.; and He, Q. 2010. TwitterRank: Finding Topic-Sensitive Influential Twitterers. In: Proceedings of the third ACM international conference on Websearch and data mining (ACM WSDM).

[2] Cha, M.; Haddadi, H.; Benevenuto, F.; and Gummadi, K. 2010. Measuring User Influence in Twitter: The Million Follower Fallacy. In: Proceedings International AAAI Conference on Weblogs and Social Media (ICWSM).

Accomplishments to Date

  • Running web application that takes in a Twitter handle and delivers recommendations for who to follow based on the content that they consume (their friend’s tweets)

Work Allocation

See Midterm Project Review