Alpa Jain has great experience teaching from her time as a graduate student at Columbia University, and it shows in the clarity of her descriptions of SVD and other recommendation algorithms in today’s lecture:
When I asked at the end of class how many students have used one of the recommended Twitter links, nearly everyone raised their hands, so Alpa is clearly doing her job.
For those of you who follow the latest developments in the Big Data technology stack, you’ll know that GraphLab is the hottest technology for processing huge graphs in fast time. We got to hear the algorithms behind GraphLab 2 even before the OSDI crowd! Check it out:
Perhaps the best news is there is new a version called GraphChi (for chihuahua) that you can run on your personal computer; so you don’t even need access to EC2 to run it going forward. Slides here.
Learn about weak ties, triadic closures, and personal pagerank, and how they all relate to the Twitter social graph from Aneesh Sharma:
I especially enjoyed his fascinating and clear explanation of the Watts-Strogatz model, its link to Kleinberg’s model, and how they explain how Milgram’s six degrees of separation phenomenon can occur. Thank you, Aneesh! Slides here.
Today we learned about an alternative software architecture for processing large data, getting the technical details from Splunk’s VP of Engineering, Stephen Sorkin. Splunk also has a really amazing GUI for analyzing Twitter and other data sources in real time; be sure to watch the last 15 minutes of the video to see the demo:
The great news is that you can download and try out the demo yourself for free! Thanks for a very technical, well-paced lecture, Stephen! And congratulations to Archana Ganapathi, who as a new mother, couldn’t make it this time!