Word Clouds

In case you’re interested about legitimate applications of word clouds to get the gist of a document, take a look at this blog post of last year titled Using Word Clouds for Topic Modeling Results. It provides, if you will, a visual representation of TF-IDF. The author goes into some interesting details defending his use of Word Clouds despite the prevailing criticism. Make sure to also read the comments at the bottom!

Parallel Coordinates

Unfortunately I wasn’t able to show you the Parallel Coordinates example in class. Parallel Coordinates are a great technique to explore relationship between (seemingly) unrelated dimensions of multivariate data set.

Take a look at http://exposedata.com/parallel, which was made by Kai Chang, who’s local to the SF Bay Area. Take a second to familiarize yourself with the user interface. You’re able to re-group the parallel axes to better explore different relationship of adjacent dimensions. You can also filter out parts of each axis, which reduces the number of lines drawn.

If you would like more information, here is a good blog post on parallel coordinates. On a related note, Robert Kosara’s blog eagereyes.org is one of the best blogs that takes information visualization seriously without appearing too academic.

Tableau

As promised, here is some information about Tableau:

[unordered_list style=”tick”]

  • Download Tableau 7.0 from: http://www.tableausoftware.com/tft/activation
  • On the landing page you’ll get to at the link above, fill out the form on the right hand side of the page. Under “Job Title”, mark Student; and under “Organization”, please input “UC Berkeley School of Information”.
  • License Key: [highlight]Check your email[/highlight]

[/unordered_list]

You need to run Windows in order to use Tableau. If you own a Mac, you can you can get all the software needed to do this for free:

[unordered_list style=”tick”]

[/unordered_list]

In case you run into issues with your installation, [highlight]don’t hesitate to add a comment to this post[/highlight] so that anyone can help you out.

Lab 1 – Data: Preparations

To make sure we’re not wasting time installing applications during the lecture, please come prepared to the lab and install the following application:
[unordered_list style=”tick”]

  • Open Refine (formerly known as Google Refine)
    If you work on a Mac with Mountain Lion and run into problems installing the app, the culprit is most likely the download restriction for apps from unknown developers. To run Google Refine you’ll have to temporarily disable the privacy protection (as described in this issue ticket).

[/unordered_list]
We’re also going to use the Google Spreadsheet app (part of Google Drive). In the unlikely case you haven’t already signed up for a Google account, please make sure you do so before the lecture.

If you run into any other problems installing the applications, please comment on this post so that others can avoid going through the same issues.

Class Mailing List

If you haven’t already, please sign up for the class mailing list:

I School Students

If you’re an I School student: subscribe via email to Majordomo by sending email to Majordomo@ischool.berkeley.edu with the following command in the body of your email message:

subscribe i247

Make sure to use your I School email. Alternatively you can subscribe via you intranet account (https://www.ischool.berkeley.edu/intranet/prefs/lists)

Students From Other Departments

If you’re a student from another department: send an email to Galen (gpanger@ischool) and he’s going to add you to the list.

Starting next week (that is, after the next class session) the mailing list is going to be where we send out reminders and announcements.