Archive for the ‘AJAX’ Category

Working with the Flickr API

Friday, November 12th, 2010

Flickr provides exemplary tools and documentation for their popular API and is also an excellent case study for social classification in the wild, so it’s worth taking a little time to understand their API.

  1. The App Garden is Flickr’s main API documentation page and the best place to start. From here you can “Create an App” (and get an API key), read articles on general topics (the Overview, REST Request Format and JSON Response Format are particularly useful for us), and see an exhaustive (and exhausting) list of methods that the API provides.
  2. Pick a method from that list that sounds like it might work for your purposes. For this example, we’ll look at flickr.tags.getListUserPopular, but any method in the Tags section is likely to be helpful. Skim through the documentation to make sure this looks like the right thing for you.
  3. Test the method using Flickr’s handy API explore tool (link for this method). If you’re logged in to your Flickr account, Flickr will even provide some sample user and photo IDs to fill in as parameters, which is handy. I like to fill in the White House Flickr account ID (35591378@N03), specify the “Do not sign call?” option and then click “Call Method…” to see the actual results in the box below. Flickr also constructs the full request URL for you.
  4. Inspect the results by copying and pasting the URL into a new browser window. View source in your browser to see the structure of the actual response in XML.

Once you have this URL, you have a couple of options. From Python (in a GAE app, say), you can urlfetch this content, use BeautifulSoup to parse the results and then store or analyze this data. Or you can access this URL from the client-side using jQuery. If you want to take that approach, there are a couple of additional steps you’ll want to keep in mind.

  1. Specify the JSON response format, which you can find documented here. You’ll want to add &format=json to the end of your URL.
  2. Access the API using JSONP. If you’re accessing the Flickr API from a standalone web page (rather than from a Chrome extension, say), you’ll need to use jQuery’s $.getJSON method with the ?callback=? option. But, one particular quirk of the Flickr API, you need to rename this to jsoncallback rather than just callback.

Once you’ve made it through all of these steps, you should be able to pull in data from Flickr to use in your language of choice. If you use getListUserPopular, you can construct a graph in Protovis to see the distribution of tags by White House photographers. Please forgive the rudimentary aesthetics of these graphs, I’m still learning Protovis myself.

Linear scale visualization of flickr tags
(The linear scale shows the clear outliers of DC and USA; switching to a log scale makes the rest of the data visible.)
Log scale visualization of flickr tags

This (short) code sample is available in the repository, so you can see code for accessing Flickr, parsing the response, and visualizing the data in Protovis. (If you make improvements to the visualization, feel free to commit your updates!) The sample uses my Flickr API key, so if you’re going to use this for anything beyond exploratory testing, please create and use your own key.

Working around the same-origin policy

Sunday, October 10th, 2010

As part of the basic security model of the Web, sites can’t usually make requests to pages on other domains — if they could, then just visiting any random site on the Web having recently logged in to your email could reveal the entire contents of your email to an attacker! In class we briefly mentioned three ways to work around the same-origin policy:

  1. Use a server-side proxy.
  2. Make a JSONP request to a server that supports it.
  3. Make a request from a privileged context (like in a Chrome Extension).

This blog post will cover generic uses of the first two methods. If you’re creating a standalone page for your projects, you’ll either need to use APIs that already support JSONP, or use our provided proxy to call an existing API for you and wrap it in JSONP. Ryan’s walkthrough tutorial uses a version of this technique for posting new bookmarks to Delicious from a standalone page, but here we’ll access any generic JSON API.

The internal details aren’t vitally important, but in case you’re curious: JSONP works by loading a new <script> element to the page, where the contents of that <script> element just happen to be (no, I’m kidding, it’s not the least bit coincidental) calling a function with a name you defined with a single parameter which is the response from the API. To take advantage of this in jQuery (which, as usual, does all the hard work for you), just use the $.getJSON() function and include a special callback parameter in the URL '?callback=?'.

To use this with the New York Times API, for example, which doesn’t support JSONP, we instead make a JSONP call to a proxy on our own Berkeley servers and that proxy makes the call to the New York Times and then responds using the JSONP callback standard described above. We just need to pass the URL of the New York Times API and all the query parameters as parameters to the proxy.

var query = $('#search').val();

var apiKey = 'yournytimesapikey';
var proxyUrl = 'http://courses.ischool.berkeley.edu/path/to/proxy.php';

$.getJSON(proxyUrl + '?callback=?', {"url": 'http://api.nytimes.com/svc/timestags/suggest', 
                                     "query": query, 
                                     "api-key": apiKey}, 
 function(json){
    console.log(json.results);
});

The full sample page is in the iolab10 repository, including the full path to the PHP proxy that we’re running and even a sample NYTimes API key. We’ve also made the PHP code for the proxy available in case you want to inspect it or modify it for your own purposes. (If you install and run the proxy on the same domain that your page runs on you don’t even need to use JSONP!)

A couple of caveats:

  1. If you’re using JSONP from a content script in a Chrome Extension (or from a Greasemonkey extension, for that matter), you’ll receive an error that a function named json123456789 (or some similar nonsense name) doesn’t exist. This is because jQuery created the callback function in its own sandboxed area, but when the script was inserted into the page it called a function in the original window context. To work around this, cross-domain requests in Chrome Extensions shouldn’t use '?callback=?' and should be made from the background page or a pop-up page with cross-domain permissions declared in the manifest. (For more detail, see the relevant Chrome Extensions documentation.)
  2. When you pass a URL to the proxy as the ?url= parameter, the URL itself shouldn’t include the parameters you’re passing on to the API, those parameters should just be additional parameters to the proxy page. Really, this just means that you should use an ampersand for the first parameter to the API. For example, call proxy.php?url=http://example.com&param1=value1&param2=value2 rather than proxy.php?url=http://example.com?param1=value1&param2=value2.
  3. There are potential security implications here that we haven’t gone into yet. Requests through a PHP proxy on Berkeley servers may be logged by Berkeley, which you might not want (particularly if you’re passing a key, password or other secret data). And if the address for your (or our) proxy becomes widely known, it could be abused by others for denial-of-service or other malicious purposes.