Folksonomy works well with others?

Here is a post about a library of congress report that we all might find pretty interesting. To copy from the blog which copies from the summary:

The following statistics attest to the popularity and impact of the pilot. As of October 23, 2008,
there have been:
• 10.4 million views of the photos on Flickr.
• 79% of the 4,615 photos have been made a “favorite” (i.e., are incorporated into personal
Flickr collections).
• More than 15,000 Flickr members have chosen to make the Library of Congress a
“contact,” creating a photostream of Library images on their own accounts.
• 7,166 comments were left on 2,873 photos by 2,562 unique Flickr accounts.
• 67,176 tags were added by 2,518 unique Flickr accounts.
• 4,548 of the 4,615 photos have at least one community-provided tag.
• Less than 25 instances of user-generated content were removed as inappropriate.
• More than 500 Prints and Photographs Online Catalog (PPOC) records have been
enhanced with new information provided by the Flickr Community.

Kinda cool, no?

Comments off

The Library of Congress releases a report on the success of Flickr Commons

The Library of Congress has released a report discussing the results of their experiment to put a few thousand historical photos on flickr and allow users to add tags, comments, and notes on the photos. They’ve deemed the project a success, gathering lots of additional information about photos including personal stories from commenters’ family histories. The LOC has employees verify user-contributed information such as details on subject or location before adding it to the official description.

The report does mention some concern with the presence of rudeness or snarkiness that results when you open a project to the public: “Notes (annotations left directly on the photos) have some utility, such as pointing out specific persons in a crowd or deciphering the words on a sign or placard. Notes are also a means of adding graffiti-type messages and smart-aleck humor to the images, which is a cause for some concern among Flickr members and Library staff.”

Link: Library of Congress Blog

On an unrelated note, here’s a comic depicting an alternate method than what we discussed in class for calculating the impact of a researcher’s work based on their citations:

Comments off

Lib O’Congress on flickr

THE Library of Congress is posting images to flickr.

The LOC is uploading images to flickr and inviting viewers to add tags. The goal is to share images, to experiment with socially constructed taxonomies, and to start wading among the people of the tubes.

The LOC is following these general guidelines with respect to annotation of the images the post on flickr:
We placed only one tag (“Library of Congress”) and two machine tags on each photo when we loaded them. Any other tags you see were added by the community; we are generally not controlling the content of Flickr tags, notes and comments, but we reserve the right to remove added content for any reason.  

The project has been a success, according to the LOC — many people have participated in annotating the images with comments, tags, and notes.  Here’s an example of an image that has been viewed more than 85,000 times and has much of annotation from viewers:

Comments off

Weinberger Need Statisticians

I’ve always wondered how Weinberger could get meaningful information out of his “huge pile”, and in his interview with Doctorow[1], Weinberger mentioned a way to make use of it: statistical analysis. This is what he said:

“Tags are chaos, and as you get more and more of them, it will get more and more chaotic.  It turns out that when you have a lot of them, the statistical analysis becomes really pretty precise.”

This reminds me of a paper I’ve previously read, “Toward Extracting Flickr Tag Semantics”, written by Yahoo! Research Berkeley and published on WWW2007[2]. The method described in the paper could identify “place tag” and “event tag” from the tags store in Flickr. For instance, the authors could “detect that the tag Bay Bridge describes a place, and that the tag WWW2007 is an event.” (WWW2007 is a conference held in Canada in 2007.)

How did they do that? The main idea is, “place tag” like Bay Bridge has significant spatial patterns, tending to concentrate within a certain geographic range, and “event tag” like a conference has significant temporal patterns, tending to appear around a certain time period. So by using preexisting spatial and temporal statistical methods, computer scientists are able to discover the “semantics” of Fickr tags.

In all, statistical analysis can help Weinberger make use of the huge amount of information, and it may also serve as a “filter” to deal with information overload problems.



[1] Metacrap and Flickr Tags: An Interview with Cory Doctorow,

[2] Towards Extracting Flickr Tag Semantics,

Comments off