The Library of Congress releases a report on the success of Flickr Commons

The Library of Congress has released a report discussing the results of their experiment to put a few thousand historical photos on flickr and allow users to add tags, comments, and notes on the photos. They’ve deemed the project a success, gathering lots of additional information about photos including personal stories from commenters’ family histories. The LOC has employees verify user-contributed information such as details on subject or location before adding it to the official description.

The report does mention some concern with the presence of rudeness or snarkiness that results when you open a project to the public: “Notes (annotations left directly on the photos) have some utility, such as pointing out specific persons in a crowd or deciphering the words on a sign or placard. Notes are also a means of adding graffiti-type messages and smart-aleck humor to the images, which is a cause for some concern among Flickr members and Library staff.”

Link: Library of Congress Blog

On an unrelated note, here’s a comic depicting an alternate method than what we discussed in class for calculating the impact of a researcher’s work based on their citations:

Tags and Mashups

While poking around the mashups of (via my ISSD class) I found a neat mashup. Cloudalicious! For fun I entered the ischool URL.
The site lists the first 10 tags as: education, information, ischool, berkeley, ucberkeley, research, school, gradschool, technology, and library.

What do you think? Are they good/relevant/correct/useful?

NYTimes TimesTags API

The New York Times has created an API against their “taxonomy and controlled vocabulary used by Times indexers since 1851”. ¬†Send their API a word and the NYTimes will send back a list of the most common relevant tags (and whether it’s a Person, Description, Organization or Location). ¬†

Why create our own structured vocabulary when highly trained people have been doing it since 1851 and we can borrow theirs?

Tags and Control

There has been an inrush of information on how humans deal with feeling a loss of control over the past few weeks, and how the internet has a positive (or negative) effect on our experience of chaotic times. This article from the New York Times postulates that, while people’s tendency is to seek out more and more information in troubled times (and that need is more than handily satisfied by the internet), it actually leads to more anxiety as one tries to keep up with the endless tide. The article follows on the heels of a study that relates feelings of one’s life being in chaos and the adoption of superstition and conspiracy theories.
So what does this have to do with tagging? In the past 8 years, there’s been a huge increase in the importance of internet news and user-generated content, along with the invention of tagging as we know it today and other Web 2.0 technologies. This explosion of content and information consumption coincides with and is to some extent driven by a society concerned with the ever-changing state of war and economic downturn.
I submit that tagging in praticular has become ubiquitous because tagging allows people to exercise control over their own content, and, even better, other people’s content. So much content now exists that the concept of having personal control is particularly attractive, rather than conforming to categories imposed by an outside source.
In terms of an externalized benefit to society, I agree that good machine tags are probably more useful, but I think it’s particularly true that human-generated tags are far more psychologically important to individual humans.

on tagging things “web” in delicious

I’m home sick, and too out of it to do my other work, so of course I’m thinking about the tagging discussions in 202 :). As Bob mentioned in one of the last few lectures, his least favorite tags in delicious are things like “web” and “toread,” which he feels are prime examples of Doctorow’s point that “people are stupid” – specifically, too stupid en masse to create useful metadata (Bob, please correct me if I’m misconstruing your viewpoint here).

I’d like to advance a contrarian approach here, which I’ve thought a lot about since this came up last year in 202. I think that the tags “web” and “toread” are the two most interesting tags on delicious, for different reasons.

The tag “web” is interesting precisely because of its evident inanity. The only thing you can tag in delicious is a website URL, which is by definition on the web, so why do people use the tag so often? Here’s my quick take on some reasons one can guess from the things people have tagged “web” in delicious:

  • “web” indicates that the page has the Internet or web development as its subject. There are a lot of these pages on the Internet, and they’re likely to be popular among delicious users.
  • “web” is used in conjunction with other tags to distinguish an online thing from an offline thing – so tagging something “web” and “tv” or “web” and “sketchpad” distinguishes streaming video or a sketchpad flash application from the real-world versions. Again, lots of this online.
  • “web” is the top level of many people’s tagging hierarchies, so it gets used a lot.

That last one needs a little explanation, because tags aren’t hierarchical, right? Well, I’ll put out the theory that they are – but not in any strict sense; rather, in the Roschian sense of linguistic category hierarchies. If I’m tagging a page about a cat (as I so often do), I might tag it first with a “basic-level” category tag (“cat”), a specific word like “kitten” (I think Rosch calls this the “subordinate” level), and an abstract, superordinate category like “animals” (plus, of course, other modifiers like “lol”, “justhanginginthere”, etc). I’m not really thinking about it hierarchically, but I end up putting in hierarchical tags because that’s often how language works. Now if I’m tagging a lot of things, many more things will have the top-level abstraction. If this was actually a rigid hierarchical tree structure, the distribution graph would look like a power law graph – and what do you know, it often does. So the theory here is that “web” is one of the superordinate tags that gets used regularly, and because it applies to so many things it’s disproportionately popular. If I was only picking one tag, the distribution would probably favor the basic-level tag (and, probably since many people only use one tag, arguably basic-level tags like “javascript” are, in fact, as popular as “web”). But the tagging system specifically promotes using as many tags as possible, so I’m likely to pick superordinate and subordinate words as well. The tag “web” by itself may be useless – but as the top level of an implicit hierarchy of Internet-related pages, it makes perfect sense.

I find “toread” interesting for a totally different reason. Unlike most other tags on delicious, which could be intended either for the individual or for the community, “toread” is wholly personal – it says, this is something I’m interested in, but not that interested in – maybe something I aspire to read, feel faintly guilty about not reading right now, but definitely will read later. If I get to it. I use the delicious plugin for Firefox, which suggests popular tags by other users for the page I’m tagging, and every time it suggests “toread” I laugh, because it means that so many other nameless people had this complex emotional reaction, usually the same reaction I’m having at that very moment, of aspiring to knowledge they can’t quite find the time for. “Communist Manifesto” from the Gutenburg project? “toread”. Page and Brin’s seminal paper on the PageRank algorithm? “toread”. More, I think, than any other tag, “toread” makes a delicious user feel like they’re part of a community, sharing this funny personal moment with other users.

Sorry, this ended up more like an essay than I intended – I’d be interested in any thoughts.

