My last 202 blog post

The Beer Judge Certification Program (no, really) has developed a set of guidelines (read categories and vocabulary) for judging beer. They even have it in downloadable XML format . They have developed (what they feel is) an authoritative vocabulary describing the various qualities of beer within their defined categories. Interestingly, they recognize that brewing styles change so their vocabulary is descriptive versus proscriptive and it will change over time. The organization also states that they use “experts” to choose the commercial examples (of the types of beer listed) instead of online surveys in order to remove the issue of “popularity contests” overwhelming the list.

So, they have taken a more Svenoniun approach to BeerML. However, I’ve never heard of these folks before (though, I am intrigued and have installed their iPhone app already). I find it interesting that one of my first questions upon reading their site (aside from how do I get in on this) was “What makes them an authority?” I have searched their site and find no association with any governing body. It seems like a bunch of folks trying to develop an authority on their own. Somewhat self-policing. I’d look more, but need to get back to writing my CMC paper and studying vector modeling. At least now I have vocabulary to use for describing the glass of awesome that is Belgian Witbier.

Comments off

Controlling vocabularies with paper barcodes

This is something that came up in my GIS (Geographic Information Systems) class – perhaps not a great breakthrough, but as a technique for controlling vocabularies, I thought it was pretty neat.

One common way to develop a GIS database is to take little GPS handhelds out in the field, go up to the features you want to map, record their locations, and input the attributes you care about (for example, locate a tree and input its height and species). As it turns out, there are a couple important interface issues here – one is that you have to input complex data with the limited interface of a handheld device, and the other is that you often want to use a complex controlled vocabulary for the data you input (e.g. a list of tree species, delineated categories of tree heights, etc).

As it turns out, some of these handhelds have an integrated barcode reader. So you define your vocabularies, then print them out into a paper catalog of terms, each with a barcode, and when you’re standing next to your tree all you need to do is look up the terms in your paper catalog and scan them with your barcode reader. I thought this was a pretty elegant solution to the problem, and it addresses what I often see as the most important part of producing structured data – find ways to make it easier for content creators to produce clean data than not.

Comments off

NYTimes TimesTags API

The New York Times has created an API against their “taxonomy and controlled vocabulary used by Times indexers since 1851”.  Send their API a word and the NYTimes will send back a list of the most common relevant tags (and whether it’s a Person, Description, Organization or Location).  

Why create our own structured vocabulary when highly trained people have been doing it since 1851 and we can borrow theirs?

Comments off

Tagging with pictures | Tagging the physical world.

At the risk of fanning political flames, this jpg was just sent to me via email. If you move past the humor and politics of the photo, it seems salient to today’s topic of tagging. Specifically, using the characteristics we collectively/culturally ascribe to trains of varying types to tag each of the presidential/vice-presidential candidates. It was done visually instead of with words (modern, green, fast, powerful, coal powered, archaic, plastic, child’s toy). Are these “good” tags? I think guys named Nick who went to Amherst (the h is silent) would say yes.

Election Trains

After I stopped laughing, this made me wonder if there were already a system tagging things with pictures out there. I did not find any with a quick google search. Just a number of whitepapers.

However, I did find Tonchidot.

While not specifically related to using images to tag other images or ideas, they are developing an iPhone app that adds tags to the images the camera sees in real time. They take community tags and make them mobile in a very compelling way. Want to know what type of flower that is? Tree? Year a building you are looking at was made, who designed it? Which store at the mall has the thing you want to buy? How many stars the restaurant you are looking at has on yelp? When the next bart is arriving at your station? Find a lower price for something in a different store. Purchase something via the phone. Leave a message for a friend to pick up by walking by a specific place.

Tagging a specific location is also possible. This reminds me of William Gibson’s book Spook Country. One aspect of the storyline was the development of location based digital art installations. In order to see a specific digitally created piece you needed specially made hardware (eyeglass digital display) and a computer. You also needed to be in a specific geo-spatial location. Now, you’ll just need your iPhone.

One of the things an artist in the book said reminds me of the potential of Tonchidot’s technology. Imaging traveling across the country and seeing a whole 2nd landscape that covers, interacts, and integrates with the physical world. Offering different things to see, information about what you’re seeing, directions to get there, prices for goods/services (who would not love to know the cheapest place to get gas?). And of course a whole new opportunity for advertising and spam.

Maybe that’s the problem with spam. No ontological control.

The video is about 18mins long and worth watching. There is a particularly interesting practical question around the 14:15 min mark.

Comments (4)

the vocabulary problem strikes again

Found this funny article about a police officer who was called in to shoo off a “big cat” only to find out that it was actually a male mountain lion weighing 80 – 90 pounds.  In the article itself, the mountain lion was called 3 names: “kitty cat”, “big cat” and “house cat” — none of which I would probably use to describe a lion. The title comes closer with “cougar”.   I find it amazing that a 200-word article can call something 5 different names!

Complete article here.

Comments (1)

Aliasing system commands in a GUI

The article that we read for Tuesday, “The Vocabulary Problem in Human-System Communication,” showed that you need at least 10 aliased terms for a referent before untrained people can reliably select it.

It’s easy to imagine how this might work in the command-line environment: a system designer picks a command and gives it the arbitrary authoritative term “delete” (for example). People who type “delete” at the command line will access this command. But we can easily toss in some aliases so that people who type “remove,” “trash,” “eliminate,” “wipe”, etc. will be referred to the delete command. This same idea can be applied to a GUI, but what would it look like?

One possibility might be something similar to what you see in OS X 10.5’s help menu. Starting in Leopard you can search for menu names in the help box and the system will visually point to where they are in the menu hierarchy. This way if you know a command is called “Crop” but you can’t remember where it is, the system will show you. Here’s a screenshot:

GUI implementation of aliases for system commands

Although the menu search currently only matches literal strings, it’s not hard to imagine it working by matching your search against aliases for commands. You search for “trim” or “cut edges” and the system suggests the crop menu. (Ignoring for the moment that trim happens to be a separate command in, e.g. Photoshop). Application designers would have to do some simple research to see what aliases would best serve users.

There are definitely other ways to implement this idea, but this seems like one simple way to put research into practice.

Comments (2)

Capitol Strives to Define “Homeless”

http://www.nytimes.com/2008/09/16/washington/16homeless.html?ref=us

NYTimes, 15 September, 2008

So the heated discussion of choice a few days ago in our nation’s capital was apparently how to define ‘homeless‘. For the last 20+ years, ‘homeless‘ meant “only people living on the streets or in shelters”. But given the high-and-getting-higher foreclosure and unemployment rates, the Hill is arguing whether or not to expand that definition.

New expansions of the existing definition under consideration are:

1) to include the ‘precariously housed’ (living with friends, couch-to-couch, day-to-day hotels, etc)

2) just to include the smaller number of people who have fled due to domestic violence

3) to include “only those forced to move three times in one year or twice in 21 days”

(Obviously we have some variance in specificity here.)

The definition is important because whoever qualifies as ‘homeless‘ is eligible for aid, shelter and housing assistance from the Department of Housing and Urban Development.

That said, in a typical DC move, none of the bills have anything about increasing funding.  The current budget ($1.7MM) can’t come close to providing enough/adequate resources for the people falling under the current definition of ‘homeless‘.  So while expanding the definition seemingly demonstrates homeland concern and goodwill, instead of a semantic debate, they should be talking about actions/solutions to actually care for these people.

(And of course it is turning into a Democrat/Republican flame war.  I would paraphrase but you know the drill…)

Two additional thoughts:

  • I think I may have lived couch to couch at some point in my younger younger years.  That definition might need some fine tuning to avoid dealing in every 22 year old in the country.
  • I don’t miss DC at all.

Comments (4)

RDFa: friend or pita?

RDFa has just become a W3C proposed recommendation.  It’s potentially cool/useful, because it may surmount some semantic barriers to automation that concerned Svenonius.

Similar to microformats, RDFa describes a syntax for embedding semantic meaning inside XHTML. It’s could be useful for us because most web pages today include (X)HTML (syntax), but don’t have a mechanism to embed clear meanings (semantics) for elements within the page, which might be picked up by search engines or browsers to return more precise and/or relevant results. RDFa, doesn’t seem overly complicated, either, something which frequently gets screwed up.

How might one of us use this? When you author an XHTML page, point to a very specific vocabulary document on the web, (you can borrow someone someone else’s that has already been created) and then add as many “tuple” statements as you’d like to describe fields within <span></span> tags in terms of that vocabulary.  An RDFa tuple is composed of a subject, a predicate, and an object all defined. An example of an RDFa tuple is, “Nat [subject] is a [predicate] Person [object],” and “Nat [subject] hates [predicate] homework [object]”. ; ).  Will RDFa be widely accepted?  I have no idea.  I kind of like the idea of microformats for encapsulating semantic meaning, too.

http://creativecommons.org/weblog/entry/9240

Related: 3. ORGANIZATION {AND,OR,VS} RETRIEVAL (9/8), 4. XML (9/10), 6. METADATA & METADATA STANDARDS (9/17), 7. CONTROLLED NAMES AND VOCABULARIES (9/22)

Comments (2)

Creating A New Word/Category to Get It Just Right…

Contributing to our expanding vocabulary
Times-Herald, Malcolm Donahoo (great name)
http://www.timesheraldonline.com/opinion/ci_10389356

Here’s a meta-post demonstrating the way in which our vocabularies can grow and expand to help us categorize things (when existing options just don’t seem to cut it)…and then can potentially lead to new definitions, categories or mean insults in the mainstream.

To summarize, Donahoo apparently recently wrote a column (which suspiciously, I can’t seem to find) about Monica Lewinsky and struggled to find a word that could truly encapsulate and convey her Monica Lewinsky-ness…to find a category that could do her justice. In an ah-ha moment, he came up with “pudgemuffin” which he realized very quickly, was not in fact a word, but it was so perfectly fitting to him that he used it anyway, figuring one of the multi-layers of editors would scream at it. No one did so it went to press. And then, shockingly, no one responded…no comments, no hate emails, nothing. His conclusion was not that no one was reading (shocking), but that the new term/category fit so well, it just went unnoticed.

Just a fun 202 in the news…

NOTE – this is reflecting the author’s views only.  Sorry if this offends anyone.

Comments (1)

Is Perfection No Longer a Category?

If you watched world class gymnasts Nastia Liukin or Shawn Johnson rake in the medals at the Beijing Olympics this summer, you may have noticed a winning beam score of “16.225” or a winning floor exercise score of “15.650.”  So why is the scoring no longer on a 0-10 scale, and what does a “16” even mean?

Though less intuitive to the spectator, these changes to the scoring system have come about in order better evaluate performance, categorizing it more discretely based on 1) execution and 2) difficulty.  In the past, judging has been characterized by deducting execution mistakes from a routine’s “start value” which is typically a 10.0 if the gymnast fulfilled his/her difficulty requirements.  Thus no deductions = “Perfect 10.”  However, there were distortions inherently built into this old notion of perfection: if two gymnasts both met the 10.0 difficulty threshold, but one gymnast added additional difficulty into her routine, this extra difficulty could not be reflected in the gymnast’s start value, as it was impossible to exceed 10.0. 
 
By re-factoring the judging schema, the concept of a maximum difficulty level no longer exists, and daring is aptly rewarded.  Though a gymnast can still receive a perfect execution score, there is no such thing as a perfect routine, because there are always new and more difficult tricks to potentially be incorporated.  In gymnastics, perfection has become a purely relative term, and by redefining the way in which routines are classified, a different type of champion — one who is flawless and fearless — is surfacing.

Article: 
http://www.nytimes.com/2008/08/06/sports/olympics/06scoring.html?_r=2&pagewanted=all&oref=slogin&oref=slogin

Lectures in the Syllabus: 
5. Concepts and Categories
7. Controlled names and Vocabularies

Comments (1)

« Previous entries