In continuation with Nick’s very valuable info on ‘NY Times tags API’

http://open.nytimes.com/2007/10/23/messing-around-with-metadata/

Jacob harris highlights the importance of metadata in News industry. And they have been using it since 1851 phew!!  

On a different note the following excerpt (from this article) touches upon the ‘automation vs manual’ tradeoff discussed in today’s class. 

“Still my snarky aside has truth to it: people are ultimately controlling the process. In the beginning, rules for the automatic extraction and tagging are set by an Information Architect. In the end, final approval and correction of suggested metadata is done by various Web producers before publication. Web producers also do the important job of accurately summarizing the story. So, while we have machines to help out the process, it’s still ultimately a human endeavor, largely because automated summarization and classification has its problems.”

Comments off

RDFa: friend or pita?

RDFa has just become a W3C proposed recommendation.  It’s potentially cool/useful, because it may surmount some semantic barriers to automation that concerned Svenonius.

Similar to microformats, RDFa describes a syntax for embedding semantic meaning inside XHTML. It’s could be useful for us because most web pages today include (X)HTML (syntax), but don’t have a mechanism to embed clear meanings (semantics) for elements within the page, which might be picked up by search engines or browsers to return more precise and/or relevant results. RDFa, doesn’t seem overly complicated, either, something which frequently gets screwed up.

How might one of us use this? When you author an XHTML page, point to a very specific vocabulary document on the web, (you can borrow someone someone else’s that has already been created) and then add as many “tuple” statements as you’d like to describe fields within <span></span> tags in terms of that vocabulary.  An RDFa tuple is composed of a subject, a predicate, and an object all defined. An example of an RDFa tuple is, “Nat [subject] is a [predicate] Person [object],” and “Nat [subject] hates [predicate] homework [object]”. ; ).  Will RDFa be widely accepted?  I have no idea.  I kind of like the idea of microformats for encapsulating semantic meaning, too.

http://creativecommons.org/weblog/entry/9240

Related: 3. ORGANIZATION {AND,OR,VS} RETRIEVAL (9/8), 4. XML (9/10), 6. METADATA & METADATA STANDARDS (9/17), 7. CONTROLLED NAMES AND VOCABULARIES (9/22)

Comments (2)