RDFa: friend or pita?
RDFa has just become a W3C proposed recommendation. It’s potentially cool/useful, because it may surmount some semantic barriers to automation that concerned Svenonius.
Similar to microformats, RDFa describes a syntax for embedding semantic meaning inside XHTML. It’s could be useful for us because most web pages today include (X)HTML (syntax), but don’t have a mechanism to embed clear meanings (semantics) for elements within the page, which might be picked up by search engines or browsers to return more precise and/or relevant results. RDFa, doesn’t seem overly complicated, either, something which frequently gets screwed up.
How might one of us use this? When you author an XHTML page, point to a very specific vocabulary document on the web, (you can borrow someone someone else’s that has already been created) and then add as many “tuple” statements as you’d like to describe fields within <span></span> tags in terms of that vocabulary. An RDFa tuple is composed of a subject, a predicate, and an object all defined. An example of an RDFa tuple is, “Nat [subject] is a [predicate] Person [object],” and “Nat [subject] hates [predicate] homework [object]“. ; ). Will RDFa be widely accepted? I have no idea. I kind of like the idea of microformats for encapsulating semantic meaning, too.
http://creativecommons.org/weblog/entry/9240
Related: 3. ORGANIZATION {AND,OR,VS} RETRIEVAL (9/8), 4. XML (9/10), 6. METADATA & METADATA STANDARDS (9/17), 7. CONTROLLED NAMES AND VOCABULARIES (9/22)
Bob Glushko Said,
September 8, 2008 @ 9:21 pm
Gee, what do you think – should we invest in developing a domain-specific vocabulary whose semantics are tuned to a particular domain, or should we try to use a general-purpose language with a simpler syntax but relatively uncontrolled semantics? I look forward to talking about these issues in a few weeks — this is the TRADEOFF in its full glory.
Nathaniel Wharton Said,
September 8, 2008 @ 11:00 pm
hmmm. Yeah, I can see that tradeoff. I think our design choice depends on what we’re building and why, and what constraints we have. We ideally want to avoid over- _or_ under- building. I can think of scenarios where either, neither, both, hybrid, or other techniques could be good design decisions for the problem we’re trying to solve. Investment in semantic markup might be mitigated and facilitated through form-based entry coupled with automation (using a scripting language), for instance when an input form entity could correspond 1:1 with a vocabulary subject. In this case, for instance, you _could_ have a great efficiency in auto-generating semantic markup many times. But of course, it depends! I sure hope we have the specifications right before we go to the trouble! Vocabularies don’t always need to be developed from scratch, either. They could be borrowed. Generally, I would think borrowing vocabulary increases possibilities for re-purposing down-the-line — beyond just your own currently-conceived application requirements. In 20 years, though, I wonder.. will our semantic language be dead, forgotten, and our time “wasted”, like all the time librarians invested in creating all those cards… but I don’t think that was wasted. They were solving the problem at hand with the limited tools available. Guess that’s what we’d be doing, too. : ).