A Semi-Automated Semantic Web?
I read a paper today that discussed a step covered in class today on document engineering. We learned how the Berkeley Calendar Network team manually harvested and consolidated tables of terms in a huge excel spreadsheet. The IEEE paper I read argues for semi-automating this process of deriving homonyms in IS-A relationships, for instance, and integrating the terms with the http://wordnet.princeton.edu ontology.
The buzzword-laden paper continues to argue for creation of a working Semantic Web by harnessing the large quantity of structured, “Deep Web” data. The, “Deep Web” (unindexed by conventional search engines) contains ~over 4 orders of magnitude of data than the, “Surface Web” and some data is structured in databases. The Semantic web, they claim, has been hampered by difficulty in manually creating large OWL and RDF ontologies, and harvesting the richer potential of the Deep Web points to a possible solution:
Semantic Web + Deep Web-Ontology-aware browser.
Ironically, the paper, itself, is buried in the deep web:
http://ieeexplore.ieee.org/iel5/2/4623205/04623231.pdf?tp=&arnumber=4623231&isnumber=4623205
Permalink Comments off
The engine scours through open-access repositories of information like PubMed Central and the U.S. Patent and Trademark Office Claims, but it also allows access to scholarly journals such as those from Oxford University Press, SAGE, Taylor & Francis, Annual Reviews, Mary Ann Liebert Publications, and more. The culmination of these billions of pages currently unindexed by other engines, gives you access to content in the areas of Life Sciences, Medicines, Patents, Industry News, and other reference content from expert sources.