<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>INFO 202 Fall 08 Blog &#187; indexing</title>
	<atom:link href="http://blogs.ischool.berkeley.edu/i202f08/tag/indexing/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.ischool.berkeley.edu/i202f08</link>
	<description>I202 course Fall 08</description>
	<lastBuildDate>Wed, 11 Mar 2009 17:29:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>New Research Engine Searches &#8220;Deep Web&#8221;</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/18/new-research-engine-searches-deep-web/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/18/new-research-engine-searches-deep-web/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 06:40:36 +0000</pubDate>
		<dc:creator>michael</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Deep Web]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=80</guid>
		<description><![CDATA[How much of the World Wide Web is actually indexed&#8230; 27.65 billion pages? Maybe about 0.2% of the total content? The &#8220;Deep Web&#8221; (web documents not immediately accessible by direct hyperlink from public pages) may contain something like 91,000 terabytes of data&#8230; as compared to an estimated 167 terabytes of Surface Web data.
A new service, [...]]]></description>
			<content:encoded><![CDATA[<p>How much of the World Wide Web is actually indexed&#8230; <a href="http://www.worldwidewebsize.com/">27.65 billion pages</a>? Maybe about 0.2% of the total content? The &#8220;<a href="http://en.wikipedia.org/wiki/Deep_web">Deep Web</a>&#8221; (web documents not immediately accessible by direct hyperlink from public pages) may contain something like <a href="http://en.wikipedia.org/wiki/Deep_web#Size">91,000 terabytes</a> of data&#8230; as compared to an estimated <a href="http://en.wikipedia.org/wiki/Deep_web#Size">167 terabytes</a> of Surface Web data.</p>
<p>A new service, called <a href="http://www.infovell.com/">Infovell</a>, hopes to help users find more of this &#8220;Deep Web&#8221; data&#8230; yet unlike Google and other Surface Web engines, it won&#8217;t be ad-supported. Instead, the service will be subscription based. Read more at <a href="http://www.readwriteweb.com/archives/sometimes_google_isnt_enough_when_researching_deep_web.php">ReadWriteWeb</a> blog. Here is an excerpt from the article:</p>
<blockquote><p><img class="alignright" style="border: 0pt none;margin: 5px;float: right" src="http://www.readwriteweb.com/images/infovell1.gif" alt="InfoVell" width="200" height="150" /><em>The engine scours through open-access repositories of information like PubMed Central and the U.S. Patent and Trademark Office Claims, but it also allows access to scholarly journals such as those from Oxford University Press, SAGE, Taylor &amp; Francis, Annual Reviews, Mary Ann Liebert Publications, and more. The culmination of these billions of pages currently unindexed by other engines, gives you access to content in the areas of Life Sciences, Medicines, Patents, Industry News, and other reference content from expert sources.</em></p></blockquote>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/18/new-research-engine-searches-deep-web/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Not so cool?</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/03/not-so-cool/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/03/not-so-cool/#comments</comments>
		<pubDate>Wed, 03 Sep 2008 14:53:38 +0000</pubDate>
		<dc:creator>Neha Kumar</dc:creator>
				<category><![CDATA[Assignment 1]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[relevance]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=47</guid>
		<description><![CDATA[http://www.techcrunch.com/2008/07/29/how-to-lose-your-cuil-20-seconds-after-launch/
Stealth search start-up Cuil (pronounced &#8220;cool&#8221;) launched its product on July 29th of this year, and was promptly subject to an angry backlash from its users. The start-up &#8211; founded by three former senior Google employees &#8211; claims to have an index size of 120 billion web pages (larger than that of Google&#8217;s, they say). [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.techcrunch.com/2008/07/29/how-to-lose-your-cuil-20-seconds-after-launch/">http://www.techcrunch.com/2008/07/29/how-to-lose-your-cuil-20-seconds-after-launch/</a></p>
<p>Stealth search start-up <a href="http://www.cuil.com">Cuil</a> (pronounced &#8220;cool&#8221;) launched its product on July 29th of this year, and was promptly subject to an angry backlash from its users. The start-up &#8211; founded by three former senior Google employees &#8211; claims to have an index size of 120 billion web pages (larger than that of Google&#8217;s, they say). On the day of its launch however, Cuil&#8217;s search results were not as rewarding. For example, a search for &#8220;Dog&#8221; resulted in 280 million hits on Cuil and 498 million on Google. Of course, quantity isn&#8217;t everything, but even in relevance, Google&#8217;s results were better.</p>
<p>The comparison to Google is but natural, since Google defines what search means to most of us today &#8211; search results that are relevant, but also photos, news articles, video files, etc. that complete the picture. It remains to be seen whether Cuil can offer a superior &#8216;universal&#8217; search package to one that we&#8217;re already used to.</p>
<p>Relevant lectures: 21/22</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/03/not-so-cool/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
