<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>INFO 202 Fall 08 Blog &#187; search</title>
	<atom:link href="http://blogs.ischool.berkeley.edu/i202f08/tag/search/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.ischool.berkeley.edu/i202f08</link>
	<description>I202 course Fall 08</description>
	<lastBuildDate>Wed, 11 Mar 2009 17:29:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Search, Facet, and Filtering Examples</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/23/search-facet-and-filtering-examples/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/23/search-facet-and-filtering-examples/#comments</comments>
		<pubDate>Mon, 24 Nov 2008 06:30:11 +0000</pubDate>
		<dc:creator>Laura Paajanen</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[facets]]></category>
		<category><![CDATA[filtering]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=296</guid>
		<description><![CDATA[Konigi is a User Experience Design site that features interesting interfaces, with a handful of features on searches, filtering, and faceted navigation.
http://konigi.com/
A couple of sites they&#8217;ve featured:
Kayak.com, my favorite travel search interface
http://konigi.com/interface/kayak-filtering
FanSnap, an event ticket site
http://konigi.com/interface/fansnap-search-results-filtering
Also, Cookstr is a recipe site that has a ton of interesting facets once you search or click a category: [...]]]></description>
			<content:encoded><![CDATA[<p>Konigi is a User Experience Design site that features interesting interfaces, with a handful of features on searches, filtering, and faceted navigation.<a href="http://konigi.com/"><br />
http://konigi.com/</a></p>
<p>A couple of sites they&#8217;ve featured:</p>
<p>Kayak.com, my favorite travel search interface<br />
<a href="http://konigi.com/interface/kayak-filtering">http://konigi.com/interface/kayak-filtering</a></p>
<p>FanSnap, an event ticket site<br />
<a href="http://konigi.com/interface/fansnap-search-results-filtering">http://konigi.com/interface/fansnap-search-results-filtering</a></p>
<p>Also, Cookstr is a recipe site that has a ton of interesting facets once you search or click a category: cuisine, cost, dietary considerations, kid friendly, holiday&#8230; Much cleaner and easier to use than other recipe sites I&#8217;ve played with.<br />
<a href="http://www.cookstr.com/recipes">http://www.cookstr.com/recipes</a></p>
<p>Doesn&#8217;t it just make you happy when a company gets search right?</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/23/search-facet-and-filtering-examples/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What to do with the Nasties</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/14/what-to-do-with-the-nasties/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/14/what-to-do-with-the-nasties/#comments</comments>
		<pubDate>Sat, 15 Nov 2008 01:28:30 +0000</pubDate>
		<dc:creator>Annette Greiner</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Retrieval]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=244</guid>
		<description><![CDATA[BBC News is reporting that YouTube has removed some videos from its site that it judged to glorify the Columbine school shooters, which left me wondering what one does when one expunges &#8220;undesirable&#8221; data from a collection. Assuming the expunging is justified, do you keep the reference information so you have a record of having [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://news.bbc.co.uk/2/hi/uk_news/7730679.stm">BBC News is reporting</a> that YouTube has removed some videos from its site that it judged to glorify the Columbine school shooters, which left me wondering what one does when one expunges &#8220;undesirable&#8221; data from a collection. Assuming the expunging is justified, do you keep the reference information so you have a record of having had the thing around (and thereby make yourself better able to detect its reappearance)? Do you expunge the thing from the entire database? It seems good general practice to have a place where one can keep old records that no longer point to something retrievable. Would it be wise to allow people to search and find that an item had been intentionally removed, to save them the trouble of searching and searching for it? Or would it be ethically questionable to have even just the record available, since it could give people the idea to seek it elsewhere or create copycat works? I&#8217;m guessing the videos will appear elsewhere on the net, and there is little anyone can do to keep them out of public view, but keeping them off popular sites could effectively marginalize them. I&#8217;m thinking the benefit of keeping something truly nasty beyond the view of the &#8220;tell me something about &#8230;&#8221; searcher outweighs the benefit of explaining the removal to the &#8220;I want this exact document&#8221; searcher.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/14/what-to-do-with-the-nasties/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Search + Social Networking</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/10/search-social-networking/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/10/search-social-networking/#comments</comments>
		<pubDate>Mon, 10 Nov 2008 18:25:52 +0000</pubDate>
		<dc:creator>Nathan Gandomi</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[social network]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=214</guid>
		<description><![CDATA[A search engine that lets users benefit from their social network to improve search results:
delver.com
From the about us page:
What is Delver?
Delver is an intelligent social search engine that enables you to find, experience and benefit from the wealth of information created and referenced by your social world. Our mission is to empower you to easily [...]]]></description>
			<content:encoded><![CDATA[<p>A search engine that lets users benefit from their social network to improve search results:</p>
<p><a href="http://www.delver.com">delver.com</a></p>
<p>From the about us page:</p>
<h1>What is Delver?</h1>
<p>Delver is an intelligent social search engine that enables you to find, experience and benefit from the wealth of information created and referenced by your social world. Our mission is to empower you to easily discover and benefit from the collective wisdom of your social world. Your circle of friends and extended network are increasingly creating and sharing useful information and media online through: blogs, videos, reviews, articles, websites, music… and the list is only growing. By indexing all that shared knowledge, media, opinions, and activities, we can deliver search results that are truly relevant to you.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/10/search-social-networking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>semantic image retrieval</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/08/semantic-image-retrieval/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/08/semantic-image-retrieval/#comments</comments>
		<pubDate>Sat, 08 Nov 2008 07:44:25 +0000</pubDate>
		<dc:creator>Mohit Gupta</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[foraging]]></category>
		<category><![CDATA[image search]]></category>
		<category><![CDATA[pixolu]]></category>
		<category><![CDATA[Retrieval]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/2008/11/08/semantic-image-retrieval/</guid>
		<description><![CDATA[this may already be old news for regular readers of lifehacker.com, but incase you missed it, here&#8217;s another search engine. 
http://www.pixolu.de/
Pixolu is a semantic image search, which allows to refine a search by allowing users to select images that best represent their query. I tried it for some queries and it seems to do a good [...]]]></description>
			<content:encoded><![CDATA[<p>this may already be old news for regular readers of lifehacker.com, but incase you missed it, here&#8217;s <strong>another search engine. </strong></p>
<p>http://www.pixolu.de/</p>
<p>Pixolu is a semantic image search, which allows to refine a search by allowing users to select images that best represent their query. I tried it for some queries and it seems to do a good job, factoring in color, object shapes, size and density in images. </p>
<p>The two-step search-and-refine process is very interesting and represents a more natural way of information gathering. Pixolu, a more 202&#8242;ish search pays attention to recent (and older) research in information gathering and foraging.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/08/semantic-image-retrieval/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Search Flickr by Color</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/02/search-flickr-by-color/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/02/search-flickr-by-color/#comments</comments>
		<pubDate>Mon, 03 Nov 2008 05:42:01 +0000</pubDate>
		<dc:creator>Ryan Greenberg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[classification]]></category>
		<category><![CDATA[color]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=191</guid>
		<description><![CDATA[Searching for all the photos on Flickr that are tagged &#8220;red&#8221; is old-hat. Besides, searching for colors in tags is fraught with problems: people don&#8217;t have the patience to tag their photos exhaustively with all the colors in them, people may not be able to distinguish all the colors in a photos, and worse, they [...]]]></description>
			<content:encoded><![CDATA[<p>Searching for all the photos on Flickr that are tagged &#8220;red&#8221; is old-hat. Besides, searching for colors in tags is fraught with problems: people don&#8217;t have the patience to tag their photos exhaustively with all the colors in them, people may not be able to distinguish all the colors in a photos, and worse, they may be &#8220;wrong&#8221; about the colors. After all, your red is my pink. (If you want to get philosophical, check out the <a href="http://plato.stanford.edu/entries/qualia-inverted/#SimInvQuaSce">inverted spectrum</a> problem, though this doesn&#8217;t pose a problem for Flickr tagging.)</p>
<p>An obvious approach is to tag photos with all their colors algorithmically. We can scan photos for colors and tag any picture with lots of #ff0000 &#8220;red&#8221;. Users who search for red will retrieve these results. This approach would be consistent, but it is still open to the problem of disagreement about colors&#8211;someone still has to define red in the computation. In terms from a recent 202 lecture, a semantic gap remains between the photo and the metadata used to describe (and consequently retrieve) it.</p>
<p>A solution to this problem is to search using a criteria at the same semantic level that you require in your results. Idée has implemented this idea with its <a href="http://labs.ideeinc.com/multicolr/">Multicolr</a> interface for searching Flickr. You select a color and see pictures that contain that color. Using Multicolr is mesmerizing because you can adjust your search criteria to encompass multiple colors and see results matching your search. Selecting the same color multiple times (i.e. the equivalent of &#8220;<a href="http://flickr.com/search/?w=all&amp;q=redred&amp;m=tags">redred</a>&#8220;) increases its intensity in your search.</p>
<p>Textual search is likely to remain our primary means of retrieval for the foreseeable future&#8211;so much of our discourse is word-dominated&#8211;but this is an example of the frontiers of IR.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/02/search-flickr-by-color/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Research Engine Searches &#8220;Deep Web&#8221;</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/18/new-research-engine-searches-deep-web/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/18/new-research-engine-searches-deep-web/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 06:40:36 +0000</pubDate>
		<dc:creator>michael</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Deep Web]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=80</guid>
		<description><![CDATA[How much of the World Wide Web is actually indexed&#8230; 27.65 billion pages? Maybe about 0.2% of the total content? The &#8220;Deep Web&#8221; (web documents not immediately accessible by direct hyperlink from public pages) may contain something like 91,000 terabytes of data&#8230; as compared to an estimated 167 terabytes of Surface Web data.
A new service, [...]]]></description>
			<content:encoded><![CDATA[<p>How much of the World Wide Web is actually indexed&#8230; <a href="http://www.worldwidewebsize.com/">27.65 billion pages</a>? Maybe about 0.2% of the total content? The &#8220;<a href="http://en.wikipedia.org/wiki/Deep_web">Deep Web</a>&#8221; (web documents not immediately accessible by direct hyperlink from public pages) may contain something like <a href="http://en.wikipedia.org/wiki/Deep_web#Size">91,000 terabytes</a> of data&#8230; as compared to an estimated <a href="http://en.wikipedia.org/wiki/Deep_web#Size">167 terabytes</a> of Surface Web data.</p>
<p>A new service, called <a href="http://www.infovell.com/">Infovell</a>, hopes to help users find more of this &#8220;Deep Web&#8221; data&#8230; yet unlike Google and other Surface Web engines, it won&#8217;t be ad-supported. Instead, the service will be subscription based. Read more at <a href="http://www.readwriteweb.com/archives/sometimes_google_isnt_enough_when_researching_deep_web.php">ReadWriteWeb</a> blog. Here is an excerpt from the article:</p>
<blockquote><p><img class="alignright" style="border: 0pt none;margin: 5px;float: right" src="http://www.readwriteweb.com/images/infovell1.gif" alt="InfoVell" width="200" height="150" /><em>The engine scours through open-access repositories of information like PubMed Central and the U.S. Patent and Trademark Office Claims, but it also allows access to scholarly journals such as those from Oxford University Press, SAGE, Taylor &amp; Francis, Annual Reviews, Mary Ann Liebert Publications, and more. The culmination of these billions of pages currently unindexed by other engines, gives you access to content in the areas of Life Sciences, Medicines, Patents, Industry News, and other reference content from expert sources.</em></p></blockquote>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/18/new-research-engine-searches-deep-web/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Not so cool?</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/03/not-so-cool/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/03/not-so-cool/#comments</comments>
		<pubDate>Wed, 03 Sep 2008 14:53:38 +0000</pubDate>
		<dc:creator>Neha Kumar</dc:creator>
				<category><![CDATA[Assignment 1]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[relevance]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=47</guid>
		<description><![CDATA[http://www.techcrunch.com/2008/07/29/how-to-lose-your-cuil-20-seconds-after-launch/
Stealth search start-up Cuil (pronounced &#8220;cool&#8221;) launched its product on July 29th of this year, and was promptly subject to an angry backlash from its users. The start-up &#8211; founded by three former senior Google employees &#8211; claims to have an index size of 120 billion web pages (larger than that of Google&#8217;s, they say). [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.techcrunch.com/2008/07/29/how-to-lose-your-cuil-20-seconds-after-launch/">http://www.techcrunch.com/2008/07/29/how-to-lose-your-cuil-20-seconds-after-launch/</a></p>
<p>Stealth search start-up <a href="http://www.cuil.com">Cuil</a> (pronounced &#8220;cool&#8221;) launched its product on July 29th of this year, and was promptly subject to an angry backlash from its users. The start-up &#8211; founded by three former senior Google employees &#8211; claims to have an index size of 120 billion web pages (larger than that of Google&#8217;s, they say). On the day of its launch however, Cuil&#8217;s search results were not as rewarding. For example, a search for &#8220;Dog&#8221; resulted in 280 million hits on Cuil and 498 million on Google. Of course, quantity isn&#8217;t everything, but even in relevance, Google&#8217;s results were better.</p>
<p>The comparison to Google is but natural, since Google defines what search means to most of us today &#8211; search results that are relevant, but also photos, news articles, video files, etc. that complete the picture. It remains to be seen whether Cuil can offer a superior &#8216;universal&#8217; search package to one that we&#8217;re already used to.</p>
<p>Relevant lectures: 21/22</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/03/not-so-cool/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>E-Discovery &#8211; Too Much Information (TMI!)</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/08/29/e-discovery-too-much-information-tmi/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/08/29/e-discovery-too-much-information-tmi/#comments</comments>
		<pubDate>Sat, 30 Aug 2008 04:57:02 +0000</pubDate>
		<dc:creator>Nat (Nathaniel) Wharton</dc:creator>
				<category><![CDATA[Assignment 1]]></category>
		<category><![CDATA[e-discovery]]></category>
		<category><![CDATA[law]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[tmi]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=17</guid>
		<description><![CDATA[Electronic discovery or e-discovery is the process of demanding and
sifting through, &#8220;digital evidentiary artifacts&#8221; for lawsuits.
Information from Facebook, Myspace, chat, email, laptops, smart phones,
memory sticks, back-up tapes, logs from service providers, is now considered,
&#8220;fair game,&#8221; and subject to inspection when adversaries in lawsuits
demand and are granted access. E-discovery is an increasingly expensive
and Sisyphean reality of [...]]]></description>
			<content:encoded><![CDATA[<p>Electronic discovery or e-discovery is the process of demanding and<br />
sifting through, &#8220;digital evidentiary artifacts&#8221; for lawsuits.<br />
Information from Facebook, Myspace, chat, email, laptops, smart phones,<br />
memory sticks, back-up tapes, logs from service providers, is now considered,<br />
&#8220;fair game,&#8221; and subject to inspection when adversaries in lawsuits<br />
demand and are granted access. E-discovery is an increasingly expensive<br />
and Sisyphean reality of modern court proceedings.  Court cases<br />
more-frequently face early settlement, plaintiffs are increasingly<br />
unable to sue (or defend), &#8220;for fear of [enormous] e-discovery costs&#8221;,<br />
and the justice system is increasingly over-burdened.</p>
<p>Ordinary court cases risk millions of dollars, and hours of being<br />
bogged down in e-discovery.  As a Verizon attorney explains for his<br />
business, &#8220;Almost every case [now] involves e-discovery and spits out<br />
&#8220;terabytes&#8221; of information&#8230;. 200 lawyers can easily review electronic<br />
documents for four months, at a cost of millions of dollars.&#8221;  As a<br />
result of the increased burden of effort, e-discovery businesses are<br />
booming, frequently charging $125-$600/hr. Annual revenues from<br />
e-discovery businesses, &#8220;Have grown from $40m in 1999 to about $2<br />
billion in 2006 and may hit $4 billion next year.&#8221;</p>
<p>&#8220;Results [of e-discovery] have to be indexed and reviewed by<br />
humans. This usually falls to the junior staff at law firms, some of<br />
whom are so fed up with the drudgery that they have quit the profession<br />
altogether.&#8221;</p>
<p>Privacy is increasingly subject to invasion, as insurance<br />
companies have demanded personal records of their clients when<br />
disputing customer claims.  For example, in a recent lawsuit, &#8220;Horizon<br />
Blue Cross Blue Shield of New Jersey&#8230; asked and were granted the<br />
right to see practically everything the teenagers had said on their<br />
Facebook and MySpace profiles, in instant-messaging threads, text<br />
messages, e-mails, blog posts and whatever else the girls might have<br />
done online.&#8221;</p>
<p>In this context, it looks like your <a title="memex" href="http://en.wikipedia.org/wiki/Memex" target="_blank">memex</a> could be your wost enemy!</p>
<p>For more, see the original Economist.com article:  <a title="Economist.com - The Big Data Dump" href="http://www.economist.com/business/displaystory.cfm?story_id=12010377" target="_blank"><strong>The Big Data Dump</strong></a></p>
<p>This may touch on the following lectures:<br />
ISSUES AND CONTEXTS (9/3)<br />
ORGANIZATION {AND,OR,VS} RETRIEVAL (9/8)<br />
PERSONAL INFORMATION MANAGEMENT (10/20)</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/08/29/e-discovery-too-much-information-tmi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
