<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>INFO 202 Fall 08 Blog &#187; categorization</title>
	<atom:link href="http://blogs.ischool.berkeley.edu/i202f08/tag/categorization/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.ischool.berkeley.edu/i202f08</link>
	<description>I202 course Fall 08</description>
	<lastBuildDate>Wed, 11 Mar 2009 17:29:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>What&#8217;s a Small Farm?</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2009/02/24/whats-a-small-farm/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2009/02/24/whats-a-small-farm/#comments</comments>
		<pubDate>Tue, 24 Feb 2009 21:05:47 +0000</pubDate>
		<dc:creator>Shawna Hein</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bias]]></category>
		<category><![CDATA[categorization]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=423</guid>
		<description><![CDATA[This year the the USDA released the much-anticipated 2007 agricultural census.  This census showed a rise in the number of small farms, and this statistic was celebrated in many farm and food articles and blogs.
Gristmill points out that former USDA Economic Research Service researcher, Michael Roberts, argues that there may not actually be more small [...]]]></description>
			<content:encoded><![CDATA[<p>This year the the USDA released the much-anticipated <a href="http://www.agcensus.usda.gov/Publications/2007/index.asp">2007 agricultural census</a>.  This census showed a rise in the number of small farms, and this statistic was celebrated in many farm and food articles and blogs.</p>
<p><a href="http://gristmill.grist.org/story/2009/2/19/8501/61694">Gristmill points out</a> that former USDA Economic Research Service researcher, Michael Roberts, <a href="http://greedgreengrains.blogspot.com/2009/02/2007-agricultural-census-and-note-of.html">argues that there may not actually be more small farms</a>, there may simply be a difference in what &#8220;counts&#8221; as a small farm.</p>
<blockquote><p>The important revelation here is that the USDA uses statistical weighting to arrive at the numbers for these micro-farms since many of these people don&#8217;t even self-identify as farmers &#8212; and so their precision is entirely a question of their methodology, i.e. how they decide to model the presence/frequency of these small operations. Census weighting is, of course, both controversial and necessary. Counting everything by hand can have a larger margin for error than rigorous statistical modeling. Indeed, this &#8220;controversy&#8221; is right now at the heart of a monumental battle between Democrats and Republicans over the U.S. Census (just ask <a href="http://www.prospect.org/csnc/blogs/ezraklein_archive?month=02&amp;year=2009&amp;base_name=why_the_census_matters">Sen. Judd Gregg</a>).</p>
<p>That said, there is nothing inherently wrong with the practice. However, even if your overall approach is solid, if you then change your weighting techniques from year to year, comparing annual changes is all but impossible. And that appears to be exactly what the USDA is doing.</p></blockquote>
<p>Needless to say, this is a pretty big deal.  Are the number of small farms actually growing?  Or is the current political climate in this realm simply pushing the USDA to fudge their methods a little, causing a shift in their categorization schemes?</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2009/02/24/whats-a-small-farm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Auto-clustering of UC Berkeley courses</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/12/12/auto-clustering-of-uc-berkeley-courses/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/12/12/auto-clustering-of-uc-berkeley-courses/#comments</comments>
		<pubDate>Sat, 13 Dec 2008 00:22:00 +0000</pubDate>
		<dc:creator>Kentaro Suzuki</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[classification]]></category>
		<category><![CDATA[EM algorithm]]></category>
		<category><![CDATA[information retrieval]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=369</guid>
		<description><![CDATA[Maybe this is my last post for 202 blog.
I have taken statistical learning theory course at EECS dept in this semester. This course provides an introduction to the area of probabilistic models, and requires students to do a final project. I picked up an unsupervised clustering of UC Berkeley courses based on their descriptions.
The problem [...]]]></description>
			<content:encoded><![CDATA[<p>Maybe this is my last post for 202 blog.</p>
<p>I have taken <a href="http://inst.eecs.berkeley.edu/~cs281a/fa08/index.html">statistical learning theory</a> course at EECS dept in this semester. This course provides an introduction to the area of probabilistic models, and requires students to do a final project. I picked up an unsupervised clustering of UC Berkeley courses based on their descriptions.</p>
<p>The problem background for this task is as follows. You know, UC Berkeley provides an <a href="http://schedule.berkeley.edu/?PageID=srchfall.html">online courses search system</a>, but it is a very low-level. It only provides users to search by course name, instructor name, department etc. But we can&#8217;t search courses keywords in course descriptions.</p>
<p>I beleive, first of all, that it should provide a keyword search system for course descriptions. Also, it is desireble for the system to be equipped with &#8220;recommendation systems&#8221;, which provides students course lists that may probably be suitable for them, based on their course registered histories(it is maybe like Amazon&#8217;s recommendation system, one kind of &#8220;folksonomy&#8221; to clssify courses).</p>
<p>To implement a recommendation system based on students&#8217; course registered histories, courses in Berkeley need to be clustered by the student registerd histories in an unsupervised manner.</p>
<p>I can&#8217;t utilize students&#8217; registered history. So, I utilize online course descriptions in the current system for the substitution and try to cluster UC Berkeley courses based on these course descriptions by a mixture mutlivariate Bernoulli distribution probabilistic model with EM algorithm(in detail, please refer a text &#8220;Introduction to Information Retrieval&#8221;, pp338-pp340.), and testify whether I can reasonably and explainably cluster UC Berkeley&#8217;s courses in unsupervised manner.</p>
<p>The result is as follows. I tested to categorize Math+Information+Statistics+Economics+Computer Science department courses(total 226 courses) into 7 clusters.  Several categories of classes generated by the algorithm are explainable such as &#8220;Statistical/Mathematical methology-oriented course cluster&#8221;, &#8220;Programming related course cluster&#8221;, &#8220;Individual study/Seminar related course cluster&#8221; and &#8220;Economics related(but less statitics oriented) course cluster) &#8220;.</p>
<p>Of course, not all courses are explanale by these labels. But, basically, although I applied very basic methods without special information-retrieval methods such as lemmatization, stemming and removing stop words, results are better than I expected before conducting the experiment.</p>
<p>1st cluster courses</p>
<p><a href="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster1.png"><img class="alignnone size-medium wp-image-372" src="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster1-274x300.png" alt="" width="274" height="300" /></a></p>
<p>2nd cluster courses</p>
<p><a href="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster2.png"><img class="alignnone size-medium wp-image-373" src="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster2-300x98.png" alt="" width="300" height="98" /></a></p>
<p>3rd cluseter courses</p>
<p><a href="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster3.png"><img class="alignnone size-medium wp-image-374" src="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster3-300x98.png" alt="" width="300" height="98" /></a></p>
<p>4th cluster courses</p>
<p><a href="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster4.png"><img class="alignnone size-medium wp-image-375" src="http://blogs.ischool.berkeley.edu/i202f08/files/2008/12/cluster4-300x184.png" alt="" width="300" height="184" /></a></p>
<p>This is a back-envelope simulation and result is simple. Also, there are some problems. But it can show a certain result, and, more imporantly, this is an integrated task with Info202(Information organization and retrieval), Info206(network programming and Java) and CS281A(statistical learning theory, esp. EM algorithm) for me. I am satisfied with the result and the fact that I achieve ability  to implement this simulation by myself in a short time.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/12/12/auto-clustering-of-uc-berkeley-courses/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>My last 202 blog post</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/12/12/my-last-202-blog-post/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/12/12/my-last-202-blog-post/#comments</comments>
		<pubDate>Fri, 12 Dec 2008 19:14:48 +0000</pubDate>
		<dc:creator>James Tucker</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[authority]]></category>
		<category><![CDATA[beer]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[vocabulary]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=365</guid>
		<description><![CDATA[The Beer Judge Certification Program (no, really) has developed a set of guidelines (read categories and vocabulary) for judging beer. They even have it in downloadable XML format . They have developed (what they feel is) an authoritative vocabulary describing the various qualities of beer within their defined categories. Interestingly, they recognize that brewing styles [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.bjcp.org/">Beer Judge Certification Program</a> (no, really) has developed a set of <a href="http://www.bjcp.org/2008styles/catdex.php">guidelines</a> (read categories and vocabulary) for judging beer. They even have it in downloadable <a href="http://www.bjcp.org/docs/xmlstyleguide.zip">XML format</a> . They have developed (what they feel is) an authoritative vocabulary describing the various qualities of beer within their defined categories. Interestingly, they recognize that brewing styles change so their vocabulary is descriptive versus proscriptive and it will change over time. The organization also states that they use &#8220;experts&#8221; to choose the commercial examples (of the types of beer listed) instead of online surveys in order to remove the issue of &#8220;popularity contests&#8221; overwhelming the list.</p>
<p>So, they have taken a more Svenoniun approach to BeerML. However, I&#8217;ve never heard of these folks before (though, I am intrigued and have installed their iPhone app already). I find it interesting that one of my first questions upon reading their site (aside from how do I get in on this) was &#8220;What makes them an authority?&#8221; I have searched their site and find no association with any governing body. It seems like a bunch of folks trying to develop an authority on their own. Somewhat self-policing. I&#8217;d look more, but need to get back to writing my CMC paper and studying vector modeling. At least now I have vocabulary to use for describing the glass of awesome that is Belgian Witbier.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/12/12/my-last-202-blog-post/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Taxonomy of Philosophy</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/07/taxonomy-of-philosophy/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/07/taxonomy-of-philosophy/#comments</comments>
		<pubDate>Fri, 07 Nov 2008 21:42:17 +0000</pubDate>
		<dc:creator>Nick Doty</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[borges]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[classification]]></category>
		<category><![CDATA[facets]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[weinberger]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=195</guid>
		<description><![CDATA[Weinberger links to this intriguing attempt to categorize philosophical papers for a system to &#8220;access online work in philosophy.&#8221;  
The best part is the discussion that follows David Chalmers&#8217; blog post about the project, which sends me through a microcosm of the 202 course so far.  One commenter links to &#8220;An Essay towards a Real [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.hyperorg.com/blogger/2008/11/07/a-taxonomy-of-philosophy/" target="_blank">Weinberger</a> links to this <a href="http://consc.net/taxonomy.html" target="_blank">intriguing attempt to categorize philosophical papers</a> for a system to &#8220;access online work in philosophy.&#8221;  </p>
<p>The best part is the discussion that follows <a href="http://fragments.consc.net/djc/2008/11/a-taxonomy-of-philosophy.html" target="_blank">David Chalmers&#8217; blog post</a> about the project, which sends me through a microcosm of the 202 course so far.  One commenter links to <a href="http://en.wikipedia.org/wiki/An_Essay_towards_a_Real_Character_and_a_Philosophical_Language" target="_blank">&#8220;An Essay towards a Real Character and a Philosophical Language&#8221;</a> in which John Wilkins attempts to create a language where every word defines itself based on a hierarchy of 40 Genuses (each divided into Differences and then Species) of his design.  The Wikipedia article points me to Borges&#8217; response, <a href="http://www.crockford.com/wrrrld/wilkins.html" target="_blank">&#8220;The Analytical Language of John Wilkins&#8221;</a>, where he casts doubt on such universal categorization schemes by comparison to <em><a href="http://en.wikipedia.org/wiki/Celestial_Emporium_of_Benevolent_Knowledge%27s_Taxonomy" target="_blank">The Celestial Emporium of Benevolent Knowledge</a></em>.  Other commenters on the blog post point out similar problems: a separate set of categories for the <a href="http://fragments.consc.net/djc/2008/11/a-taxonomy-of-philosophy.html#comment-138029496" target="_blank">history of philosophy</a> seems strange since many of these papers are relevant to the philosophical topics themselves; there seem to be &#8220;<a href="http://fragments.consc.net/djc/2008/11/a-taxonomy-of-philosophy.html#comment-138016162" target="_blank">multiple principles of division</a>&#8220;.</p>
<p>One of the author&#8217;s of the philosophy taxonomy <a href="http://fragments.consc.net/djc/2008/11/a-taxonomy-of-philosophy.html#comment-138036422" target="_blank">responds</a> with a return to pragmatism:</p>
<blockquote><p>OK, it&#8217;s a pseudo-taxonomy, or maybe just a category scheme. We&#8217;re not doing science here, just trying to come up with something useful and convenient.</p></blockquote>
<p>Excellent.  We all know that classification systems should be judged by their usefulness rather than how essential their representations of the world are.</p>
<p>Finally, the other author of the taxonomy <a href="http://fragments.consc.net/djc/2008/11/a-taxonomy-of-philosophy.html#comment-138055772" target="_blank">argues for the values of faceted classification</a>:</p>
<blockquote><p>our system allows massive cross-classification both of papers and categories: any paper or category can be in multiple categories. This allows us to cut the pie in many ways at once, and we hope that people will generally be able to find what they are looking for following their intuitive way of cutting the pie (along periods, figures, views, points of disagreement, etc).</p></blockquote>
<p>Though if he is attempting to cut the pie in many different ways at once, <a href="http://fragments.consc.net/djc/2008/11/a-taxonomy-of-philosophy.html#comment-138166014" target="_blank">I would think</a> he would want explicitly orthogonal classifications, rather than one enormous tree.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/07/taxonomy-of-philosophy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On Political Voicemail</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/11/02/on-political-voicemail/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/11/02/on-political-voicemail/#comments</comments>
		<pubDate>Mon, 03 Nov 2008 06:05:15 +0000</pubDate>
		<dc:creator>Annette Greiner</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[personal space of information]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=192</guid>
		<description><![CDATA[The other day I got voice mail from Bill Clinton. Yes, Bill himself apparently took the time to call me and leave me a message reminding me to vote against Prop. 8. It must have been the real Bill, because my phone number is on the national do-not-call list, so I&#8217;m protected from annoying phone [...]]]></description>
			<content:encoded><![CDATA[<p>The other day I got voice mail from Bill Clinton. Yes, Bill himself apparently took the time to call me and leave me a message reminding me to vote against Prop. 8. It must have been the real Bill, because my phone number is on the national do-not-call list, so I&#8217;m protected from annoying phone calls sent out by machinery. I&#8217;m only sorry I wasn&#8217;t home to talk to him myself, assure him that I will vote against Prop. 8, and ask how Hillary is feeling these days. </p>
<p>But seriously, it&#8217;s funny how phone calls from political campaigns get free reign under the rules around the national do-not-call list. Somehow it was decided that sales calls from for-profit businesses are in a different category from calls trying to sell you on a political agenda. Surveys by for-profit companies seem also to have escaped being categorized as sales calls. The argument for keeping things this way is that some people want to receive calls from nonprofits and some presumably would like to be included in surveys. Assuming that&#8217;s true, what we need is more granularity in the do-not-call list. Wouldn&#8217;t it be nice if we could all decide for ourselves whether sales calls, surveys, and political calls can be categorized as annoying? As I&#8217;m sure there are a few people out there who would not want to miss out on their yearly call from Bill Clinton, they would be able to set their political-call option to &#8220;useful&#8221; rather than &#8220;annoying&#8221; and have it still come through. The problem here is not so much that things have been classified wrong but that someone else is calling the shots on everyone else&#8217;s personal space of information.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/11/02/on-political-voicemail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tagging with pictures &#124; Tagging the physical world.</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/10/15/tagging-with-pictures-tagging-the-physical-world/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/10/15/tagging-with-pictures-tagging-the-physical-world/#comments</comments>
		<pubDate>Wed, 15 Oct 2008 20:41:02 +0000</pubDate>
		<dc:creator>James Tucker</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[classification]]></category>
		<category><![CDATA[folksonomy]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[vocabulary]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=158</guid>
		<description><![CDATA[At the risk of fanning political flames, this jpg was just sent to me via email. If you move past the humor and politics of the photo, it seems salient to today&#8217;s topic of tagging. Specifically, using the characteristics we collectively/culturally ascribe to trains of varying types to tag each of the presidential/vice-presidential candidates. It [...]]]></description>
			<content:encoded><![CDATA[<p>At the risk of fanning political flames, this jpg was just sent to me via email. If you move past the humor and politics of the photo, it seems salient to today&#8217;s topic of tagging. Specifically, using the characteristics we collectively/culturally ascribe to trains of varying types to tag each of the presidential/vice-presidential candidates. It was done visually instead of with words (modern, green, fast, powerful, coal powered, archaic, plastic, child&#8217;s toy). Are these &#8220;good&#8221; tags? I think guys named Nick who went to Amherst (the h is silent) would say yes.</p>
<p><img style="vertical-align: baseline" src="http://www.bobcesca.com/images/electiontrains.jpg" alt="Election Trains" width="400" height="475" /></p>
<p>After I stopped laughing, this made me wonder if there were already a system tagging things with pictures out there. I did not find any with a quick google search. Just a number of whitepapers.</p>
<p>However, I did find <a href="http://www.techcrunch50.com/2008/conference/presenter.php?presenter=71">Tonchidot</a>.</p>
<p>While not specifically related to using images to tag other images or ideas, they are developing an iPhone app that adds tags to the images the camera sees in real time. They take community tags and make them mobile in a very compelling way. Want to know what type of flower that is? Tree? Year a building you are looking at was made, who designed it? Which store at the mall has the thing you want to buy? How many stars the restaurant you are looking at has on yelp? When the next bart is arriving at your station? Find a lower price for something in a different store. Purchase something via the phone. Leave a message for a friend to pick up by walking by a specific place.</p>
<p>Tagging a specific location is also possible. This reminds me of William Gibson&#8217;s book Spook Country. One aspect of the storyline was the development of location based digital art installations. In order to see a specific digitally created piece you needed specially made hardware (eyeglass digital display) and a computer. You also needed to be in a specific geo-spatial location. Now, you&#8217;ll just need your iPhone.</p>
<p>One of the things an artist in the book said reminds me of the potential of Tonchidot&#8217;s technology. Imaging traveling across the country and seeing a whole 2nd landscape that covers, interacts, and integrates with the physical world. Offering different things to see, information about what you&#8217;re seeing, directions to get there, prices for goods/services (who would not love to know the cheapest place to get gas?). And of course a whole new opportunity for advertising and spam.</p>
<p>Maybe that&#8217;s the problem with spam. No ontological control.</p>
<p>The video is about 18mins long and worth watching. There is a particularly interesting practical question around the 14:15 min mark.</p>
<p><img src="///Users/james/Library/Caches/TemporaryItems/moz-screenshot.jpg" alt="" /></p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/10/15/tagging-with-pictures-tagging-the-physical-world/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>A Dogma of Categorization</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/10/05/a-dogma-of-categorization/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/10/05/a-dogma-of-categorization/#comments</comments>
		<pubDate>Sun, 05 Oct 2008 09:05:08 +0000</pubDate>
		<dc:creator>Nick Doty</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[dogma]]></category>
		<category><![CDATA[empiricism]]></category>
		<category><![CDATA[facets]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[pragmatism]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=130</guid>
		<description><![CDATA[In determining facets or categories for a set of objects, we might tend to think that some facets are better than others because they are more inherently essential to a particular set of objects.  I believe this is a dogma we should be careful to avoid and as a result I argue that we can [...]]]></description>
			<content:encoded><![CDATA[<p>In determining facets or categories for a set of objects, we might tend to think that some facets are better than others because they are more inherently essential to a particular set of objects.  I believe this is a dogma we should be careful to avoid and as a result I argue that we can only be pragmatic in evaluating ontologies.<span id="more-130"></span></p>
<p>In <a href="http://www.calculemus.org/lect/06transl/quine1.html" target="_blank">Two Dogmas of Empiricism</a>, Quine calls it the first dogma of empiricism that some questions can be answered by appeals to the meanings of the terms while other questions can only be answered by appeals to experience of the world.  He goes on to show in great length that there is no sharp difference here &#8212; the &#8220;meaning&#8221; of the term and its particular analytic properties are not special.  While at first it may seem that &#8220;All bachelors are unmarried&#8221; and &#8220;All swans are white&#8221; are verified in different ways (the first just by looking at the meaning of &#8220;bachelor&#8221;, the second by investigating the world&#8217;s swans), Quine shows that on careful consideration, this distinction is quite blurry and they are verified in the same way.</p>
<p>Similarly, there are not two classes of properties &#8212; the essential and the non-essential* &#8212; from which we can decide upon the best facets for categorizing a group of objects.  (The same argument applies to the ontologies that we created last week &#8212; there are not objects or words that are essential to a particular domain, any vocabulary choices you made were not inherently right or wrong.)  In fact, there are infinitely** many ways we can categorize &#8220;tools&#8221; and many very simple ones can achieve the goal set forth to categorize both the original 10 objects and any set of 5 objects the TAs might throw at you.  (For example, consider the single, boolean facet &#8220;Existence&#8221; which has the headings &#8220;Exists&#8221; and &#8220;Doesn&#8217;t Exist&#8221;.)</p>
<p>What should we do then, if there is no inherent advantage/disadvantage to any one facet or vocabulary?  Quine&#8217;s blunt conclusion is that his arguments will cause &#8220;a shift toward pragmatism.&#8221;  Carnap is a little more explicit (he&#8217;s discussing the question of whether abstract entities exist, but it&#8217;s applicable to any ontological questions about language frameworks):</p>
<blockquote><p>For those who want to develop or use semantical methods, the decisive question is not the alleged ontological question of the existence of abstract entities but rather the question whether the use of abstract lingusitic forms [or a particular set facets or vocabulary we create] is expedient and fruitful for the purposes for which semantical analyses are made, viz. the analysis, interpretation, clarification, or construction of languages of communication, especially languages of science.</p>
<p><a href="http://www.ditext.com/carnap/carnap.html">Empiricism, Semantics, and Ontology</a> by Rudolf Carnap***</p></blockquote>
<p>This seems just right &#8212; we created vocabularies that were handy in their naming and granularity for whoever was using our vocabulary and we should choose facets that are useful for whatever our particular purpose is.  I think <a href="http://courses.ischool.berkeley.edu/i202/f08/assignments/A3-Feedback.html" target="_blank">Professor Glushko agrees</a> when he states that the critical piece is &#8220;to choose an appropriate scope (and hence, an <em>intended user community</em>)&#8221; (emphasis mine).  To repeat, the decisive question is just whether a particular ontology is fruitful, not whether it&#8217;s somehow ontologically better or worse.  We must be pragmatic and results-focused in evaluating our ontological decisions, if for no other reason than there is no clear alternative.</p>
<p>This will also, I hope, cast doubt on such projects as the Colon Classification system which rely on fundamental categories and semantic universals in order to organize and describe all information for all purposes.</p>
<p> </p>
<p><em>Notes:</em></p>
<p><em>* It might be said here that my real complaint is with <a href="http://plato.stanford.edu/entries/aristotle-metaphysics/#Cat" target="_blank">Aristotle and essentialism</a></em><em> rather than with empiricists and meaning.  But I contend that the same arguments will have the same implications.  As Quine says (in the same paper), &#8220;the Aristotelian notion of essence was the forerunner, no doubt, of the modern notion of intension or meaning.&#8221;</em></p>
<p><em>** Infinite?  This may be bounded, depending on how you define it, but at least as large as the number of possible groupings of N distinct objects where N is the number of tools.</em></p>
<p><em>*** </em><em>There was a question in class last week about how ontology differed in philosophy vs. in our creation of vocabularies.  I believe the key difference is that the ontology in philosophy is at a meta-level.  Rather than investigating what the things are in a particular domain (which we did as vocabulary-creators), philosophers investigate what it means for something to be a thing and question, for example, whether abstract things exist.  This piece by Carnap is an example of philosophical ontology since it&#8217;s discussing whether these abstract entities actually exist and whether we should accept them.</em></p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/10/05/a-dogma-of-categorization/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The intelligent cloud</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/23/the-intelligent-cloud/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/23/the-intelligent-cloud/#comments</comments>
		<pubDate>Wed, 24 Sep 2008 06:51:16 +0000</pubDate>
		<dc:creator>Julian Couhault</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Categories]]></category>
		<category><![CDATA[categorization]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=107</guid>
		<description><![CDATA[After our discussion on Monday on automation and Svenonius&#8217;s attitude toward the expensive cost of categorization I was interested to read a recent google blog post about the future of Google&#8217;s search technology.
We discussed that current technology would not allow automation to recognize / understand things such as metaphors, fuzzy words, and multi-word terms.
Google is [...]]]></description>
			<content:encoded><![CDATA[<p>After our discussion on Monday on automation and Svenonius&#8217;s attitude toward the expensive cost of categorization I was interested to read a recent google blog post about the future of Google&#8217;s search technology.</p>
<p>We discussed that current technology would not allow automation to recognize / understand things such as metaphors, fuzzy words, and multi-word terms.</p>
<p>Google is predicting that by 2019 their technology will be able do much more than fully automate categorization and language comprehension but also solve complex problems and learn from its research.  The impact of their technology will go well beyond Google&#8217;s offerings and will generate many significant benefits for mankind.</p>
<blockquote><p>&#8220;Thus, computer systems will have greater opportunity to learn from the collective behavior of billions of humans. They will get smarter, gleaning relationships between objects, nuances, intentions, meanings, and other deep conceptual information.&#8221;</p></blockquote>
<p><a href="http://googleblog.blogspot.com/2008/09/intelligent-cloud.html">The intelligent cloud</a></p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/23/the-intelligent-cloud/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Dewey Decimal</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/23/dewey-decimal/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/23/dewey-decimal/#comments</comments>
		<pubDate>Wed, 24 Sep 2008 01:30:54 +0000</pubDate>
		<dc:creator>K. Joyce Tsai</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[taxonomy]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=105</guid>
		<description><![CDATA[Based on section discussion of Cory Doctorow&#8217;s point &#8220;schemas aren&#8217;t neutral&#8221; and on a librarian friend complaining that Korea got shafted when it came to the folktale section of the Dewey Decimal system, I decided to look at the complete list of Dewey Decimal classes.
Like Nick mentioned in section, the religion section is overwhelmingly dominated [...]]]></description>
			<content:encoded><![CDATA[<p>Based on section discussion of Cory Doctorow&#8217;s point &#8220;schemas aren&#8217;t neutral&#8221; and on a librarian friend complaining that Korea got shafted when it came to the folktale section of the Dewey Decimal system, I decided to look at <a href="http://en.wikipedia.org/wiki/List_of_Dewey_Decimal_classes">the complete list</a> of Dewey Decimal classes.</p>
<p>Like Nick mentioned in section, the <a href="http://en.wikipedia.org/wiki/List_of_Dewey_Decimal_classes#200_.E2.80.93_Religion">religion section</a> is overwhelmingly dominated by Christianity. Also, any time languages are mentioned, European languages get multiple categories (English, Other Germanic Languages, French, Spanish, Italian, Slavic, Scandinavian) while the rest of the world is stuck in the &#8220;other&#8221; category. Wikipedia, font of all knowledge, mentions that the Library of Congress system is even more US-centric than the Dewey Decimal system. </p>
<p>Makes you wonder what sort of systems of categorization information scientists in other countries create.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/23/dewey-decimal/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>On the Subject of Important Definitions &#8211; OR &#8211; Why Politics and Categorization Don&#8217;t Mix</title>
		<link>http://blogs.ischool.berkeley.edu/i202f08/2008/09/22/on-the-subject-of-important-definitions-or-why-politics-and-categorization-dont-mix/</link>
		<comments>http://blogs.ischool.berkeley.edu/i202f08/2008/09/22/on-the-subject-of-important-definitions-or-why-politics-and-categorization-dont-mix/#comments</comments>
		<pubDate>Mon, 22 Sep 2008 22:55:11 +0000</pubDate>
		<dc:creator>Michael Lissner</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[categorization]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[poverty]]></category>

		<guid isPermaLink="false">http://blogs.ischool.berkeley.edu/i202f08/?p=97</guid>
		<description><![CDATA[In Erin Knight&#8217;s post, she talks about how the Capitol is trying to figure out how to redefine homelessness. This reminds me of a similar issue that I have encountered year after year while working for Contra Costa County.
Back in about 1970, a bit of research was done to determine what the poverty level should [...]]]></description>
			<content:encoded><![CDATA[<p>In<a href="http://blogs.ischool.berkeley.edu/i202f08/2008/09/17/capitol-strives-to-define-homeless/" target="_self"> Erin Knight&#8217;s post</a>, she talks about how the Capitol is trying to figure out how to redefine homelessness. This reminds me of a similar issue that I have encountered year after year while working for Contra Costa County.</p>
<p>Back in about 1970, a bit of research was done to determine what the poverty level should be. They did a bunch of research, but eventually just decided that the thing to do was to simply take the cost of food for a given family size and then multiply it times three. Out of this math, we have the poverty level.</p>
<p>From this number, the government has adjusted every year for inflation, and with that, we arrive at the <a href="http://aspe.hhs.gov/poverty/08Poverty.shtml" target="_blank">federal poverty levels for 2008</a>.</p>
<p>Now, this would be pretty bad research, and were I the professor overseeing the high schoolers responsible for these measures, I would probably scold them for committing every bad research method ever. The federal government however has taken these measures, and based pretty much every aid program on them&#8230;.for the past 30-40 years.</p>
<p>Brilliant.</p>
<p>In class, we have talked about how important it is to have specific and precise ways of categorizing things. Unfortunately, this thing happens to be humans, and unfortunately nobody wants to raise the poverty level while in office because that will mean that X number of people fell into poverty during their time.</p>
<p>When politics meets categorization, problems ensue.</p>]]></content:encoded>
			<wfw:commentRss>http://blogs.ischool.berkeley.edu/i202f08/2008/09/22/on-the-subject-of-important-definitions-or-why-politics-and-categorization-dont-mix/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
