Archive forSeptember, 2008

Our Digital Lives, Monitored By A Hidden ‘Numerati’

I listened to an interview, broadcast on NPR’s Fresh Air, with Stephen Baker, where he discusses his new book, “Numerati”, which examines the “mathematical modeling” of humanity, and what he believes are some of the potential consequences of this activity. He talks about how information is collected about each of us from cell phone use, credit cards, super-market scanners, Internet shopping and many other sources.  This information about our choices is now being examined, and based on the results consumers will be targeted for particular services and goods, creating customized profiles. On a more serious note this same type of data mining is being used in attempting to understand people more deeply to determine behavior, such as someone’s potential to be involved in terrorism.  If you have 20 minutes to listen, this is an interesting topic in light our current discussions on data-mining and classifications of information.

http://www.npr.org/templates/story/story.php?storyId=95166854

Comments off

Copyrights for Recipes?

http://www.theatlantic.com/doc/200810/bread

On the topic of authorship, royalties and intellectual property, I found an article discussing the predicament a baker or cook faces when their recipes are reproduced, unacknowledged (or sort of acknowledged), proliferated and popularized.  Regardless of whether the creation (eg. The grilled pizza) becomes accredited to a certain person or not, they certainly do not obtain commission or royalties when the item is placed on the menus of various dining establishments or on websites. 

In the instance of baking, when a quarter teaspoon measurement of a single ingredient can mean the difference between a perfectly brown crust and a dull yellow doughy hue, the baker who perfected the recipe may feel entitled to ownership rights.  I guess what I’m having difficulty wrapping my head around is the notion that a pasta recipe with a twist can have the same weight of copyrights as for example a novel or a textbook.  It’s hard for me to place recipes and the next new technological innovation within the same intellectual property realm.  In addition, the nature of artisanal craftsmanship, under which I think baking and olive oil making etc. falls, seems to value the hand-me-down, generation-after-generation-of-toiling-to-perfect-the-craft element that seems to make it impossible to pinpoint ownership.  

Shrug, who knows…

Comments (1)

“Oh, you are on the blacklist!”

The previous reading material, “Name Matching in Law Enforcement and Counter-Terrorism” reminds me of my experience that I was mistakenly regarded as a debtor in arrear with the payment for bill of cell phone.

 

About 10 years ago, when I planned to contract a cell phone for the first time, I was refused to make contract by a staff of a cell phone store because of “the past debt for a payment unpaid yet”. I asserted that this was the first time for me to intend to contract a cell phone so that I had no experience to make a late payment. But the office staff said that the one who had the same name and the same birth day as mine didn’t paid for the bill in the past –in short, “I” was on the black list–,and he couldn’t make contract with me. I had no enough time and no material to persuade the staff so that I gave up contracting on the day.

My name, Kentaro Suzuki, is a very common name. Suzuki is the second most common family name in Japan(about 1.7 million, based on telephone books). I’m not sure how many “Kentaro” are accurately in Japan. However, I believe that there are a lot of “Kentaro Suzuki”s because I can find so many “Kentaro Suzuki”s by searching with Google, which says there are 20,700 web pages that contains “Kentaro Suzuki”(searching with Japanese character) .Of course, most of these pages are not related to me. I know there is “Kentaro Suzuki” who is a former professional soccer player, a lawyer, a president of a company, a street musician,etc,etc accoding to Google. Fortunately, it seems that there is no Kentaro Suzuki who committed felony…until now!

 

When we plan to construct a name-matching system, solving a “homonym” name is one of crucial problems. Most used invariant attributes to identify one is “Name” and “Birthday”. However, how to distinguish ones who have the same name and birthday? “Address” and “Phone” is imperfect because they are often changed or not-updated. The universal code, such as “SSN”, is one candidate but there are no perfect universal code that covers all the people. Also, in Japan, there is no universal code such as SSN.

 

Anyway, this kind of system also needs well-trained registerers who can find whether a “homonym” one is really the one who have been already registered or a different one. This is a  problem similar to register a book that has the same name and  author in some ways.

 

Other than the experience described above, I have had several experiences that I was mistakenly regarded as other “Kentaro Suzuki”. I think this problem will continue for all my rest life, unless I change my family name, for example,when I get married.

Comments (7)

More On Categories and Politics: We Aren’t As Divided As It May Seem

I just wanted to make yet another remark on the power of classification, again with respect to the presidential election (a popular theme in 202). Many analysts have already pointed this out, but it’s worth repeating that framing political standpoints into “this versus that” arguments (for example: pro-life v. pro-choice; gun control v. “right to protect your family;” “red states” v. “blue states”) tend to promote the perception that the nation is becoming more and more polarized.

http://www-personal.umich.edu/~mejn/election/
http://www-personal.umich.edu/~mejn/election/
Both graphics both taken from a University of Michigan article (http://wwwpersonal.umich.edu/~mejn/election/).

Take the concept of red v. blue counties: the top-most graphic depicts which political party predominated in the 2004 election by county (all or none), while the bottom-most graphic depicts the percentage of each county who voted Democrat versus the percentage who voted Republican. Of course the top graphic does depict more accurately how the electoral college system works, but the two graphics illustrate that the lines of political division (by geographic location) aren’t as obvious as they are often made to appear.

I also found an interesting article from the Hoover Institution (a political think tank from Stanford) http://www.hoover.org/publications/digest/6731096.html, which talked about the people’s tendency to lean towards the middle of an issue.  Some exerpts:

  • On Political Trends Over Time:  “Moderate voters have not disappeared. According to Gallup Poll data, Americans classify themselves ideologically in about the same way as they have since the 1970s: Between 15 to 20 percent identify as liberal, 40 to 45 percent as moderate, and 35 to 40 percent as conservative.”
  • On Abortion:  “Only 30 percent of Democrats believe it should be ‘legal under all circumstances,’ and only 30 percent of Republicans believe it should always be illegal. Large pluralities of both parties prefer the middling option of ‘legal only under certain circumstances.’”
  • On Guns:  “Upward of 35 percent of gun owners voted for John Kerry in 2004, as did a similar proportion of born-again Christians. Public opinion surveys that compare the policy views of red-state and blue-state residents show that they do not differ nearly as much as commonly presumed.”
  • On Majorities:  “In 2004, for example, a narrow majority of red-state residents joined a larger majority of blue-state residents who favored making gun regulations stricter. Solid majorities of blue-state residents share red-state residents’ support for the death penalty and opposition to gay marriage. Political differences? Yes. A cultural chasm? No.”

Perhaps, as other in the class have noted (Michael Manoochehri, Michael Lissner, Ryan Greenberg, et. al.), a more enumerative approach to categorizing issues would reveal some greater truths – in this case, that the differences among opinions in the Democrat and Republican camps are not always as vast as they are presented.

Comments (1)

Weinberger Need Statisticians

I’ve always wondered how Weinberger could get meaningful information out of his “huge pile”, and in his interview with Doctorow[1], Weinberger mentioned a way to make use of it: statistical analysis. This is what he said:

“Tags are chaos, and as you get more and more of them, it will get more and more chaotic.  It turns out that when you have a lot of them, the statistical analysis becomes really pretty precise.”

This reminds me of a paper I’ve previously read, “Toward Extracting Flickr Tag Semantics”, written by Yahoo! Research Berkeley and published on WWW2007[2]. The method described in the paper could identify “place tag” and “event tag” from the tags store in Flickr. For instance, the authors could “detect that the tag Bay Bridge describes a place, and that the tag WWW2007 is an event.” (WWW2007 is a conference held in Canada in 2007.)

How did they do that? The main idea is, “place tag” like Bay Bridge has significant spatial patterns, tending to concentrate within a certain geographic range, and “event tag” like a conference has significant temporal patterns, tending to appear around a certain time period. So by using preexisting spatial and temporal statistical methods, computer scientists are able to discover the “semantics” of Fickr tags.

In all, statistical analysis can help Weinberger make use of the huge amount of information, and it may also serve as a “filter” to deal with information overload problems.

 

REFERENCE

[1] Metacrap and Flickr Tags: An Interview with Cory Doctorow, http://blog.wired.com/business/2007/05/metacrap_and_fl.html

[2] Towards Extracting Flickr Tag Semantics, http://www2007.org/posters/poster909.pdf

Comments off

Clay Shirky talk: It’s Not Information Overload. It’s Filter Failure.

Here’s a video of a talk from the recent Web 2.0 Expo NY: Clay Shirky talks about how information overload has been an issue not just since the explosion of the web, but since the printing press. He says that we need to assume that the volume of information will always grow. Managing information at the source isn’t a feasible solution anymore, and we need to design new filters.

Unfortunately, he didn’t present ideas about what the new filtering tools might look or act like. Given the Weinberger-encouraged proliferation of metadata and the recording and saving of basically all electronic communication, how can we design systems that don’t bother us with information we think is unimportant? How can we teach computers which social information we want to see without detailing each type of information and from which sources?

Information overload warning: the video is slightly over 20 minutes.

Comments (1)

Dewey or Don’t We?

This article from May 07 is about a library that decided to move away from the Dewey Decimal system and towards a subject based organization. They used 50 subject headings created by the Book Industry Study Group Inc. The library intentionally mimicked certain aspects of bookstores, not only in how the books are organized by subject, but also in physical layout. It appears they are trying to accommodate their customers’ habits and expectations.

For myself, this sounds interesting. I recall while reading Weinberger that I liked book stores and as long as the subject areas are clearly labeled I had little trouble finding the specific book I was seeking. At the very least it was no more difficult than in a library, and usually easier. Of course, this is a small library (24,000 books/dvds, etc). If you are dealing with a larger set of works this may become too difficult to manage.  And it seems more “natural” to me to search for a subject over a number.

However, one of the comments on the article is key (in my opinion) to the bookstore/Dewey decision. “That’s OK for leisure reading, but if you need to do research on a specific topic, you are going to have a hard time finding the particular information that you need.” The additional structure in the Dewey system makes it easier (once you know how to use the system) to find ever-granular information. Most bookstores just lump it all together.

I’ve not been able to find any follow-up information as to whether it worked or not. Their page shows they now have over 30,000 items in the library, but nothing about its current layout/organization or popularity. I wish I’d found this article when we read Weinberger’s piece.

PS: I wish I could claim the title as original, but I borrowed it.

Comments (1)

Information R/evolution

http://www.youtube.com/watch?v=-4CV05HyAbM

I came across this very interesting short film by Michael Wesch titled Information R/evolution. Information R/evolution reflects Wesch’s views on information management isssues – everything from its creation and categorization, to presentation and retrieval. It shows that as information becomes increasingly digital, our assumptions about its traditional characteristics are no longer valid. The web has fundamentally changed the way we create, manage and use information and so, there is a need for us to “rethink information beyond material constraints.” The video draws on Weinberger and is optimistic about Web 2.0 as it allows knowledge and information to not only be free, but also be miscellaneous.

Comments (2)

Color and Meaning: Designing a wristband is harder than you think

After reading Jonathan’s post I came across this New York Times article about a national effort to standardize the color coding system for hospital wristbands and the challenges that presents.

I enjoyed it as a concrete example of much of what we have been talking about these last few classes, and it showed how intertwined systems of categorization are with social context. It did a good job laying out the unexpected meanings, values and assumptions that can effect what might seem to be a fairly straightforward task.

Questions of privacy, granularity of information systems, and the historical nature of how we understand colors (at least in this culture) are all raised here.

http://www.nytimes.com/2008/09/25/nyregion/25bracelets.html

Comments (2)

Basic Color Categorization

One of the things that came up in lecture today is that classifications are arbitrary and “biased.” I think there is an interesting counter-example to this in the story of the development of the classifications of basic colors. It is understood that across different cultures throughout time, we share the same basic colors (blue, green, red, yellow, etc.). These basic colors are common in all cultures, and what with the advancement of neuroscience and discoveries in cultural anthropology, it’s been agreed upon by many scholars that this is some sort of evolutionary development that makes these colors common to all people. So in this case, the classification isn’t arbitrary, but natural. So it seems like it’s not necessarily the case that all classifications must be arbitrary.

http://www.npr.org/templates/story/story.php?storyId=7051553
http://human-nature.com/science-as-culture/saunders.html

Comments (3)

« Previous entries