I found this visualization in the business section of The New York Times’ website. It accompanied the article “In Investing, It’s When You Start and When You Finish.” In this article, the author finds that ordinary investors expect average of 7% return on their investments when they invest over a long period of time – 10 to 20 years. However, the historical data do not support this conventional wisdom and the author demonstrates that stock market returns on investments vary widely depending on starting and ending points of individual investments. He chooses this graphic to analyze and communicate the historical evidence in support of his claim that market returns are more volatile than most people realize, even over longer periods of time.
Besides providing clear annotations, this heat map displays two quantitative dimensions – average real annual returns and time periods measured in years to illustrate that returns on 20-year investments can vary depending on when money was first invested and when it was withdrawn. It allows easy comparisons of various investment timeframe combinations. The reader can easily see that long-term investments done from mid-1960’s through mid-1970’s could hardly keep up with inflation. In contrast, short-term investments made in 1980’s and late 1990’s produced more than 10% returns. When following a typical 20-year investment line, it is obvious that 20 years is not the magic number and the returns vary from not keeping up with the inflation to 7-10% returns.
I like this graphic because it does a good job organizing and presenting a great deal of quantitative information and supporting author’s claims. It uses colors that are intuitive for readers to interpret (shades of red – loss or low returns; shades of green – 3%+ returns). It also provides enough context through legend, annotations, and 20-year return example that makes it very easy to read and interpret.
The graph depicts five years of every NBA shot attempt from 2006 to 2011, visualization work credited to Prof. Kirk Goldsberry, who teaches geography at Michigan State University. He published a paper titled “CourtVision: New Visual and Spatial Analytics for the NBA” while he was a PhD at Harvard. (Paper link: http://www.sloansportsconference.com/wp-content/uploads/2012/02/Goldsberry_Sloan_Submission.pdf) He introduces CourtVision, what he called “a new ensemble of analytical techniques designed to quantify, visualize, and communicate spatial aspects of NBA performance with unprecedented precision and clarity.” He believes that visual and spatial analyses represent vital new methodologies for NBA analysts.
Prof. Goldsberry presents a whole new way to look at NBA data. His paper on CourtVision investigates spatial and visual analytics as means to enhance basketball expertise. Goldsberry and his research team propose a new way to quantify and visualize NBA player’s shooting ability with unprecedented precision and clarity. In the paper, an exploratory case study is introduced. Goldsberry attempts to examine spatial shooting behavior and performance for every NBA player by applying his CourtVision method. He concludes with evidence that Steve Nash and Ray Allen have the best shooting range in the NBA.
Who is the best shooter in the NBA? This is the question asked by Goldsberry. Conventional evaluative approaches would probe into FA (field-goal attempted) and its derived FG% (field-goal percentage). This approach fails to provide a simple answer to this question. For example, Nene Hilario and Dwight Howard led the league in FG% in the 2010-2011 season, but neither is considered to be a great shooter. NBA reporter David Aldridge suggests that Ray Allen is the best shooter because of his ability to shoot well from many different locations on the court. Goldsberry introduces two metrics to quantify player’s shooting performance and validate reporter Aldrige’s opinion.
Visualization: Shooting Spread
The research team compiled a spatial field goal database that included Cartesian coordinates (x,y) for every field goal attempted during 2006-2011. The shooting data is mapped to a standard NBA basketball court and the map is divided into 1,284 squares for analysis of Spread, which describes the overall size of a player’s shooting territory.
Visualization: Shooting Range
But Spread alone is not enough. It reveals shooting tendencies but not effectiveness. Range, thereof, indicates the percentage of the scoring area in which a player averages more than 1 PPA (points per attempt). Steve Nash is ranked first, with a Range value of 406, indicating that he averages over 1 PPA from 406 unique shooting cells, or 31.6% of the scoring area. Ray Allen was ranked second (30.1%). Although the hypothesis about Ray Allen being the best shooter in the league is wrong, but Goldsberry proves that Allen is still the second best shooters in terms of the Range metric.
The paper presents new spatial metrics and advanced visualizations that allow better understanding of the complex spatial dynamics of NBA players and teams. CourtVision integrates database science, spatial analysis, and visualization to demonstrate players’ or teams’ spatial shooting signatures.
Connections to Lecture/Lab
Visual encoding process: Record, Analyze, and Communicate
Linear mapping of size and value for FGA (see legend on Visualization of Spread)
Use of color spectrum and correlated dimensions to illustrate PPM (points per attempt)
CourtVision is an informative and beautiful visualization of shooting analysis dedicated to NBA fans, players, coaches, analysts, and executives. The presented spatial and visual analytics could be vital new tools for informing future game plans, practice regimens, and for scouts to find potential players.
I came across the visualizations of this India based company called Gramener a while ago. There are many more interesting ones on their online portfolio but I especially liked this one since it is the outcome of scanning the text of the magnum opus and the longest epic of India: the Mahabharata. With 1.8 million words in it, I would consider this visualization to be “Ancient Big Data” visualization.
The intended function of this visualization is to highlight those parts in the epic wherever the selected characters have been mentioned. This helps us to directly navigate to those parts of this huge unstructured data to find topics on characters of our interest. Thus, the intended function of this visualization is to communicate the same. The author’s main objective here is to ease the navigation of unstructured data with 1.8 million words and communicate to us directly the parts of the epic where we might find the data (about the characters) we are looking for. It could be seen as a visual information retrieval system.
Other Potential Applications:
Research paper text scanning: One problem that I have often faced while going through research papers for projects or assignments is that sometimes, I have to go through the entire paper to find information about topics of my interest, only to realize that it was not an important paper. With such an application, I can easily decide if I need to read the paper by simply looking at the visualization of the text as shown above.
Our conversation about Chernoff faces yesterday made me think of emoji (emoticons) and how the visualization of emotion is becoming more important as text-based messaging and interaction grow relative to voice and face-to-face communication. Our faces and voices carry rich emotional information, while our text interactions don’t.
But visualizing emotion is hard. It’s multi-dimensional. And, to me, the somewhat disorderly way companies go about creating emoticons speaks to how hard it is. Popular messaging apps have pages upon pages of emoticons in an attempt to help users add some emotional meat to the cold bones of text.
The iPhone has emoji (you have to enable a special keyboard), and it’s fun to me to think about how Apple designed and ordered them (though they seem primarily designed for a Japanese market).
The first page of emoji shows five faces that you could call happy (or contented) followed by an assortment of faces grouped by feature—winking, hearts, kissing, tongue-sticking-out, etc. Then there’s surprise (unpleasant?), teeth-gritting (I think?), a sad face, another random happy face, and a few other sad faces the meaning of which is hard to infer exactly. The second page has a number of other faces, mostly negative in emotion, along with the all-important face mask emoji:
It’s not a super coherent representation of our various possible emotions, but perhaps it’s “good enough” or suits our culture (or, rather, the Japanese culture) well. What would be a better or more complete representation? Maybe a few icons at least per “basic emotion”? Arranged by intensity? (i.e. neutral smiley to really smiley smiley, or neutral smiley to anxious smiley, or…you get the picture). It’s interesting to ponder.
Facebook is clearly pondering this—check out the new experimental feature that lets you set a mood for your status update. A few Facebook researchers also at a recent public presentation said they worked with an animator at Pixar to experiment with some new emoticons, using Darwin’s emotions as a basis. I thought some of them were fun:
Unfortunately I wasn’t able to show you the Parallel Coordinates example in class. Parallel Coordinates are a great technique to explore relationship between (seemingly) unrelated dimensions of multivariate data set.
Take a look at http://exposedata.com/parallel, which was made by Kai Chang, who’s local to the SF Bay Area. Take a second to familiarize yourself with the user interface. You’re able to re-group the parallel axes to better explore different relationship of adjacent dimensions. You can also filter out parts of each axis, which reduces the number of lines drawn.
If you would like more information, here is a good blog post on parallel coordinates. On a related note, Robert Kosara’s blog eagereyes.org is one of the best blogs that takes information visualization seriously without appearing too academic.