Universities: They’re Learning From You Too by Keri Wheatley

Universities want you to succeed. Did you know that? Besides the altruistic reasons, universities are also incentivized to make sure you do well. School funding is largely doled out based on performance. In Florida, state performance funding is determined by ranking the 11 eligible universities on 10 criteria, such as their six-year graduation rate, salaries of recent graduates, retention of students and student costs. The top ranked university nets the most funding while the 3 lowest ranked universities don’t get any at all. To put it into perspective, Florida A&M University earned $11.5 million for the 2016-2017 school year, but then lost that funding when it finished 10th on the list the next year. Policies like these, coupled with the trend of decreasing enrollments, have compelled universities to start thinking about ways to improve their numbers.

data

Data analytics is a rapidly growing field in the higher education industry. Universities are no longer using data just for recording keeping, but they are also using data to identify who will be successful and who needs more help. What does this mean for you? Before you enroll and while you are there, the university’s basic systems collect thousands of data points about you – high school background, age, hometown, ethnicity, communications with your professor, campus housing, number of gym visits, etc. This is normal. If a university didn’t keep track of these things, it wouldn’t be able to run the basic functions of the organization. However, it is when a university decides to combine these disparate data systems into one dataset that this raises some eyebrows. The mosaic effect happens when individual tidbits of information get pieced together to form a picture that wasn’t apparent from the individual pieces. And having that knowledge is powerful.

But they’re using their powers for good, right? There are a lot of questions that should posed when universities begin building such datasets.

What about security? Every day, data is becoming more vulnerable. Organizations, especially those regulated and funded by government agencies, just can’t keep up with new threats. When universities begin aggregating and sharing this data internally, they open themselves and their students to new risks. Do the benefits for the students outweigh the potential harms? This can only be answered on a case-by-case basis, since the security practices and uses of data differ vastly between universities.

Who has access? FERPA, one of the nation’s strictest privacy protection laws, was written to protect student personal information. This law restricts universities from selling student data to other organizations, and also dictates that universities have to create policy to restrict access to only those who need it. In practice, however, these policies are applied ambiguously. Professors shouldn’t have access to students’ grades, but your history professor wants to know why you wrote such a bad essay in his class, so he has a use case to look up your English I grade. Unless a university has stringent data access policies, this dataset could be shared with persons at the university who don’t need access to it.

How do they use the data? There are many ways. Once a university collects the data of all students, it gets a birds-eye view. Institutional researchers then have the ability to answer any question. Which students will drop out next semester? Do students who attend school events do better than students who don’t? How about computer lab visits? How does the subject line affect email open rates? These are all investigations I have done. Universities tend to ask more questions than they can provide actions to the answers. This leads to an unintentional imbalance where the university learns more about its students than is necessary to make decisions.

Universities are asking a lot of questions and finding the answers through the data. In doing so, they are learning more about their students than their students are aware of. How would a student feel if he knew someone was monitoring his gym visits and predicting what grades he will get? What if his academic advisor knew this piece of information about him? How would the student feel when he starts getting subtle nudges to go to the gym? These scenarios are a short step from becoming reality.

In the end, you are purchasing a product from universities—your degree. Shouldn’t they have a right to analyze your actions and make sure you are getting the best product? At what point do we consider it an invasion privacy versus “product development”?

diploma

The Cost of Two $1 Bike Rides by Alex Lau

In February 2018, bike sharing was finally introduced to denizens of San Diego, making their presence known overnight, and without much forewarning, as multicolored bicycles seemed to sprout on public and private land all across the city. Within weeks of their arrival, multitudes of people could be seen taking advantage of the flexibility these pick-up-and-go bikes provided, and most people liked the idea of offering alternatives to cars for getting around town. Not as widely discussed was the large amount of information these companies gather through payment information, logging of bike pick-up and drop-off locations, and potentially a vast store of other less obvious metadata.

Recently my wife and I grabbed two orange bikes standing on the grass just off the sidewalk, deciding to ride to the nearby UCSD campus. After each of us paired a payment method to the Spin app, and off we went. We hit a snag while pedaling up a one-mile incline that is normally imperceptible behind the wheel of a car, but forced us to pedal at a moderate jogging pace in the bikes’ first gears. We finally got off the bikes short of the campus, grateful that the service allowed us to drop-off a bike as easily as we had picked them up. After walking them over to a wide part of the sidewalk and securing the wheels with the built-in locking mechanism, we began to walk the rest of the way. Maybe we wouldn’t be competing in the Tour de France, but we got in a little exercise, had some fun riding bikes together, and tried out a new bike app for very little money.

Spin bike. (By SounderBruce - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=61416390)

Within a minute of leaving the bikes, we both received text messages and e-mails informing us that we did not leave the bikes in an approved designated area, and that our profiles may suffer hits if the bikes were not parked well. While trying to understand what constituted a designated area in a neighborhood already littered with bike shares, we began wondering to one another what information we had just handed over to Spin and what kind of profile the company was building on us.

There have been articles in the press about the potential dangers of inadvertent data leakage with ride-sharing apps, using a situation where a high-level executive of a well-known public company uses a ride share to visit the doctor, or perhaps more revealing in this hypothetical, an outpatient cancer therapy center. This type of information could be accidentally or even purposely exposed, invading the rider’s privacy and perhaps used to hurt the company’s stock price. While I doubt my bike app is angling to embarrass me in the tabloids one day, some of the same data that can leak out of ride-sharing habits extends to the simple bike app.


Note: You cannot drop off anywhere.
(https://fifi.myportfolio.com/spin-branding)

In the case of our quick ride, one could begin to imagine how Spin might start to learn personal details about my wife and me both individually and as two users that share some sort of connection. While we each paid through Apple Pay, keeping private some of the payment details from Spin, we had to provide phone numbers and e-mail addresses. Even without providing a street address, repeat uses of the app may build Spin a picture of which neighborhood we live in. When we had the chance to read through Spin’s privacy policy, we found most of it to be in the realm one expects: if you use our service, we have access to your info. A few other items were a little bit more concerning including Spin reserving the right to use our personal information to pull a driving record and to assess our creditworthiness. Although we had assumed there might be some method of ensuring a rider cannot abuse a bike or cause an accident without being exposed to some liability, neither of us thought that might include pulling a driving record. Other areas of the privacy policy mention that a user’s private information is Spin’s business asset, and may be disclosed to a third party in the event of a corporate restructuring or bankruptcy.

Although I am not privy to how Spin uses their user data, if I were in their position I can understand the business reality of protecting the company’s assets and satisfying insurance obligations for running a business where almost anyone with a smartphone and credit card can pick up a bike with no human intervention. But even though the policy may state what the company can do with personal data, I would want to err towards the option of least intrusion, or least potential harm. I find it hard to justify using a user’s information to run a detailed background check on their credit history and driving record for building a user profile, but if a user is involved in an incident, such actions may be required. (If the incident is severe, privacy may not be possible or guaranteed regardless if legal action is involved.) I do worry that the lines between which actions are viewed as ethically right or wrong in relation to user data may shift especially if the company was facing financial hardship.

While the privacy policy opened my eyes about what our cheap novelty really cost us, I would be naive not to assume every other app and non-app service I use daily has similar wording. It can be worryingly easy to handwave away such concerns as the price for participation and access to these services, however. Instead as data professionals, we need to take advantage of our expertise to examine and understand the potential benefits and pitfalls of how other organizations use our user data, and lend our voices where needed to minimize potential areas for abuse.

Spin Privacy Policy: https://www.spin.pm/privacy

Thick Data: ethnography trumps data science by Michael Diamond

“It is difficult / to get the news from poems / yet men die miserably every day / for lack / of what is found there.” William Carlos Williams

As business continues to pivot on its data-obsessed axis, with a fixation on the concrete and measurable, we are in danger of missing true meaning and insight from what surrounds us. The field of ethnography, established long before the bright sparks of data science were kindled, provides some language to enlighten our path, guide us through the thickets of information, and situate the analytics with new perspectives.

Drawing on his field-work in Morocco, the American anthropologist Clifford Geertz introduced the world to “thick descriptions” in the 1970’s in the context of ethnography, borrowing from the work of a British philosopher Gilbert Ryle who used the term to unpack the work of language and thought. Ethnographers, who study human culture and society, need to hack through a path to insight like a hike though an overgrown jungle. Thick descriptions are, in Geertz’s words, the “multiplicity of complex conceptual structures, many of them superimposed upon or knotted into one another, which are at once strange, irregular, and inexplicit.”

For today’s ethnographers, and the business consultants who champion these methodologies, thick description has morphed in to thick data. Consultant Christian Madsjberg contrasts this with the “thin data” that consumes the work of data scientists, and which he portrays as simply “the clicks and choices and likes that characterize the reductionist versions of ourselves.” What thin data lacks is context — the rich overlay of impressions, feelings, cultural associations, family and tribal affinities, societal shifts — the less measurable or unseen aspects that frame and inform our orientation towards the world we experience.

Businesses, or at least the products they launch, die miserably every day for the lack of what is found in this thick data. The story of Ford’s introduction of the Edsel is instructive.

In 1945 Henry Ford II took over the auto manufacturer that his grandfather had founded at the turn of the twentieth-century. By the 1930s and 1940s Ford’s growth had slowed and the business reputation was waning. Henry Ford immediately set out to professionalize the management team with modern scientific principles of organization. Bringing together the finest minds from the war effort and from rival companies like General Motors, Ford hired executives with Harvard MBAs and recruited the “whiz kids” from Statistical Control, a management science operation within the Army Air Force. The senior management team wrestled over strategy and organization and pored over the data – commissioning multiple research studies and building elaborate demand forecasting models. They ultimately concluded that what Ford needed was a new line of cars – pitched to the upwardly-mobile young professional family. The data and analysis identified a gap in the product portfolio, an area where Ford under-served the market demand and a place where their rival General Motors showed growing strength.

The much heralded “Edsel” launched on September 4, 1957 with a commitment from over 1,000 newly established dealerships. Within weeks it was clear that the public was turning against the product and the brand never gained traction with its target market. Described by one critic as “fabulously overpriced jukeboxes,” the Edsel came to represent everything that was wrong with the flash and excess of Detroit. Within two years Ford had abandoned the business and Edsel had become a watchword for a failed and misguided project. With over $250mm invested and no sign of the projected 200,000 unit sales in sight, the last Edsel rolled off the production line on November 20, 1959.

Ford missed the cultural moment.
Looking back, Ford’s statisticians and planners missed a series of cultural moments – the thick data that was hidden from their analysis and models. First, there was an emerging sense of fiscal responsibility, as car-buyers increasingly saw vehicles coming out of Detroit as gas-guzzling dinosaurs belonging to an earlier era, an idea that was successfully exploited by one of the best selling cars that season: American Motors’ more fuel-efficient “Rambler.” Second, the deepening sense that America was falling behind the rest of the world culturally and scientifically, that participating in the American Dream was not quite as glamorous as once believed, a sense heightened with the deep psychic impact felt across America when the Soviet Sputnik went into orbit in October 1957. Third, the beginning of a consumer movement against the product-oriented “build it and they will buy” approach to marketing — a concern, captured in the same year as the Edsel launched, with the publication of Vance Packard’s _Hidden Persuaders_ that exposed the manipulation and psychological tactics used by big business and their Madison Avenue advertising agencies; and an approach to marketing that was roundly and succinctly critiqued a few years later in Theodore Levitt’s seminal 1960 essay Marketing Myopia.

Lessons for history.
The Edsel may be one of the best known business failures before the age of Coca Cola’s New Coke, or McDonald’s Arch Deluxe, but it is an interesting and salient case because cars are a uniquely American form of self-expression – they announce who we are, how we see ourselves and what tribe we belong to. Indeed automakers have been described as the “grammarians of a non-verbal language”.

But these lessons about fetishizing the things that can be measured, ignoring the limits to how well we can quantify key drivers, and mistaking strong measures for true indicators of what matters most, were to have much greater consequences than an abandoned brand. Sadly they were lessons still being learnt by America as the country entered and prosecuted the War in Vietnam a decade later. Robert McNamara, one of the “whiz kids” hired by Ford who rose to be President of the company, was now leading America’s military strategy, as Secretary of Defense. His dedication to the “domino theory,” which argued that if one country came under the influence of Communism, all of the surrounding countries would soon follow suite, was the justification used to escalate and prolong one of America’s most misguided foreign interventions. And his obsession with “body count” as the key metric of the war led many to exaggerate and mislead the public.

While it is simplistic to reduce the tragedy of the War in Vietnam to one man or one concept, more than a million Vietnamese, civilian and military, died in that war and nearly 60,000 soldiers from the US lost their lives.

McNamara failed to grapple with the “thick data” of the situation because it was hard to quantify. He refused to embrace an hypothesis about the conduct of the war that differed with his own, as it would have meant pursuing a much deeper understanding and empathy for the leaders and people of South East Asia. Ultimately McNamara, by then in his 90’s, came to understand and champion “empathy” in foreign affairs. “We must try to put ourselves inside their skin and look at us through their eyes, just to understand the thoughts that lie behind their decisions and their actions.”

Algorithmic Misclassification – the (Pretty) Good, the Bad, and the Ugly by Arnobio Morelix

Everyday, your identity and your behavior is algorithmically classified countless times. Your credit card transaction is labeled “fraudulent” or not. Political campaigns decide whether you are a “likely voter” for their candidate. You constantly claim and are judged on your identity of “not a robot” through captchas. Add to this the classification of your emails, the face recognition in your phone, the targeted ads you get, and it is easy to imagine hundreds of such classification instances per day.

For the most part, these classifications are convenient and pretty good for you and the organizations running them. So much so we can almost forget they exist, unless they go obviously wrong. I tend to get a lot of examples of these predictions working poorly. I am a Latino living in the U.S. and I often get ads in Spanish. Which would be pretty good targeting, except that I am a Brazilian Latino, and my native language is Portuguese, not Spanish.

Needless to say, this misclassification causes no real harm. My online behavior might look similar enough to the one of a native Spanish speaker living in U.S., and users like me getting mis-targeted ads may not be more than a rounding error. Although it is in no one’s interest that I get these ads — I am wasting my time, and the company is wasting money — the targeting is probably good enough.

This “good enough” mindset is at the heart of a lot of prediction applications in data science. As a field, we constantly put people in boxes to make decisions about them, even though we inevitably know predictions will not be perfect. “Pretty good” is fine most of the time — it certainly is for ad targeting.

But these automatic classifications can go from to good to bad to ugly fast — either because of scale of deployment or tainted data. As we go to higher stake fields beyond those they have arguably been perfected for — like social media and online ads — we get into problems.

Take psychometric tests for example. Companies are increasingly using them to weed out candidates, with growth in usage, and 8 of the top 10 private employers in the U.S. using related pre-hire assessments. Some of these companies are reporting good results, with higher performance and lower turnover. [1] The problem is, these tests can be pretty good but far from great. IQ tests, a popular component of psychometric assessments, is a poor predictor of cognitive performance across many different tasks — though it is certainly correlated to performance in some of them. [2]

When a single company weeds out a candidate that would otherwise perform well, it may not be a big problem by itself. But it can be a big problem when the tests are used at scale, and a job seeker is consistently excluded from jobs they would perform well in. And while the use of these tests by a single private actor may well be justified on an efficiency for hiring basis, it should give us pause to see these tests used at scale for both private and public decision making (e.g., testing students).

Problems with “pretty good” classifications also arise from blind spots in the prediction, as well as tainted data. Somali markets in Seattle have been prevented by the federal government of accepting food stamps because many of their transactions looked fraudulent — with many infrequent, large dollar transactions driven by the fact that many families in the community they serve only shopped once a month, often sharing a car to do so (the USDA later reversed the decision). [3] [4] African American voters in Florida were disproportionately disenfranchised because their names were more often automatically matched to a felon’s names, because African Americans have a disproportionate share of common last names (a legacy of original names being stripped due to slavery). [5] Also in Florida, black crime defendants were more likely to be algorithmically classified as “high risk,” and among those defendants who did not reoffend, blacks were over twice as likely as whites to have been labelled risky. [6]

In all of these cases, there is not necessarily evidence of was malicious intent. The results can be explained by a mix of “pretty good” predictions and data reflecting previous patterns of discrimination — even if the people designing and applying the algorithms had no intention to discriminate.

While the examples I mentioned here had a broad range of technical sophistication, there’s no strong reason to believe the most sophisticated techniques are getting rid of these problems. Even the newest deep learning techniques excel at identifying relatively superficial correlations, not deep patterns or causal paths, as entrepreneur and NYU professor Gary Marcus explains in his January 2018 paper “Deep Learning: A Critical Appraisal.” []

The key problem of the explosion in algorithmic classification is the fact that we are invariably designing life around a sleuth of “pretty good” algorithms. “Pretty good” may be a great outcome for ad targeting. But when we deploy them at scale on applications from voter registration exclusions to hiring to loan decisions, the final outcome may well be disastrous.

References

[1] Weber, Lauren. “Today’s Personality Tests Raise the Bar for Job Seekers.” Wall Street Journal. https://www.wsj.com/articles/a-personality-test-could-stand-in-the-way-of-your-next-job-1429065001

[2] Hampshire, Adam et al. “Fractionating Human Intelligence.” https://www.cell.com/neuron/fulltext/S0896-6273(12)00584-3

[3] Davila, Florangela. “USDA disqualifies three Somalian markets from accepting federal food stamps.” Seattle Times. http://community.seattletimes.nwsource.com/archive/?date=20020410&slug=somalis10m

[4] Parvas, D. “USDA reverses itself, to Somali grocers’ relief.” Seattle Post-Intelligencer. https://www.seattlepi.com/news/article/USDA-reverses-itself-to-Somali-grocers-relief-1091449.php

[5] Stuart, Guy. “Databases, Felons, and Voting: Errors and Bias in the Florida Felons Exclusion List in the 2000 Presidential Elections.” Harvard University, Faculty Research Working Papers Series.

[6] CorbeŠ-Davies, Sam et al. “Algorithmic decision making and the cost of fairness.” https://arxiv.org/abs/1701.08230

[7] Marcus, Gary. “Deep Learning: A Critical Appraisal.” https://arxiv.org/abs/1801.00631