Archive for March 13th, 2018

2017 was a bad year for Uber. If you’re reading this, you probably don’t need me to tell you why. What you might not have seen though, is how Uber used data science experiments to manipulate drivers.  In this New York Times article, Noam Sheiber discusses how Uber uses the results of data-driven experiments to influence drivers, including ways to get drivers to stay on the app and work longer, as well as getting drivers to exhibit certain behaviors (e.g. drive in certain neighborhoods at certain times).

In light of Uber’s widespread bad behavior, it’s been brought up several times that maybe we should have seen this coming.  After all, this is a company that has flown in the face of laws and regulations with premeditation and exuberance, operating successfully in cities where by rule their model isn’t allowed.  Given this, the question I’ll pursue here is what should we make of Airbnb, a company whose growth to unicorn status has been fueled by similarly brazen ignorance of local laws, pushing into cities where hosts often break the law (or at least the tax code) by listing their homes?

In particular, I’d like to take a look at how Airbnb affects how their hosts price their listings. Why? Well, this is where Airbnb has invested a lot of their data science resources (from what’s known publicly) and it’s one of the key levers where they can influence hosts.  The genesis of  their pricing model came in 2012, when Airbnb realized they had a problem. In a study, they found that many potential new hosts were going through the entire signup process, just to leave when prompted to price their listing. People didn’t know, or didn’t want to put in the work, to find out how much their listing was worth.  So, Airbnb built  hosts a tool that would offer pricing “tips”. The inference from Airbnb’s blog posts covering their pricing model is that this addressed the problem, as users happily rely on their tips – though they are careful to point out, repeatedly, that users are free to price at whatever they want.

As someone who is looking at this with the agenda of flagging any potential areas of concerns, this caught my attention.  The inference I took from reading several accounts of their pricing model, is that Airbnb believe users lean heavily (or blindly) on their pricing suggestions. I’d buy that. And why that’s concerning is we don’t really know how their model works.  Yes, we know that it’s a machine learning classifier model, that extracts features out of a listing, as well as incorporating dynamic market features (season, events, etc) to predict the value of the listing.  In their postings about their model, they list features it uses, and many make sense.  Wifi, private bathrooms, neighborhood, music festivals, all of these are things we’d expect. And others like “stage of growth for Airbnb” and “factors in demand” seem innocuous at first pass. But wait, what do those really mean?

One of the underlying problems present in Sheiber’s Uber article was that fundamentally, Uber’s and their Driver’s agendas were at odds. And while I wouldn’t say the relationship between Airbnb and their hosts is nearly as fraught as Uber and its drivers, it might not be 100% aligned. For host’s, the agenda is pretty simple: on any given listing, they’re trying to make as much money as possible. But for Airbnb, there’s way more at play. They’re trying to grow, and establish themselves as a reliable, go-to source for short-term housing rentals. They’re competing with the hotel industry as a whole, trying to establish themselves in new markets, and trying to change legislature the world over.  Any of these could be a reason why they might include features in their pricing tips model that do not lead it price listings at the maximum potential value.

The potential problem here is that while Airbnb likes to share their data science accomplishments, and even open source tools, they aren’t fully transparent with users and hosts about what factors go into some of the algorithms that effect user decisions. While it would be impossible to share every feature and it’s associated weights, it is entirely possible for them to inform users if their model takes into account factors whose intent is not to maximize user revenue. 

Clearly, this is all speculative, as I can’t with any certainty say what is behind the curtain of Airbnb’s pricing model. In writing this, I’m mearly hoping to bring attention to an interaction that is vulnerable to manipulation.

Filter Bubbles

March 13th, 2018

During our last live session, we discussed in detail the concept of filter bubbles. The condition in which we isolate ourselves inside an environment where everyone around us agrees with our points of view. It is being said a lot lately, not just during our live session, that these filter bubbles are exacerbated by business models and algorithms that power most of the internet. For example, Facebook runs on algorithms that aim to show the users the information that Facebook thinks they will be most interested in based on what the platform knows about them. So if you are on Facebook and like an article from a given source, chances are you will continue to see more articles from that and other similar sources constantly showing up on your feed and you will probably not see articles from other publications that are far away in the ideological spectrum. The same thing happens with Google News and Search, Instagram feeds, Twitter feeds, etc. The information that you see flowing through is based on the profile that these platforms have built around you and they present the information that they think best fits that profile.

Filter bubbles are highlighted as big contributors to the unexpected outcomes of some major political events around the world during 2016 such as the UK vote to exit the European Union as well as the result of the US presidential election in favor of Donald Trump. The idea is that in a politically divided society, filter bubbles make it even harder for groups to find common ground, compromise, and work towards a common goal. Another reason filter bubbles are seen as highly influential in collective decision making is that people tend to trust other individuals in their own circles much more than “impartial” third parties. For example, a person would much rather believe what his or her neighbor is posting on the Facebook wall over what the article in a major national newspaper is reporting on, if the two ideas are opposed to each other, even if the newspaper is a longstanding and reputable news outlet.

This last effect is to me, the most detrimental aspect of internet-based filter bubbles. Because it lends itself for easy exploitation and abuse. With out-of-the-box functionality, these platforms allow trolls and malicious agents to easily identify and join like-minded cohorts and present misleading and false information pretending to be just another member of the trusted group. This type of exploitation is currently being exposed and documented, for example, as part of the ongoing investigation on Russian meddling in the 2016 US Presidential election. But I believe that the most unsettling aspect of this is not the false information itself, it is the fact that that the tools being used to disseminate it are not backdoor hacking or sophisticated algorithms. It is being done using the very core and key functionality of the platforms, which is the ability of third party advertisers to identify specific groups in order to influence them with targeted messages. That is the core business model and selling point of all of these large internet companies and I believe it is fundamentally flawed.

So can we fix it? Do we need to pop out the filter bubbles and reach across the aisle? That would be certainly helpful. But very difficult to implement. Filter bubbles have always been around. I remember in my early childhood, in the small town where I grew up, pretty much everyone around me believe somewhat similar things. We all shared relatively similar ideas, values, and world views. That is natural human behavior. We thrive in tribes. But because we all knew each other, it was also very difficult for external agents to use that close knit community to disguise false information and propaganda. So my recommendation to these big internet companies, would not necessarily be to show views and articles across a wider range of ideas. That’d be nice. But most importantly, I would ask for them to ensure that the information shared by their advertisers and the profiles they surface on users’ feeds are properly vetted out. Put truth before bottom lines.