AI innovation or exploitation : Uber’s rideshare digital economy (Musings of a private citizen)

AI innovation or exploitation : Uber’s rideshare digital economy
(Musings of a private citizen)
By Anonymous | December 6, 2019

T​axicabs have been around for decades, regulated and controlled by city governments and other local transportation authorities across the civilized world in one of the following ways – drivers have to apply for a taxicab license or permit with a city or state agency, they need to have a good driving record and often are governed by local rules around fares, rights they must afford to their customers and when and how much they can charge for their cab service during the day, night, busy traffic, airports, etc…

Behind the innovative ideas like Uber or Lyft is a very principled thought of sharing – a very basic human trait that each of us learns as early as we are probably learning to walk. Share your food, your toys, your books, pencil and then as adults, share your space, sometimes possessions for the greater good – benefits like reducing waste and traffic when we think of things like carpools and friends driving friends together.
There are limits to human sharing — sharing does not scale up, after all we know that all friendships, all neighbors, colleagues are not created equal. We all don’t have the same possessions, and hence our ability or willingness to share can be varied and hence reciprocity may be uneven, resulting in broken hearts, spoilt relationships and a system of carpooling that does not scale. Thus enter – ridesharing services like Uber.

Due to income disparity, a section of the community that is left behind finds a way to make an income as a driver supplementing during times of hardship or when one might be in job training or when in between jobs or just not suitably qualified or experienced for available local jobs.

Innovation creates hundreds of thousands of “gig economy” jobs and results in new income rivers to flow from the pockets of haves to have-nots. This in turn increases access to resources like education, child care, college education and fill the gaps left despite public transportation in cities like Chicago, New York, New Delhi, San Francisco, Mumbai, Calcutta, London where numerous layers of local public transportation services have been in existence for decades alongside taxicab drivers.

There are some visible evils of this sharing economy, innovative yet exploitative by design. While taxicabs are regulated and have some rights, On Uber the drivers and riders are all “users” and these drivers cannot expect to be treated with employee benefits, something that is a sticking issue for those who drive 40 or more hours a week. The appeal of a gig job is far greater and fares are hard to compete with for the regulated taxicab drivers. Another major problem is that of discrimination of marginalized sections of society both as riders and drivers – the inherent biases result in drivers geting lower fares with lower ratings and customers get lower ratings and have to often wait longer time periods in order to get their rides.

The algorithm cannot repair the biases of society – Uber amplifies these biases. Finally, Uber users have to put up with a direct invasion of privacy. The app continues to track the rider after they are not with Uber anymore resulting in collection, processing and utilization of private data that should not have been collected.

It has been known for some time now that Uber has poor internal privacy safeguards and the data they collect can be used in the name of “R&D” projects within Uber where data scientists have been found to be using user data while the User privacy policy remains devoid of a proper disclaimer of these research objectives and how these may affect the user community.

While Uber is a technology platform, it does have a powerful ability to manipulate the market. Using AI to reinforce learning ways to test the tolerance of a rider when it comes to price point and the tolerance of a driver to accept a lower price, the margin between the driver and the rider is Uber’s profit.

Uber is effectively charging variable “user fee” based on the value of the transaction, the ability of the customers to let go of their bottomline in the interest of convenience and a shared mindset.Uber is doing this while capturing a lot more data than is needed in broad daylight (and at night) from all its customers blurring the line between innovation and exploitation.

Coupon Browser Extensions: Sweet Deals For A Not So Sweet Price?

Coupon Browser Extensions: Sweet Deals For A Not So Sweet Price?
By Keane Johnson | December 6, 2019

With Thanksgiving come and gone, the holiday season is in full swing, meaning most Americans are turning their attention to holiday shopping. This year is predicted to be another record-breaking season, with total consumer sales forecasted to grow 3.8% over 2018 and exceed $1 trillion for the first time in history [1].

Although brick-and-mortar stores continue to account for the vast majority of consumer spending, online sales are forecasted to increase 13.2% to ​$135.5 billion, or 13.4% of total holiday shopping [1].

This growth is not limited to the holiday season. Online shopping in the United States has grown from 7.1% of total sales in 2015 to 8.9% of total sales in 2018 [2]. This increase in online shopping has motivated the creation of multiple sites and plugins that deliver discount codes or coupons to consumers. These plugins automatically process what is in a consumer’s online shopping cart, search the internet for available codes or coupons, and apply the best one at checkout.

One of the pioneers in this space is RetailMeNot. The original RetailMeNot service aggregated coupon and discount codes for a wide variety of companies on its website. Consumers would then go to the site, copy the coupon code, and apply the code to their carts at checkout. In 2018, RetailMeNot observed over 453 million site visits and facilitated global sales of ​$4.9 billion [3].

Two years ago, RetailMeNot released a browser extension – RetailMeNot Genie – that applies discounts and cash-back offers directly to a consumer’s cart at checkout. The plug-in is 100% free, meaning that those savings come at no monetary cost to the consumer.

However, as the saying goes, if you are not buying the product, you are the product. An examination of RetailMeNot’s Privacy Policy and their use of customer data raises some serious ethical concerns. RetailMeNot collects, either online or through a mobile device, consumer’s contact information (email, phone number, and address), “relationship information” (lifestyle, preferences, and interests), “transaction information” (kinds of coupons that are redeemed), location information (consumer’s proximity to a merchant), and “analytics information” (information about a user’s mobile device, including applications used, web pages browsed, battery level, and more).

RetailMeNot uses this information for a variety of purposes, including: creating user profiles that may infer age range, income range, gender, and interests; inferring the location of places users visit often; providing notifications when users arrive at, linger near, or leave these places; and providing advertisements through display, email, text, and mobile-push notifications [4]. Additionally, RetailMeNot may allow third parties to track and collect this data for their own purposes [4].

RetailMeNot may also share personal information “to effect a merger, acquisition, or otherwise; to support the sale or transfer of business assets” [4]. This final clause is the most troubling because it gives RetailMeNot leeway to sell its users’ personal information to support its business. And so although there is no upfront monetary cost to using RetailMeNot, users could end up paying in the background with their personal data.

However, alternative deal-hunting services are becoming more and more available. One of the fastest-growing is Honey, which like RetailMeNot’s Genie, is a browser extension that automatically applies coupon codes and discounts. Honey is transparent about the type of data they collect and is upfront about never selling their users’ personal information [5]. However, they may share data “with a buyer or successor if Honey is involved in a merger, acquisition, or similar corporate transaction” [5]. Their recent acquisition by Paypal [6] may mean that their users’ personal information is now in the hands of one of the largest online payments systems in the world. And Honey may no longer have control over how this information is used.

In summary, free deal-hunting and coupon-finding extensions are becoming more popular because they offer consumers an opportunity to easily save money. But this free money may be too good to be true. An inspection of the privacy policies of a couple popular services shows that consumers may end up paying with their sensitive personal information.

Content References







Image References




[4] [](

ICE – License Plate tapping & Immigration control – Privacy and ethical concerns

ICE – License Plate tapping & Immigration control – Privacy and ethical concerns
By Dasa Ponnappan | December 6, 2019


With Trump’s era on stricter immigration starting 2016, ICE faced a daunting task of enforcing a stricter immigration policy to enforce poll promises of detaining and deporting illegal immigrants. What followed is an unprecedented level of policy design and execution to meet the standards. In this blog post, we are going to see the historical context, the means through which ICE employed consultants and technology to achieve the goal. Also, a once-abandoned license plate tracking became mainstream ICE Tool for targeting, detaining, and deporting illegal immigrants overlooking the ethical and privacy concerns.

ICE Immigration crackdown

Starting in 2017, post-Trump era policy on illegal immigration crackdown, ICE was entrusted with the daunting task of using “all legal means” to stop and deport illegal immigrants across the country. That included a massive ramp-up of a force of 10,000 to handle detention and deportation. Having McKinsey as their management consultants, ICE started devising massive recruitment drives of officers in Gyms, devising means to deport illegal immigrants to border cities with little safety and medical needs.

The Technology means:

ICE started deploying tracking means to apprehend and deport illegal immigrants. In order to do that, they started adopting tracking License plates, which was once ruled out as a policy in 2014 by DHS due to severe backlash around privacy concerns. The tracking of license plates involves tracking all of the vehicles passing a point, which resulted in tracking people’s movement across the country. This tracking provided unprecedented insight into people’s lives through their movements. The database that provided this comprehensive insight around license plates was that of a private entity. ICE tapped onto it to track, detain and deport illegal immigrants.

Privacy and ethical Concerns:

Given the nature of data collection, the process of not getting consent in collecting privacy data, and breach of privacy conceptions of Solove in terms of anti totalitarianism, the right to be left alone is very evident. Not only that, the vulnerability of data exposure of license tracking could potentially lead to stalking of individuals and harm in their way. It also serves as a stepping stone on the state controlling the lives of individuals through the umbrella of national security. Despite the privacy and ethical concerns, ICE spokesperson argued in its favor by citing year-long training for their staff around protecting and ethical use of license data. Given the duration of retainment of this data, which runs into years, and having collected this data through a for-profit organization, it is tough to justify the advantage of such means of tracking to control illegal immigration.

Alternative Data Sources in Investment Management Firms

Alternative Data Sources in Investment Management Firms
By Peter Yi Wang | December 6, 2019

Investment management firms have found alternative data as a way to gain information advantage over their peers. Industry spending on alternative data by investment firms such as mutual funds, hedge funds, pension funds, private-equity firms will jump from $232 million in 2016 to a projected $1.1 billion in 2019 and $1.7 billion next year, according to, an industry trade group supported by data provider YipitData. There are currently hundreds of alternative data providers across the globe, with a heavy concentration in the United States.

In recent years, investment management firms, particularly hedge funds which are highly focused on time-sensitive information, have pioneered innovative ways of tracking information. As an example, some hedge fund may send out drones to fly over lumberyards to see the stockpile of lumbers to make the bet to short the price of lumber. Other hedge funds may retrieve satellite images of retail stores parking lots to gauge the performance of the department stores. And furthermore, other hedge funds track online job postings of different companies to decipher the trajectory of growth.

While all these methods give hedge funds a particular edge over its peers, these actions also raise questions around the privacy protection rights for the subjects of these tracking. For example, does the lumberyard owners worry that drones are flying over their yards? Does parking lot drivers concerned about satellite taking pictures of their parked cars? Does the particular management of the subject companies care that their online job postings are being scrapped?

The most urgent issue that faces the alternative data and investment management industry that uses alternative data today is the lack of a global best practices standard.

In October, 2017, the Investment Data Standards Organization (IDSO) was formed in the United States to support the growth of alternative data industry through the creation and promotion of industry standards and best practices. This non-governmental organization is focused on three main products: 1) Personal identifiable information (PII); 2) Web crawling; and 3) Dataset compliance for sensitive information (SI).

There are four main areas which can be materially enhanced from a privacy perspective:

  1. Consent: The data subjects like individuals (or websites containing individual information) and businesses need to consent directly or indirectly to the data collection process.
  2. Storage and security: The alternative data storage should have a regulatory time limit similar to call transcripts of trading records under the Securities regulations in many countries. This ensures that personal identifiable information is deleted safely under a certain period. The data subjects should also reserve the right to delete their personal data upon request.
  3. Secondary use: Secondary use of the alternative data should be strictly monitored or prohibited given the unfair distribution of costs and benefits.
  4. Confidentiality: Personal identifiable information should be kept confidential at all times and data subjects have the opt-out option to exclude their information from alternative data sets.

Given alternative data is a global phenomenon and rapidly expanding, a global standards organization similar to IDSO should be formed to address the four critical recommendations listed above. Without a proper global standard, the alternative data industry and the investment management industry that utilizes alternative data may continue to breach the privacy boundaries of data subjects. Urgent privacy protection actions are needed in the alternative data industry.

Robo Cop, Meet Robo Lawyer – Using AI to Understand EULAs

Robo Cop, Meet Robo Lawyer – Using AI to Understand EULAs
By Manpreet Khural | December 6, 2019

No one reads EULAs. Everyone is concerned about their online privacy. Perhaps a bit of a hyperbole, but it is true none the less that End User License Agreements (EULAs) are overlooked documents that users scroll through as quick as they can, searching for the I Agree button. If they do take the time to read them, they often find the language difficult to decipher and may not have the prerequisite knowledge to identify concerns.

Do Not Sign, a recent tool released by DoNotPay, claims to parse through the legal language of EULAs and identifies Warnings and potential Loopholes for the user to review prior to agreeing to the terms of the document. The tool can even, on the userís behalf, send letters addressing the issues to the EULA company. DoNotPay began its journey to this tool by first helping its users contest parking tickets successfully, cancel subscriptions, and even sue companies in small claims court. With its new tool, it seeks to create a new landscape in which consumers can protect their privacy and contest problematic and abusive EULAs.

Is it not ironic, however, that we would use a technology product to protect ourselves from mainly technology centric EULAs? Before we hail this as a solution to this modern problem, we must ask what capability it has to inflict harm. According to its developers and journalists who have tried it, Do Not Sign is designed to rarely if ever produce false positives, Warnings where an underlying problem does not actually exist. It can however miss some problematic terms of an agreement. Specifically, it often misses out on terms related to online tracking. If users begin to use and trust this tool, they may feel more protected than when they had to agree to a EULA by themselves. This could provide a false sense of security, a convenience that may miss an important section of terms. This may lead to an agreement made when it should not have been. There does exist a potential for harm.

Is that enough of a roadblock to stay away from Do Not Sign? Likely the answer is no. Users are not able to EULAs with the level of scrutiny that this tool can. Overall it provides them with an ability to make a more informed decision in the face of legal or technologically opaque terms. A simplification is more than welcomed. One of the goals of the tool is to provide users with negotiating power. As more people use it to understand EULAs and subsequently reject questionable ones, the companies behind the agreements may open up to letting users pick and choose terms or at least provide some kind of feedback. This empowers people to have a voice in the matter of their privacy in specifically the digital sphere. This may spark a greater interest in consumer protections and create a better framework of principles when it comes to constructing EULAs.

Overall, Do Not Sign helps users to understand an environment foreign to most of them. While there are concerns with overreliance on the tool or the tool missing critical red flags in documents, the benefits of having something like this widely available far outweighs the hurdles. As persons who deal with such privacy related agreements regularly, we should support this tool so that the masses can begin to protect themselves.


When humans lose from “AI Snake Oil”

When humans lose from “AI Snake Oil”
By Joe Butcher | December 6, 2019

No matter where you work or live, you don’t have to go far to hear someone talk about the benefits of AI (yes, that’s Artificial Intelligence I’m referring to). What do I mean exactly by AI? Well, for the most part machine learning (ML for all you acronym lovers), but that’s a topic for another blog post. While everyone loves to talk about creating AI tools and systems, one can argue we aren’t talking enough about the human lives impacted by AI-aided decisions.

While I am two years into UC Berkeley’s Master of Information and Data Science (MIDS) program, I realize I have far more to learn about AI and data science. In fact, the more I learn, the more I realize I don’t know. What’s frightening to me is the amount of AI-based companies being created and funded that influence decisions that have a real impact on humans’ lives. It can be challenging and time consuming for data science trained professionals to understand the validity of the tools that companies are creating, not to mention individuals who are less data science savvy which make up the majority of the workforce.

Arvind Narayanan, an Associate Professor of Computer Science at Princeton, recently gave a talk at MIT focused on “How to recognize an AI snake oil”. During this presentation, Professor Narayanan articulates the contrast between the hope of how AI can be successfully applied to certain domains to the reality around its effectiveness (or lack thereof). Professor Narayanan goes on to discuss domains where AI is making real, genuine progress and domains where AI is not only “fundamentally dubious”, but also results in ethical concerns due to inaccuracy. Furthermore, he claims that for predicting social outcomes, AI is no better than manually scoring using a few features.

While none of this is likely shocking to anyone in the field, it does beg the question of what is being done to protect society from negative consequences. With policy and regulations struggling to keep up with the pace of technological advancement, some have argued that self-regulation will be enough to combat the likes of “AI Snake Oil”. Neither seem to be progressing fast enough to protect people from poor decisions made by algorithms. Moreover, political turbulence (both in the U.S. and around the world) and potential for economic disruption across industries leave most people feeling both uneasy and hopeless.

Ethical frameworks and regulations have been proven ways to protect humans from harm. While the current situation is a daunting one, the data science community should challenge itself to stay committed to work grounded in values and ethics. While it can be tempting to reap the economic benefits from developing solutions that customers are willing to pay for, it is critical that we understand whether our solutions follow the basic principles from the Belmont Report that we started this class with:

  • Respect of persons: Respect people’s autonomy and avoid deception
  • Beneficence: “Do no harm”
  • Justice: Fairly administer procedures and solutions

We can’t control everything that happens in the crazy world out there, but we can control how we apply our newly acquired data science toolkit. We should all choose wisely.

[1] Narayanan, A. “How to recognize AI snake oil”.
[2] Sagar, R. “The Snake Oil Merchants of AI: Princeton Professor Deflates the Hype.”
[3] “Belmont Report”

Modern day Masters: Who controls your personal data?

Modern day Masters: Who controls your personal data?
By Anonymous | December 6, 2019

As we move towards a more automated digital universe, data is considered as ‘the new oil’. As data, the most valuable resource of digital age continues to grow in a world where every move in the digital land is recorded; global tech giants such as Facebook, Amazon and Alphabet have succeeded in making huge profits from this opportunity.

Given that the number of digital devices has grown exponentially, we have involuntarily allowed our personal data to be at the disposal of big companies. This is mainly due to the fact that devices such as our smartphones are tools that not only help us with organizing our day but also the one that we voluntarily feed with data that are private. So, it becomes impossible for us to know and control who gathers what information, where it ends up, who has ownership of it and what it is used for.

The intrusion to autonomy is taken for granted and general public is resigned to the fact that their information is collected, shared and used without their consent as a tradeoff in order to use a network tool or social media platform. So, we succumb to the fact that it’s almost impossible to prevent tracking of our information with the usage of online platforms for our day-to-day activities. However, we need to be cognizant that this may put the individual at risk, if data is used with a harmful intent.

Recently there has been several data privacy breaches like Facebook contagion experiment, Facebook–Cambridge Analytica data scandal and unauthorized street view capture by Google that raised red flags and drew attention of governments across the globe. Though tech giants have been willing to discuss data protection after seeing the inevitability, it needs to be seen how far they would go in acceding control over the data they collect. While momentum is growing for federal laws and rules on data privacy in the US, European Union has already taken the lead with enacting a comprehensive General Data Protection Regulation (GDPR) to protect its citizens and residents. The hope is that GDPR with its comprehensive framework and enforcement of severe financial and legal implications will become a gold standard for empowering people to control their personal data, however history doesn’t stand behind this expectation.

The EU has had a data-protection directive since 1995, but studies have repeatedly shown that its rights weren’t well enforced. One of the reasons this has not been successful so far is because there is a lack of understanding of the different types of personal data held by these companies. We can classify personal data into three types:

  1. Data that is fully shared by the user consciously,
  2. Data collected automatically when the user uses a platform or a device to access an application,
  3. Data that is generated, predicted or modeled from user behavior based on piecing together different data sources.

While most of us are mindful of data that’s shared consciously, only some may be cognizant of data that’s collected automatically and most wouldn’t be even aware of such predicted data existing about oneself. Hence, this lack of awareness makes it difficult to enforce control even with stringent regulations like GDPR.

Rather than feeling vulnerable and just relying on regulations to protect and control your own data, there is a whole new approach to digital identity called self-sovereign identity (SSI) that’s evolving currently. With this approach, individuals and/or companies are able to manage their own data and thus decide which data should be shared with whom, when and for how long. The idea is that you allow your data to be accessed, but you never actually give it away. Enabled by blockchain technologies, a SSI solution would use a distributed ledger to establish immutable recordings of lifecycle events for globally unique decentralized identifiers (DIDs). This abolishes the need for centralized institutions/data barons to store (and capitalize on) personal data separately in their own servers. This allows you to control your own data and make specific data available to those who need them in a given context and thus allow the use of their services. However, this requires tech companies to tie up to this solution. But, this gives us a ray of hope that a true potential solution that can let you be the master of your own data is on the horizon!


Image references

Protection of AI through the Department of Technology: an Incomplete Solution

Protection of AI through the Department of Technology: an Incomplete Solution
By Anonymous | December 5, 2019

From Stephen Hawking to Elon Musk, futurists and scientists have feared for the state of humanity post singularity. This apocalyptic Roko’s Basilisk scenario has now entered the platforms of United States Presidential candidate platforms for the first time, via Andrew Yang.

Yang has received both positive and negative attention for his proposed policies on data as an individual property. This position could potentially mitigate some of the current negative effects of data collection, security, use and reuse by many companies today. But there has been less talk about his idea to create the cabinet level position of the Department of Technology to address advances in algorithms.

In his platform page, Yang vaguely casts artificial intelligence as a bogeyman who is out “displacing…jobs” and “causing unknown psychological issues for our children” created by techies who “don’t fully understand how it works”. While these broad strokes seem to resonate more with language of the 19th century Luddites, one thing we can agree on is that no one truly understands where technology will land and what implications this will mean for our lives. To address this uncertainty, Yang proposes the creation of the Department of Technology to:

  • Monitor the development of new technology
  • More quickly adapt to the changing technological landscape
  • Prevent technological threats to humanity from developing without oversight

While none of these ideas is controversial, they are also unhelpfully broad. Those objectives currently have representation, at least theoretically, in presidential administrations through the Office of Science and Technology Policy (OSTP), both through the directorate and through the position of the United States Chief Technology Officer. Yang does not suggest abolishing or transforming the OSTP but additionally insists on the reinstitution of the Office of Technology Assessment (OTA) as its legislative counterpart.

It is unclear how the proposed Department of Technology would interact with the congressionally-confirmed OSTP, the revived (OTA), and his recent addition to Cabinet rank positions, the Department of Attention Economy, another department that would focus on the critical areas of protections from the negative side effects of social media, especially for children.

The additions of more departments and offices of the executive branch cheapen the sophistication of solutions being offered, with described roles resembling “think tanks” more than governing bodies.

Nevertheless, Mr. Yang’s surfacing of these potential harms towards children, get closer to some of the other algorithmic needs not addressed in his policies: biases and potential harms of current applications of machine learning used across the country. While the Department of Technology is working to ensure the stop of the singularity, we still have unaddressed algorithmic issues and processes to address. Yang is right, that the efficacy of the government to respond to these needs is insufficient. But it isn’t clear that creating another Department will actually provide people with the protection that they need to understand the implications of their data as property, to protect them from data exposure or from algorithmic abuse, and to secure their information and safety in the evolving cyberscape.

If we do accept his assumption that the most pressing technological issues are in regulating advanced AI applications, we still don’t have what is needed to protect the electorate: checks and balances. As Mr. Yang’s policy states:

The level of technological fluency that members of our government has shown has created justified fears in the minds of Americans that the government isn’t equipped to create a regulatory system that’s designed to protect them.

Congress would have a difficult time crafting laws for enforcement, as some congresspersons, I gather, think that Pac-Man is an emerging technology. And the Judiciary branch of the government, already struggling to process technological patent laws due to the level of depth of expertise required both legally and technologically of content experts.

While the revival of OTA would help, it would be a legislative act, since Congress defunded it in 1995 after “some Republican lawmakers came to view [the OTA] as duplicative, wasteful, and biased against their party.” But no such equivalent judiciary plan has been put forward, leaving the public without one of their key levers for recourse: if technological fluency is not equally available across all branches of government, we risk abuses from those branches that do have intellectual power.

The success of the Cabinet-ranked EPA to protect the environment and people from economic externalities came in part by an accessible level content expertise in all three branches (like the legislative Clean Air Act and the judicial Massachusetts v. Environmental Protection Agency). To be at least as successful as the EPA (whose “success” had arguably been diminished under Scott Pruitt), all these proposed regulatory bodies need to be both given that power through law and to be held accountable for its work in the courts.

In short, to protect individuals from the unknown harmful effects of future technology, the government needs to be much better equipped to address these needs. But the promotion or creation of a Department of Technology led by deeply sophisticated technologists is less than one third of the solution needed to address our algorithmic obstacles. That level of technological acumen would need to proliferate all three branches of government and would need address current algorithmic issues—not just the rise of our computer overlords.