Class Blog – Page 26 – Data Science W231 | Behind the Data: Humans and Values

September 21, 2020September 21, 2020

Data collection – Is it ethical?

Data collection – Is it ethical?
By Sarah Wang | September 18, 2020

Companies’ data collection is growing rapidly and is projected to continue. Data collection refer that companies use a cornucopia of collection methods and sources to capture customer data, on a wide range of metrics. The type of data collected could range from personal data, such as Social Security numbers and other identifiable information, to attitudinal data, such as consumer satisfactory, product desirability and more.

Consumer data is collected for business purpose. For example, companies often analyze customer service records to understand what interaction methods worked well and what did not, and how customers responded, on a grand scale. Furthermore, it is also common for companies to sell customers data to third-party resources for profit or business collaboration. But this process is nearly never clearly disclosed to customers.

Why are companies collecting customer data?
Targeted advertising the the main driver behind customer data collection. Target advertising is directed towards audiences with certain traits, based on the product or the person the advertiser is promoting to. Contextualized data can help companies understand customers’ behavior and personalize marketing campaign. As the result, the increase in the likelihood of a purpose transaction will increase companies’ return on investment.

Concerns about data collection
Data privacy and breach is the major concerns about data collection. Last year, major corporations, such as Facebook, Google, Uber, experienced data breaches that put tens of millions of personal records into the hands of criminals. These breaches are only the “tip of the iceberg” when it comes to hacked accounts and stolen data. Consumers are beginning to take notice. In a research conducted by PWC, 69% of consumers believe companies are vulnerable to hacks and cyberattackers.

Over the time, this caused consumers to lose trust in companies that have customers’ personal information. Only 10% of consumers feel they have complete control over their personal information. If customers don’t trust the business to protect their sensitive data and use it responsibly, companies will get nowhere to harness the value of that data to offer customers a better experience

Last, but not the least, another downside of data driven business is it subjects to model bias. One example of that is how amazon’s recruiting algorithm is particular in favor of male and penalized resumes that includes the word “women” because the algorithm is trained based on the large sample bias toward male employees

How to reassure customers that their data is being protected?
First and foremost, companies need to demonstrate respect to their customers by providing full transparency of what data is collected, how the data will be used, when will the data be purged and expired, if ever.

Secondly, companies need to provide customers the option with data not being collected. Each individual should be treated as autonomous person capable of making decisions for themselves. Behind this idea is that data should be owned by customers. Individuals may consent that their data are used by companies but only under certain boundaries and conditions.

BIBLIOGRAPHY
1. Consumer Intelligence Series: Protect.me, Article by PWC, September 2017
https://www.pwc.com/us/en/advisory-services/publications/consumer-intelligence-series/protect-me/cis-protect-me-findings.pdf

2. Targeted Advertising, Wikipedia, January 2017
https://en.wikipedia.org/wiki/Targeted_advertising

September 21, 2020

The Social Credit System

The Social Credit System
By Anonymous | September 18, 2020

Many Westerners are familiar with China’s Great Firewall, a tool used by the Chinese Communist Party to monitor, control, and restrict information over the internet inside China’s borders. However, a less well known surveillance practice has been covertly spreading like wildfire in China, and that is the omnipresent surveillance system incorporated into all levels of government, society, and family life. This is the start of what is called the Social Credit System.

The Social Credit System is a point system used to evaluate the behavior and trustworthiness of people and businesses in China. In the picture below, we see some methods to gain or lose points. For instance, if a citizen donates blood or money, or engages in charity work, they gain more points. However, if they don’t visit aging parents, play too much video games, publicly oppose the government, or spread rumors on the internet, that can result in a decrease in points. Everything about the person is used in the calculation of the social credit score, including: daily habits, payments of bills and utilities, locations you visit, online activity, and education.

Methods in which citizens can gain or lose social credit score and possible rewards and punishments.
https://www.visualcapitalist.com/the-game-of-life-visualizing-chinas-social-credit-system/

There are serious ethical factors for this system. Much like having a high Credit Score in the US, there are perks for having a high Social Credit System in China. This includes: easier access to financial services, forgoing deposits on certain rental items, battery visibility on dating websites, and better travel deals. There are serious ramifications, like: restrictions on trains and airplanes, family members being blocked from prestigious schools, denial of social messaging apps, or being unable to pull out a mortgage for a house. And, not to mention the social ostracization that comes from disapproving people around you. An individual’s livelihood is inextricably linked to their Social Credit Score. What restrictions should be considered acceptable and which are not?

The Chinese government is constantly collecting data about individuals, including: financial records, social media activities, credit history, health records, online transactions, tax payments, social network, and location information. There is no option to opt out of the collection process, alter or amend records, or voice concerns over abuse or misuse of the data. There are no laws or regulations that protect the individual over privacy concerns from the government collection and use of their citizen’s data. What should be the line for privacy and legal controls?

This system is made possible by recent advances in technology, big data, and artificial intelligence, such as facial recognition. In addition, the surveillance state created by millions of security cameras, internet watch, interconnected IoT devices, and neighborhood watchdogs ensure that nothing is missed. The government then aggregates all the information into databases and logs that record an individual’s whereabouts and actions; if something new happens, the program automatically sends a notification to the system to ensure that no action goes unrewarded or unpunished.

Artificial intelligence vision recognizing people and objects in the frame.
https://www.pri.org/stories/2018-08-14/laboratory-far-west-chinas-surveillance-state-spreads-quietly

Although this Social Credit System seems to be infringing on many rights Westerners are accustomed to, Eastern societies tend to be focused more on the benefit of the group versus the individual and are more likely to give up certain rights in order to maintain society’s well-being. Several studies have been done to explain the reasons why a proportion of Chinese people support this Social Credit System (https://doi.org/10.1177/1461444819826402). This shows the importance of cultural norms when considering privacy laws, since people of different backgrounds and attitudes might view the balance of security and privacy differently.

The main concern that this system poses is its use to silence societal grievances or governmental opposition. There have been many instances of public outcry against governmental action and policy, such as: political corruption and bribes, poorly built buildings and infrastructure, oppression of ethnic minority groups, and limiting free speech critical of the government. The most notable might be the government’s initial (and possibly continuing) cover up of Coronavirus and persecution of the initial whistleblower, Li Wenliang, for “making false comments” and “spreading rumours” that had “severely disturbed the social order”. When Chinese social media sites fooled with anger at how the government had failed to respond to news appropriately, the sites comments’ were quickly censored. Others may have been deterred for fear of losing social credit points.

Indeed, many political figures, social advocates, or ordinary citizens who have not been inline with the government’s political agenda have been threatened, denied basic services, falsely imprisoned and tortured, coerced to give false testimony, or disappeared without whereabouts. Combined with the omnipresent surveillance state and unforgiving Social Credit System, there isn’t a single individual who isn’t at the mercy of the government’s new way to control society.

Sources:
https://www.washingtonpost.com/news/theworldpost/wp/2018/11/29/social-credit/
https://www.wired.com/story/china-social-credit-score-system/
https://www.youtube.com/watch?v=0cGB8dCDf3c
https://www.bbc.com/news/world-asia-china-53077072

July 27, 2020

Authentication and the State

Authentication and the State
By Julie Nguyen

Introduction

For historical and cultural reasons, the American society is one of very few democracies in the world where there is no universal authentication system at the national level. Surprisingly, the Americans don’t trust governments as they do toward corporations because they consider such identifier system a serious violation of privacy and a major opening to Big Brother government. I will argue that it is more beneficial for the US to create a universal authentication system to replace the patchwork of de facto paper documents currently in use in a disparate fashion at the state level in the United States.

Though controversial and difficult to be implemented, a national-level authentication system would entail a lot of benefits.

It is not reasonable to argue that it is too complex to create a national-level authentication system. No, it is hard but possible elsewhere.

The debate on a national-level authentication system is not new. In Europe, national census scheme inspired a lot of resistance as it tended to focus the attention on privacy issues. One of the earliest examples was the protest against a census in the Netherland in 1971. Likewise, nobody foresaw the storms of protests over the censuses in Germany in 1983 and 1987. In both countries, the memories of the World War II and how the governments had terrorized the Dutch and German people during and after the war could explain such kind of reactions.

Similarly, proposals for a national-level identity cards produced the same reactions in numerous countries. Today, however, almost all modern societies have developed systems to authenticate their citizens. Those systems have evolved with the advent of new technologies in particular the biometric cards or e-cards: the pocket-sized “ID cards” have become biometric cards in almost all European countries and E-cards in Estonia. Citizens of many countries, including democracies, are required by law to have ID cards with them all the time. Surprisingly, these cards are still viewed by Americans as a major tool of oppressive governments and any discussion on establishing a national-level ID cards are not in general considered fit for discussion.

In some countries where people shared the same American view, their governments have learnt their hard lessons. As the result, contemporary national identification policies tend to be introduced more gradually under other symbols than the ID system per se. Thus, the new Australian policy is termed an Access Card since its introduction in 2006. The Canadian government now talks of a national Identity Management policy. More recently, the Indian government has implemented Aadhaar, the biggest world-wide biometric identification scheme containing the personal details, fingerprints and iris patterns of 1.2 billion people – nine out of ten Indians.

It is time that the federal government, taking lessons from other countries, create a national-level authentication system in the Unites States given that the system would create a lot of benefits for the Americans.

The advantages of a national authentication system would outperform its disadvantages in contrast to the argument of the opponents related to privacy and discrimination issues. I will use two main arguments to justify my statement. First and foremost, the most significant justification for identifying citizens is to insure the public’s safety and well-being. Even in Europe where the right to privacy is extremely important, Europeans have made a trade-off in favour of their safety. Documents captured from Al Queda or ISIS show that terrorists are aware that anonymity is a valuable tool for penetrating an open society. For domestic terrorist acts, it would be also easier and simpler to get terrorists caught in the case the country has a universal authentication system. For instance, Unabomber is one of the most notorious terrorists in the United States due to fact that it was extremely hard to track him as he had quasi no identity in the society.

Second, the opponents of a national authentication system argue that traditional ID cards or a national authentication system are a source of discrimination. Actually, universal identifiers could serve to reduce discrimination in some areas. All job applicants would be identified to avoid the fake identity, not only immigrant people or those who look or sound “foreign”. Taking the example of E-Verify which is a voluntary online system operated by the U.S. Department of Homeland Security (DHS) in partnership with the Social Security Administration (SSA). It’s used to verify an employee’s eligibility to legally work in the United States. E-Verify checks workers’ Form I-9 information for authenticity and work authorization status against SSA and Citizenship and Immigration Services (CIS) database. Today, more than 20 states have adopted laws that require employers to use the federal government’s E-Verify Program. As the E-verify entails further administrative costs for potential employers, it is a driver of discrimination towards immigrant workers in the United States. A national “E-verify” system of all US residents would prevent such a source of discrimination.

Lack of a national-wide authentication system results in a lot of social costs.

Identity theft has become a serious problem in the United States. Though the federal government passed the Identity Theft and Assumption Deterrence Act in 1998 in order to crackdown the problem and make it a federal felony, the cost of identity theft has continued to increase significantly[1]. Identity thieves have stolen over $107 billion in the US for the past six years. Identity theft is particularly frightening because there is no completely effective way for most people to protect themselves. Rich and powerful persons can be also caught in the trap. For example, Abraham Abdallah, a busboy in New York, succeeded in stealing millions of dollars from famous people’s bank accounts, using the Social Security Numbers, home addresses and birthdays of Warren Buffet, Oprah Winfrey and Steven Spielberg…

People usually think that identity theft is mainly a case of someone using another person’s identity to steal money from this person, mostly via stolen credit cards or more complicated in the case of the above-mentioned New Yorker. But the reality is much more complex. In his book The Limits of Privacy, Amitai Etzioni lists several categories of crime related to identity theft:

- Criminal fugitive
- Child abuse and sex offenses
- Income tax fraud and welfare fraud
- Nonpayment of child support
- Illegal immigration

Additionally, the highest hidden cost for American society due to the lack of a universal identity system is, in my opinion, the vulnerability of their democracy and the inefficient function of the whole society. In most democracies, a universal authentication system permits citizens to interact with government, reducing transaction cost and increasing the trust in governments at the same time. Moreover, it is a step toward an e-election in these countries where, like in the United States, the turnout rate has become critical. Without a universal and secured authentication system, any reform of the election in the country would be very difficult to put in place.

Overall, the tangible and intangible cost of not having a national authentication system is very high.

Conclusion

The United States is one of the very few democracies that has no standardized universal identification system. The social cost is very significant. The new technologies today can make it possible to protect the system from abuse. There is no zero-sum game in a society. Opponents of such kind of authentication system are wrong and their arguments would not hold today anymore. “Information does not kill people; people kill people” as Dennis Bailey wrote in The open Society Paradox. It is time to create a single, secure and standardized national-level ID to replace the patchwork of de facto paper documents currently in use in the United States. An incremental implementation of an Estonian-like system with a possible opting-out solution like Canadian approach can be an appropriate answer to the opponents of a national authentication system in the United States.

Bibliography

1/ The Privacy Advocates – Colin J. Bennett, The MIT Press, 2008.

2/ The Open Society Paradox – Dennis Bailey, Brassey’s Inc., 2004.

3/ The Limits of Privacy – Amitai Etzioni, Basic Books, 1999.

4/ E-Estonia: The power and potential of digital identity – Joyce Shen, 2016. https://blogs.thomsonreuters.com/answerson/e-estonia-power-potential-digital-identity/

5/ E-Authentication Best Practices for Government – Keir Breitenfeld, 2011. http://www.govtech.com/pcio/articles/E-Authentification-Best-Practices-for-Government.html

6/ My life under Estonia’s digital government – Charles Brett, 2015. https://www.theregister.co.uk/2015/06/02/estonia/

7/ Hello Aadhaar, Goodbye Privacy – Jean Drèze, 2017. https://thewire.in/government/hello-aadhaar-goodbye-privacy

July 20, 2020

GDPR: Good Intentions, Unintended Consequences?

GDPR: Good Intentions, Unintended Consequences?
By Jen Patterson-Radovancevic | July 17, 2020

General Data Protection Regulation (GDPR) padlock on european union flag

The EU’s General Data Protection Regulation, or GDPR, has often been lauded for its progressiveness, having seemingly pushed the definition of what can be considered to be under a governing body’s supervision when it comes to technology. The goal was to make Europe “fit for the digital age,” but GDPR has implications for businesses and individuals on a global scale — and only some of them good.

While GDPR protects European residents (it is not exclusive to citizens) and has inspired adherence to its principles and similar policies abroad, there could be some negative economic costs incurred on non-EU countries — and these costs would be most inflicted, potentially, on the most economically vulnerable countries, or those who still catching up from behind in terms of technological globalization.

The EU regulation impacts firms both inside and outside the EU — in effect, it can affect any company that touches the data of EU businesses, residents, or citizens, regardless of having a physical headquarters in Europe. If the business is outside the EU, but handles any European data, it is required to designate a representative to monitor the company’s data practices, notify relevant EU authorities of potential data breaches, and attend enforcement proceedings in the event of GDPR non-compliance. In such a case of noncompliance, the ICO or another European Protection Authority can serve a formal enforcement notice on the company. This would likely take the form of blocking the service in the case of unlawful data processing, or goods seizure in the case of personal data related to the sale of physical goods being processed unlawfully. For repeat offenders, up to €20 million (approximately $23.5 million USD) or 4% of a company’s worldwide turnover can be fined.

This presents a tricky situation for non-EU countries and companies that have high market contact with the EU, in whatever capacity. These might include, for example, “fringe” European countries, like those in Southeastern Europe. Countries must decide whether or not they will follow in the EU’s footsteps, and create their own “progressive” data protection policies. If they do, they may be able to ensure that their companies can maintain business relationships in Europe, but would face the high cost of actually enforcing the regulation — certainly, this would affect the poorer, most vulnerable countries the most. If they do not pursue such policies, they might save regulation costs, but risk losing overall GDP; furthermore, companies within their borders will have to compete amongst themselves and decide if they will meet GDPR’s requirements, possibly with little governmental support. Firms surviving at the edge may be sunk simply from the administrative cost of determining who among their user base is an EU resident. EU countries, and other rich western countries, can more easily afford to switch over to the GDPR standards either via policy or via natural market competition, and likely have the private technical knowledge and public governmental support to do so.

Aside from the economic burden, there’s another effect to consider when it comes to GDPR’s influence: that the rest of the world, as it mimics Europe’s policies and practices to fit into the global economy, will de facto adopt the European model of data privacy. With the European and California models of privacy poised to become the dominant privacy paradigm globally, the question must be asked — is it right for the West to impose its conceptualization of privacy on the rest of the world? No matter how well-intentioned, the ideological effects of GDPR may, in a sense, act as a form of technological imperialism. Furthermore, exacting regulations after the West’s Internet tech companies have been firmly established is a practice reminiscent of other potentially harmful “progressive” movements by the West: imposing environmental laws on industrial-era Stage 2 and 3 countries, while the West’s service-based economies reached their current state of comfort by engaging themselves in environmental exploitation; or even the apparent solidification of national borders in the name of self-determination, once Europe and the US were satisfied with the outcomes. The West generously exporting its morality is not new; nor is the world’s willingness to adopt that morality, if it means staying competitive in the global market.

In this piece, I don’t intend to muddle the benefits GDPR provides. The transparency that it demands from large companies, especially concerning data practices and data breaches, is a huge leap forward from pre-GDPR times. Rather, my goal was to highlight some of the potential negative externalities of the legislation, and hope it may inspire others to deeply consider the true effects of such a premium, global policy on the world’s underdogs.

July 20, 2020

Tracking Transitions: The Surveillance State, Identification Documents, & Trans Communities

Tracking Transitions: The Surveillance State, Identification Documents, & Trans Communities
By Kai Nham | July 17, 2020

For many trans people, especially trans people of color, visibility is fraught with contradictions. Who gets to be seen as their full selves? Who is forced to be hyper visible? These contradictions are reflected in the processes of the United States’ surveillance mechanisms that continue to police and enact administrative violence on trans communities. While trans people are disproportionately affected by the rise of surveillance technologies, such as facial recognition, another pervasive element of the surveillance state is identity documentation. Cross-referenced throughout many databases, identity documentation serves as an access-point into many of the rights conferred by citizenship, both formally and informally.

Fig. 1: Art credited to Ana Galvañ, originally illustrated for John Seabrook’s New Yorker article, “Dressing for the Surveillance Age”

Policing Trans as a Category
The category of trans has been one that has consistently been policed through formal mechanisms of identification by the state, as well as through social interactions with those who carry the administrative power of the state (e.g. police officers, Transportation Security Administration (TSA) agents, doctors, etc). In order to obtain or change identity documents, many trans people in the United States must obtain doctor’s notes that document their “proper” transition, which historically has been prescribed as one’s ability to conform to norms of whiteness, class privilege, heterosexuality, and able-bodiedness. This poses many problems for trans and gender nonbinary people who cannot conform to these prescriptions of normative gender, do not have the financial or material means to medically transition, do not desire to medically transition, or any combination of the above. Doctors, then, serve as one of the gatekeepers to access to identity-aligned documents. In the 2015 National Transgender Discrimination Survey report, one third of respodents had none of their identity documents updated to align with their gender identity. The consequences of these barriers and practices that make trans people both invisible and hypervisible simultaneously are dangerous. 40% of people who presented an ID that did not appear to match their gender expression were harassed , 15% were asked to leave the establishment, and 3% were physically assaulted.

Additionally the data that is used “as part of these legal processes (along with any form requiring one to identify as a specific gender) form a paper trail through which state agencies [and other private entities] may track, assess, and manage transgender people” (Beauchamp, 2014). These paper trails expose the trans community to the dangers of hypervisibility by the state, which has historically played a large role in the violence enacted on them, whether it be through acts of police violence to housing discrimination.

Identification & Post-9/11 Surveillance Policies
In the post-9/11 era, we have seen an expansion and bolstering of the surveillance state in more visible and explicit ways. Even in technologies and processes that do not explicitly seek to target or harm trans and gender nonbinary communities, it has been found that these surveillance practices disproportionately harm those exact communities. The heightened scrutiny of identity documents due to the United States’ “War on Terror” policies has led to targeting of particularly trans people of color, trans immigrants, and low-income trans people, who are more likely to have inconsistent identity documents. Deeply connected to formations of who is deemed as a citizen versus alien, these surveillance policies work in a nexus of neoliberalism, white supremacy, and cisheteronormativity. For trans and gender nonbinary communities, the lack of access to “proper” and “accurate” documentation excludes them from access to basic needs, such as work or public spaces, and subjects them as undesirable aliens that need to be policed and surveilled.

Fig. 2: A black and white photo of a crowd of protestors with the words “Expect Trans Resistance” printed on a trans flag.

Revealing how the surveillance state continues to violate and harm trans and gender nonbinary communities through technologies that seem as benign as identity documentation calls into question what liberation would look like. This does not include neoliberal reforms that nod at inclusion into a violent state apparatus. Rather, looking to the continued legacy of radical trans politics from riots against state violence at Compton’s Cafeteria to Stonewall, trans and gender nonbinary communities have been working to dismantle these systems of power that systemically harm them and creatively imagine and build new systems that allow them to thrive.

References

Beauchamp, Toby. “Surveillance.” TSQ 1 May 2014; 1 (1-2): 208–210. doi: https://doi.org/10.1215/23289252-2400037.
Grant, Jaime M., Lisa A. Mottet, Justin Tanis, Jack Harrison, Jody L. Herman, and Mara Keisling. Injustice at Every Turn: A Report of the National Transgender Discrimination Survey. Washington: National Center for Transgender Equality and National Gay and Lesbian Task Force, 2011.

July 6, 2020

Computer Vision and Regulation

Computer Vision and Regulation
By Hong (Sophie) Yang | July 3, 2020

What is computer vision?

Computer vision is a field of study focusing on training the computer to see.
“At an abstract level, the goal of computer vision problems is to use the observed image data to infer something about the world.”
(Page 83, Computer Vision: Models, Learning, and Inference, 2012).

The goal of computer vision is to understand the content of digital images. Typically, this involves developing methods that attempt to reproduce the capability of human vision. Object detection is a form of computer vision. Understanding the content of digital images may involve extracting a description from the image, which may be an object, a text description, a three-dimensional model, and so on. During inference of object detection, the model draws bounding boxes around the object based on extracted weights, which are the training coefficient from the labeled images. The bounding boxes give the exact xmin, ymin, xmax and ymax position of the object with the confidence value.

Use cases of Computer Vision

This is the list of professionally researched areas where have seen successful using computer vision.

Optical character recognition (OCR)
Machine inspection
Retail (e.g. automated checkouts)
3D model building (photogrammetry)
Medical imaging
Automotive safety
Match move (e.g. merging CGI with live actors in movies)
Motion capture (mocap)
Surveillance
Landmark detection
Fingerprint recognition and biometrics

It is a broad area of study with many specialized tasks and techniques, as well as specializations to target application domains.

From YOLO to Object Detection Ethical issue YOLO (You Only Look Once), the real time object detection model created by Joseph Redmon in May 2016, is a real time object detection model, and Yolov5 was released June 2020, it is the most recent state of art computer vision model. YOLO solved the low-level computer vision problem; more tools can be built on top of YOLO model from automatic driving to cancer cell detection with real time monitoring.

The news in February 2020 shocked the machine learning community, Joseph Redmon announced that he had ceased his computer vision research to avoid enabling potential misuse of the tech – citing in particular “military applications and privacy concerns.”

The news spun the discussion of “broader impact of AI work including possible societal consequences – both positive and negative” and “someone should decide not to submit their paper due to Broader Impacts reasons?” That is where Redmon stepped in to offer his own experience. Despite enjoying his work, Redmon tweeted, he had stopped his CV research because he found that the related ethical issues “became impossible to ignore.”
Redmon said he felt certain degree humiliation for ever believing “science was apolitical and research objectively moral and good no matter what the subject is.” He said he had come to realize that facial recognition technologies have more downside than upside, and that they would not be developed if enough researchers thought about the broader impact of the enormous downside risks.

When Redmon first created Yolo3 in 2016, he wrote about the implications of having a classifier such as the YOLO. “If humans have a hard time telling the difference, how much does it matter?” On a more serious note: “What are we going to do with these detectors now that we have them?” He also insisted on the responsibility of the computer vision researchers to consider the harm our work might be doing and think of ways to mitigate it.

“We owe the world that much”, he said.

This whole debate led to these questions, which might go unanswered forever:

Should the researchers have a multidisciplinary, broader view of the implications of their work?
Should every research be regulated in its initial stages to avoid malicious societal impacts?
Who gets to regulate the research?
Shouldn’t the expert create more awareness rather than just quit?
Who should pay the price; the innovator or those who apply?

One big complaint that people have against Redmon’s decision is that experts should not quit. Instead, they should take the responsibility of creating awareness about the pitfalls of AI.

The article on Forbes “Should AI be regulated”, published in 2017, had pointed out that AI is a fundamental technology, Artificial Intelligence is a field of research and development. You can compare it to quantum mechanics, nanotechnology, biochemistry, nuclear energy, or even math, just to cite a few examples. All of them could have scary or evil applications but regulating them at the fundamental level would inevitably hinder advances, some of which could have a much more positive impact than we can envision now. What should be heavily regulated is its use in dangerous applications, such as guns or weapons. This led to the tough questions: Who to regulate it? At what level?

July 6, 2020

Surge Pricing – Is it fair?

Surge Pricing – Is it fair?
By Sudipto Dasgupta | July 3, 2020

What is Surge Pricing?

Surge pricing by rideshare companies like Uber, Lyft originates from the idea to adjust prices of rides to match driver supply to rider demand at any given time. During periods of excess demand, the number of riders is high compared to the cars and customers need to wait for a longer time. The rideshare companies increase their normal fare. The fares are increased by a multiplier which depends on the demand in real time. Whenever rates are raised due to surge pricing, the app lets riders know. Some riders will choose to pay, while some will choose to wait a few minutes to see if the rates go back down. Most of us who are regular users of the apps would have encountered surge pricing as depicted below.

Fig 1 : Surge Pricing

A brief history

Surge pricing is based on the concept of dynamic pricing which is not new. In the 1950s, the New York subway faced a problem. At peak times, it was overcrowded; at other times, the trains were empty. William Vickrey suggested the abandonment of the flat-rate fare in favor of a fare structure which takes into account the length and location of the ride and the hour of the day. This was called as peak-load pricing.

The ride share apps extends the concept of peak-load pricing through their surge prices. The difference is that the price calculation is not only dependent on peak load but also on other factors like driver availability, weather, zip code, special events , rush hours to mention a few. The apps rely on algorithms which are opaque to the consumer to compute the multiplier. The factors which influence the prices are not transparent to the riders.

Consciously or unconsciously we as riders accept the convenience in exchange of cost. We may not even check the fare multiplication factor while availing for a convenient ride.

Fairness Concerns

The opaque algorithms of surge pricing do raise multiple fairness concerns. Prices on Uber and Lyft rose to as much as five times normal rates in the immediate aftermath of a deadly shooting in downtown Seattle in January 2020. The automated surge pricing lasted for about an hour and drew widespread criticism before the companies manually reset prices to normal levels. In 2015, Spencer Meyer, a Connecticut Uber rider, sued Uber co-founder and then-CEO Travis Kalanick, alleging that Uber was engaging in price-fixing. Uber came under criticism for hiking prices during a hostage crisis that was unfolding in Sydney in 2014. They subsequently apologized for the same.

An analysis conducted by Akshat Pandey and Aylin Caliskan from George Washington University indicates possible disparate impact due to social bias based on age, house pricing, education, and ethnicity in the dynamic fare pricing models used by ride hailing applications.

Fig2 : City of Chicago ride-hailing data. The colors in each chart designate the average fare price per mile for each census tract.

The authors analyzed 100 million rides from the city of Chicago from November 2018 to December 2019 and reported increase in ride-hailing prices when riders were picked up or dropped off in neighborhoods with a low percentage of 1) people over the age of 40, (2) people with a high school education or less, and 3) houses priced under the median for Chicago.

The surge pricing manifests as decisional interference for the riders. Biases in the training data influence the outcomes of the algorithm. Hence the question arises what data are the algorithms trained on? How can the ride sharing apps reduce the opacity of the algorithms? Is it possible to explain the AI models behind the algorithms given the apps are used by a diverse group of riders with different levels of technology understanding?

What ridesharing companies have to say?

“When demand for rides outstrips the supply of cars, surge pricing kicks in, increasing the price,”. Uber said that surge pricing reduces the number of requests made during a peak time, while drawing more drivers to busy areas. “As a result, the number of people wanting a ride and the number of available drivers come closer together, bringing wait times back down.”

Looking Forward

Answering the questions on the opacity of the algorithms is important for addressing the fairness concerns. Can the complex algorithms be exposed to the users? The ride sharing apps can have simpler mechanisms to explain the multiplication factor and make it more predictable for the riders.

July 6, 2020

Addressing the Weaponization of Social Media to Spread Disinformation

Addressing the Weaponization of Social Media to Spread Disinformation
By Anonymous | July 3, 2020

The use of social media platforms like Facebook and Twitter by political entities to present their perspectives and goals is arguably a key aspect of their utility. However, social media is not always used in a forthcoming manner. One such example is the use of these sites by Russia to spread disinformation by exploiting platform engagement and the cognitive biases of the users. The specific mechanisms of their techniques are documented and summed up as a “Firehose of Falsehood”, which serves as a guide to identify specific harms that we can proactively guard against.

The context of the analysis was rooted in the techniques being employed around the time of Russia’s 2014 invasion of the Crimean Peninsula. The techniques employed would go on to be reused to great effect in 2016, when they were used against the United Kingdom in their Brexit referendum, as well as the United States in their presidential election. More recently, the Firehose has been used against many other targets, including 5G cellular networks and vaccines.

While their techniques share some similarities with those of their Soviet predecessors, the key characteristics of Russian propaganda are that they are high-volume and multichannel, continuous and repetitive, and lacking commitment to objective reality or consistency. This approach lends itself well to social media platforms, as the speed at which new false claims can be generated and broadly disseminated far outstrip the speed at which fact checkers operate – polluting is easy, but cleaning up is difficult.

Figure 1: The evolution of Russian propaganda towards obfuscation and using available platforms
(Sources: Amazon, CBS)

The Firehose also emphasizes exploiting audience psychology in order to disinform. The cognitive biases exploited include the advantage of the first impression, using information overload to force the use of shortcuts to determine trustworthiness, use of repetition to create familiarity, the use of evidence regardless of veracity, and peripheral cues such as creating the appearance of objectivity. Repetition in particular works because familiar claims appear are favoured over less familiar ones – by repeating the message frequently, that repetition leads to familiarity, which in turn leads to acceptance. From there, confirmation bias further entrenches those views.

Figure 2: A cross-section of specimens from the 2016 election
(Source: Washington Post)

Given the nature of the methods outlined, some suggested responses are:

1. Do not rely solely on traditional techniques of pointing out falsehoods and inconsistencies
2. Get ahead of misinformation by raising awareness and make efforts to expose manipulation efforts
3. Focus on thwarting the desired effects by redirecting behaviours or attitudes without directly engaging with the propaganda
4. Compete by increasing the flow of persuasive information
5. Turn off the flow by undermining the broadcast and message dissemination through enforcement of terms of service agreements with internet service providers and social media platforms

From an ethical standpoint, some of the proposed measures have some hazards of their own – in particular, the last suggestion (“turn off the flow”) may be construed as viewpoint-based censorship if executed without respect for the users’ autonomy in constructing their perspectives. As well, competing may be tantamount to fighting fire with fire, depending on the implementation. Where possible, getting ahead of the misinformation is preferable, as forewarning acts as an inoculant for the audience – by getting the first impression and highlighting attempts to manipulate the audience, it prepares the users to critically assess new information.

As well, if it’s necessary to directly engage with the claims being made, solutions proposed are:

1. Warn at the time of initial exposure to misinformation
2. Repeat the retraction/refutation, and
3. Provide alternative story while correcting misinformation to immediately fill the information gap that arises

These proposed solutions are less problematic than the prior options, as limiting the scope to countering the harms of specific instances of propaganda, despite the limitations highlighted above, preserves respect for users to arrive at their own conclusions.

In fighting propaganda, how can we be sure that our actions remain beneficent in nature? In understanding the objectives and mechanics of the Firehose, we also see that there are ways to address the harms being inflicted in a responsible manner. By respecting the qualifications of the audience to exercise free will in arriving at their own conclusions and augmenting their available information with what’s relevant, we can tailor our response to be effective and ethical.

Sources:
The Russian “firehose of falsehood” propaganda model: Why it might work and options to counter it
Your 5G Phone Won’t Hurt You. But Russia Wants You to Think Otherwise
Firehosing: the systemic strategy that anti-vaxxers are using to spread misinformation
Release of Thousands of Russia-Linked Facebook Ads Shows How Propaganda Sharpened
What we can learn from the 3,500 Russian Facebook ads meant to stir up U.S. politics

June 29, 2020July 1, 2020

Your Health, Your Rights: medical information not covered by HIPAA

Your Health, Your Rights: medical information not covered by HIPAA
By Adam Sohn | June 26, 2020

HIPAA

HIPAA (Health Insurance Portability and Accountability Act of 1996) protects your personal medical information as possessed by a medical provider. By HIPAA, you may obtain your record, add information to your record, seek to change your record, learn who sees your information, and perhaps most importantly, exercise limited control over who sees your information.

HIPAA protection provides security enshrined in law. However, the internet and Artificial Intelligence have provided additional vectors for personal medical information to be ascertained and distributed outside of a person’s control. The implications of data release from any vector are comparable to sharing from a medical setting.

Technology Generates and Discloses Medical Information
An example of entities not bound by HIPAA for most transactions, yet dealing in medical information, is the retail sector. As customers purchase a market basket of products aligned to certain medical status, astute predictive analytics systems operated by a retailer can infer the medical status. This medical status is free from HIPAA protections as it has no origins in a medical setting. Furthermore, the status is not provided information.

Famously, the astuteness of Target’s predictive analytics was on display in 2012 when coupons for baby supplies were sent to the home of a teenage girl. While it is alarming enough that Target has a database of inferred medical information (in this case, pregnancy), Target went a step further by disclosing this information for anyone handling the teenage girl’s mail to happen upon. This triggered a public understanding of the privacy risks related to data aggregation; where mundane data becomes a building block of sensitive information.

Exploring Privacy Protections
Exploring the state of protections that do exist to prevent unwanted disclosures such as the Target case reveals a picture of a system that has room to mature.

– One way to prevent unwanted disclosure is to personally opt-out of mailed advertisements from Target, per instructions in Target’s Privacy Policy. This is an unrealistic expectation for a customer to be able to foresee such a need.
– Another method is to submit a complaint to the FTC regarding a violation of a Privacy Policy. However, Target’s Privacy Policy is vague on these matters.

Expanding the view to regulatory changes that do not yet exist, yet are in the approval progress, there is a relevant bill in Congress. CONSENT (The Customer Online Notification for Stopping Edge-provider Network Transgressions Act) was brought to the Senate in 2018 and is currently under review in the Committee on Commerce, Science, and Transportation. CONSENT would turn the tide into the public’s favor with regard to the security of Personally-Identifying Information (PII) by requiring a distinct opt-in for sharing or using PII. However, the bill is only applicable to data transacted online, which is only a portion of the relationship a consumer has with a retailer.

Clearly, consumer behavior is trending towards online purchases. However, brick-and-mortar purchasing can not be overlooked, as it is also increasing.

Advice to Consumers
In light of the general laxness of protections, the methods for keeping your information secure falls under the adage caveat emptor – buyer beware. For individual consumers, options to keep your information safe are:
– Only share the combination of PII and medical information in a setting where you are explicitly protected by a Privacy Policy.
– Forgo certain conveniences in order to remain obscure. This entails using cash in a brick-and-mortar store and refraining from participating in loyalty programs.

Sources
[HIPPA]
[New York Times – Shopping Habits]
[Consumer Privacy Bill of Rights]
[CONSENT]

June 29, 2020June 29, 2020

Discriminatory practices in interest-based advertising

Discriminatory practices in interest-based advertising
By Anonymous | June 26th, 2020

Economics and ethics

The multi-billion dollar online advertising industry is incentivised to ensure that ad dollars convert into sales, or at least high click-through rates. Happy clients equate to healthy revenues. The way to realize this goal is to match the right pair of eyeballs for each ad – quality, not quantity, matters.

Interest-based ads (sometimes referred to as personalized or targeted ads) are strategically placed for specific viewers. The criteria for viewer selection could be from immutable traits like race, gender and age, or online behavioral pattern. Unfortunately, both approaches are responsible for amplifying racial stereotypes and deepening social inequality.

Baby and the bathwater

Dark ads exclude a person or group from seeing an ad by targeting viewers based on an immutable characteristic, such as sex or race. This is not to be confused with the notion of big data exclusion, where ‘datafication unintentionally ignores or even smothers the unquantifiable, immeasurable, ineffable parts of human experience.’ Instead, dark ads refer to a deliberate act by advertisers to shut out certain communities from participating in its product or service offering.

Furthermore, a behaviorally targeted ad can act as a social label even when it contains no explicit labeling information. When consumers recognize that the marketer has made an inference about their identity in order to serve them the ad, the ad itself functions as an implied social label.

Source: The Guardian

That said, it’s not all bad news with these personalized ads. While there are calls to simply ban targeted advertising, one could argue for the benefits of having public health campaigns, say, delivered in the right language to the right populace. Targeted social programs could also have better efficacy if it reaches the eyes and ears that need them. To take away this potentially powerful tool for social good is at best a lazy approach in solving the conundrum.

Regulatory oversight

In 2018, the U.S. Department of Housing and Urban Development filed a complaint against Facebook, alleging that the social media platform had violated the Fair Housing Act. Facebook’s ad targeting tools enabled advertisers to express unlawful preferences by suggesting discriminatory options, and Facebook effectuates the delivery of housing-related ads to certain users and not others based on those users’ actual or imputed protected traits.

Source: The Truman Library

A 2016 investigation by ProPublica found that Facebook advertisers could create housing ads allowing posters to exclude black people. Its privacy and public policy manager defended the practice, underlining the importance for advertisers to have the ability to both include and exclude groups as they test how their marketing performs – nevermind that A/B testing itself often straddles the grey area in the ethics of human subject research.

Source: ProPublica

Opinion

Insofar as the revenues for online businesses are driven by advertising revenue, which is dictated by user traffic, interest-based ads are here to stay. Stakeholders with commercial interests will continuously defend its marketing tools with benevolent use cases. Lawmakers need to consistently address the harm itself – that the deliberate exclusions (and not just the ones from algorithmic bias and opacity) serve to exacerbate inequalities from discriminatory practices in the physical world.

In the example above, the HUD authorities did well to call out Facebook’s transgressions, which are no less serious to those of the Jim Crow era. As a society, we have moved forward with Brown v. Board of Education. Let us not slip back in complacency in justifying segregatory acts; and of being complicit in Plessy v. Ferguson.