morganya – Page 4 – Data Science W231 | Behind the Data: Humans and Values

October 24, 2018

GDPR Will Transform Insurance Industry’s Business Model

GDPR Will Transform Insurance Industry’s Business Model
By Amit Tyagi | October 21, 2018

The European Union wide General Data Protection Regulation, or GDPR, came into force on May 25, 2018, with significant penalties for non-compliance. In one sweep, GDPR harmonizes data protection rules across the EU and gives greater rights to individuals over how their data is used. GDPR will radically reshape how companies can collect, use and store personal information, giving people the right to know how their data are used, and to decide whether it is shared or deleted. Companies face fines of up to 4 per cent of global turnover or €20m, whichever is greater.

To comply with GDPR, companies across various industries are strengthening their data usage and protection policy and procedures, revamping old IT systems to ensure that they have the functionality to comply with GDPR requirements, and reaching out to customers to get required consents.

However, GDPR will also require a fundamental rethink of business models for some industries, especially those that heavily rely on personal data to make business and pricing decisions. A case in point is insurance industry. Insurers manage and underwrite risks. Collection, storage, processing and analyzing data is central to their business model. The data insurers collect go beyond personal information. They collect sensitive information such as health records, genetic history of illnesses, criminal records, accident-related information, and much more.

GDPR is going to affect insurance companies in many ways. Start with pricing. Setting the right price for underwriting risks heavily relies on data. With data protection and usage restriction provisions of GDPR, insurers will have to re-look at their pricing models. This may have an inflationary effect on insurance prices: not a good thing for consumers. This will be further compounded by ‘data minimization’, a core principle of GDPR limits the amount of data companies can lawfully collect.

Insurance companies typically store their data for long periods. This aids them in pricing analytics and customer segmentation. With right to erasure, customers can request insurers to erase their personal data and claims history. These requests might come from customers who have an unfavorable claims history, leading to adverse selection due to information asymmetry.

Insurance frauds are another area that will be impacted by GDPR. Insurance companies protect themselves from fraudulent claims by analyzing myriad data points, including criminal convictions. With limitation on the type of data they are able to lawfully use, quite possibly insurance frauds may spike.

Insurance companies will also have to rethink their internal processes and IT systems which were built for a pre-GDPR era. Most decisions in insurance industry are automated, which includes, inter alia, whether to issue a policy or not, how much insurance premium to charge, whether to processes a claim fully or partially. Now with GDPR, customers can lawfully request human intervention in decision making.

GDPR gives the right to customers to receive their personal data held by an insurer, or have it transmitted to another insurer in a structured, commonly used and machine-readable format. This will be a challenge as insurers will have to maintain interoperable data formats from disparate legacy IT systems. Further, this has to be done free of charge. This will surely lead to lower profitability as competition among insurers will increase.

GDPR mandates that data should be retained only as long as is necessary for the purpose for which it was collected, after which it needs to be deleted and anonymized. If stored for longer duration, the data should be pseudonymized. This will require significant system changes, which will be a huge challenge for insurance companies as the rely on disparate systems and data sources, all of which will have to change to meet GDPR requirements.

Though insurers may be acutely impacted by GDPR, their path to compliance should be a disciplined approach: revisiting systems and processes to assess readiness for this regulation and investing in filling gaps. Some changes may be big, such as data retention and privacy by design, while some may be more straightforward, such as providing privacy notices. In all cases, effective change management is the key.

October 23, 2018October 23, 2018

Making AI “Explainable”: Easier Said than… Explained

Making AI “Explainable”: Easier Said than… Explained
By Julia Buffinton | October 21, 2018

Technological advances in machine learning and artificial intelligence (AI) have opened up many applications for these computational techniques in a range of industries. AI is now used in facial recognition for TSA pre-check, reviewing resumes for large companies, and determining criminal sentencing. In all of these examples, however, these algorithms have received attention for being biased. Biased predictions can have grave consequences, and determining how biases end up in the algorithm is key to preventing them.

Understanding how algorithms reach their conclusions is crucial to their adoption in industries such as insurance, medicine, finance, security, legal, and military. Unfortunately, the majority of the population is not trained to understand these models, viewing them as opaque and non-intuitive. This is challenging when accounting for the ethical considerations that surround these algorithms – it’s difficult to understand their bias if we don’t understand how they work in general. Thus, seeking AI solutions that are explainable is key to making sure that end users of these approaches will “understand, appropriately trust, and effectively manage an emerging generation of artificially intelligent machine partners.”

How can we do that?

Developing AI and ML systems are resource-intensive, and being thorough in managing ethical implications and safety adds to that. The federal government has recognized not only the importance of AI but also ethical AI and has increased its attention and budget for both.

In 2016, the Obama administration formed the new National Science and Technology Council (NSTC) Subcommittee on Machine Learning and Artificial Intelligence to coordinate federal activity relating to AI. Additionally, the Subcommittee on Networking and Information Technology Research and Development (NITRD) created a National Artificial Intelligence Research and Development Strategic Plan to recommended roadmaps for AI research and development investments by the federal government.

This report identifies seven priorities:

Make long-term investments in AI research
Develop effective methods for human-AI collaboration
Understand and address the ethical, legal, and societal implications of AI
Ensure the safety and security of AI systems
Develop shared public datasets and environments for AI training and testing
Measure and evaluate AI technologies through standards and benchmarks
Better understand the national AI R&D workforce needs

Three of the seven priorities focus on the minimizing the negative impacts of AI. The plan indicates that “researchers must strive to develop algorithms and architectures that are verifiably consistent with, or conform to, existing laws, social norms and ethics,” and to achieve this, they must “develop systems that are transparent, and intrinsically capable of explaining the reasons for their results to users.”

Is this actually happening?

Even though topics related to security, ethics, and policy of AI comprise almost half of the federal government’s funding priorities for AI, this has not translated directly into funding levels for programs. A brief survey of the budget for the Defense Advanced Research Project Agency (DARPA) shows an overall increase in funding for 18 existing programs that focus on advancing basic and applied AI research, almost doubling each year. However, only one of these programs fits into the ethical and security priorities.

Funding Levels for DARPA AI Programs

The yellow bar on the table represents the Explainable AI program, which aims to generate machine learning techniques that produce more explainable yet still accurate models and enable human users to understand. trust, and manage these models. Target results from this program include a library of “machine learning and human-computer interface software modules that could be used to develop future explainable AI systems” that would be available for “refinement and transition into defense or commercial applications.” While funding for Explainable AI increases, it is not at rate proportional to the overall spending increases for DARPA AI programs.

What’s the big deal?

This issue will become more prevalent as the national investment in AI grows. Recently, predictions have been made that China will close the AI gap by the end of this year. As US organizations in industry and academia strive to compete with their international counterparts, careful consideration must be given not only to improving technical capabilities but also developing an ethical framework to evaluate these approaches. This not only affects US industry and economy, but it has big consequences for national security. Reps. Will Hurd (TX) and Robin Kelly (IL) argue that, “The loss of American leadership in AI could also pose a risk to ensuring any potential use of AI in weapons systems by nation-states comports with international humanitarian laws. In general, authoritarian regimes like Russia and China have not been focused on the ethical implications of AI in warfare.” These AI tools give us great power, but with great power comes great responsibility, and we have a responsibility to ensure that the systems we build are fair and ethical.

October 23, 2018October 24, 2018

What do GDPR and the #MeToo Movement Have in Common?

What do GDPR and the #MeToo Movement Have in Common?
By Asher Sered | October 21, 2018

At first glance it might be hard to see what the #MeToo movement has in common with the General Data Protection Rule (GDPR), the new monumental European regulation that governs the collection and analysis of commercially collected data. One is a 261 page document composed by a regulatory body while the other is a grassroots movement largely facilitated by social media. However, both are attempts to protect against commonplace injustices that are just now starting to be recognized for what they are. And, fascinatingly, both have brought the issue of consent to the forefront of public consciousness. In the rest of this post, I examine the issue of consent from the perspective of sexual assault prevention and data privacy and lay out what I believe to be a major issue with both consent frameworks.

Consent and Coercion

Feminists and advocates who work on confronting sexual violence have pointed out several issues with the consent framework including the fact consent is treated as a static and binary state, rather than an evolving ‘spectrum of conflicting desires’^[1]. For our purposes, I focus on the issue of ‘freely given’ consent and coercion. Most legal definitions require that for an agreement to count as genuine consent, the affirmation must be given freely and without undue outside influence. However, drawing a line between permissible attempts to achieve a desired outcome and unacceptable coercion can be quite difficult in theory and in practice.

Consider a young man out on a date where both parties seem to be hitting it off. He asks his date if she is interesting in having sex with him and she says ‘yes’. Surely this counts as consent, and we would be quick to laud the young man for doing the right thing. Now, how would we regard the situation if the date was going poorly and the man shrugged off repeated ‘nos’ and continued asking for consent before his date finally acquiesced? What if his date feared retribution if she were to continue saying ‘no’? What if the man was a celebrity? A police officer? The woman’s supervisor? At some point a ‘yes’ stops being consent and starts being a coerced response. But where that line is drawn is both practically and conceptually difficult to disentangle.

Consent Under GDPR

The authors of GDPR are aware that consent can also be a moving target in the realm of data privacy, and have gone to somelengths to try and articulate under what conditions an affirmation qualifies as consent. The Rule spends many pages laying out the specifics of what is required from a business trying to procure consent from a customer, and attempts to build in consumer protections that shield individuals from unfair coercion. GDPR emphasizes 8 primary principles of consent, including that consent be easy to withdraw, free of coercion and given with no imbalance in the relationship.

How Common is Coercion?

Just because the line between consent and coercion is difficult to draw doesn’t necessarily mean that the concept of consent isn’t ethically and legally sound. After all, our legal system rests on ideas such as intention and premeditation that are similarly difficult to disentangle. Fair point. But the question remains, in our society how often is consent actually coerced?

Michal Buchhandler-Raphael, a professor of Law at Washington and Lee University, argues that problems with legal frameworks in which the definition of rape is built around non-consensual sex ‘[are] most noticeable in the context of sexual abuse of power stemming from professional and institutional relationships.^[2]’ She cites numerous cases in which a supervisor or someone in an otherwise privileged position managed to extract consent from a subordinate, and was therefore unpunished by the legal system. Since 70% of rapes are committed by someone known to the victim, and presumably an even larger percentage of sexual interactions take place between parties who know each other, we can expect that some amount of coercion occurs in numerous day to day sexual interactions. Especially given the fact that we continue to live in a patriarchal society where men are much more likely than women to be in positions of power^[3].

This observation about power imbalances in sexual interactions neatly parallels a major concern with consent under GDPR. While GDPR requires that data subjects have a ‘genuine or free choice’ about whether to give consent, it fails to adequately account for the fact that there is always a power differential between a major corporation and a data subject. Perhaps I could decide to live without email, a smartphone, social networks or search engines, but giving up those technologies would have a major impact on my social, political and economic life. It matters much more to me that I have access to a Facebook account than it does to Facebook that they have access to my data. If I opt out, they can sell ads to their other 2 billion customers.

Conclusion

I should be clear that I do not intend to suggest that companies stop offering Terms and Conditions to potential data subjects or that prospective sexual partners stop seeking affirmative consent. But we do need to realize that consent is only part of the equation for healthy sexual relationships and just data practices. The next step is to think about what a world would like where people are not constantly pressured to give things away, but instead are empowered to pursue their own ends.

Notes

^[1] See, https://economictimes.indiatimes.com/news/politics-and-nation/thoughts-on-metoo-why-cant-men-understand-the-concept-of-consent-a-flimmaker-explains/articleshow/66198444.cms?from=mdr&fbclid=IwAR0O8fzj4cQ4d68nwqWciQPSLrepZIV00RJKAnUmsC0id6JBqnNb4CR69WQ for a fascinating take on the topic

^[2] https://repository.law.umich.edu/cgi/viewcontent.cgi?article=1014&context=mjgl

^[3] Of course, coercion can be used by people of all genders to convince potential partners to agree to have sex.

October 17, 2018

Workplace monitoring

Workplace monitoring
by Anonymous on September 30, 2018

We never intended to build a pervasive workplace monitoring system – We just wanted to replace a clunky system of shared spreadsheets and email.

Our new system was intended to track customer orders as they moved through the factory. Workers finding a problem in the production process could quickly check order history and take corrective action. As a bonus, the system would also record newly-mandatory safety-related records. And it also appealed to the notion of “Democratization of Data,” giving workers direct access to customer orders. No more emails or phone-tag with production planners.

It goes live

The system was well received, and we started collecting a lot of data: a log every action performed on each sales order, with user IDs and timestamps. Workers could see all the log entries for each order they processed. And the log entries became invisible to workers once an order was closed.

Invisible, but not deleted.

Two years later, during a time of cost-cutting, it came to the attention of management that the logs could be consolidated and sorted by *any* field. Like this:

A new report was soon generated; logs sorted by worker ID. It didn’t seem like such a major request. After all, the data was already there. No notice was given to workers about the new report, or its potential use as a worker performance evaluation tool.

Re-purposed data

The personally identifiable information was re-purposed without notice or consent. The privacy issue may be intangible now to the workers, but could one day become very tangible as a factor in pay or layoff decisions. There is also potential for misinterpretation of the data. A worker doing many small tasks could be seen as doing far more than a worker doing a few time-consuming tasks.

Protection for workers’ information

California employers may monitor workers’ computer usage. The California Consumer Privacy act of 2018 covers consumers, not workers.

However, European Union’s General Data Protection Regulation (GDPR) addresses this directly, and some related portions of the system operate in the EU.

GDPR’s scope is broad; covering personally identifiable information in all walks of life (e.g. as a worker, as consumer, as citizen.) Section 26 makes clear: “The principles of data protection should apply to any information concerning an identified or identifiable natural person.” Other sections cover consent, re-purposing, and fairness/transparency issues, and erasability (sections 32, 60, 65, and 66.)

Most particularly, Article 88 requires that collection and use of personally identifiable information in workplace should be subject to particularly high standards of transparency.

Failings

It’s easy in hindsight to find multiple points at which this might have been avoided. Mulligan’s and Solove’s frameworks suggest looking at “actors” and causes.

Software Developer’s action (harm during data collection): There could have been a login at the level of a processing station ID, rather than a worker’s personal ID.
Software Developer’s action (and timescale considerations, yielding harm during data processing): The developer could have completely deleted the worker IDs once the order closed.
Software Developer’s action (harm during data processing and dissemination: increased Accessibility, Distortion): The
developer could have written the report to show that simple “event counting” wasn’t a reliable way of measuring worker contribution.
Management’s action (harm during data gathering and processing): the secondary use of the data resulted in intrusive surveillance. Businesses have responsibility (particularly under GDPR) to be transparent with respect to workplace data. Due concern for the control over personal information was not shown.

Prevention

One way forward, when working on systems which evolve over time: Consider Mulligan’s contrast concept dimension of privacy analysis, applied with Solove’s meta-harm categories. Over the full evolutionary life of a software system, we can ask: “What is private and what is not?” If the actors – developer, manager, and even workers – had asked, as each new feature was requested: “What is being newly exposed, and what is made private,” they might not have drifted into the current state. It’s a question which could readily be built into formal software development processes.

But in an era of “Data Democratization,” when data-mining may be more broadly available withing organizations, such checklists might be necessary but not sufficient. Organizations will likely need broadly-based internal training on protection of personal information.

October 17, 2018

May we recommend…some Ethics?

May we recommend…some Ethics?
by Jessica Hays | September 30, 2018

The internet today is awash with recommender systems. Some pop to mind quickly – like Netflix’s suggestions of shows to binge, or Amazon’s nudges towards products you may like. These tools use your personal history on their site, along with troves of data from other users, to predict which items or videos are most likely to tempt you. Other examples of recommender system include social media (“people you may know!”), online dating apps, news aggregators, search engines, restaurants finders, and music or video streaming services.

Screenshots of recommendations from LinkedIn, Netflix

Recommender systems have proliferated because there are benefits to be shared on both sides of the coin. Data-driven recommendations mean customers have to spend less time digging for the perfect product themselves. The algorithm does the heavy lifting – once they set off on the trail of whatever they’re seeking, they are guided to things they may have otherwise only found after hours of searching (or not at all). Reaping even more rewards, however, are the companies using the hook of an initial search to draw users further and further into their platform, increasing their revenue potential (the whole point!) with every click.

Not that innocent

In the last year, however, some of these recommender giants (YouTube, Facebook, Twitter) have gotten attention for the ways in which their algorithms have been unwitting enablers of political radicalization and the proliferation of conspiratorial content. It’s not surprising, in truth, that machine learning quickly discovered that humans are drawn to drama and tempted by content more extreme than what they originally set out to find. And if drama means clicks and clicks mean revenue, that algorithm has accomplished its task! Fortunately, research and methods are underway to redirect and limit radicalizing behavior.

However, dangers need not be as extreme as ISIS sympathizing to merit notice. Take e-commerce. With over 70% of the US population projected to make an online purchase this year, behind-the-scenes algorithms could be influencing the purchasing habits of a healthy majority of the population. The sheer volume of people impacted by recommender systems, then, is cause for a closer look.

It doesn’t take long to think up recommendation scenarios that could raise an eyebrow. While humans fold ethical considerations into their recommendations, algorithms programmed to drive revenue do not. For example, imagine an online grocery service, like [Instacart

Junk food -> more junk food!

Can systems be held accountable?

While this may be great for retailers’ bottom line, it’s clearly not for our country’s growing waistlines. Some might argue that nothing has fundamentally changed from the advertising and marketing schemes of yore. Aren’t companies merely responding to the same old pushes and pulls of supply and demand – supplying what they know users would ask for if they had the knowledge/chance? Legally, of course, they’re right – nothing new is required of companies to suggest and sell ice cream, just because they now know intimately that a customer has a weakness for it.

Ethics, however, points the other way. Increased access to millions of users’ preferences and habits and opportunity to influence behavior aren’t negligible. Between the power of suggestion, knowledge of users’ tastes, and lack of barriers between hitting “purchase” and having the treat delivered – what role should ethically-responsible retailers play in helping their users avoid decisions that could negatively impact their well-being?

Unfortunately, there’s unlikely to be a one-size-fits-all approach across sectors and systems. However, it would be advantageous to see companies start integrating approaches that mitigate potential harm to users. While following the principles of respect for persons, beneficence, and justice laid out in the Belmont Report is always a good place to start, some specific approaches could include:

Providing users more transparency and access to the algorithm (e.g. being able to turn off/on recommendations for certain items)
Maintaining manual oversight of sensitive topics where there is potential for harm
Allowing users to flag and provide feedback when they encounter a detrimental recommendation

As users become more savvy and aware of the ways in which recommender systems influence their interactions online, it is likely that the demand for ethical platforms will only rise. Companies would be wise to take measures to get out ahead of these ethical concerns – for both their and their users’ sakes.

October 17, 2018

Mapping the Organic Organization

Mapping the Organic Organization
Thoughts on the development of a Network Graph for your company

Creating the perfect organization for a business is hard. It has to maintain alignment of goals and vision, enable the movement of innovation and ideas, develop the products and services and deliver them, apply the processes and controls needed to remain disciplined, safe and sustainable, and many things besides. But getting it right means a significant competitive advantage.

Solid Organization

There are myriad organization designs but even with all the efforts and thought in the word, the brutal truth is that the perfect organizational structure of your company today will be imperfect tomorrow. But what if we could track how our organization was developing and evolving over time? Would that be useful in identifying the most effective organization structure, i.e. the one that, informally, is in place already? Furthermore, could tracing the evolution of that organization predict the structure of the future?

Monitoring the way information flows in a company can be tricky as it takes many formats through many systems, but communication flows are becoming easier to map. Large social media platforms such as Facebook or LinkedIn have developed functions to visualize your social network or business contact network. Twitter maps of key influencers and their networks also make interesting viewing. And increasingly, corporations are developing such tools to understand their inner workings.

Network graph example

A corporation’s ‘social network’ can be visualised using the metadata captured from communication between colleagues – email, IM, VOIP call, meetings – and maps them on a network graph. Each individual engaged in work with the company constitutes a node, and the edges, communications (the weight of the edges relating to the frequency of communications).

The insights could be fascinating and important. The communities within the graph can be identified – perhaps these line up with an organization chart, but more likely, not. It would identify key influencers or knowledge centers that may not easily be recognised outside a specific work group, aiding with risk management and key succession planning with many nodes connecting to that person. It could identify the ‘alpha index’, the degree of connectivity within a department providing some understanding of how controlled or independent workgroups are. Perhaps key influencers, those who have a wide and disparate connection profile, could help in driving cultural changes or adoption of new concepts and values.

You could even match the profiles of the people in the team with Belbin’s team roles to see how balanced it is! But maybe that’s getting a little carried away

And I think that’s the risk. What is being collected and what are you using it for?

Data is collected on communications within a corporation. Many companies have Computer User Guidelines that are reviewed and agreed to upon joining a company. The use of computers and the networks owned by an employer comes with some responsibilities and an agreement about this corporate assets’ intended use – and how the company will use the metadata generated. Clearly it is a matter of importance for the protection of company technology and IP to understand when large files are transferred to third parties. And when illicit websites are being accessed from within a company network. But how far can employers go when using this data for other insights into their employees behaviour?

Dilbert privacy

Before embarking on this data processing activity, a company should develop an understanding of its employees expectation of privacy. There is a clear contrast here between an evil, all-seeing big-brother style corporation spying on staff, and employees that are compensated and paid to fulfill specific job functions for that company and who must accept a level of oversight. An upfront and transparent engagement will keep the employee population from being concerned and reduce distrust.

Guidelines for the collection, processing and dissemination of this data can be developed in conjunction with employee representatives using the multi-dimensional analytic developed by Deirdre Mulligan, Colin Koopman and Nick Doty which systematically considers privacy in five dimensions.

The theory dimension helps in identifying the object of the data, specifying the justification for collecting it, and the archetypal threats involved.
The protection dimension helps to develop the boundaries of data collection that the employee population are comfortable with and that the company desire. It will also help the company to understand the value of the data it is generating and understand the risks to its function should information about critical personal be made available outside, for example, to competitors or head hunters.
The dimension of harm can help understand the concerns employees have in the use of the collected data, and a company should be open about what the expected uses are and what they are not – results would not be used as input for any employee assessments, for example.
An agreement of what ‘independent’ entities to oversee the guidelines are carried out can make up the provision dimension.
And, finally, the provision of scope documents concerns about limits to the data managed and access to it. This should include storage length for example. An employee engaging in business on behalf of a company should have limited recourse to be concerned that the company monitors email, but specifications of “reasonable use” of company email for personal business – corresponding with your bank for example – muddies the waters. It will be key, for example, to agree that the content of communications would not be monitored.

In order to maintain motivation, trust and empowerment, it is key to be open about such things. There is an argument that providing this openness may impact the way the organization communicates: Perhaps people become more thoughtful or strategic in communications; perhaps verbal communications start to occur electronically or vice versa as people become conscious of the intrusion. Much like arguments on informed consent, however, I believe the suspicion and demotivation generated if employees’ privacy is dismissed will far outweigh the benefits gained from organizational graphing.

Works Cited

Mulligan, D; Koopman, C; Doty, N. “Privacy is an essentially contested concept: a multi-dimensional analytic for mapping privacy”, 2016

Rothstein, M.A; Shoben, A.B; “Does consent bias research?”, 2013

Ducruet, C. , Rodrigue, J. “Graph Theory: Measures and Indices”, Retrieved September 29, 2018 from https://transportgeography.org/?page_id=5981

Nodus Labs, “Learning to Read and Interpret Network Graph Data Visualizations”, Retrieved September 29, 2018 from https://noduslabs.com/cases/learn-read-interpret-network-graphs-data-visualization/

October 17, 2018

Content Integrity and the Dubious Ethics of Censorship

Content Integrity and the Dubious Ethics of Censorship
By Aidan Feay | September 30th, 2018

In the wake of Alex Jones’ exile from every social media platform, questions about censorship and content integrity are swirling around the web. Platforms like Jones’ Infowars propagate misinformation under the guise of genuine journalism while serving a distinctly more sinister agenda. Aided by the rise of native advertising and the blurring of lines between sponsored or otherwise nefariously motivated content and traditional editorial media, the general populace finds it increasingly difficult to distinguish between the two. Consumers are left in a space where truth is nebulous and the ethics of content production are constantly questioned. Platforms are forced to evaluate the ethics of censorship and balance profitability with public service.

At the heart of this crisis is the concept of fake news, which we can define as misinformation that imitates the form but not the content of editorial media. Whether it’s used to generate ad revenue or sway entire populations of consumers, fake news has found marked success on social media. The former is arguably less toxic but no less deceitful. As John Oliver so aptly put it in his 2014 piece on native advertising, publications are “like a camouflage manufacturer saying ‘only an idiot could not tell the difference between that man [gesturing to a camouflage advertisement] and foliage’.” There’s a generally accepted suggestion of integrity in all content that is allowed publication or propagation on a platform.

Image via Last Week Tonight with John Oliver

Despite the assumption that the digital age has advanced our access to information and thereby made us smarter, it has simultaneously improved the spread of misinformation. The barriers of entry to mainstream consumption are lower and the isolation of like-minded communities has created ideological echo chambers that feed confirmation bias in order to widen the political spectrum and reinforce extremist beliefs. According to the Pew Research Center, this gap has more than doubled over the past twenty years, making outlandish claims all the more palatable to general media consumers.

Democrats and Republicans are More Ideologically Divided than in the Past
Image via People Press.org

Platforms are stuck weighing attempts to bridge the gap and open up echo chambers with cries against censorship. On the pro-censorship side, arguments are made in favor of safety for the general public. Take the Comet Ping Pong scandal, for example, wherein absurd claims based on the John Podesta emails found fuel on 4chan and gained traction within far-right blogs, which propagated the allegations of a pedophilia ring as fact. These articles found purchase on Twitter and Reddit and ultimately led to an assailant armed with an assault rifle firing shots within the restaurant in an attempt to rescue the supposed victims. What started as a fringe theory found purchase and led to real-world violence.

The increasing pressure on platforms to prevent this sort of exacerbation has led media actors to partner with platforms in order to arrive at a solution. One such effort is the Journalism Trust Initiative, a global effort to create accountability standards for media orgations and develop a whitelisted group of outlets which would be implemented by social media platforms as a low-lift means of censoring harmful content.

On the other hand, strong arguments have been made against censorship. In the Twitter scandal of Beatrix von Storch, evidence can be found of legal pressure from the German government to promote certain behaviors by the platform’s maintainers. Similarly, Courtney Radsch of the Committee to Protect Journalists points out that authoritarian regimes have been the most eager to acknowledge, propagate and validate the concept of fake news within their nations. Egypt, China, and Turkey have jailed more than half of imprisoned journalists worldwide, illustrating the dangers of censorship to a society that otherwise enjoys a free press.

Committee to Protect Journalists map of imprisoned journalists worldwide
Image via the Committee to Protect Journalists

How can social media platforms ethically engage with the concept of censorship? While censorship can prevent violence amongst the population, it can also reinforce governmental bias and suppress free speech. For so long, platforms like Facebook tried to hide behind their terms of service in order to avoid the debate entirely. During the Infowars debacle, the Head of News Feed at Facebook said that they don’t “take down false news” and that “being false […] doesn’t violate the community standards.” Shortly after, they contorted the language of their Community Standards due to public pressure and cited their anti-violence clause in Infowars ban.

It seems then that platforms are only beholden to popular opinion and the actions of their peers (Facebook only banned InfoWars after YouTube, Stitcher and Spotify did). Corporate profitability will favor censorship as an extension of consumers so long as those with purchasing power remain ethically conscious and exert their power by choosing which platforms to utilize and passively fund via advertising.

October 17, 2018October 17, 2018

Hiding in Plain Sight: A Tutorial on Obfuscation

Hiding in Plain Sight: A Tutorial on Obfuscation
by Andrew Mamroth | 14 October 2018

In 2010, the Federal Trade Commission pledged to give internet users the power to determine if or when websites were allowed to track their behavior. This was the so called “Do Not Track” list. Now in 2018, this project has been sidelined and widely ignored by content providers and data analytics platforms, with users left to the wolves. So if one wanted to avoid these trackers or at the very least disincentivize their use by rendering them useless, what options do we have?

Imagine the orb weaving spider. This animal creates similar looking copies of itself in it’s web so as to prevent wasps from reliably striking it. It introduces noise to hide the signal. This is the idea behind many of the obfuscation tools used to nullify online trackers today. By hiding your actual request or intentions under a pile of noise making signals, you get the answer you desire while the tracking services are left with a unintelligible mess of information.

Here we’ll cover a few of the most common obfuscation tools used today to add to your digital camouflage so you can start hiding in plain sight. All of these are browser plug-ins and work with most modern browsers (firefox, chrome, etc.)

camo spider

TrackMeNot

TrackMeNot is a lightweight browser extension that helps protect web searchers from surveillance and data-profiling by search engines. It accomplishes this by running randomized search-queries to various online search engines such as Google, Yahoo!, and Bing. By hiding your queries under a deluge of noise, it becomes near impossible or at the very least, impractical to aggregate your search results to build a profile of you.

TrackMeNot was designed in response to the U.S. Department of Justice’s request for Google’s search logs and in response to the surprising discovery by a New York Times reporter that some identities and profiles could be inferred even from anonymized search logs published by AOL Inc (Nissenbaum, 2016, p. 13)

The hope was to protect those under criminal prosecution from government or state entities, from seizing their search histories for use against them. Under the Patriot Act, the government can demand library records via a secret court order and without probable cause that the information is related to a suspected terrorist plot. It can also block the librarian from revealing that request to anyone. Additionally, the term “records” covers not only the books you check out, it also includes search histories and hard drives from library computers. By introducing noise into your search history, this can make snooping in these logs very unreliable.

AdNauseam

AdNauseam is a tool to thwart online advertisers from collecting meaningful information from you. Many content providers make money by the click so by clicking on everything you provide a significant amount of noise without generating a meaningful stamp.

AdNauseam quietly clicks on every blocked ad, registering a visit on ad networks’ databases. As the collected data gathered shows an omnivorous click-stream, user tracking, targeting and surveillance become futile.

FaceCloak

FaceCloak is a tool that allows users to use social media sites such as Facebook while providing fake information to the platform for some users.

Users of social networking sites, such as Facebook or MySpace, need to trust a site to properly deal with their personal information. Unfortunately, this trust is not always justified. FaceCloak is a Firefox extension that replaces your personal information with fake information before sending it to a social networking site. Your actual personal information is encrypted and stored securely somewhere else. Only friends who were explicitly authorized by you have access to this information, and FaceCloak transparently replaces the fake information while your friends are viewing your profile.

FaceCloak

Conclusion

Online privacy is becoming ever more important and ever more difficult to achieve. It has become increasingly clear that the government is either unable or unwilling to protect it’s citizens privacy in the digital age and more often than not is a major offender in using individuals personal information for dubious goals. Obfuscation tools are becoming ever more prevalent as companies quietly take our privacy with or without consent. Protecting your privacy online isn’t impossible, but it does take work, fortunately the tools exist to take it back.

Reference

Brunton, Finn, and Helen Nissenbaum. Obfuscation: a User’s Guide for Privacy and Protest. MIT Press, 2016.

October 17, 2018October 17, 2018

Shopping in the Digital Age – a Psychological Battle

Shopping in the Digital Age – a Psychological Battle
by Krissy Gianforte | 14 October 2018

Imagine your last trip to Target. You entered the store with your shopping list in-hand, intending to buy “just a few things”. As you walked down the aisles, though, extra items caught your eye – a cute coffee mug, or a soft throw blanket that would definitely match your living room. And so began the battle of wits – you versus the store. Hopefully you stayed strong and focused, and ended the day with your budget at least partially intact.

Now instead of that coffee mug and blanket, imagine that Target had an aisle designed specifically for you, containing all of the items you’ve *almost* purchased but decided against. As you walk through the store, you are bombarded by that pair of shoes you always wanted, sitting on a shelf right next to a box set of your favorite movies. How can you resist?

As stores collect more and more personal data on their customers, they are able to create exactly that shopping experience. But is that really a fair strategy?

Just a few things…

Classic economics

Stores have always used psychological tricks to help you spend more money. Shopping malls don’t have clocks to make sure you are not reminded that you need to leave; fast food restaurants offer “Extra Value Meals” that actually aren’t any less expensive than buying the items individually, and movie theatres use price framing to entice you into purchasing the medium size popcorn even though it is more than you want to eat.

Really a value?

All of these tactics are fairly well known – and shoppers often consciously recognize that they are being manipulated. There is still a sense that the shopper can ‘win’, though, and overcome the marketing ploys. After all, these offers are generic, designed to trick the “average person”. More careful, astute shoppers can surely avoid the traps. But that is changing in the age of big data…

An unfair advantage

In today’s digital world, stores collect an incredible amount of personal information about each and every customer: age, gender, purchase history, even how many pets you have at home. That deep knowledge allows stores to completely personalize their offerings for the individual – what an ex-CEO of Toys-R-Us called “marketing to the segment of one…the holy grail of consumer marketing”(Reuters, 2014). Suddenly, the usual psychological tricks seem a lot more sinister.

For example, consider the Amazon shopping site. As you check out, Amazon offers you a quick look at a list of suggested items, 100% personalized based on your purchase history and demographics. This is similar to the “impulse buy” racks of gum and sweets by the grocery store register, but much more powerful because it contains *exactly the items most likely to tempt you*.

Even familiar chains like Domino’s Pizza have begun using personal data to increase sales: the restaurant now offers a rewards program where customers can earn free pizza by logging-in with each purchase. Each time the customer visits the Domino’s site, he is shown a progress bar towards his next free reward. This type of goal-setting is a well-recognized gamification technique designed to increase the frequency of purchases. Even futher, the Domino’s site uses the customer’s purchase history to create an “Easy Meal”, which can be ordered with a single button-click. Ordering pizza is already tempting – even more so when it is made so effortless!

Personal Pizza

But has it crossed a line?

Retailers have enough personal information to tempt you with *exactly* the product you find irresistable. The catch is, they also know how hard you have tried to resist it. As clearly as Amazon can see the products you’ve repeatedly viewed, they can see that you have *never actually purchased them*. Domino’s knows that you typically purchase pizza only on Fridays, with your weekend budget.

And yet, the personalized messages continue to come, pushing you to act on that extra indulgent urge and make that extra purchase. The offers are no longer as easy to resist as a generic popcorn deal or value meal. They pull at exactly the thing you’ve been resisting…and eventually overrule the conscious decision you’d previously made against the purchase.

Shoppers may begin to question whether personal data-based tactics are actually fair, with the balance of power shifting so heavily in favor of the seller. Can these psychological manipulations really be acceptable when they are *so effective*? To help determine whether some ethical line has truly been crossed, we can apply personal privacy analysis frameworks to this use of customers’ personal data.

Daniel Solove’s Taxonomy of Privacy provides a perfect starting point to help identify what harms may be occuring. In particular, these data-based marketing strategies may represent invasions – Intrusion on your personal, quiet shopping time, or Decisional Interference as you are coerced into buying products you don’t want or need. However, this case is not as clear as typical examples of decisional interference, where *clearly uninvolved* parties interfere (such as the government stepping into personal reproductive decisions). Here, the seller is already an integral part of the transaction environment – so perhaps they have a right to be involved in customers’ decisions.

Deirdre Mulligan’s analytic for mapping privacy helps define seller and customers’ involvement and rights even more concretely. Using the analytic’s terminology, the issue can be understood along multiple dimensions:

Dimensions of Protection: privacy protects the subject’s (customers’) target (decision-making space and peaceful shopping time)
Dimensions of Harm: the action (using personal data for manipulating purchases) is conducted by the offender (merchants)

Those definitions are straightforward; however, defining the Dimension of Theory becomes more difficult. Where we would hope to assign a clear and noble object that privacy provides – something as universal as dignity or personal freedom – in this case we find simply a desire to not be watched or prodded by ‘big brother’. Though an understandable sentiment, it does not provide sufficient justification – a motivation and basis for providing privacy. Any attempt to assign a from-whom – an actor against whom privacy is a protection – returns somewhat empty. Privacy would protect consumers from the very merchants who collected said data, which they rightfully own and may use for their own purposes, as agreed when a customer acknowledges a privacy policy).

Business as usual

As we are unable to pinpoint any actually unethical behavior from sellers, perhaps we must recognize that personalized marketing is simply the way of the digital age. Ads and offers will become more tempting, and spending will be made excessively easy. It is a dangerous environment, and as consumers we must be aware of these tactics and train ourselves to combat them. But in the end, shopping remains a contentious psychological battle between seller and shopper. May the strongest side win.

October 17, 2018October 17, 2018

Hate Speech – How far should social media censorship go, and can there be unintended consequences?

Hate Speech – How far should social media censorship go, and can there be unintended consequences?

By Anonymous, 10/14/2018

Today, there is a lot of harmful or false information being spread around the internet, among the most egregious of which is hate speech. It may seem a difficult task to stop hate speech from spreading over the vast expanse of the internet, however, some countries are taking matters into their own hands. In 2017, Germany banned any hate speech, as well as “fake news” that was defamatory, on social media sites. Some social media sites have already begun censorship of hate speech; for example, both Twitter and Facebook disallow it.

This raises the question, should there be social media censorship of hate speech, and could there be disparate impact on certain groups, particularly on protected groups? Some may argue that there shouldn’t be censorship at all; in the US, there is very little regulation of hate speech, since courts have repeatedly ruled that hate speech regulation violates the First Amendment. However, others may take the view that we should do all we can to stop language which could potentially incite violence against others, particularly minorities. Furthermore, since social media companies are private platforms, ultimately they have control over the content allowed on their site, and it makes sense for them to enhance their reputation by removing bad actors. Thus some social media companies have decided that while it not appease everyone, social media censoring is positive for their platform and will be enforced.

Could social media censoring of hate speech lead to unintended consequences, that unintentionally harm some groups or individuals over others? Firstly, since hate speech is not always well defined, could the list of phrases included as hate speech disproportionally affect certain groups of people? Certain phrases may not be offensive in some cultures or countries, but may be in others. How should social media sites determine what constitute hate speech, and is there a risk that certain groups have their speech censored more than other groups? In addition, could the way hate speech is monitored could be subject to bias from the reviewer or algorithmic bias?

Some social media sites do seem to recognize the complexity of censorship of hate speech. Facebook has a detailed blog which discusses its approach to hate speech; it details that the decision to determine whether a comment is hate speech or not includes a consideration of both context and intent. The blog even provides complex examples of words that may appear offensive but are not because the phrase was made in sarcasm, or the use of the word was to reclaim the word in a non-offensive manner. This is an important recognition in the censorship process; among many issues in ethics, there is often no absolute right or wrong answer, with context being the major determinant, and this is no exception.

Facebook also notes that they “are a long way from being able to rely on machine learning and AI to handle the complexity involved in assessing hate speech.” Algorithmic bias is thus not an issue as of yet, but more importantly, it is good that there is not a rush to use algorithms in this situation; something as context based as hate speech would be extremely difficult to flag correctly, and doing so is certain to lead to many false positives. It does however mean the main identification of hate speech comes from user reports and employees who review the content. This could lead to two additional forms of bias. Firstly, there could be bias in the types of posts that are reported or the people whose posts get reported. There could be a particular issue if there are certain posts that are not being reported because the people they target are too afraid to report it, or if they do not see the post. This is a difficult problem to address, although anonymous reports should somewhat resolve the issue of the fear of reporting. The second type of bias is that the reviewers themselves are biased in a certain way. While it is difficult to remove all types of bias, it is important to increase the understanding of potential sources of bias, and then address the bias. Facebook has pledged that their teams will continue learning about local context and the changing language, in an effort to combat this. It is a difficult battle, and we must hope that the social media companies are able to get it right; in the meantime, we should continue to monitor that hate speech censorship censors just hate speech, and no more.

References:

BBC News, “Germany starts enforcing hate speech law”, retrieved October 13, 2018 from
https://www.bbc.com/news/technology-42510868

Twitter, “Hateful Conduct Policy”, retrieved October 13, 2018 from https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy

Facebook, Community Standards – Hate Speech, retrieved October 13 from https://www.facebook.com/communitystandards/hate_speech

Facebook Newsroom, “Hard Questions: Who Should Decide What Is Hate Speech in an Online Global Community?”, retrieved October 13 from https://newsroom.fb.com/news/2017/06/hard-questions-hate-speech/