Archive for the ‘blogpost’ Category

Workplace monitoring

October 17th, 2018

Workplace monitoring
by Anonymous on September 30, 2018

We never intended to build a pervasive workplace monitoring system – We just wanted to replace a clunky system of shared spreadsheets and email.

Our new system was intended to track customer orders as they moved through the factory. Workers finding a problem in the production process could quickly check order history and take corrective action. As a bonus, the system would also record newly-mandatory safety-related records. And it also appealed to the notion of “Democratization of Data,” giving workers direct access to customer orders. No more emails or phone-tag with production planners.

It goes live

The system was well received, and we started collecting a lot of data: a log every action performed on each sales order, with user IDs and timestamps. Workers could see all the log entries for each order they processed. And the log entries became invisible to workers once an order was closed.

Invisible, but not deleted.

Two years later, during a time of cost-cutting, it came to the attention of management that the logs could be consolidated and sorted by *any* field. Like this:

A new report was soon generated; logs sorted by worker ID. It didn’t seem like such a major request. After all, the data was already there. No notice was given to workers about the new report, or its potential use as a worker performance evaluation tool.

Re-purposed data

The personally identifiable information was re-purposed without notice or consent. The privacy issue may be intangible now to the workers, but could one day become very tangible as a factor in pay or layoff decisions. There is also potential for misinterpretation of the data. A worker doing many small tasks could be seen as doing far more than a worker doing a few time-consuming tasks.

Protection for workers’ information

California employers may monitor workers’ computer usage. The California Consumer Privacy act of 2018 covers consumers, not workers.

However, European Union’s General Data Protection Regulation (GDPR) addresses this directly, and some related portions of the system operate in the EU.

GDPR’s scope is broad; covering personally identifiable information in all walks of life (e.g. as a worker, as consumer, as citizen.) Section 26 makes clear: “The principles of data protection should apply to any information concerning an identified or identifiable natural person.” Other sections cover consent, re-purposing, and fairness/transparency issues, and erasability (sections 32, 60, 65, and 66.)

Most particularly, Article 88 requires that collection and use of personally identifiable information in workplace should be subject to particularly high standards of transparency.


It’s easy in hindsight to find multiple points at which this might have been avoided. Mulligan’s and Solove’s frameworks suggest looking at “actors” and causes.

  • Software Developer’s action (harm during data collection): There could have been a login at the level of a processing station ID, rather than a worker’s personal ID.
  • Software Developer’s action (and timescale considerations, yielding harm during data processing): The developer could have completely deleted the worker IDs once the order closed.
  • Software Developer’s action (harm during data processing and dissemination: increased Accessibility, Distortion): The
    developer could have written the report to show that simple “event counting” wasn’t a reliable way of measuring worker contribution.
  • Management’s action (harm during data gathering and processing): the secondary use of the data resulted in intrusive surveillance. Businesses have responsibility (particularly under GDPR) to be transparent with respect to workplace data. Due concern for the control over personal information was not shown.


One way forward, when working on systems which evolve over time: Consider Mulligan’s contrast concept dimension of privacy analysis, applied with Solove’s meta-harm categories. Over the full evolutionary life of a software system, we can ask: “What is private and what is not?” If the actors – developer, manager, and even workers – had asked, as each new feature was requested: “What is being newly exposed, and what is made private,” they might not have drifted into the current state. It’s a question which could readily be built into formal software development processes.

But in an era of “Data Democratization,” when data-mining may be more broadly available withing organizations, such checklists might be necessary but not sufficient. Organizations will likely need broadly-based internal training on protection of personal information.

May we recommend…some Ethics?

October 17th, 2018

May we recommend…some Ethics?
by Jessica Hays | September 30, 2018

The internet today is awash with recommender systems. Some pop to mind quickly – like Netflix’s suggestions of shows to binge, or Amazon’s nudges towards products you may like. These tools use your personal history on their site, along with troves of data from other users, to predict which items or videos are most likely to tempt you. Other examples of recommender system include social media (“people you may know!”), online dating apps, news aggregators, search engines, restaurants finders, and music or video streaming services.

Screenshots of recommendations from LinkedIn, Netflix

Recommender systems have proliferated because there are benefits to be shared on both sides of the coin. Data-driven recommendations mean customers have to spend less time digging for the perfect product themselves. The algorithm does the heavy lifting – once they set off on the trail of whatever they’re seeking, they are guided to things they may have otherwise only found after hours of searching (or not at all). Reaping even more rewards, however, are the companies using the hook of an initial search to draw users further and further into their platform, increasing their revenue potential (the whole point!) with every click.

Not that innocent

In the last year, however, some of these recommender giants (YouTube, Facebook, Twitter) have gotten attention for the ways in which their algorithms have been unwitting enablers of political radicalization and the proliferation of conspiratorial content. It’s not surprising, in truth, that machine learning quickly discovered that humans are drawn to drama and tempted by content more extreme than what they originally set out to find. And if drama means clicks and clicks mean revenue, that algorithm has accomplished its task! Fortunately, research and methods are underway to redirect and limit radicalizing behavior.

However, dangers need not be as extreme as ISIS sympathizing to merit notice. Take e-commerce. With over 70% of the US population projected to make an online purchase this year, behind-the-scenes algorithms could be influencing the purchasing habits of a healthy majority of the population. The sheer volume of people impacted by recommender systems, then, is cause for a closer look.

It doesn’t take long to think up recommendation scenarios that could raise an eyebrow. While humans fold ethical considerations into their recommendations, algorithms programmed to drive revenue do not. For example, imagine an online grocery service, like [Instacart

Junk food -> more junk food!

Can systems be held accountable?

While this may be great for retailers’ bottom line, it’s clearly not for our country’s growing waistlines. Some might argue that nothing has fundamentally changed from the advertising and marketing schemes of yore. Aren’t companies merely responding to the same old pushes and pulls of supply and demand – supplying what they know users would ask for if they had the knowledge/chance? Legally, of course, they’re right – nothing new is required of companies to suggest and sell ice cream, just because they now know intimately that a customer has a weakness for it.

Ethics, however, points the other way. Increased access to millions of users’ preferences and habits and opportunity to influence behavior aren’t negligible. Between the power of suggestion, knowledge of users’ tastes, and lack of barriers between hitting “purchase” and having the treat delivered – what role should ethically-responsible retailers play in helping their users avoid decisions that could negatively impact their well-being?

Unfortunately, there’s unlikely to be a one-size-fits-all approach across sectors and systems. However, it would be advantageous to see companies start integrating approaches that mitigate potential harm to users. While following the principles of respect for persons, beneficence, and justice laid out in the Belmont Report is always a good place to start, some specific approaches could include:

  • Providing users more transparency and access to the algorithm (e.g. being able to turn off/on recommendations for certain items)
  • Maintaining manual oversight of sensitive topics where there is potential for harm
  • Allowing users to flag and provide feedback when they encounter a detrimental recommendation

As users become more savvy and aware of the ways in which recommender systems influence their interactions online, it is likely that the demand for ethical platforms will only rise. Companies would be wise to take measures to get out ahead of these ethical concerns – for both their and their users’ sakes.

Mapping the Organic Organization
Thoughts on the development of a Network Graph for your company

Creating the perfect organization for a business is hard. It has to maintain alignment of goals and vision, enable the movement of innovation and ideas, develop the products and services and deliver them, apply the processes and controls needed to remain disciplined, safe and sustainable, and many things besides. But getting it right means a significant competitive advantage.

Solid Organization

There are myriad organization designs but even with all the efforts and thought in the word, the brutal truth is that the perfect organizational structure of your company today will be imperfect tomorrow. But what if we could track how our organization was developing and evolving over time? Would that be useful in identifying the most effective organization structure, i.e. the one that, informally, is in place already? Furthermore, could tracing the evolution of that organization predict the structure of the future?

Monitoring the way information flows in a company can be tricky as it takes many formats through many systems, but communication flows are becoming easier to map. Large social media platforms such as Facebook or LinkedIn have developed functions to visualize your social network or business contact network. Twitter maps of key influencers and their networks also make interesting viewing. And increasingly, corporations are developing such tools to understand their inner workings.

Network graph example

A corporation’s ‘social network’ can be visualised using the metadata captured from communication between colleagues – email, IM, VOIP call, meetings – and maps them on a network graph. Each individual engaged in work with the company constitutes a node, and the edges, communications (the weight of the edges relating to the frequency of communications).

The insights could be fascinating and important. The communities within the graph can be identified – perhaps these line up with an organization chart, but more likely, not. It would identify key influencers or knowledge centers that may not easily be recognised outside a specific work group, aiding with risk management and key succession planning with many nodes connecting to that person. It could identify the ‘alpha index’, the degree of connectivity within a department providing some understanding of how controlled or independent workgroups are. Perhaps key influencers, those who have a wide and disparate connection profile, could help in driving cultural changes or adoption of new concepts and values.

You could even match the profiles of the people in the team with Belbin’s team roles to see how balanced it is! But maybe that’s getting a little carried away

And I think that’s the risk. What is being collected and what are you using it for?

Data is collected on communications within a corporation. Many companies have Computer User Guidelines that are reviewed and agreed to upon joining a company. The use of computers and the networks owned by an employer comes with some responsibilities and an agreement about this corporate assets’ intended use – and how the company will use the metadata generated. Clearly it is a matter of importance for the protection of company technology and IP to understand when large files are transferred to third parties. And when illicit websites are being accessed from within a company network. But how far can employers go when using this data for other insights into their employees behaviour?

Dilbert privacy

Before embarking on this data processing activity, a company should develop an understanding of its employees expectation of privacy. There is a clear contrast here between an evil, all-seeing big-brother style corporation spying on staff, and employees that are compensated and paid to fulfill specific job functions for that company and who must accept a level of oversight. An upfront and transparent engagement will keep the employee population from being concerned and reduce distrust.

Guidelines for the collection, processing and dissemination of this data can be developed in conjunction with employee representatives using the multi-dimensional analytic developed by Deirdre Mulligan, Colin Koopman and Nick Doty which systematically considers privacy in five dimensions.

  • The theory dimension helps in identifying the object of the data, specifying the justification for collecting it, and the archetypal threats involved.
  • The protection dimension helps to develop the boundaries of data collection that the employee population are comfortable with and that the company desire. It will also help the company to understand the value of the data it is generating and understand the risks to its function should information about critical personal be made available outside, for example, to competitors or head hunters.
  • The dimension of harm can help understand the concerns employees have in the use of the collected data, and a company should be open about what the expected uses are and what they are not – results would not be used as input for any employee assessments, for example.
  • An agreement of what ‘independent’ entities to oversee the guidelines are carried out can make up the provision dimension.
  • And, finally, the provision of scope documents concerns about limits to the data managed and access to it. This should include storage length for example. An employee engaging in business on behalf of a company should have limited recourse to be concerned that the company monitors email, but specifications of “reasonable use” of company email for personal business – corresponding with your bank for example – muddies the waters. It will be key, for example, to agree that the content of communications would not be monitored.

In order to maintain motivation, trust and empowerment, it is key to be open about such things. There is an argument that providing this openness may impact the way the organization communicates: Perhaps people become more thoughtful or strategic in communications; perhaps verbal communications start to occur electronically or vice versa as people become conscious of the intrusion. Much like arguments on informed consent, however, I believe the suspicion and demotivation generated if employees’ privacy is dismissed will far outweigh the benefits gained from organizational graphing.

Works Cited

Mulligan, D; Koopman, C; Doty, N. “Privacy is an essentially contested concept: a multi-dimensional analytic for mapping privacy”, 2016

Rothstein, M.A; Shoben, A.B; “Does consent bias research?”, 2013

Ducruet, C. , Rodrigue, J. “Graph Theory: Measures and Indices”, Retrieved September 29, 2018 from

Nodus Labs, “Learning to Read and Interpret Network Graph Data Visualizations”, Retrieved September 29, 2018 from

Content Integrity and the Dubious Ethics of Censorship
By Aidan Feay | September 30th, 2018

In the wake of Alex Jones’ exile from every social media platform, questions about censorship and content integrity are swirling around the web. Platforms like Jones’ Infowars propagate misinformation under the guise of genuine journalism while serving a distinctly more sinister agenda. Aided by the rise of native advertising and the blurring of lines between sponsored or otherwise nefariously motivated content and traditional editorial media, the general populace finds it increasingly difficult to distinguish between the two. Consumers are left in a space where truth is nebulous and the ethics of content production are constantly questioned. Platforms are forced to evaluate the ethics of censorship and balance profitability with public service.

At the heart of this crisis is the concept of fake news, which we can define as misinformation that imitates the form but not the content of editorial media. Whether it’s used to generate ad revenue or sway entire populations of consumers, fake news has found marked success on social media. The former is arguably less toxic but no less deceitful. As John Oliver so aptly put it in his 2014 piece on native advertising, publications are “like a camouflage manufacturer saying ‘only an idiot could not tell the difference between that man [gesturing to a camouflage advertisement] and foliage’.” There’s a generally accepted suggestion of integrity in all content that is allowed publication or propagation on a platform.

Image via Last Week Tonight with John Oliver

Despite the assumption that the digital age has advanced our access to information and thereby made us smarter, it has simultaneously improved the spread of misinformation. The barriers of entry to mainstream consumption are lower and the isolation of like-minded communities has created ideological echo chambers that feed confirmation bias in order to widen the political spectrum and reinforce extremist beliefs. According to the Pew Research Center, this gap has more than doubled over the past twenty years, making outlandish claims all the more palatable to general media consumers.

Democrats and Republicans are More Ideologically Divided than in the Past
Image via People

Platforms are stuck weighing attempts to bridge the gap and open up echo chambers with cries against censorship. On the pro-censorship side, arguments are made in favor of safety for the general public. Take the Comet Ping Pong scandal, for example, wherein absurd claims based on the John Podesta emails found fuel on 4chan and gained traction within far-right blogs, which propagated the allegations of a pedophilia ring as fact. These articles found purchase on Twitter and Reddit and ultimately led to an assailant armed with an assault rifle firing shots within the restaurant in an attempt to rescue the supposed victims. What started as a fringe theory found purchase and led to real-world violence.

The increasing pressure on platforms to prevent this sort of exacerbation has led media actors to partner with platforms in order to arrive at a solution. One such effort is the Journalism Trust Initiative, a global effort to create accountability standards for media orgations and develop a whitelisted group of outlets which would be implemented by social media platforms as a low-lift means of censoring harmful content.

On the other hand, strong arguments have been made against censorship. In the Twitter scandal of Beatrix von Storch, evidence can be found of legal pressure from the German government to promote certain behaviors by the platform’s maintainers. Similarly, Courtney Radsch of the Committee to Protect Journalists points out that authoritarian regimes have been the most eager to acknowledge, propagate and validate the concept of fake news within their nations. Egypt, China, and Turkey have jailed more than half of imprisoned journalists worldwide, illustrating the dangers of censorship to a society that otherwise enjoys a free press.

Committee to Protect Journalists map of imprisoned journalists worldwide
Image via the Committee to Protect Journalists

How can social media platforms ethically engage with the concept of censorship? While censorship can prevent violence amongst the population, it can also reinforce governmental bias and suppress free speech. For so long, platforms like Facebook tried to hide behind their terms of service in order to avoid the debate entirely. During the Infowars debacle, the Head of News Feed at Facebook said that they don’t “take down false news” and that “being false […] doesn’t violate the community standards.” Shortly after, they contorted the language of their Community Standards due to public pressure and cited their anti-violence clause in Infowars ban.

It seems then that platforms are only beholden to popular opinion and the actions of their peers (Facebook only banned InfoWars after YouTube, Stitcher and Spotify did). Corporate profitability will favor censorship as an extension of consumers so long as those with purchasing power remain ethically conscious and exert their power by choosing which platforms to utilize and passively fund via advertising.

Hiding in Plain Sight: A Tutorial on Obfuscation
by Andrew Mamroth | 14 October 2018

In 2010, the Federal Trade Commission pledged to give internet users the power to determine if or when websites were allowed to track their behavior. This was the so called “Do Not Track” list. Now in 2018, this project has been sidelined and widely ignored by content providers and data analytics platforms, with users left to the wolves. So if one wanted to avoid these trackers or at the very least disincentivize their use by rendering them useless, what options do we have?

Imagine the orb weaving spider. This animal creates similar looking copies of itself in it’s web so as to prevent wasps from reliably striking it. It introduces noise to hide the signal. This is the idea behind many of the obfuscation tools used to nullify online trackers today. By hiding your actual request or intentions under a pile of noise making signals, you get the answer you desire while the tracking services are left with a unintelligible mess of information.

Here we’ll cover a few of the most common obfuscation tools used today to add to your digital camouflage so you can start hiding in plain sight. All of these are browser plug-ins and work with most modern browsers (firefox, chrome, etc.)

camo spider


TrackMeNot is a lightweight browser extension that helps protect web searchers from surveillance and data-profiling by search engines. It accomplishes this by running randomized search-queries to various online search engines such as Google, Yahoo!, and Bing. By hiding your queries under a deluge of noise, it becomes near impossible or at the very least, impractical to aggregate your search results to build a profile of you.

TrackMeNot was designed in response to the U.S. Department of Justice’s request for Google’s search logs and in response to the surprising discovery by a New York Times reporter that some identities and profiles could be inferred even from anonymized search logs published by AOL Inc (Nissenbaum, 2016, p. 13)

The hope was to protect those under criminal prosecution from government or state entities, from seizing their search histories for use against them. Under the Patriot Act, the government can demand library records via a secret court order and without probable cause that the information is related to a suspected terrorist plot. It can also block the librarian from revealing that request to anyone. Additionally, the term “records” covers not only the books you check out, it also includes search histories and hard drives from library computers. By introducing noise into your search history, this can make snooping in these logs very unreliable.


AdNauseam is a tool to thwart online advertisers from collecting meaningful information from you. Many content providers make money by the click so by clicking on everything you provide a significant amount of noise without generating a meaningful stamp.

AdNauseam quietly clicks on every blocked ad, registering a visit on ad networks’ databases. As the collected data gathered shows an omnivorous click-stream, user tracking, targeting and surveillance become futile.


FaceCloak is a tool that allows users to use social media sites such as Facebook while providing fake information to the platform for some users.

Users of social networking sites, such as Facebook or MySpace, need to trust a site to properly deal with their personal information. Unfortunately, this trust is not always justified. FaceCloak is a Firefox extension that replaces your personal information with fake information before sending it to a social networking site. Your actual personal information is encrypted and stored securely somewhere else. Only friends who were explicitly authorized by you have access to this information, and FaceCloak transparently replaces the fake information while your friends are viewing your profile.



Online privacy is becoming ever more important and ever more difficult to achieve. It has become increasingly clear that the government is either unable or unwilling to protect it’s citizens privacy in the digital age and more often than not is a major offender in using individuals personal information for dubious goals. Obfuscation tools are becoming ever more prevalent as companies quietly take our privacy with or without consent. Protecting your privacy online isn’t impossible, but it does take work, fortunately the tools exist to take it back.


Brunton, Finn, and Helen Nissenbaum. Obfuscation: a User’s Guide for Privacy and Protest. MIT Press, 2016.

Shopping in the Digital Age – a Psychological Battle
by Krissy Gianforte | 14 October 2018

Imagine your last trip to Target. You entered the store with your shopping list in-hand, intending to buy “just a few things”. As you walked down the aisles, though, extra items caught your eye – a cute coffee mug, or a soft throw blanket that would definitely match your living room. And so began the battle of wits – you versus the store. Hopefully you stayed strong and focused, and ended the day with your budget at least partially intact.

Now instead of that coffee mug and blanket, imagine that Target had an aisle designed specifically for you, containing all of the items you’ve *almost* purchased but decided against. As you walk through the store, you are bombarded by that pair of shoes you always wanted, sitting on a shelf right next to a box set of your favorite movies. How can you resist?

As stores collect more and more personal data on their customers, they are able to create exactly that shopping experience. But is that really a fair strategy?

Just a few things…

Classic economics

Stores have always used psychological tricks to help you spend more money. Shopping malls don’t have clocks to make sure you are not reminded that you need to leave; fast food restaurants offer “Extra Value Meals” that actually aren’t any less expensive than buying the items individually, and movie theatres use price framing to entice you into purchasing the medium size popcorn even though it is more than you want to eat.

Really a value?

All of these tactics are fairly well known – and shoppers often consciously recognize that they are being manipulated. There is still a sense that the shopper can ‘win’, though, and overcome the marketing ploys. After all, these offers are generic, designed to trick the “average person”. More careful, astute shoppers can surely avoid the traps. But that is changing in the age of big data…

An unfair advantage

In today’s digital world, stores collect an incredible amount of personal information about each and every customer: age, gender, purchase history, even how many pets you have at home. That deep knowledge allows stores to completely personalize their offerings for the individual – what an ex-CEO of Toys-R-Us called “marketing to the segment of one…the holy grail of consumer marketing”(Reuters, 2014). Suddenly, the usual psychological tricks seem a lot more sinister.

For example, consider the Amazon shopping site. As you check out, Amazon offers you a quick look at a list of suggested items, 100% personalized based on your purchase history and demographics. This is similar to the “impulse buy” racks of gum and sweets by the grocery store register, but much more powerful because it contains *exactly the items most likely to tempt you*.

Even familiar chains like Domino’s Pizza have begun using personal data to increase sales: the restaurant now offers a rewards program where customers can earn free pizza by logging-in with each purchase. Each time the customer visits the Domino’s site, he is shown a progress bar towards his next free reward. This type of goal-setting is a well-recognized gamification technique designed to increase the frequency of purchases. Even futher, the Domino’s site uses the customer’s purchase history to create an “Easy Meal”, which can be ordered with a single button-click. Ordering pizza is already tempting – even more so when it is made so effortless!

Personal Pizza

But has it crossed a line?

Retailers have enough personal information to tempt you with *exactly* the product you find irresistable. The catch is, they also know how hard you have tried to resist it. As clearly as Amazon can see the products you’ve repeatedly viewed, they can see that you have *never actually purchased them*. Domino’s knows that you typically purchase pizza only on Fridays, with your weekend budget.

And yet, the personalized messages continue to come, pushing you to act on that extra indulgent urge and make that extra purchase. The offers are no longer as easy to resist as a generic popcorn deal or value meal. They pull at exactly the thing you’ve been resisting…and eventually overrule the conscious decision you’d previously made against the purchase.

Shoppers may begin to question whether personal data-based tactics are actually fair, with the balance of power shifting so heavily in favor of the seller. Can these psychological manipulations really be acceptable when they are *so effective*? To help determine whether some ethical line has truly been crossed, we can apply personal privacy analysis frameworks to this use of customers’ personal data.

Daniel Solove’s Taxonomy of Privacy provides a perfect starting point to help identify what harms may be occuring. In particular, these data-based marketing strategies may represent invasions – Intrusion on your personal, quiet shopping time, or Decisional Interference as you are coerced into buying products you don’t want or need. However, this case is not as clear as typical examples of decisional interference, where *clearly uninvolved* parties interfere (such as the government stepping into personal reproductive decisions). Here, the seller is already an integral part of the transaction environment – so perhaps they have a right to be involved in customers’ decisions.

Deirdre Mulligan’s analytic for mapping privacy helps define seller and customers’ involvement and rights even more concretely. Using the analytic’s terminology, the issue can be understood along multiple dimensions:

  • Dimensions of Protection: privacy protects the subject’s (customers’) target (decision-making space and peaceful shopping time)
  • Dimensions of Harm: the action (using personal data for manipulating purchases) is conducted by the offender (merchants)

Those definitions are straightforward; however, defining the Dimension of Theory becomes more difficult. Where we would hope to assign a clear and noble object that privacy provides – something as universal as dignity or personal freedom – in this case we find simply a desire to not be watched or prodded by ‘big brother’. Though an understandable sentiment, it does not provide sufficient justification – a motivation and basis for providing privacy. Any attempt to assign a from-whom – an actor against whom privacy is a protection – returns somewhat empty. Privacy would protect consumers from the very merchants who collected said data, which they rightfully own and may use for their own purposes, as agreed when a customer acknowledges a privacy policy).

Business as usual

As we are unable to pinpoint any actually unethical behavior from sellers, perhaps we must recognize that personalized marketing is simply the way of the digital age. Ads and offers will become more tempting, and spending will be made excessively easy. It is a dangerous environment, and as consumers we must be aware of these tactics and train ourselves to combat them. But in the end, shopping remains a contentious psychological battle between seller and shopper. May the strongest side win.

Hate Speech – How far should social media censorship go, and can there be unintended consequences?

By Anonymous, 10/14/2018

Today, there is a lot of harmful or false information being spread around the internet, among the most egregious of which is hate speech. It may seem a difficult task to stop hate speech from spreading over the vast expanse of the internet, however, some countries are taking matters into their own hands. In 2017, Germany banned any hate speech, as well as “fake news” that was defamatory, on social media sites. Some social media sites have already begun censorship of hate speech; for example, both Twitter and Facebook disallow it.

This raises the question, should there be social media censorship of hate speech, and could there be disparate impact on certain groups, particularly on protected groups? Some may argue that there shouldn’t be censorship at all; in the US, there is very little regulation of hate speech, since courts have repeatedly ruled that hate speech regulation violates the First Amendment. However, others may take the view that we should do all we can to stop language which could potentially incite violence against others, particularly minorities. Furthermore, since social media companies are private platforms, ultimately they have control over the content allowed on their site, and it makes sense for them to enhance their reputation by removing bad actors. Thus some social media companies have decided that while it not appease everyone, social media censoring is positive for their platform and will be enforced.

Could social media censoring of hate speech lead to unintended consequences, that unintentionally harm some groups or individuals over others? Firstly, since hate speech is not always well defined, could the list of phrases included as hate speech disproportionally affect certain groups of people? Certain phrases may not be offensive in some cultures or countries, but may be in others. How should social media sites determine what constitute hate speech, and is there a risk that certain groups have their speech censored more than other groups? In addition, could the way hate speech is monitored could be subject to bias from the reviewer or algorithmic bias?

Some social media sites do seem to recognize the complexity of censorship of hate speech. Facebook has a detailed blog which discusses its approach to hate speech; it details that the decision to determine whether a comment is hate speech or not includes a consideration of both context and intent. The blog even provides complex examples of words that may appear offensive but are not because the phrase was made in sarcasm, or the use of the word was to reclaim the word in a non-offensive manner. This is an important recognition in the censorship process; among many issues in ethics, there is often no absolute right or wrong answer, with context being the major determinant, and this is no exception.

Facebook also notes that they “are a long way from being able to rely on machine learning and AI to handle the complexity involved in assessing hate speech.” Algorithmic bias is thus not an issue as of yet, but more importantly, it is good that there is not a rush to use algorithms in this situation; something as context based as hate speech would be extremely difficult to flag correctly, and doing so is certain to lead to many false positives. It does however mean the main identification of hate speech comes from user reports and employees who review the content. This could lead to two additional forms of bias. Firstly, there could be bias in the types of posts that are reported or the people whose posts get reported. There could be a particular issue if there are certain posts that are not being reported because the people they target are too afraid to report it, or if they do not see the post. This is a difficult problem to address, although anonymous reports should somewhat resolve the issue of the fear of reporting. The second type of bias is that the reviewers themselves are biased in a certain way. While it is difficult to remove all types of bias, it is important to increase the understanding of potential sources of bias, and then address the bias. Facebook has pledged that their teams will continue learning about local context and the changing language, in an effort to combat this. It is a difficult battle, and we must hope that the social media companies are able to get it right; in the meantime, we should continue to monitor that hate speech censorship censors just hate speech, and no more.


BBC News, “Germany starts enforcing hate speech law”, retrieved October 13, 2018 from

Twitter, “Hateful Conduct Policy”, retrieved October 13, 2018 from

Facebook, Community Standards – Hate Speech, retrieved October 13 from

Facebook Newsroom, “Hard Questions: Who Should Decide What Is Hate Speech in an Online Global Community?”, retrieved October 13 from

The Doctor Will Model You Now:
The Rise and Risks of Artificial Intelligence in Healthcare
by Dani Salah | September 23, 2018

As artificial intelligence promises to change many facets of our everyday lives, it will have perhaps no greater change than in the healthcare industry. The increase in data collected on individuals has already proven paramount in improving health outcomes, and in this infrstructure AI has a natural fit. Computing power capabilities that are orders of magnitude greater than could have been imagined just decades ago, along with increasingly complex models that can learn as well as if not better than humans, have already changed healthcare capabilities worldwide.

While the applications for AI today and in the near future hold tremendous potential, there are ethical concerns that must be considered each step of the way. Doctors, whose oaths swear them to withhold the ethical standards of medicine, may soon be pledging to abide by ethical standards of data.

Promising Applications

AI Along Patient Experience

AI has begun demonstrating its potential to touch individuals at every point of their patient experiences. From diagnosis to surgical visit to recovery, AI and robotics can improve the efficiency and accuracy of a patient’s entire health experience. The British healthcare startup Babylon Health has this year demonstrated its product could assign correct diagnosis 75% of the time, which was 5% more accurate than the average prediction rate for human physicians. They have since raised $60 million in funding and plan to expand their algorithmic service to chatbots next (Locker, M.).

Radiology and pathology are in line for major operational shifts as AI brings new capabilities to medical imaging (G. Zaharchuk). Although nascent in its testing, results have been promising. Prediction accuracy rates often surpass human abilities here as well, and one deep learning network accurately identified breast cancer cells in 100% of the images it scanned (Bresnick, 2018).

AI in medical imaging

But some areas of AI’s potential have less to do with replacing doctors and more to do with making their jobs easiers. All medical centers today are concerned with burnout, which is driving professionals across the patient experience to reduce hours, retire early or switch out of the medical career. A leading cause of this burnout is administrative requirements that don’t provide the job satisfaction many get from working in medicine. By some accounts, physicians today spend more than two thirds of their time at work handling paperwork. Increasingly sophisticated data infrastructure in the healthcare industry has done little to change the amount of time required to document a patient’s health records; it has only changed the form in which that data is documented. But automating more of such required tasks could mean that physicians have more time to spend with their patients, increasing their satisfaction in their work, and even boost accuracy of data collection and documentation (Bresnick, 2018).

Significant Risks

The many improvements that AI experts promise for healthcare operations certainly do not come without costs. Taking advantage of these advanced analytics possibilities requires feeding the models data – lots and lots of data. With the collection and storage of vast amounts of patient data comes the potential for data breaches, misuse of data, incorrect interpretations, and automated biases. Healthcare perhaps more than other industries necessitates that patient data is held private and secure. In fact, the emerging frameworks with which to evaluate and protect population data must be made even more stringent when the data represents extremely telling details of the health and wellness of individual people.

Works Cited

Bresnick, J. (2018). Arguing the Pros and Cons of Artificial Intelligence in Healthcare. [online] Health IT Analytics. Available at: [Accessed 24 Sep. 2018].

Bresnick, J. (2018). Deep Learning Network 100% Accurate at Identifying Breast Cancer. [online] Health IT Analytics. Available at: [Accessed 24 Sep. 2018].

G. Zaharchuk, E. Gong, M. Wintermark, D. Rubin, C.P. Langlotz (2018). Deep Learning in Neuroradiology. American Journal of Neuroradiology. [online] Available at: [Accessed 24 Sep. 2018].

Locker, M. (2018). Fast Company. [online] Available at: [Accessed 24 Sep. 2018].

Healthcare’s Painful: Is HIPAA to blame?
By Keith LoMurray on 9/23/2018

Working in healthcare technology, I regularly hear painful stories about people’s experiences with healthcare. People will often talk about the great care they received from an individual doctor or nurse, but then mention the challenges of navigating the impersonal healthcare bureaucracy. Coordinating care between clinics, waiting for records, signing the same forms multiple times, scheduling appointments, other tasks that for healthcare organizations do as a normal course of business, often seem unnecessarily burdensome to a patient already dealing with an illness.

Over time you pick up nuggets of information about health care such as the fact that most medical records are still transferred via fax machine or that only one in three hospitals can send and receive medical records for care that happened outside their organization. HIPAA, the Health Insurance Portability and Accountability Act, has a central role in healthcare, which dutifully guards the patient’s right to privacy. HIPAA sets standard for how and when healthcare information can be shared and sets severe penalties for violations. Transferring medical records by individual faxes seems antiquated, but draws the question: how much is HIPAA responsible for this painful process?

Image credit: Byrd Pinkerton/Vox

Healthcare technology requires additional effort compared to other industries, which results in slower and more expensive processes. A common challenge is around using a web analytics tool for tracking the usage of a website. The two largest web analytics vendors are Google Analytics and Adobe Analytics, which both advertise the simplicity of adding analytic tracking to a website. Both companies leverage the IP address of the website visit to set metrics, a method that is not compliant with HIPAA. Neither vendor supports a configuration that doesn’t read the IP address. As a result, healthcare companies using an analytics solution will either need to find a complaint tool, which presents it’s own challenges, or implement workarounds to Google or Adobe Analytics for HIPAA compliance. Performing this work may be in the best interest of patient privacy, but it requires healthcare companies expend additional resources, time and energy.

Another challenge for a healthcare organization is the need for a “business associate agreement” (BAA) with all vendors that handle medical information. BAA’s are contracts for vendors that specify the safeguards around medical information as well as the liability of each partner involved. In many cases a vendor can’t be used, because they won’t sign a BAA. BAA’s also require determining who is liable for violations and responsible for penalties. This is a good principle, but it is slow, and requires companies to accept the liabilities. Often organizations will decide to avoid explicit risks. Instead organizations will opt to remain on legacy technology, which still have hidden risks but without the explicit assertions of liability within a BAA.

As much as requirements within HIPAA slow healthcare companies and make processes more expensive, healthcare companies also make choices that contribute to the problems in healthcare. Interoperability, the ability to share healthcare records across healthcare organizations, has been a goal of the US government since at least the 2009 HITECH Act that included digital health records and interoperability as core standards.

Many healthcare organizations have adopted digital health records, but interoperability progress has been more limited. Sharing medical data in a secure manner is already complicated, but interoperability is not prioritized for other reasons. There is no incentive to share records when a patient is switching healthcare providers. For hospital systems, reducing the burden of medical record sharing could make it easier to lose a customer to a competitor. HIPAA allows sharing of records across organizations for a patient care. So, the lack of interoperability can’t be totally blamed on HIPAA.

There are many challenges with HIPAA and in many situations it makes healthcare companies move slower and become more risk averse compared to other technology companies. But it also makes healthcare technology companies think explicitly about the risks they take and prioritize strategies to protect a person’s medical data. A challenge is other industries haven’t historically prioritized the protection and securing of a person’s data to the degree HIPAA requires. When protecting personal data healthcare companies are at the forefront, building the technology to support these standards. Compare this to when healthcare companies use technologies such as cloud computing, they are leveraging a second wave technology, which was refined in another area. Perhaps if more companies prioritized protecting user data, they could help healthcare companies fix some of the unnecessary burdens in healthcare that cause patients additional heartache.

Been There, Done That: What Data Science Can Learn from Psychology

by Kim Darnell on September 20, 2018

In the wake of recent revelations regarding misuses and abuses of personal data by a variety of well-known and successful companies, as well as the growing evidence that big data are being used to actively perpetuate and increase socioeconomic inequality, it might seem like data science as a discipline has wandered so far down a dark ethical path that there is no clear map to recovery.

Image credit: Getty Images

As a professor of psychology and a data scientist, however, I see something very different: A young field that is still trying to figure out how to maximize its potential in a fast-paced, dynamic world while still following a stable, practical moral compass. Psychology was there once, too, performing highly controversial studies, such as the Milgram experiment, which showed everyday Americans that, like their counterparts in Nazi Germany, they would engage in the potentially life-threatening torture of strangers if instructed to do so by an authority figure. Or the Stanford prison experiment, which revealed that even the most privileged among us can become predators or prey at the flip of a coin when placed in a prison environment.

Image credit: Yale University Manuscript and Archives

Today, data scientists and those who employ them are struggling publicly, if not painfully, to find the right balance between getting the data they want while respecting the rights of those they get the data from. Fortunately, psychology can offer a detailed and time-tested framework for making that struggle less difficult.

In the United States, any licensed psychologist or employee of a training program approved by the American Psychological Association (APA) is bound by the Ethical Principles of Psychologists and Code of Conduct, also known as the APA Code of Ethics. This code is centered on five principles that are intended to “guide and inspire … toward the very highest ethical ideals of the profession.” They include A) Beneficence and Nonmaleficence, B) Fidelity and Responsibility, C) Integrity, D) Justice, and E) Respect for People’s Rights and Dignity. Taken together, these principles and the rules they give rise to govern psychologists’ behavior in all areas of professional practice (e.g., therapy, research, education, public service) and describe how we must:

  • resolve conflicts of interest among our domains of practice, and that practice with the law;
  • interact with our colleagues and clients in a way that guarantees they understand what we are doing, why we are doing it, and what we think the consequences of our collective actions might be;
  • define the limits of our professional competence, as well as that of our colleagues and clients;
  • and address any mistrust or harmful effects that arise from our professional conduct, and do so in a meaningful and timely fashion.

Image credit: American Psychological Association

Of course, psychology’s model is not the only plausible source for a data science code of ethics. Data for Democracy, for example, has attempted to crowdsource a code of ethical conduct from the data science community itself, an effort supported by former U.S. Chief Data Scientist and data ethics evangelist, D.J. Patil. Others have proposed a data science code of ethics based on the Hippocratic Oath or the code of ethics for the National Association of Social Workers. Each of these approaches has their strengths and weaknesses, but none seems to offer the comprehensive perspective that the APA Code of Conduct does.

However we ultimately choose to resolve the crafting of a data science code of ethics, there are a few things we can be sure of. First, we as data scientists need to break the bad habit of asking for forgiveness rather than permission. If we don’t, the general public will become so mistrustful of us that they refuse to provide us with the honest and representative data we need to do our jobs well. Second, we need to avoid falling prey to the entropy of procrastination. Otherwise, we will find our own code of ethics defined for us piecemeal by various government entities, the majority of which have members who know far less about the ethics of human subjects research and data science technology than they do about their current polling numbers and chances for re-election.

Psychology as a discipline runs the gamut from social science to biological science, and thus has constructed its code of ethical conduct to function effectively in diverse intellectual, cultural, and professional contexts. Given that data science is facing a comparably Herculean but highly related task, it seems both reasonable and efficient for our young discipline to take advantage of the insight that psychology can offer and base our own code of ethics on its well-validated model.