HR Analytics – An ethical dilemma?

HR Analytics – An ethical dilemma?
By Christoph Jentzsch | June 28, 2019

In the times of “Big Data Analytics”, the “War for Talents” and “Demographic Change” the key words HR Analytics or People Analytics seem to be ubiquitous in the realm of HR departments. Many Human Resource departments ramping up their skills on analytical technologies to deploy the golden nuggets of data they have about their own workforce. But what does HR Analytics even mean?

Mick Collins, Global Vice President, Workforce Analytics & Planning Solution Strategy at SAP SuccessFactors sets the context of HR Analytics as the following:

“The role of HR – through the management of an organization’s human capital assets – is to impact four principal outcomes: (a) generating revenue, (b) minimizing expenses, (c) mitigating risks, and (d) executing strategic plans. HR analytics is a methodology for creating insights on how investments in human capital assets contribute to the success of those four outcomes. This is done by applying statistical methods to integrated HR, talent management, financial, and operational data.” (Lalwani, 2019)

Summarized, HR Analytics is a data-driven approach to HR Management.

Figure 1: Data-Driven Decision Making in HR – Source: https://www.analyticsinhr.com/blog/what-is-hr-analytics/

So, what’s the big fuzz about it? Well, the example of Marketing Analytics, which had a revolutionary impact on the field of marketing, showcases that HR analytics is changing the way how HR departments operate tomorrow. A more data-driven approach enables HR to:

  • …make better decisions using data, instead of relying on the managers gut feeling
  • …move from an operational partner to a tactical, or even strategic partner (Vulpen, 2019)
  • …attract more talent, by improving the hiring processes and the employee experience
  • …continuously improve workforce planning through informed talent development (MicroStrategy Incorporated, 2019)

However, the increased availability of new findings and information as well as the ongoing digitalization that unlocks new opportunities to understand and interpret those information, also raises new concerns. The most critical challenges are:

  • Having employees in HR functions with the right skillset to gather, manage, and report on the data
  • Confidence in data quality as well as cleansing and interpretation problems
  • Data privacy and compliance risks
  • Ethical and moral concerns about using the data

Especially, the latter 2 aspects are up for investigation and some guidance is given on how to overcome those challenges. It is important to understand in the first place that corporate organizations collect data about their employees on very detailed level. Theoretically they could reconcile findings back down the level that identifies an individual employee. However, in the considerations of legal requirements this is not always allowed and due to the implementation of GDPR regulations, organizations are now forced to look at employee data and privacy in the same way they do for customers.

Secondly, it is crucial to understand that HR Analytics uses a range of techniques based on statistics that are incredibly valuable at the population level but they can be problematic if you use them to make a decision about an individual. (Croswell, 2019)

Figure 2: HRForecast Recruiting Analytics Dashboard source: https://www.hrforecast.de/portfolio-item/smartinsights/

This is being confirmed by Florian Fleischmann, CEO of HR Analytics Provider HRForecast as he states: “The real lever of HR Analytics is not taking place on an individual employee level, it is instead happing on a corporate macro level, when organizational processes, such as the hiring procedure or overarching talent programs are being improved.” (Fleischmann, 2019). Mr. Fleischmann is totally right, as managing people on an individual level is still a person-to-person relationship between employee and manger, which is nothing that requires a Big Data algorithm. Assuming the worst-case scenario: Job Cuts. If low-performers ought to be identified, simply line managers have to be interviewed – there is no need for a Big Data solution.

Analytics on an individual level do not bring added value but can even create harm as Mr. Fleischmann points out: “According to our experience the application of AI technology to predict for example, employee attrition rates on an individual basis can create more harm than benefit. It can cause a self-fulfilling prophecy, as the manager believes to know what team member is subject to leave and changes his behavior accordingly in a negative way”. (Fleischmann, 2019)

For that reason, HRForecast advocates for two paradigms in the ethical use and application for HR Analytics:

  1. Information on an employee level is only provided to the individual employee and is not shared with anyone else. “This empowers the employee to stay performant as he or she can analyze for example his or her own skill set against a benchmark of skills that are required in the future”, confirms Fleischmann.
  2. Information is being shared with management only on an aggregated level. The concept of “Derived Privacy” is applicable in this context as it allows enough insights to draw conclusions on a bigger scale but protects the individual employee. Given the legal regulations data on that level needs to be fully anonymized and groups smaller than 5 employees are excluded from any analysis. Fleischmann adds: “The implementation of GDPR did not affect HRForecast, as we applied those standards already pre-GDPR. Our company stands to a high ethical code of conduct, which is a key element if you want to be a successful player in the field of HR Analytics.”

In conclusion it can be stated that the application of Big Data Analytics or AI in a context of Human resources can create a huge leap in organizational transparency. However, his newly won information can cause major privacy risks for employees if not treated in reasonable fashion. To mitigate the risk of abusing the increased level of transparency an ethical code of conduct as provided by a third-party expert HRForecast needs to be applied in modern organizations. Thus, Big Data in HR can lead to an ethical dilemma, but it does not have to.

Bibliography

  • Croswell, A. (2019, June 25). Why we must rethink ethics in HR analytics. Retrieved from Why we must rethink ethics in HR analytics: https://www.cultureamp.com/blog/david-green-is-right-we-must-rethink-ethics-in-hr
  • Fleischmann, F. (2019, June 25). CEO HRForecast. (C. Jentzsch, Interviewer)
  • Lalwani, P. (2019, April 29). What Is HR Analytics? Definition, Importance, Key Metrics, Data Requirements, and Implementation. Retrieved from What Is HR Analytics? Definition, Importance, Key Metrics, Data Requirements, and Implementation: https://www.hrtechnologist.com/articles/hr-analytics/what-is-hr-analytics/
  • MicroStrategy Incorporated. (2019). HR Analytics – Everything You Need to Know. Retrieved from HR Analytics – Everything You Need to Know: https://www.microstrategy.com/us/resources/introductory-guides/hr-analytics-everything-you-need-to-know
  • Vulpen, E. v. (2019). HR Analytics. Retrieved from What is HR Analytics?: https://www.analyticsinhr.com/blog/what-is-hr-analytics/

Privacy Movements within the Tech Industry

Privacy Movements within the Tech Industry
By Jill Rosok | June 24, 2019

An increasing number of people have become fed up with major tech companies and are choosing to divest from companies that violate their ethical standards. There’s been #deleteuber, #deletefacebook and other similar boycotts of big tech companies that violated consumer trust.

In particular, five companies have an outsized influence on the technology industry and the economy in general, Amazon, Apple, Facebook, Google (now a unit of parent company, Alphabet) and Microsoft. Among numerous scandals, Facebook has insufficiently protected user data leading to Russian hacking and the Cambridge Analytica controversy. Amazon and Apple have been chastised for unsafe working conditions in their factories. Google contracts with the military and collects a massive amount of data on their users. Microsoft has been repeatedly involved in antitrust suits. Those who have attempted to eliminate the big five tech companies from their lives have found it nearly impossible. It’s one thing to delete your Facebook and Instagram accounts and stop ordering packages from Amazon. However, eliminating the big five companies from your life is actually much more complicated than that. The vast majority of smartphones have hardware and/or software built by Apple and Google. Furthermore, Amazon’s services run the backend of a huge number of websites, meaning stepping away from these companies would essentially mean to stop using the internet.

For a limited few, it might be possible to simply log off and never come back, but most people rely on tech companies in some capacity to provide them basic access to work, connection to friends and family, and the internet in general. As the big five acquire more and more services that encompass the entirety of people’s lives it is extremely difficult for an individual to participate in a meaningful boycott of all five companies.

In light of the dominance of these five companies, to what extent is the government responsible for some kind of intervention? And if the government were to intervene, what might this look like? Antitrust legislation is intended to protect the consumer from monopoly powers. Historically, the government’s focus has been ensuring that companies are not behaving in such a way that leads consumers to pay higher prices for goods and services. However, this doesn’t protect users where no cash is exchanged, as in the case of Facebook. It’s a great example of the classic adage, if you’re not paying money for a service, then that means the product is you. It also does not hold up in a circumstance where venture backing or other product lines in the business can enable companies to artificially deflate prices below cost for many years until all other competitors are wiped off the map. Senator and presidential candidate, Elizabeth Warren, recently proposed breaking up big tech. While her piece was received more symbolically than as a fully formed plan to regulate the tech industry, there were aspects that appear to have resonated strongly with the general public. In particular, the idea that mergers and acquisitions by large companies should undergo much deeper scrutiny and perhaps be banned entirely was well received by analysts.

Like with most complex problems in life, there are no easy solutions to simultaneously protect consumers and maximize technological innovation. However, it is vital to avoid becoming stagnant in response to the scale of the problem. Rather, as individuals, we must remain informed and put pressure on our political leaders to enact meaningful legislation to ensure the tech industry does not violate the basic rights of consumers.

Breast Cancer, Genetic Testing, and Privacy

Breast Cancer, Genetic Testing, and Privacy
By Anna Jacobson | June 24, 2019

5%-10% of breast cancer is believed to be hereditary, meaning that it results directly from a genetic mutation passed on from a parent. The most common known cause of hereditary breast cancer is an inherited mutation in the BRCA1 or BRCA2 gene; about 70% of women with these mutations will develop breast cancer before the age of 80. Identification of these mutations can determine a breast cancer patient’s course of treatment and post-treatment monitoring, inform decisions about if and how she has children, and raise awareness in her family members of their potentially higher risk.

Because of this, newly diagnosed breast cancer patients may be referred for genetic risk evaluation if they meet criteria laid out in the National Comprehensive Cancer Network (NCCN) genetic testing guidelines, including family medical history, tumor pathology, ethnicity, and age. These at-risk patients typically undergo multi-gene panel testing that looks for BRCA1 and BRCA2 mutations, as well as a handful of other less common gene mutations, some of which are associated with inherited risk for other forms of cancer as well as breast cancer.

Genetic testing for breast cancer is a complex issue that raises many concerns. One concern is that not enough patients have access to the testing; some recent studies have shown that the genetic testing guideline’s criteria are too restrictive, excluding many patients who in fact do carry hereditary gene mutations. Another concern is that the testing is not well-understood; for example, patients and even doctors may not be aware that there are many BRCA mutations that are not detected by current tests, including ones that are more common that those that are currently tested. Yet another set of concerns revolves around the value of predictive genetic testing of family members who do not have a positive cancer diagnosis, and whether the benefit of the knowledge of possible risk outweighs the potential harms.

To help a patient navigate this complexity, this genetic testing is ideally offered in the context of professional genetic expertise for pre- and post-test counseling. However, under a 2013 Supreme Court ruling which declared that genes are not patentable, companies like 23andMe now offer direct-to-consumer BRCA testing without professional medical involvement or oversight. And even at its best, genetic counseling comes at a time at which breast cancer patients and their caregivers may be least able to comprehend it. They may be suffering from the shock of their recent diagnoses. They may be overwhelmed by the vast amount of information that comes with a newly diagnosed illness. Most of all, they may only be able to focus on the immediate and urgent need to take the steps required to treat their disease. To many, it is impossible to think about anything other than whether the test results are positive, and if they are, what to do.

But to a breast cancer survivor, other concerns about her genetic testing may arise months or years later. One such concern may be about privacy. Genetic testing for breast cancer is not anonymous; as with all medical testing, the patient’s name is on the test order and the results, which then become part of the patient’s medical record. All medical records, including genetic test results, are protected under HIPAA (Health Insurance Portability and Accountability Act of 1996). However, the recent proliferation of health data breaches from cyberattacks and ransomware has given rise to growing awareness that the confidentiality of medical records can be compromised. This in turn leads to the fears that exposure of a positive genetic test result — one that suggests increased lifetime cancer risk — could lead to discrimination by employers, insurers, and others.

In the United States, citizens are protected against such discrimination by GINA (Genetic Information Nondiscrimination Act of 2008), which forbids most employers and health insurers from making decisions based on genetic information. However, GINA does not apply to small businesses (with fewer than 15 employees), federal and military health insurance, and other types of insurance, such as life, disability, and long-term care. It also does not address other settings of potential discrimination, such as in housing, social services, education, financial services and lending, elections, and legal disputes. Furthermore, in practice it could be very difficult to prove that discrimination prohibited by GINA took place, particularly in the context of hiring, in which it is not required that an employer give complete or truthful reasons – or sometimes, any reasons at all – to a prospective employee for why they were not hired. And perhaps the greatest weakness of GINA, from the standpoint of a breast cancer survivor, is that it only prohibits discrimination based on genetic information about someone who has not yet been diagnosed with a disease.

Though not protected by GINA, cancer survivors are protected by the Americans with Disabilities Act (ADA), which prohibits discrimination in employment, public services, accommodations, and communications based on a disability. In 1995, the Equal Employment Opportunity Commission (EEOC) issued an interpretation that discrimination based on genetic information relating to illness, disease, or other disorders is prohibited by the ADA. In 2000, the EEOC Commissioner testified before the Senate that the ADA “can be interpreted to prohibit employment discrimination based on genetic information.” However, these EEOC opinions are not legally binding, and whether the ADA protects against genetic discrimination in the workplace has never been tested in court.

Well beyond existing legislative and legal frameworks, genetic data may have implications in the future of which we have no conception today, more than perhaps any other health data. The field of genomics is rapidly evolving; it is possible that a genetic mutation that is currently tested because it signals an increased risk for ovarian cancer might in the future be shown to signal something completely different and possibly more sensitive. And unlike many medical tests which are relevant at the time of the test but have decreasing relevance over time, genetic test results are eternal, as true on the day of birth as on the day of death. Moreover, an individual’s genetic test results can provide information about their entire family, including family members who never consented to the testing and family members who did not even exist at the time the test was done.

The promise of genetic testing is that it will become a powerful tool for doctors to use in the future for so-called “precision prevention”, as well as personalized, targeted treatment. However, in our eagerness to prevent and cure cancer, we must remember to consider that as the area of our knowledge grows, so too grows its vulnerable perimeter – and so must our defenses against those who might wish to misuse it.

________________
References

  • “Genetic Testing and Privacy.” Breastcancer.org, 28 Sept. 2016, www.breastcancer.org/symptoms/testing/genetic/privacy.
  • “Genetic Testing Guidelines for Breast Cancer Need Overhaul.” Clinicaloncology.com, 24 August 2018, https://www.clinicaloncology.com/Breast-Cancer/Article/08-18/Genetic-Testing-Guidelines-for-Breast-Cancer-Need-Overhaul/52544?sub=–esid–&enl=true.
  • “Genetic Information Privacy.” Eff.org. https://www.eff.org/issues/genetic-information-privacy.
  • “Genetic Discrimination.” Genome.gov. https://www.genome.gov/about-genomics/policy-issues/Genetic-Discrimination.
  • “NCCN Guidelines Version 3.2109.” NCCN.org. https://www.nccn.org/professionals/physician_gls/pdf/genetics_screening.pdf.
  • “Understanding Genetic Testing for Cancer.” Cancer.org. https://www.cancer.org/cancer/cancer-causes/genetics/understanding-genetic-testing-for-cancer.html.

Maintaining Data Integrity in an Enterprise

Maintaining Data Integrity in an Enterprise
By Keith Wertsching | June 21, 2019

Everyone suffers when an enterprise does not maintain the integrity of its data and the leaders employ that data to make important decisions for the enterprise. There are many roles involved in mitigating the risk of poor data integrity, which is defined by Digital Guardian as “the accuracy and consistency (validity) of data over its lifecycle.” But who should be responsible for making sure that the integrity of the data is preserved throughout collection, extraction, and use by the data consumers?
The agent who maintains data accuracy should ideally be someone who:

  • Understands where the data is collected from and how it is collected
  • Understands where and how the data is stored
  • Understands who is accessing the data and how they are accessing it
  • Has the ability to recognize when that data is not accurate and understands the steps required to correct it

Too often, the person responsible for maintaining data integrity is focused primarily on the second bullet point, with a casual understanding of the first and third bullet points. Take this job description for a data integrity analyst from Investopedia:
“The primary responsibility of a data integrity analyst is to manage a company’s computer data by way of monitoring its security…the data integrity analyst tracks records indicating who is accessing what information held by company computer systems at specific times.”

The job description demonstrates that someone working in data integrity should be an expert on where and how the data is stored, and be familiar with who should be accessing that information in order to make sure that company data is not stolen or used inappropriately. But who is ultimately responsible for making sure that the information is accurate in the first place, and for making sure that any changes needed are done in a timely fashion and tracked for future records?

In today’s world of enterprise database administrators, there is often a distinct separation between the person or team that understands how the data is stored and maintained and the person or team that has the ability to recognize when the data is not accurate. Let’s take the example of a configuration management database (CMDB) to highlight the potential issues from separation of data integrity responsibility. SearchDataCenter defines a CMDB as “a database that contains all relevant information about the hardware and software components used in an organization’s IT services and the relationships between those components.” The information stored in the CMDB is important because it allows the entire organization to refer to technical components in the same manner. In a larger organization, the team that is responsible for provisioning hardware and software components will often be responsible for also making sure that any information related to newly provisioned components makes its way into the CMDB. There is often an administrator or set of administrators that will maintain the information in the CMDB. The data will then be consumed by a large number of teams, including IT Support, Project Teams, and Finance.

When the accuracy of the data is not complete, the teams consuming the data do not have the ability to speak the same language regarding IT components. The Finance Team may allocate dollars based on the number of components or breakdown of types of components. If they do not have adequate information, they may fail to allocate the right budget for the project teams to complete their work on time. A different understanding of enterprise components may cause delays in assistance from the IT Support organization, which has the potential to push out timelines and delay projects.

One potential solution to this issue: make one team responsible for maintaining the accuracy of the data from collection to consumption. As mentioned before, this team needs to have an understanding of where the data comes from, how it is stored, how it is consumed, and the ability to recognize when the data is not accurate and the steps required to correct the information. The data integrity team must be accessible to the rest of the organization to correct data accuracy problems when they arise. As the team grows and matures, they should target developing proactive measures to test that data is accurate and complete so that they can solve data integrity issues before they impact the user. By assigning specific ownership over the entire data lifecycle to one team, the organization can enforce accountability and integrity and mitigate the risk that leaders make poor decisions based on false information.

Links:

[1] Digital Guardian: https://digitalguardian.com/blog/what-data-integrity-data-protection-101
[2] Investopedia: https://www.investopedia.com/articles/professionals/120115/data-integrity-analyst-job-description-average-salary.asp
[3] SearchDataCenter: https://searchdatacenter.techtarget.com/definition/configuration-management-database

Using Social Media to Screen Job Candidates: Ethical and Future Implications

Using Social Media to Screen Job Candidates: Ethical and Future Implications
By Anonymous | June 24, 2019

Image Source: https://www.cfm-online.com/research-blog/2017/7/26/eu-looks-at-limiting-employer-social-media-snooping

Hiring qualified people is hard. Most of the time, the foundation of a hiring manager’s decision is built off of a 1-page resume, a biased reference or two (sometimes none), and a few hours of interviews with the candidate on their best behavior.

It’s no surprise that around 70% of employers have admitted to snooping on personal social media sites as a method for screening candidates [1]. Since hiring someone who isn’t the right fit can be expensive, it’s only natural for companies to turn to Facebook, Twitter, Instagram, or other social media sites to get a deeper glimpse into the personality they’re hiring. Unfortunately, there’s a lot that can go wrong for all parties involved due to the ethical implications.

What could go wrong?

Using social media to screen candidates doesn’t just weed out people who are vocal online about their criminal or illegal behavior. Doing this can lead to hiring managers screening out perfectly qualified candidates.

Recently, CIPD (an employee advocate group based in London) wrote a comprehensive pre-employment guide for organizations to follow, and included a section on using social media for job screening [2]. They outlined the risks of employers doing this, which included a case study about a company deciding not to hire a transgender candidate, even after indicating that the individual was suitable for the job prior to the social media check. This was considered an act direct discrimination under a protected characteristic, brought on by the company using social media to get more information on the candidate.

It doesn’t stop there. For some people, it’s common sense that employers review social media profiles, and they are able to keep their private thoughts secured. However, not everybody is a social media expert, and deciphering exactly is and isn’t private can be unwieldy. Many people are not aware that they are consenting to disclosing posts from 5+ years ago to potential employers. When companies don’t directly disclose that all content from personal social media sites are subject to review, this could be considered a breach of privacy for individuals who are unaware.

The Future of Social Media Screening

Manually reading through social media sites for potential issues with the candidate is time consuming. Why can’t someone your just create an algorithm that parses through social media content when it’s available, and labels attributes of your employees for you?

Image Source: http://fortune.com/2017/05/19/ai-changing-jobs-hiring-recruiting/

With the massive influx of artificial intelligence being leveraged within the job-hunting industry, it’s surprising that this isn’t already an industry norm. However, there are a myriad of potential ethical concerns around creating algorithms to do this.

It’s entirely possible that job candidates can fall victim to algorithmic bias, and be categorized as something they’re not because of an unperfected algorithm. If someone is new to social media and undergoes a screening like this, it’s possible the result will find no positive traits for the candidate, and the company will reject the candidate based on the algorithm’s decision.

Between the start-ups that continue to sprout up for the purpose of data mining to gain valuable insights on individuals and the “Social Credit Score” going live in Chine in 2020 [3], it’s hard to discount the possibility of algorithmic social media screenings that score how “hirable” a candidate is becoming prevalent. Because of this, all aspects of the hiring process should continually be subjected to ethical laws and frameworks to protect job candidates from unfair discrimination.

References

[1] https://www.cbia.com/news/hr-safety/employers-continue-rejecting-jobseekers-social-media/

[2] https://www.cipd.co.uk/knowledge/fundamentals/emp-law/recruitment/pre-employment-checks-guide

[3] https://digitalcommons.law.yale.edu/cgi/viewcontent.cgi?article=1122&context=yjolt

Ethical Implication of Generative AI

Ethical Implication of Generative AI
By Gabriel Hudson | April 1, 2019

Generative data models are rapidly growing in popularity and sophistication in the world of artificial intelligence (AI). Rather than using existing data to classify an individual or predict some aspect of a dataset these models actually generate new content. Recently developments in generative data modeling have begun to blur lines not only between real and fake, but also between machine and human generated content creating a need to look at the ethical issues that arise as the technologies evolve.

Bots
Bots are an older technology that has already been used over a large range of functions such as automated customer service or directed personal advertising. Bots are generative (almost exclusively creating language), but historically have been very narrow in function and limited to small interaction on a specified topic. In May of 2018 Google debuted a Bot system called Duplex that was able to successfully “fool” a significant number of test subjects while carrying out daily tasks such as booking restaurant reservations and making a hair salon appointment (link). This, combined with ubiquity of digital assistants, sparked a resurgence in bot advancement.

Deepfake
In this case Deepfake is a generalized term used to describe very realistic “media” (such images, videos, music, and speech) created with an AI technology know as a Generative Adversarial Network (GAN). GANs were originally introduced in 2014 but came into prominence when a new training method was published in 2018. GANs represent the technology behind seemingly innocuous generated media such as the first piece of AI generated art sold (link):

as well as a much more harmful set of false pornographic videos created using celebrities faces
(link):

The key technologies in this area were fully release fully released to the public upon their completion.

Open AI’s GPT-2
In February 2019 Open AI (a non-profit AI research organization founded in part by Elon Musk) released a report claiming a significant technology breakthrough in generating human sounding text as well as promising sample results (link). Open AI, however, against longstanding trends in the field and their own history chose not to release the full model citing potential for misuse on a large scale. Similar to GPT-2, there have also been breakthroughs in generative technology in other media like images, that have been released to the public. All of the images in the subsequent frame were generated with technology developed by Nvidia.

In limiting access to a new technology Open AI brought to the forefront some discussions about how the rapid evolution of generative models must be handled. Now that almost indistinguishable “false” content can be generated in large volume with ease it is important to consider who is tasked with deciding and maintaining the integrity of online content. In the near future, discussions must be extended about the reality of the responsibilities of both consumers and distributors of data and the way their “rights” to know fact from fiction and human from machine may be changing.

ESG Investing and Data Privacy

ESG Investing and Data Privacy
By Nate Velarde | March 31, 2019

Much of the focus on how to better protect individuals’ data privacy revolves around legal remedies and more stringent regulatory requirements. Market-based solutions are either not discussed or seen as unrealistic, ineffective or impractical. However, the “market” in the form of “responsible” or “sustainable” driven investors are imposing market discipline on companies with insufficient data privacy safeguards through lower share prices and redirecting investment capital to those companies with lower data privacy risks. Responsible investing as a market force is poised to grow dramatically. Blackrock, the world’s largest asset manager, is forecasting that responsible investing strategies will comprise 21% of total fund assets by 2028, up from only 3% today.

Responsible investing involves the integration of environmental, social and governance (“ESG”) factors into investment processes and decision-making. Many investors recognize that ESG information about companies is vital to understand a company’s business model, strategy and management quality. Several academic studies have shown that good corporate sustainability performance is associated with good financial results and superior investment returns. The best known ESG factors having financial relevance are those related to climate change. The reason for this is that climate change is no longer a hypothetical threat, but one that is real with multi-billion dollar consequences for investment portfolios.

Why Do ESG Investors Care About Data Privacy?

ESG investors are becoming increasing focused on data privacy issues. Under the ESG framework, data privacy is considered a human rights issue – falling under the “S” of ESG. Privacy is a fundamental human right, according to international norms established by the United Nations, the US and EU constitutions, but it is increasingly at odds with the business models of technology companies. As these companies have become more reliant on personal data collection, processing and distribution, they have faced increased scrutiny from users and regulators, heightening reputational, litigation and regulatory risks.

Data has been dubbed the “new oil”, the commodity that powers the digital economy. But, as investors are finding, scandals caused by privacy breaches can be just as damaging to tech behemoths as oil spills are to fossil fuel companies. Facebook-Cambridge Analytica was the tech industry’s Exxon-Valdez moment in regards to data privacy. $120 billion was wiped off Facebook’s market value in the aftermath of the scandal. Many of the sellers were ESG investors who sold the stock because of what they perceived as Facebook’s poor data stewardship.

For ESG investors, data privacy risk has become a crucial metric in assessing the companies in which they invest. ESG funds are pushing companies to be more transparent in their data-handling processes (collection, use and protection) and privacy safeguards with shareholders. ESG investors want companies to be proactive and self-regulate rather than wait for government involvement, which often tends to be overbearing and ultimately, more damaging to long-term profitability.

How ESG Investors Advocate for Data Privacy

ESG investors have three levers to advocate for stronger privacy safeguards – one carrot and two sticks. The first is dialog with senior management. As shareholders and/or potential shareholders, ESG investors are given the opportunity to meet regularly with the CEO, CFO and other key executives. ESG investors use their management face time to discuss business opportunities and risks, of which privacy, is top of mind. ESG investors can highlight any deficiencies in privacy policies (relative to what they see as industry best practice) and advocate for increased management and board oversight, spending on privacy and security audits and staff training and helping shift the mindset of executives towards designing in privacy into their products and servives. The key message ESG investors convey to tech executives is that companies that are better at better managing privacy risks have a lower probability of suffering incidents that can meaningfully impact their share price. Any direct incremental expenses associated with privacy risk mitigation is miniscule (in dollar terms) compared to the benefits of a higher share price valuation that is associated with lower risk.

As demonstrated by the Facebook-Cambridge Analytica share price sell-off in mid-2018, ESG investors’ second lever is to vote with their feet and sell their shares if companies fall short of data privacy expectations. Large share price declines are never pleasant, but they are often temporary. As long as business model profitability is not permanently impaired, the share price will eventually recover in most cases. Management may not feel enough pain to see through the hard work of implementing the technical and cultural changes required to adequately protect their users’ data. This is when ESG investors’ third lever can be deployed. Acting in concert with other shareholders, ESG investors’ can engage in a proxy fight and vote to replace the company management and/or board with one more focused on data privacy concerns. The mere threat of a proxy fight has proved to be a powerful catalyst for change at many companies across many industries. While this has yet to happen specifically in regards to data privacy, given the growing market power of ESG investors and their focus on privacy issues, that day is likely to come sooner, rather than later.

Conclusion

Data privacy researchers and advocates should establish relationships with ESG investors, ESG research firms (Sustainlytics) and influential proxy voting advisory firms (Institutional Shareholder Services and Glass-Lewis), to highlight concerns, make recommendations and mold the overall data privacy conversation at publicly traded technology companies. Data privacy advocacy through ESG investors is a more direct, and likely, much faster route to positive change (albeit, incremental) than litigation or regulation.

The Privacy Tradeoff

The Privacy Tradeoff
By John Pette | March 31, 2019

I see privacy referenced often as an all-or-nothing proposition, often in discussions of whether one has it or one does not. In the realm of data, though, privacy exists on a continuum. It is a tradeoff between the benefits from having data readily available and the protection of people’s privacy. There is tremendous gray area in this discussion, but some things are clear. Few would argue that all social security numbers should be public. Things like people’s names and addresses are less clear. It is easy to argue that these data have always been publicly available in America via the White Pages. This is not a valid argument, as it ignores context. While that information was certainly available, the internet was not. Name, phone, and address records were not in one collected location; they were only on the local level, and not digitized. As such, there were limits to the danger of dissemination. Also, there was only so much a bad actor could do with information. In the modern world, anyone can use these basic data elements to commit fraud from anywhere in the world. The context has changed, and the need to protect information has changed with it.

Of course, to what extent data should be protected is also a gray area. Technology and, arguably, society benefit greatly from data availability. People want Waze to work reliably. Many of those same people probably do not want Google to track their locations. It is easy to go too far in either direction. These sorts of situations should all have privacy assessments to evaluate the benefits and risks.

The privacy tradeoff is particularly tricky in government, which has the responsibility for protecting its citizens, but also an obligation for transparency. In studying public crime data from all U.S. municipalities with populations of more than 100,000, I uncovered enormous differences in privacy practices. Some cities made full police reports publicly available to any anonymous user, exposing the privacy details of anyone involved in an incident. Others locked down all data under a blanket statement like, “All data are sensitive. If you want access to a report, file a FOIA request in person.” In the latter case, the data are certainly protected, but the police departments provide no data of value to its citizens. At the risk of making a fallacious “slippery slope” argument, I fear the expansion of government using privacy as a catch-all excuse for hiding information and eliminating transparency. The control of information is a key element of any authoritarian regime, and it is easy to reach that point without the public noticing.

The Freedom of Information Act (FOIA) is intended to provide the American public transparency in government information. It is a flawed system with good intentions. Having worked in an office responsible for FOIA responses for one government bureau, I have seen both sides of FOIA in action. When people discuss their FOIA requests publicly, it is generally in the form of complaints, and usually in one of two contexts:

  1. “They are incompetent.”
  2. “They’re hiding something.”

Most of the time, no one is intentionally hiding anything, though that makes for the most convenient conspiracy theories. In reality, there is an unspeakable volume of FOIA requests. Records are not kept in any central database, so each response requires any involved employee to dig through their email, and their regular jobs are already full-time affairs. Then, each response goes through multiple legal reviews to redact privacy data of U.S. citizens. Eventually, this all gets packaged, approved, and delivered to the requestor. It is far from a perfect system. However, it does, to a sufficient degree, serve its original intent. As long as FOIA is in place and respected, I do not see the information control aspect of government devolving into authoritarianism.

What is the proper balance? This is the ultimate question in the privacy tradeoff. Privacy risk should be assessed with every new technology or application that could contain threats of exposure, and the benefits should always outweigh those risks to the public. If companies provide transparency in their privacy policies and mechanisms for privacy data removal, the benefits and risks should coexist harmoniously.

The Bias is Real

The Bias is Real
By Collin Reinking | March 31, 2019

In 2017, a poll conducted by Digital Examiner, after a very public controversy in which a Google employee wrote about Google’s “ideological echo chamber”, 57.1% of respondents said they believed search results were “biased” in some way. In the years since then, Google, alongside other big tech companies, increasingly find themselves at the center of much public debate about whether their products are biased.

Of course they are biased.

This is nothing new.

That doesn’t mean it can’t be a problem.

Search is Biased
In its purest form, search engines filter down the massive corpus of media hosted on the world wide web to just the selections from the corpus that relate to our desired topic. Whatever bias that corpus has, the search engine will reflect. Search engines are biased because the Internet is made by people and people are biased.

This aspect of bias rose to the international spotlight in 2016 when a teenager from Virgina posted a video showing how Google’s image search results for “three white teenagers” differed from the results for “three black teenagers”. The results for “three white teenagers” were dominated by stock photos of three smiling teens while the results for “three black teenagers” were dominated by mugshots (performing the same searches today mostly returns images from articles referencing the controversy).

In their response to the controversy Google asserted that it’s search engine results are driven by what images were found next to what text on the Internet. In other words, Google was only reflecting the corpus it’s search engine was searching over. Google didn’t create the bias, the Internet did.

This is Not New
Before the Internet there were libraries. Before search engines there were card catalogs, many of which relied on the Dewey Decimal Classification system. Melvil Dewey was a serial sexual harasser whose classification system reflected the racism and homophobia, along with other biases, that were common in the dominant culture at the time of its invention in 1876. If you had searched the Google of 1919 for information about homosexuality, you would have landed in the section for abnormal psychology, or similar. Of the 100 numbers in the system dedicated to religion, 90 of them covered Christianity. Google didn’t invent search bias.

Pick your Bias
Do we want Google, or any other company to try to filter or distort our view of the corpus? This is the first question we must ask ourselves when we consider the conversation around “fixing” bias in search. Some instances clearly call for action, such as Google’s early efforts at image labeling not adequately distinguishing images of African Americans from Images of gorillas. Other questions, like how to handle content that some might consider political propaganda or hate speech, are more confusing and would require Google to serve as of truth and social norms.

But We Know The Wrong Answer When We See It
Google is currently working to build an intentionally censored search engine, Dragonfly, to allow itself to enter the Chinese market. This project, which is shrouded in more secrecy than usual (even for a tech company), is the wrong answer. Google developing a robust platform for managing censorship is basically pouring the slippery onto the slope. With the current political climate both here in the United States and around the globe it is not hard to imagine actors of all political stripes looking to exert more control over the flow of information. Developing an interface for bias to exert that control is not a solution, it’s a problem.

A New Danger To Our Online Photos

A New Danger To Our Online Photos
By Anonymous | March 29, 2019

This is age of photo sharing.

We as humans have replaced some our socialization needs by posting our captured moments online. Those treasured pictures on Instagram and facebook fulfill many psychological and emotional needs – from keeping in touch with our family, reinforcing our ego, collecting our memories and even so we can keep up with the Joneses.

You knew what you were doing when you posted your Lamborghini to your FB group. Photo credit to @Alessia Cross

We do this even when the dangers of posting photos at times appear to outweigh the benefits. Our pictures can be held for ransom by digital kidnappers, used in catfishing scams, used to power fake gofundme campaigns or be gathered up by registered sex offenders. Our photos could expose us to real world perils such as higher insurance premiums, real life stalking (using location metadata) and blackmail. This doesn’t even include activities which aren’t criminal but still expose us to harm – like our photos being used against us in job interviews, being taken out of context or being used to embarrass us years later. As they say, the internet never forgets.

As if this all wasn’t even enough, now our private photos are being used by companies to train their algorithms. According to this article in fortune, IBM “released a collection nearly a million photos which were scraped from Flickr and then annotated to describe the subject’s appearance. IBM touted the collection of pictures as a way to help eliminate bias in facial recognition. The pictures were used without consent from the photographers and subjects, IBM relied on “Creative Commons” licenses to use them without paying licensing fees.

IBM has issued the following statement:

IBM has been committed to building responsible, fair and trusted technologies for more than a century and believes it is critical to strive for fairness and accuracy in facial recognition. We take the privacy of individuals very seriously and have taken great care to comply with privacy principles, including limiting the Diversity in Faces dataset to publicly available image annotations and limiting the access of the dataset to verified researchers. Individuals can opt-out of this dataset.

Opting-out however is easier said than done. To remove any images requires photographers to email IBM links to the images they would like to have removed which is a bit hard since IBM has not revealed usernames of any users it pulled photos from.

Given how all the dangers our photos are already exposed to, it might be easy to dismiss this. Is a company training models on your pictures really more concerning than, say, what your creepy uncle is doing with downloaded pictures of your kids?

Well, it depends.

The scary part of our pictures being used to train machines is that we don’t know a lot of things. We don’t know which companies are doing it and we don’t know what they are doing it for. They could be doing it for a whole spectrum of purposes from the beneficial (make camera autofocus algorithms smarter) to innocuous (detect if someone is smiling) to possibly iffy (detect if someone is intoxicated) to ethical dubious (detecting someone’s race or sexual orientation) to downright dangerous (teach Terminators to hunt humans).

It’s all fun and games until your computer tries to kill you. Photo by @bwise

Not knowing means we don’t get to choose. Our online photos are currently thought of a public good and used for any conceivable purpose, even if those purposes are not only something we may not support but possibly even harmful to us. Could your Pride Parade photos be used to train detection of sexual orientation? Could insurance companies use your photos to train detection of participation in risky activities? Could T2000s use John Connor’s photos to find out what Sarah Connor would look like? Maybe these are extreme examples, but it is not much of a leap to think there might be companies developing models that you might find objectionable. And now your photos could be helping them.

All of this is completely legal of course, though it goes against the principles laid out in the Belmont Report. It doesn’t respect persons due to its lack of consent (Respect for Persons), it provides no real advantage to the photographers or subjects (Beneficience) and all the benefits really go to the companies exploiting our photos while we absorb all of the costs (Justice).

With online photo sharing, a Pandora’s box has been opened and there is no going back. As much as your local Walgreen’s Photo Center might wish it, wallet sized photos and printed 5.75 glossies are things of the past. Online photos are here to stay, so we have have to do better.

Maybe we can start with not helping Skynet.

Hasta la vista, baby.

Sources:

Millions of Flickr Photos Were Scraped to Train Facial Recognition Software, Emily Price, Fortune March 12, 2019, http://fortune.com/2019/03/12/millions-of-flickr-photos-were-scraped-to-train-facial-recognition-software/