Why representation matters in AI development?

Why representation matters in AI development?
By Mohamed Gesalla | February 23, 2022

A recent genocide took place against Rohingya in Myanmar, a country that is formerly known as Burma. Rohingya is a stateless Muslim minority in Myanmar’s Rakhine state that are not recognized by Myanmar as citizens or one of the 135 recognized ethic groups in the country. The resulting causality toll of the humanitarian crises reached a staggering 10,000 people and more than 700,00 Rohingya fleeing to Bangladesh. The US Holocaust Memorial Museum found these numbers along with other evidence compelling to conclude that Myanmar’s military committed ethnic cleansing, crime against humanity and genocide against the Rohingya. The government in Myanmar partaking in these crimes was not the supersizing reality as the succeeding military regime that ruled Myanmar failed to address ethnic minority grievances or to provide security for these communities for many years, which has led to arm race and created powerful non-state armed groups (Crisisgroup2020).

In 2010, the new predominantly Buddhist government introduced wide-ranging reforms towards political and economic liberalization but continued to discriminate against the Rohingya. Since the reforms, the country has seen an increase in Buddhist nationalism and Anti-Muslim violence (HumaRightsCouncil2018). As a result of economic liberation reforms constituted by the government in 2011, the telecommunication sector in Myanmar witnessed an unprecedented drop in SIM-card prices going from $200 to $2. This explosion of the internet allowed more than 18 million people out of 50 million to have access to Facebook, compared to 1.1% of the population having access to the internet in 2011.

Facebook became the main source of information in a country that was emerging from a military dictatorship, ethnic division, and a population with no proper education on spread of misinformation through the internet and social media platforms. All these created a fertile environment for widespread hate speech against the Muslim minority and especially the Rohingya. Facebook was the surprise that exacerbated the crises in Myanmar, it was used as a platform not only by Buddhist extremist groups but by authorities fueling these violent groups as found in this post “every citizen has the duty to safeguard race, religion, cultural identities and national interest” (HumaRightsCouncil2018).

Facebook and other social media platforms use sophisticated machine learning and AI driven tools for hate speech detection, however; these systems are reviewed and made better by content reviewers who have deep understanding of certain cultures and languages. Even though Facebook, now known as Meta, was warned by human rights activists that its platform was being used to spread Anti-Muslim hate speech, but Facebook didn’t take any actions because the company couldn’t deal with the Burmese language. As of 2014, the social media giant had only one content reviewer who spoke Burmese. Facebook’s statement reviled that their Deeptext engine failed to detect hate speech, their workforce does on represent the people they serve, and evidently, they do not care about the disastrous impacts of their platform. The Facebook (Meta) – Myanmar incident is only a prime example of many catastrophic impacts that biases and lack of representation in technology development can cause.

In recent years, new technology development across different industries have been adapting AI techniques to optimize and make systems more advanced and sophisticated. As this dependency on AI continues, all segments of society will use AI in one form or another. To ensure the development of technology with minimal biases and defects as possible, there need to be fair representation of all sectors served. Even though there have been efforts at the federal government level to push corporation to diversify their workforce, there need to be more thorough policies. For example, a tech company might have a diverse workforce, but most of the minority groups works in the factory and only a few works in design and executive positions. From a policy perspective, the government needs to go beyond just requiring corporation to meet certain diversity numbers to being specific that those numbers need to be met at the group level inside the organization.

Companies often define diversity as the diversity of thoughts which is valid, but it cannot be segregated from the diversity of religion, gender, race, socioeconomic status, etc. Diversity and representation benefits are not restricted on meeting the needs of the consumers as it has direct benefits on corporations as well. Research found that companies with the most gender and ethnic/cultural diversity on executive teams were ‘33% more likely to have industry-leading profitability (medium).

Conclusion:
There is no doubt that technology has made our lives better in many ways. As it might be a mean for corporation to make profits, there is still a mutually beneficial incentive for consumers and companies to diversify the work force. Diversity and representation ensure the development of technologies that serve all segments of society, minimize discrimination and biases, prevent tragedies, and increase profitability for corporations. Individuals of certain communities know their problems and needs best, and they must be included in the conversation and the decision making. Policy makers and businesses have an obligation to ensure the inclusion of the people served. In my opinion, representation is an effective way to avoid or at least minimize the risk of technologies contributing to another genocide.

References:
https://www.atlanticcouncil.org/blogs/new-atlanticist/now-is-the-time-to-recognize-the- genocide-in-burma/
https://varian- my.sharepoint.com/personal/cns6848_varian_com/Documents/Desktop/www.crisisgroup.org/asi a/south-east-asia/myanmar/312-identity-crisis-ethnicity-and-conflict-myanmar
https://www.ohchr.org/Documents/HRBodies/HRCouncil/FFM-Myanmar/A_HRC_39_64.pdf https://www.google.com/search?q=myanmar+genocide&rlz=1C1SQJL_enUS919US919&sxsrf=AP
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret- ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
https://www.google.com/search?q=spread+of+hate+speech+through+facebook&tbm=isch&ved=2 ahUKEwi55Yf-vID2AhViP30KHbS8C3sQ2- cCegQIABAA&oq=spread+of+hate+speech+through+facebook&gs_lcp=CgNpbWcQAzIHCCMQ7wMQ J1DVjARYxp8EYJSiBGgCcAB4AIAB5wGIAY8IkgEFMC42LjGYAQCgAQGqAQtnd3Mtd2l6LWltZ8ABAQ &sclient=img&ei=BfcKYvnOJ-L-9AO0- a7YBw&bih=714&biw=1536&rlz=1C1SQJL_enUS919US919#imgrc=Y7NkBHg1KnNU8M
https://medium.com/groupon-design-union/diversity-in-product-development-is-always- important-even-when-selling-coupons-7138cf6946f8

Doctors See the Patient – AI Sees Everything Else

Doctors See the Patient – AI Sees Everything Else
By Andi Morey Peterson | February 23, 2022

As a woman who has never fit into the ideal mold of what it means to be a physically healthy 30-something female, I am quite excited about the prospects of what machine learning can bring to the healthcare industry. It took me years, nearly a decade, to find and interview physicians to find the one that would not just focus on the numbers and that would take my concerns seriously as a woman.

I had all too often felt the gaslighting many women experience when seeking health care. It has been reported often that women’s complaints get ignored easier or pushed off as “normal”. We get seen as being anxious and emotional and it has been proven we wait longer to receive relief when expressing pain”[1]. Black women and other minorities have it even worse. For example, black women die at three times the rate as white women during pregnancy[2]. They are recommended fewer screenings and prescribed less pain medication. Knowing this, the question now is: can machine learning help doctors correctly diagnose patients while ignoring their bias? Can they be more objective?

What we can look forward to:

Today, we are seeing the results of machine learning in our health care systems. Some emergency rooms are using AI to scan in paperwork saving clerical time and using NLP to document conversations between doctors and patients. And researchers are building models that can use computer vision to better detect cancer cells[3]. While all of this is very exciting, will that truly improve patient care?

We want to fast forward to the days where social bias will decrease as more machine learning algorithms are used to help doctors make decisions and diagnoses. Especially as gender becomes more fluid, algorithms will be forced to look at more features than what the doctor sees in front of them. In a way, a doctor, with their bias, sees the patient and their demographics, but the algorithms can see everything. In addition, as more algorithms are released, the more doctors can streamline their work, thus decreasing their errors and reducing the amount of paperwork.

We must remain diligent:

We know that with these solutions we must be careful. Most solutions will not apply to all patients and some solutions simply don’t work no matter how much training data we throw at it. IBM Watson’s catastrophic failure to even come close to real physician knowledge is a good example[4]. It saw only the symptoms, it didn’t see the patient. Worse, unlike other simple ML models, such as Jeopardy (which Watson dominated), what is considered “healthy” is often disagreed upon at the highest level of doctors[5]. The industry is learning this is the case and is heavily focused on fixing these issues.

However, if one of the goals of AI in healthcare is to remove discrimination, we ought to tread lightly. We cannot just focus on improving the algorithms and fine-tuning the models. Human bias has a way of sneaking into our artificial intelligence systems even when we have the intention of making them blind. We have witnessed it with Amazon’s recruiting systems being biased against women and facial regnonitions systems being biased against people of color. In fact, we are starting to see it in previously released models in predicting patient outcomes[5]. We must feed these models with more accurate and unbiased data; this will be the only way we can make sure we can get the best of both worlds. Otherwise, society will have to reckon with the idea that AI can make healthcare disparities worse, not better. With the Belmont Principle of beneficence, we must maximize the benefits and minimize potential harms and that should be at the forefront of our minds as we expand AI in healthcare[6].

My dream of an unbiased AI to handle my health care is not quite as close as I had hoped. My search for a good doctor will continue. In the future, the best doctors will use AI as a tool in their arsonal to help decide what to do for a patient. It will be a practice of art, knowing what to use and when and more importantly knowing when their own biases are coming into play so that they can treat the patient in front of them and so they can be sure the data fed into future models isn’t contaminating the results. We need the doctor to see the patient and AI to see everything else. We cannot have one without the other.

References:
[1] Northwell Health. (2020) Gaslighting in women’s health: No, it’s not just in your head
[2] CDC. (2021) Working Together to Reduce Black Maternal Mortality | Health Equity Features | CDC
[3] Forbes. (2022) AI For Health And Hope: How Machine Learning Is Being Used In Hospitals
[4] Goodwins, Rupert. The Register. (2022) https://www.theregister.com/2022/01/31/machine_learning_the_hard_way/
[5] Nadis, Steve. MIT. (2022) The downside of machine learning in health care | MIT News
[6] The Belmont Report. (1979) https://www.hhs.gov/ohrp/sites/default/files/the-belmont-report-508c_FINAL.pdf

Children’s Online Privacy. Parents, Developers or Regulators Responsibility?

Children’s Online Privacy:  Parents, Developers or Regulators Responsibility?
By Francisco Miguel Aguirre Villarreal | February 23, 2022

As a parent of four daughters under 10 and user of social media and gaming platforms, I am in awe of the current trends, viral videos and what kids (not just minors) are posting nowadays. And looking at it, I constantly ask myself about what the girls will be expose to and socially pressure to do, see or say on their teen and pre-teen phase.

This concern its not only mine and it’s not exclusive to this point in time. In the 90’s, legislators and social organizations drafted ideas to protect children from misuse and harms in websites. Those ideas ended up in Congress passing the 1998 Children’s Online Privacy Protection Act (COPPA). It basically states that websites aimed at children under the age of 13 must get parental consent along with other rights to protect their privacy and personally identifying information.

But, because of the additional burden that COPPA compliance carries, most apps and sites not designed for children, social media at the top, prefer to direct their services to adults by adding to their privacy statements clauses to ban or mention that the services are intended for +13, +16 or +18.

The foregoing creates two different main problems, a) for children intended apps or sites, COPPA’s parental consent doesn’t always work, since Children will not wait for parent approval and can falsify approval or search for less restrictive sites, which opens the door for them to access inappropriate content or share data that can be used for advertising or targeting them for purposes that can be harmful, and, b) Social media and other apps not intended for children are still being used by them just by lying about their age to enroll, without any parental consent or, in many cases, without parents even knowing about it. Since apps and sites assume that they are not children, they will not be treated as such and can leave them completely exposed to identity theft, predators or a handful of potential risks, on the other hand parents doesn’t even know their children enrollment so they can’t do anything to protect them and finally the law, as progressive as it may be, will just be, in most cases, reactive for an event that already occur.

Among the findings of the 2020 report “Responding to Online Threats: Minors’ Perspectives on Disclosing, Reporting, and Blocking” conducted by Thorn and Benenson Strategy Group was that 45% of kids under the age of 13 already use Facebook and 36% of them reported experienced a potentially harmful situation online. And this is just Facebook, the numbers are not considering Snapchat, Tik Tok, Instagram and many others available in the market in addition to what may appear in the future.

So, who is responsible for children’s online privacy? First, parents have the responsibility of communicating with their children to make them aware of the risks they are exposed to and monitoring their online activity to early detect potential harms. This might sound easy but finding the balance between surveillance for protection and spying on their children might not come as easy. Still, it’s a necessary or even mandatory task for parents to protect them and informed them. Secondly, legislators and regulators should have a data base of complaints and confirmed cases to gradually typify and incorporate them into the applicable laws, both for developers and perpetrators. These constant updates can modernize children’s online privacy legislation and make it more proactive than reactive. Third, developers must create minimum ethics standards within companies, communicate possible harms on an easy readable format and inform presented cases to children and parents for easy understanding of potential harms. If social organizations, developers, legislators and regulators work together on regulations and principles to protect minors, it would be more fluid, efficient and, above all, safe for children, understanding that the final responsibility will always be carried by parents. Let’s work together help parents minimize that burden and better protect children.

Works Cited:
Children’s Online Privacy Protection Act (COPPA) of 1998
https://www.ftc.gov/enforcement/rules/rulemaking-regulatory-reform-proceedings/childrens-online-privacy-protection-rule

16 CFR Part 312, the FTC’s Children’s Online Privacy Protection Rule
https://www.govinfo.gov/app/details/CFR-2003-title16-vol1

Thorn, Benenson Strategy Group, Responding to Online Threats: Minors’ Perspectives on Disclosing, Reporting, and Blocking, 2021
https://info.thorn.org/hubfs/Research/Responding%20to%20Online%20Threats_2021-Full-Report.pdf

Photos
Aguirre Blogphoto1
images

Aguirre Blogphoto2
children-computer

Biometric Data: Don’t Get Third-Partied!

Biometric Data: Don’t Get Third-Partied!
By Kayla Wopschall | February 23, 2022

In 2020, Pew Research estimated that one in five Americans regularly uses a Fitness Tracker, putting the market at $36.34 billion dollars. Then you have Health Applications, where you can connect your trackers and get personalized insights into your health, progress towards goals, and general fitness.

With so much personal data held on one little application on your phone or computer, it is easy to feel like it is kept personal. But the Fitness Tracker and Health Applications have your data and it is critically important to understand what you’ve agreed to when you quickly click that “Accept User Agreement” button when setting up your account.

Biometric Data – How Personal Is it?

Fitness Trackers collect an incredible amount of Biometric Data that reveals very personal information about your health, behaviors, and even exact GPS coordinates with timestamps. From this, it is easily possible to analyze someone’s health, lifestyle, and patterns of movement in space… in fact, many Health Applications that read in your Fitness Tracker data are designed to do just that. Provide you patterns of your behavior, allow you to share things like bike routes with friends, and evaluate ways for you to improve your health.

But what do the companies do with this data? It can feel like it is used just for you and your fitness goals, the service that they’re providing… in reality, this data is used for much more. And may be provided to third-parties… in other words, other companies you have not directly consented to.

What is required in a Privacy Policy?

In 2004, the State of California was the first to implement a law, the California Online Privacy Protection Act, or CalOPPA for short, requiring that a Privacy Policy be posted and easily accessible online for all commercial organizations. Because of the nuances of organizations typically doing state business with individuals, the policies apply to all California residents and visitors.

CalOPPA highlights the following basic goals for organizations that collect personally identifying information (PII):

1. Readability – they should use common and understandable language in an easy to understand format.
2. Do Not Track – should contain a clear statement about how/if the device/app tracks you online, and state clearly whether other parties may be collecting PII while you use their service.
3. Data Use and Sharing – Explain the use of your PII, and when possible provide links to the third parties with who your PII is shared with.
4. Individual Choice and Action – Make it clear what choices you as a user has regarding the collection, use, and sharing of PII.
5. Accountability – Have a clear point of contact if you as a user have question or concerns regarding privacy policies and practices.

The implementation of CalOPPA has greatly improved the accessibility and understandability of Privacy policies. However, improvements are needed for third-party data sharing.

Privacy Policies should have a clear explanation of how, if, and when data is shared or sold to a third-party (e.g. another company). There are some protections in place that require companies to aggregate (combine) data and make it anonymous so that any individual can not be identified from the data shared. However, this can be extremely difficult to achieve in practice.

For example, if a third-party purchases data from a health application or fitness tracker, which does not contain personal identifying information like a name or address, the same third-party could then purchase the missing data from a food delivery service that does, making it easy to determine identity.

It is overwhelming to think of all the ways biometric data can travel throughout the web and how it might be used to market, discriminate, and/or monitor individuals.

The first step to keeping data safe is understanding the policies you’ve opted into regarding third-party sharing. Decide if this is a policy you feel comfortable with, and if not, take steps to request the removal of your data from the platform. You do have the right to both fully understand how companies use your data and make more informed choices when clicking that Accept User Agreement button.

Social Media Screening in Recruiting: Biased or Insightful?

Social Media Screening in Recruiting: Biased or Insightful?
By Anonymous | February 23, 2022

Introduction
Did you know that your prospective employer may screen your social media content before initiating a conversation or extending a job offer to you? You may not know that a grammatical error in your social post could make your prospective employer question your communication skills, or an image of you drinking at a party could make them pass on your resume.

Social media does not work like a surveillance camera and it does not show a holistic view of someone’s life. People post content on social media selectively, which may not reflect who they are and how they behave in real life. Regardless of our views on the usage of social media for background screening, social intelligence is on the rise.

Social Intelligence in the Data Era
If you google the phrase “social intelligence”, the first definition you may see is the capacity to know oneself and to know others. If you keep browsing, you will eventually see something different stand out prominently:

Social intelligence is often used as part of the social media screening process. This automates the online screening process and gives a report on a candidate’s social behavior.

The internet holds no secret. In this data era, the capacity to know oneself and to know others has expanded more than ever. According to a 2018 CareerBuilder survey, 70% of employers research job candidates as part of the screening process and 57% decided not to move forward with the candidates due to what they found. What’s more surprising is that the monitoring does not stop once the hiring decision has been made. Some employers continue to monitor employees’ social media presence even after they are hired. Almost half of employers indicated that they use social media sites to research current employees. About 1 in 3 employers have terminated an employee based on the content they found online.

Biases from Manual Social Media Screenings
LinkedIn, Facebook, Twitter and Instagram are commonly screened platforms. You may question if it is legal to screen candidates’ online presences in the hiring process. The short answer is, yes, as long as employers and recruiting agencies comply with the laws such as Fair Credit Reporting Act and the Civil Rights Act in the recruiting and hiring process.

While there are rules in place, complying with the rules may not be easy. Although employers and recruiting agencies should only flag inappropriate content such as crime, illegal activities, violence and sexually explicit materials, social media profiles contain a lot more information that does not exist on the candidates’ resumes. Federal laws prohibit discrimination based on protected characteristics such as age, gender, race, religion, sex orientation, or pregnancy status. However, it is almost impossible to avoid seeing the protected information when navigating through someone’s social media profile. In an ideal world, recruiters should ignore the protected information they have seen and make an unbiased decision based on the information within the work context, but is that even possible? A new research revealed that seeing such information tends to impact recruiters’ evaluations on candidate’s hire-ability. In the study, the recruiters reviewed the Facebook profiles of 140 job seekers. While they clearly looked at the work-related criteria such as education, their final assessment was also impacted by prohibited factors such as relationship status and religion – married and engaged candidates got higher ratings while those who indicated their beliefs got lower ratings.

Some people may consider deleting their social media accounts so nothing can be found online, but that may not help you get a better chance with your next career opportunity. According to the 2018 CareerBuilder survey, almost half of employers say that they are less likely to give a candidate a call if they can’t find the candidate online.

Social Intelligence: Mitigate the Biases or Make It Worse?
There has been heated debate over whether social media usage in the hiring process is ethical. While it seems to increase the efficiency of screening and help employers understand the candidates’ personality, it’s hard to remain unbiased when being exposed to a wide variety of protected information.

Could social intelligence help mitigate the biases? Based on the description of social intelligence, we can see that it automates the scanning process which seems to reduce human biases. However, we don’t know how the results are reported. Is protected information reported? How does the social intelligence model interpret grammatical errors or a picture of someone drinking? Are the results truly bias-free? Only those who have access to those social intelligence results know the answer.

Privacy Computing

Privacy Computing
By Anonymous | October 29, 2021

The collection, use, and sharing of user data can enable companies to better judge users’ needs and provide better services to customers. From the perspective of contextual integrity [1], all the above are reasonable. However, studying the multi-dimensional privacy model [2] and privacy classification method [3], there are many privacy risks in the processing and sharing of user data, such as data abuse, third-party leakage, data blackmail, and so on. Due to the protection of the value of data and the protection of user privacy authorization by enterprises and institutions, data is stored in different places, and it is difficult to effectively connect and interact with each other. Traditional commercial agreements cannot effectively protect the security of data. Once the original data is out of the database, it will face the risk of completely losing control. A typical negative case is the Cambridge Gate incident on Facebook. The two parties follow the agreement: Facebook will transfer tens of millions of user data to Cambridge Analytica for academic research [4]. However, once the original data was released, it was completely out of control and used for non-academic purposes, resulting in huge fines facing Facebook. It is needed to provide a more secure solution from the technical level to ensure that the data usage rights are subdivided in the process of data circulation and collaboration.

“Privacy computing” is a new computing theory and method for protecting the entire life cycle of private information [5]. Privacy leakage, privacy protection and privacy calculation models along with the separation of the right to use the axiom system and other methods, are used to protect the information while using it. Privacy computing is essentially to solve data service problems such as data circulation and data application on the premise of protecting data privacy. The concept of privacy computing includes: “data is available but not visible, data does not move the model moves”, “data is available but invisible, data is controllable and measurable”, “not sharing data, but sharing the value of data” and so on. According to the main related technologies of privacy computing technology in the market, it can be divided into three categories: multi-party secure computing, trusted hardware, and federated learning.

Federated learning is a distributed machine learning technology and system that includes two or more participants. It allows people to perform specific algebraic operations on plaintext data to get the result that is encrypted, and the result obtained by decrypting it is the same as the result of performing the same operation on the plaintext. These participants conduct joint machine learning through a secure algorithm protocol and can jointly model and provide model reasoning and prediction services in the form of intermediate data exchange. And the model effect obtained in this way is almost the same as the effect of the traditional central machine learning model, as shown in Fig.1.

Secure multi-party computation is a technology and system that can safely calculate agreed functions without requiring participants to share their own data and without a trusted third party. Through security algorithms and protocols, participants encrypt or convert data in plain text before providing the data to other parties. No participant can access other parties’ data in plain text, thus ensuring the security of all parties’ data, as shown in Fig.2.

Trusted computing includes a security root of trust that is first created, and then a chain of trust from the hardware platform, operating system to the application system is established. On this chain of trust, the first level of certification is measured from the root, and the first level of trust is the first level. This realizes the step-by-step expansion of trust, thereby constructing a safe and trustworthy computing environment. A trusted computing system consists of a root of trust, a trusted hardware platform, a trusted operating system, and a trusted application. Its goal is to improve the security of the computing platform.

With the increasing attention in various fields, privacy computing has become a hot emerging technology and a hot track for business and capital competition. Data circulation is a key link to release the value of data, and privacy computing technology provides a solution for data circulation. The development of privacy computing has certain advantages and a broad application space. However, due to the imperfect technology development, it also faces some problems. Whether it is innovation breakthroughs realized by engineering or optimization and adaptation between software and hardware, the performance improvement of privacy computing has a long way to go.

References:
【1】 Helen Nissenbaum, “Privacy as Contextual Integrity”, Washington Law Review, Volume 79, Number 1 Symposium: Technology, Values, and the Justice System, Feb 1, 2004.
【2】 Daniel J. Solove, “A Taxonomy of Privacy”, The University of Pennsylvania Law Review, Vol. 154, No. 3, pp. 477-564, 2006. https://doi.org/10.2307/40041279.
【3】 Mulligan Deirdre K., Koopman Colin and Doty Nick 2016, “Privacy is an essentially contested concept: a multi-dimensional analytic for mapping privacy”, Phil. Trans. R. Soc. A.3742016011820160118 http://doi.org/10.1098/rsta.2016.0118
【4】 Confessore, Nicholas (April 4, 2018). “Cambridge Analytica and Facebook: The Scandal and the Fallout So Far”. The New York Times. ISSN 0362-4331. Retrieved May 8, 2020.
【5】 F. Li, H. Li, B. Niu, J. Chen,” Privacy Computing: Concept, Computing Framework, and Future Development Trends”, journal of engineering 5, 1179-1192, 2019.

Alternative credit scoring – a rocky road to credit

Alternative credit scoring – a rocky road to credit
By Teerapong Ninvoraskul | October 29, 2021

Aimee, a food truck owner in the Philippines, was able to expand her business after getting access to a loan. She opened a second business where she sells beauty products on the side. Stories like Aimee were common in the Philippines, where 85% of the formal Filipino population is outside of the formal banking system.

“Aimee makes money, she’s clearly got an entrepreneurial spirit, but previously had no way of getting a forma bank to cooperate” said Shivani Siroya, founder and CEO of Tala, a fintech company providing financial access to individuals and small businesses.

Loan providers usines alternative credit scoring like Tala is spreading fast through developing countries. In just a few years China’s Ant Financial, an affiliate of Alibaba Group, has built up an extensive scoring system, called Zhima Credit (or Sesame Credit), covering 325m people.

Alternative credit scoring could be viewed as a development in building a loan-default prediction system. Unlike the traditional credit score system which determines consumers’ possibilities of default using financial information such as payment history, alternative scoring models use their behaviors on the Internet to predict default rates.

Personal information such as email, devices used, time of the day when browsing, IP address, purchase history, etc. are collected. These data are found to be correlated with loan default rate.


Alternative credit scoring

Financial access for the unbanked
Historically, lower income is the market segment which is too costly for traditional banking to serve, given its small ticket size, expensive infrastructure investment required, and high default rates.

For this market segment, traditional credit-scorers have limited data to work with. They could use payment records for services that are provided first and paid later, such as utilities, cable TV or internet. Such proven payment data are a good guide to default risk in the absence of credit history. In most cases, this specialized score is the only possible channel to get credible scores for consumers that were un-scorable based on traditional credit data alone.

In smaller and poorer countries with no financial infrastructure, credit-scorers have even more limited financial data to work with. Utilities are registered to households, not individuals, if they are registered at all. Thanks to high penetration of pay-as-you-go mobile phones among the poor, rapidly emerging alternative lenders are able to look at payment records for mobile phones.

New breed of startups spot opportunities to bring these data-driven, algorithm-based approaches to offer services to individual and small businesses. Tala, which operates in India, Mexico, the Philippines and east Africa, says it uses over 10,000 data points collected from a customer’s smartphone to determine whether to grant a loan. It has lent more than $2.7 billion to over 6 million customers since 2014.

With inexpensive cost structure and lower loan default rates, these fintech startups achieve attractive investment returns and are able to provide cost-efficient financing to the previously unbanked and underbanked.

Lack of transparency & fairness challenges
Despite benefits in expanding financial inclusion, alternative credit scoring presents new challenges that raises issues of transparency and fairness.

First, it is harder to explain to people seeking credit than traditional scores. While consumers generally have some sense of how their financial behavior affects their traditional credit scores, it may not always be readily apparent to consumers, or even to regulators, what specific information is utilized by certain alternative credit scoring systems, how such use impacts a consumer’s ability to secure a loan or its pricing, and what behavioral changes consumers might take to improve their credit access and pricing.

Difficulty in explaining the alternative scores is further amplified by the secretive “blackbox” roles that alternative scoring systems play as competitive edges against each other in producing better default predictions for lenders.

Second, improving their own credit standing is more difficult. Traditional credit scoring is heavily influenced by a person’s own financial behavior; therefore, clearer targeted actions to improve one’s credit standing, i.e., punctual monthly mortgage payments.

However, most alternative data may not be related to a person’s own financial conduct, making it beyond consumers’ control to positively influence the scores. For example, a scoring system using your social media profile, or where you attended high school, or where you shop to determine your creditworthiness would be very difficult for you to take actions to positively influence.

Third, big data could contains potential inaccuracies and biases that might lead to discrimination against against low-income, therefore, failing to provide equitable opportunity for the underserved population.

Using some alternative data, especially data about a trait or attribute that is beyond a consumer’s control to change, even if not illegal to use, could harden barriers to economic and social mobility, particularly for those currently out of the financial mainstream, i.e., Landlords often don’t report rental payments that million people make on a regular basis, including more than half of Black Americans.

Predicting the predictors

Ultimate goal of the alternative scoring system is to predict the likelihood of timely payment, which are incorporated in the predicting factors within the FICO traditional scoring system. One would argue that alternative scoring is simply an attempt to use correlations between these non-traditional characteristics and payment history to come up with the creditworthiness prediction.

It’s arguably whether these alternative models could match the prediction power of actual financial records, and whether it is simply a transitional road to the traditional model while financial payment records are not available for the underserved population.

References:

  • www.economist.com/international/2019/07/06/a-brief-history-and-future-of-credit-scores
  • Big Data: A tool for inclusion or exclusion? Understanding the issues (FTC Report)
  • CREDIT SCORING IN THE ERA OF BIG DATA Mikella Hurley* & Julius Adebayo** 18 YALE JL & TECH. 148 (2016)
  • Is an Algorithm Less Racist Than a Loan Officer? New York Times, Sep 2020
  • What Your Email Address Says About Your Credit Worthiness, Duke University’s Fuqua School of Business, Sep 2021,
  • Data Point: Credit Invisibles, The Consumer Financial Protection Bureau Office of Research
  • On the Rise of FinTechs – Credit Scoring using Digital Footprints
  • Zest AI Comments on The Federal Guidance For Regulating AIts-on-the-federal-guidance-for-regulating-ai
  • MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES FROM: Russell T. Vought Acting Director

The New Need to Teach Technology Ethics

The New Need to Teach Technology Ethics
By Tony Martinez| October 29, 2021

The Hippocratic oath was written in the 5th century BC with on of the first lines stating “I will use those dietary regimens which will benefit my patients according to my greatest ability and judgement, and I will do no harm or injustice to them.”1 Iterations of this ode has been adopted and updated to be used in medicine and in other industries with the main purpose of stating do no harm. For these industries the onus of the oath falls on the industry and not the patients or users. Is it time now for Technology companies to take a similar oath?

Discussion:
Like many people I use a plethora of applications and websites for things like mobile banking and my daily work or for the occasional dog video. In doing this I blindly accept terms of service, cookie policies, and even share my data such as email for more targeted advertisements. Then I took w231 “Behind the Data: Human and Values” at the University of California Berkeley and was tasked to review these terms and understand them. It was here, where as a master level student, I was frustrated and unable to grasp some of the concepts companies discussed in the terms of service. So how would we expected the 88.75% of US households with social media accounts to be able to navigate such technical legalese.

With the average reading level in the United States being slightly over an 8th grade…

…the onus to protect the users of an application is shifting to the developers. As this shift occurs and we have the same public outcries due to data breaches or research like the Facebook contagion study we must explore if these developers have the tools to make ethical choices. Or if the companies should require them to be better trained and think through all the ethical implications.

These ethical issues are not new to technology or the Silicon Valley. Evidence of ethical issues in Technology can be found by reviewing the founding of the Markkula center in 1986. The purpose of the center was to provide silicon valley decision makers with the tools to properly practice ethics when making decisions. The founder of the center, and former Apple Chairman, Mike Markkula Jr. created this after he felt “[it was clear]”…that there were quite a few people who were in decision-making positions who just didn’t have ethics on their radar screen.” To him it was not that decision makers were being unethical but they didn’t have the tools needed to think ethically. Now the center serves as a location to provide training to companies with regards to technology, AI, and machine learning. This has lead to larger companies like Google to send a number of employees to train at the Markkula center and has since allowed them to develop a fairness module to train developers on the notion of fairness and ethics.  More importantly after its creation google moved to allow the module to be publicly available as it felt the onus of protecting the users of their virtual world fell on the System developers. Googles fairness module even signifies this by stating “As ML practitioners build, evaluate, and deploy machine learning models, they should keep fairness considerations (such as how different demographics of people will be affected by a model’s predictions) in the forefront of their minds.”

It is clear from Googles stance and the growing course work at some public universities that an oath of no harm is needed in technology and is making its way into the education of developers. Such large paradigm shifts regarding ethics by these companies shows the increasing importance for them to train employees. The public view has shifted on them to not only state their ethical views but to prove it with actions and by making items like the fairness module available publicly it provides the groundwork to eventually have it mandatory in the Technology sector and for the Developers.

References:
1. National Institute of Health. (2012, February 07). Greek Medicine: “I Swear by Apollo Physician …” Greek Medicine from the Gods to Galen. https://www.nlm.nih.gov/hmd/greek/greek_oath.html
2. Statista Research Department. (2021, June 15). Social media usage in the United States – Statistics & Facts. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/#dossierKeyfigures
3. Wriber. (Accessed on 2021, October 27). A case for writing below a grade 8 reading level. https://wriber.com/writing-below-a-grade-8-reading-level/
4. Kinster, L. (2020, February 2020). “Ethicists were hired to save tech’s soul. Will anyone let them?”. https://www.protocol.com/ethics-silicon-valley
5. Kleinfeld, S (2018, October 18). “A new course to teach people about fairness in machine learning”. https://www.blog.google/technology/ai/new-course-teach-people-about-fairness-machine-learning/

Are the kids alright?

Are the kids alright?
By Anonymous | October 29, 2021

Today 84% of teenagers own a cellphone in the US . Further, teens spend an average of 9 hours per day online. While half of parents with teenagers aged 14 to 17 say they are “extremely” or “very aware” of what their kids are doing online, only 30 percent of teens say their parents are “extremely” or “very aware” of what they’re doing online. There are plenty of books, resources and programs/applications to help parents track what their teens are doing online. However, in truth there are just as many ways for kids to get around these types of controls.

This is even more disturbing when we consider that privacy policies of many companies only protect children 13 and under, but do not apply to teenagers. This means that teens are treated as adults when it comes to privacy. For example TikTok, which is the number one app used by teenagers in the US today, states the following in their privacy policy:

By contrast here is an excerpt from TikTok’s privacy policy for children under 13. It states clear retention and deletion processes.

While teens may be fine sharing their data with TikTok in what feels like a friendly community, they may not realize how many partners TikTok is sharing their data with. This list of partners includes ones that we might expect like payment processors, but it also includes advertising vendors that might be less expected/desirable.

In turn, each of these partners has their own data handling, retention, sharing, privacy and deletion policies and practices that are completely unknown to TikTok users.

What about the government?
While we might expect private corporations to do what is in their own best interests, even Congress has been slow to protect the privacy of teens. This week the Congressional subcommittee on Consumer Protection, Product Safety and Data Security questioned policy leaders from TikTok and Snap about the harmful effects of social media on kids and teens.

While these types of investigations are necessary and increase visibility into these companies’ opaque practices, the bottom line is that there are no formal protections for teens today. The Children’s Online Privacy Protection Act (COPPA), enacted in 1998, does impose certain restrictions on websites targeted at children, but only protects children 13 and under. The bill S.1628, which looks to amend COPPA to include protections to teenagers, was only introduced in May of this year . Additionally, there is the Kids Internet Design and Safety Act (KIDS) which was proposed last month to protect the online safety of children under 16. However, all this is still only under discussion – nothing has been approved.

What about protections such as GDPR and CCPA?
The General Data Protection Regulation (GDPR) which went into effect in Europe in 2018, was enacted to give European citizens more control over their data. It includes the “right to be forgotten” which states:

“The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay” if one of a number of conditions applies. “Undue delay” is considered to be about a month.

Similarly in the US, California has enacted the California Consumer Privacy Act (CCPA), which went into effect in 1998, extends similar protections to California residents. While it is likely that many other states will follow suit with similar types of protections, companies are able to interpret their implementation of these regulations as they see fit, and many are still figuring out exactly how to implement these policies tactically in their organizations. Until then teens will continue to create a digital footprint and audit trail that could follow them for many years into the future.

How do we move forward?
As we see, there are many places where privacy protections for teens break down. They are legally children, but have none of the protections that kids should have. Google this week announced that children (persons under the age of 18) or adults on their behalf have the ability to request that photos of them be removed from the search engine. This is a step in the right direction. However, we need more. We need governmental agencies to move more quickly to enact legislation to provide stronger, explicit protections for teens so that their privacy protections are not dictated by the whims of online companies – we owe them that much.

Sources:
“It’s A Smartphone Life: More Than Half Of US Children Now Have One.” 31 Oct. 2019, https://www.npr.org/2019/10/31/774838891/its-a-smartphone-life-more-than-half-of-u-s-children-now-have-one. Accessed 7 Oct. 2021.
“How much time does a teenager spend on social media?.” 31 May. 2021, https://www.mvorganizing.org/how-much-time-does-a-teenager-spend-on-social-media/. Accessed 25 Oct. 2021.
“Think You Know What Your Teens Do Online? – ParentMap.” 16 Jan. 2018, https://www.parentmap.com/article/teen-online-digital-internet-safety. Accessed 25 Oct. 2021.
“Text – S.1628 – 117th Congress (2021-2022): Children and Teens ….” https://www.congress.gov/bill/117th-congress/senate-bill/1628/text. Accessed 7 Oct. 2021.
“Google now lets people under 18 or their parents request to delete ….” 27 Oct. 2021, https://techcrunch.com/2021/10/27/how-to-delete-your-kids-pictures-google-search/. Accessed 28 Oct. 2021.

Trends in Modern Medicine and Drug Therapy

Trends in Modern Medicine and Drug Therapy
By Anonymous | October 11, 2021

The prescription drug industry has been a constant headline in the news over the past decade for a variety of reasons. Opioid addiction is probably the most prominent drawing attention to the negative aspects of prescription drug abuse. One of the current headlines and topics in congress is prescription drug costs which is a large issue for certain demographics who are unable to access drugs essential to their well being. Overshadowed are discussions of the benefits of drug therapy and the opportunities for advancement in the medical field through research and a combination of modernized and alternative methodologies.

Three interesting methodologies and fields of research that overlap with drug therapy are personalized medicine, patient engagement, and synergies between modern and traditional medicine. Interestingly, data collection, data analytics, and data science are important components of each. Below is a quick synopsis of these topics including some of the opportunities and challenges with the integration of data in the research. I include a number of research papers I reviewed at the end.

Patient engagement defined broadly is the practice of the patient being involved in decision making throughout their treatment. A key component of patient engagement is education in various aspects of one’s own personal health and the treatment options available. A key benefit is collection of better treatment intervention and outcome data.

One of the primary aspects of decision making in pursuing a treatment option is that the benefits outweigh the risks (fda). Patients which take an active role in their treatment and are more aware of the associated risks are naturally better able to minimize the negative effects. One common example of a risk is weight gain. Another benefit of patient engagement is better decision making with respect to lifestyle changes such as having children.

Patient engagement also creates the opportunity to gather better data through technological advances in smartphone devices and apps which allow patients to enter data or collect data through automatic sensors. Social media data is actually a common data source and it is tough to argue that the patient provided data is not a better alternative.

Traditional Medicine, also known as alternative medicine, are those which have been practiced by indigenous cultures and rely on natural products and therapies to provide health care treatment. Two examples include Traditional Chinese Medicine and Ayurveda of India. For the purposes of this discussion, I would broaden the field to the evolving use of natural products such as CBD’s and medicinal marijuana.

While the efficacy of alternative medicine is debated, it can probably be agreed that components of traditional medicine can provide practical medical benefits to modern health care. One of the main constraints of identifying these components is the access to data. In the case of Ayurveda, one researcher has proposed a data gathering framework combining a digital web portal, operational training of practitioners, and leveraging India’s vast infrastructure of medical organizations to gather and synthesize data (P. S. R. K. Haranath). As the developed world becomes more comfortable with alternative medicine, these types of data collection frameworks will be critical to formalizing treatments.

Personalized Medicine is the concept of medicine which can be tailored to the individual rather than a one-size fits all approach (Bayer). The complex scientific framework relies on biomarkers, biogenetics, and patient stratification to develop targeted treatment for individual patients.

Data analytics and data professionals will play a vital role in the R & D of personalized medicine and the pharmaceutical industry in general. Operationalized data is a key component to the research methodologies. Many obstacles exist with clinical data including the variety of data sources, types, and terminology, siloed data across the industry, and data privacy and security. Frameworks are being developed to lead to more data uniformity and promising efforts are being made to share data across organizations. With operationalized data, advanced predictive and prescriptive analytics can be conducted to develop customized treatments and decision support (Peter Tormay). Although complex, hopefully continued progress in research and application of data analytics will lead to incremental innovations for medical treatment.

The broader purpose of the discussion is to bring awareness and advocacy for these fields of research as healthcare data is a sensitive topic for patients. The opportunities with respect to data are also highlighted to help build confidence in the prospect of jobs in the fields of data engineering, data analytics, and data science in medicine. Hopefully, the long term results of this medical research will be to provide patients with more and better treatment options, increase treatment effectiveness and long term sustainability, and lower costs and increase availability.

Resource Materials:

Pharmaceutical Cost, R&D Challenges, and Personalized Medicine
Ten Challenges in Prescription Drug Market – Cost <https://www.brookings.edu/research/ten-challenges-in-the-prescription-drug-market-and-ten-solutions/>
Big Data in Pharmaceutical R&D: Creating a Sustainable R&D Engine <https://link.springer.com/article/10.1007/s40290-015-0090-x> Peter Tormay
Bayer’s Explanation of Personalized Medicine <https://www.bayer.com/en/news-stories/personalized-medicine-from-a-one-size-fits-all-to-a-tailored-approach>

Patient Engagement and Centricity
Making Patient Engagement a Reality <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5766722/>
Think It Through: Managing the Benefits and Risks of Medicines <https://www.fda.gov/drugs/information-consumers-and-patients-drugs/think-it-through-managing-benefits-and-risks-medicines>
Patient Centricity and Pharmaceutical Companies: Is It Feasible? <https://journals.sagepub.com/doi/full/10.1177/2168479017696268>

Traditional and Alternative Medicine
The Traditional Medicine and Modern Medicine from Natural Products <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6273146/>
Role of pharmacology for integration of modern medicine and Ayurveda <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4621664/> , P. S. R. K. Haranath
Number of States with Legalized Medical Marijuana <https://www.insidernj.com/press-release/booker-warren-call-doj-decriminalize-cannabis/>

Prescription Drug Stats
https://hpi.georgetown.edu/rxdrugs/ <https://hpi.georgetown.edu/rxdrugs/>
https://www.cdc.gov/nchs/data/hus/2019/039-508.pdf

Images
5 elements of successful patient engagement <https://hitconsultant.net/2015/07/17/5-elements-of-successful-patient-engagement/#.YWUB3bhKhyw> – HIT Consultant News
Personalized Medicine Image <https://blog.crownbio.com/pdx-personalized-medicine>