Archive for November, 2018

This is a project update from a CTSP project from 2017: Assessing Race and Income Disparities in Crowdsourced Safety Data Collection (with Kate BeckAditya Medury, and Jesus M. Barajas)

Project Update

This work has led to the development of Street Story, a community engagement tool that collects street safety information from the public, through UC Berkeley SafeTREC.

The tool collects qualitative and quantitative information, and then creates maps and tables that can be publicly viewed and downloaded. The Street Story program aims to collect information that can create a fuller picture of transportation safety issues, and make community-provided information publicly accessible.

 

The Problem

Low-income groups, people with disabilities, seniors and racial minorities are at higher risk of being injured while walking and biking, but experts have limited information on what these groups need to reduce these disparities. Transportation agencies typically rely on statistics about transportation crashes aggregated from police reports to decide where to make safety improvements. However, police-reported data is limited in a number of ways. First, crashes involving pedestrians or cyclists are significantly under-reported to police, with reports finding that up to 60% of pedestrian and bicycle crashes go unreported. Second, some demographic groups, including low-income groups, people of color and undocumented immigrants, have histories of contentious relationships with police. Therefore, they may be less likely to report crashes to the police when they do occur. Third, crash data doesn’t include locations where near–misses have happened, or locations where individuals feel unsafe but an issue has not yet happened. In other words, the data allow professionals to react to safety issues, but don’t necessarily allow them to be proactive about them.

One solution to improve and augment the data agencies use to make decisions and allocate resources is to provide a way for people to report transportation safety issues themselves. Some public agencies and private firms are developing apps and websites whether people can report issues for this purpose. But one concern is that the people who are likely to use these crowdsourcing platforms are those who have access to smart phones or the internet and who trust that government agencies with use the data to make changes, biasing the data toward the needs of these privileged groups.

Our Initial Research Plan

We chose to examine whether crowdsourced traffic safety data reflected similar patterns of underreporting and potential bias as police-reported safety data. To do this, we created an online mapping tool that people could use to report traffic crashes, near-misses and general safety issues. We planned to work with a city to release this tool to and collected data from the general public, then work directly with a historically marginalized community, under-represented in police-reported data, to target data collection in a high-need neighborhood. We planned to reduce barriers to entry for this community, including meeting the participants in person to explain the tool, providing them with in-person and online training, providing participants with cell phones, and compensating their data plans for the month. By crowdsourcing data from the general public and from this specific community, we planned to analyze whether there were any differences in the types of information reported by different demographics.

This plan seemed to work well with the research question and with community engagement best practices. However, we came up against a number of challenges with our research plan. Although many municipal agencies and community organizations found the work we were doing interesting and were working to address similar transportation safety issues we were focusing on, many organizations and agencies seemed daunted by the prospect of using technology to address underlying issues of under-reporting. Finally, we found that a year was not enough time to build trusting relationships with the organizations and agencies we had hoped to work with. Nevertheless, we were able to release a web-based mapping tool to collect some crowdsourced safety data from the public.

Changing our Research Plan

To better understand how more well-integrated digital crowdsourcing platforms perform, we pivoted our research project to explore how different neighborhoods engage with government platforms to report non-emergency service needs. We assumed some of these non-emergency services would mirror the negative perceptions of bicycle and pedestrian safety we were interested in collecting via our crowdsourcing safety platform. The City of Oakland relies on SeeClickFix, a smartphone app, to allow residents to request service for several types of issues: infrastructure issues, such as potholes, damaged sidewalks, or malfunctioning traffic signals; and non-infrastructure issues such as illegal dumping or graffiti. The city also provides phone, web, and email-based platforms for reporting the same types of service requests. These alternative platforms are collectively known as 311 services. We looked at 45,744 SeeClickFix-reports and 35,271 311-reports made between January 2013 and May 2016. We classified Oakland neighborhoods by status as community of concern. In the city of Oakland, 69 neighborhoods meet the definition for communities of concern, while 43 do not. Because we did not have data on the characteristics of each person reporting a service request, we made the assumption that people reporting requests also lived in the neighborhood where the request was needed.

How did communities of concern interact with the SeeClickFix and 311 platforms to report service needs? Our analysis highlighted two main takeaways. First, we found that communities of concern were more engaged in reporting than other communities, but had different reporting dynamics based on the type of issue they were reporting. About 70 percent of service issues came from communities of concern, even though they represent only about 60 percent of the communities in Oakland. They were nearly twice as likely to use SeeClickFix than to report via the 311 platforms overall, but only for non-infrastructure issues. Second, we found that even though communities of concern were more engaged, the level of engagement was not equal for everyone in those communities. For example, neighborhoods with higher proportions of limited-English proficient households were less likely to report any type of incident by 311 or SeeClickFix.

Preliminary Findings from Crowdsourcing Transportation Safety Data

We deployed the online tool in August 2017. The crowdsourcing platform was aimed at collecting transportation safety-related concerns pertaining to pedestrian and bicycle crashes, near misses, perceptions of safety, and incidents of crime while walking and bicycling in the Bay Area. We disseminated the link to the crowdsourcing platform primarily through Twitter and some email lists. . Examples of organizations who were contacted through Twitter-based outreach and also subsequently interacted with the tweet (through likes and retweets) include Transform Oakland, Silicon Valley Bike Coalition, Walk Bike Livermore, California Walks, Streetsblog CA, and Oakland Built. By December 2017, we had received 290 responses from 105 respondents. Half of the responses corresponded to perceptions of traffic safety concerns (“I feel unsafe walking/cycling here”), while 34% corresponded to near misses (“I almost got into a crash but avoided it”). In comparison, 12% of responses reported an actual pedestrian or bicycle crash, and 4% of incidents reported a crime while walking or bicycling. The sample size of the responses is too small to report any statistical differences.

Figure 1 shows the spatial patterns of the responses in the Bay Area aggregated to census tracts. Most of the responses were concentrated in Oakland and Berkeley. Oakland was specifically targeted as part of the outreach efforts since it has significant income and racial/ethnic diversity.

Figure 1 Spatial Distribution of the Crowdsourcing Survey Responses

Figure 1 Spatial Distribution of the Crowdsourcing Survey Responses

 

In order to assess the disparities in the crowdsourced data collection, we compared responses between census tracts that are classified as communities of concern or not. A community of concern (COC), as defined by the Metropolitan Transportation Commission, a regional planning agency, is a census tract that ranks highly on several markers of marginalization, including proportion of racial minorities, low-income households, limited-English speakers, and households without vehicles, among others.

Table 1 shows the comparison between the census tracts that received at least one crowdsourcing survey response. The average number of responses received in COCs versus non-COCs across the entire Bay Area were similar and statistically indistinguishable. However, when focusing on Oakland-based tracts, the results reveal that average number of crowdsourced responses in non-COCs were statistically higher. To assess how the trends of self-reported pedestrian/cyclist concerns compare with police-reported crashes, an assessment of pedestrian and bicycle-related police-reported crashes (from 2013-2016) shows that more police-reported pedestrian/bicycle crashes were observed on an average in COCs across the Bay Area as well as in Oakland. The difference in trends observed in the crowdsourced concerns and police-reported crashes suggest that either walking/cycling concerns are greater in non-COCs (thus underrepresented in police crashes), or that participation from among COCs is relatively underrepresented.

Table 1 Comparison of crowdsourced concerns and police-reported pedestrian/bicycle crashes in census tracts that received at least 1 response

Table 1 Comparison of crowdsourced concerns and police-reported pedestrian/bicycle crashes in census tracts that received at least 1 response

Table 2 compares the self-reported income and race/ethnicity characteristics of the respondents with the locations where the responses were reported. For reference purposes, Bay Area’s median household income in 2015 was estimated to be $85,000 (Source: http://www.vitalsigns.mtc.ca.gov/income), and Bay Area’s population was estimated to be 58% White, per the 2010 Census, (Source: http://www.bayareacensus.ca.gov/bayarea.htm).

Table 2 Distribution of all Bay Area responses based on the location of response and the self-reported income and race/ethnicity of respondents

The results reveal that White, medium-to-high income respondents were observed to report more walking/cycling -related safety issues in our survey, and more so in non-COCs. This trend is also consistent with the definition of COCs, which tend to have a higher representation of low-income people and people of color. However, if digital crowdsourcing without widespread community outreach is more likely to attract responses from medium-to-high income groups, and more importantly, if they only live, work, or play in a small portion of the region being investigated, the aggregated results will reflect a biased picture of a region’s transportation safety concerns. Thus, while the scalability of digital crowdsourcing provides an opportunity for capturing underrepresented transportation concerns, it may require greater collaboration with low-income, diverse neighborhoods to ensure uniform adoption of the platform.

Lessons Learned

From our attempts to work directly with community groups and agencies and our subsequent decision to change our research focus, we learned a number of lessons:

  1. Develop a research plan in partnership with communities and agencies. This would have allowed us to ensure that we began with a research plan in which community groups and agencies were better able to partner with us on, and this would have ensured that the partners were on board the topic of interest and the methods we hoped to use.
  2. Recognize the time it takes to build relationships. We found that building relationships with agencies and communities was more time intensive and took longer that we had hoped. These groups often have limitations on the time they can dedicate to unfunded projects. Next time, we should plan for this in our initial research plan.
  3. Use existing data sources to supplement research. We found that using See-Click-Fix and 311 data was a way to collect and analyze information to add context to our research question. Although the data did not have all demographic information we had hoped to analyze, this data source added additional context to the data we collected.
  4. Speak in a language that the general public understands. We found that when we used the term self-reporting, rather than crowdsourcing, when talking to potential partners and to members of the public, these individuals were more willing to consider the use of technology to collect information on safety issues from the public as legitimate. Using vocabulary and phrasing that people are familiar with is crucial when attempting to use technology to benefit the social good.

This blog post is a version of a talk given at the 2018 ACM Computer Supported Cooperative Work and Social Computing (CSCW) Conference based on a paper written by Richmond Wong, Deirdre Mulligan, Ellen Van Wyk, John Chuang, and James Pierce, entitled Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks, which was honored with a best paper award. Find out more on our project page, our summary blog post, or download the paper: [PDF link] [ACM link]

In the work described in our paper, we created a set of conceptual speculative designs to explore privacy issues around emerging biosensing technologies, technologies that sense human bodies. We then used these designs to help elicit discussions about privacy with students training to be technologists. We argue that this approach can be useful for Values in Design and Privacy by Design research and practice.

dhs slide

Image from publicintelligence.net. Note the middle bullet point in the middle column – “avoids all privacy issues.”

 

Let me start with a motivating example, which I’ve discussed in previous talks. In 2007, the US Department of Homeland Security proposed a program to try to predict criminal behavior in advance of the crime itself –using thermal sensing, computer vision, eye tracking, gait sensing, and other physiological signals. And supposedly it would “avoid all privacy issues.” But it seems pretty clear that privacy was not fully thought through in this project. Now Homeland Security projects actually do go through privacy impact assessments and I would guess that in this case, they would probably go through the impact assessment process, find that the system doesn’t store the biosensed data, so privacy is protected. But while this might  address one conception of privacy related to storing data, there are other conceptions of privacy at play. There are still questions here about consent and movement in public space, about data use and collection, or about fairness and privacy from algorithmic bias.

While that particular imagined future hasn’t come to fruition; a lot of these types of sensors are now becoming available as consumer devices, used in applications ranging from health and quantified self, to interpersonal interactions, to tracking and monitoring. And it often seems like privacy isn’t fully thought through before new sensing devices and services are publicly announced or released.

A lot of existing privacy approaches, like privacy impact assessments, are deductive, checklist-based, or assume that privacy problems already known and well-defined in advance which often isn’t the case. Furthermore, the term “design” in discussions of Privacy by Design, is often seen as a way of providing solutions to problems identified by law, rather than viewing design as a generative set of practices useful to understanding what privacy issues might need to be considered in the first place. We argue that speculative design-inspired approaches can help explore and define problem spaces of privacy in inductive, situated, and contextual ways.

Design and Research Approach

We created a design workbook of speculative designs. Workbooks are collections of conceptual designs drawn together to allow designers to explore and reflect on a design space. Speculative design is a practice of using design to ask social questions, by creating conceptual designs or artifacts that help create or suggest a fictional world. We can create speculative designs explore different configurations of the world, imagine and understand possible alternative futures, which helps us think through issues that have relevance in the present. So rather than start with trying to find design solutions for privacy, we wanted to use design workbooks and speculative designs together to create a collection of designs to help us explore the what problem space of privacy might look like with emerging biosensing technologies.

workbook pages

A sampling of the conceptual designs we created as part of our design workbook

In our prior work, we created a design workbook to do this exploration and reflection. Inspired by recent research, science fiction, and trends from the technology industry, we created a couple dozen fictional products, interfaces, and webpages of biosensing technologies. These included smart camera enabled neighborhood watch systems, advanced surveillance systems, implantable tracking devices, and non-contact remote sensors that detect people’s heartrates. This process is documented in a paper from Designing Interactive Systems. These were created as part of a self-reflective exercise, for us as design researchers to explore the problem space of privacy. However, we wanted to know how non-researchers, particularly technology practitioners might discuss privacy in relation to these conceptual designs.

A note on how we’re approaching privacy and values.  Following other values in design work and privacy research, we want to avoid providing a single universalizing definition of privacy as a social value. We recognize privacy as inherently multiple – something that is situated and differs within different contexts and situations.

Our goal was to use our workbook as a way to elicit values reflections and discussion about privacy from our participants – rather than looking for “stakeholder values” to generate design requirements for privacy solutions. In other words, we were interested in how technologists-in-training would use privacy and other values to make sense of the designs.

Growing regulatory calls for “Privacy by Design” suggest that privacy should be embedded into all aspects of the design process, and at least partially done by designers and engineers. Because of this, the ability for technology professionals to surface, discuss, and address privacy and related values is vital. We wanted to know how people training for those jobs might use privacy to discuss their reactions to these designs. We conducted an interview study, recruiting 10 graduate students from a West Coast US University who are training to go into technology professions, most of whom had prior tech industry experience via prior jobs or internships. At the start of the interview, we gave them a physical copy of the designs and explained that the designs were conceptual, but didn’t tell them that the designs were initially made to think about privacy issues. In the following slides, I’ll show a few examples of the speculative design concepts we showed – you can see more of them in the paper. And then I’ll discuss the ways in which participants used values to make sense of or react to some of the designs.

Design examples

 This design depicts an imagined surveillance system for public spaces like airports that automatically assigns threat statuses to people by color-coding them. We intentionally left it ambiguous how the design makes its color-coding determinations to try to invite questions about how the system classifies people.

truwork

Conceptual TruWork design – “An integrated solution for your office or workplace!”

In our designs, we also began to iterate on ideas relating to tracking implants, and different types of social contexts they could be used in. Here’s a scenario advertising a workplace implantable tracking device called TruWork. Employers can subscribe to the service and make their employees implant these devices to keep track of their whereabouts and work activities to improve efficiency.

coupletrack3

Conceptual CoupleTrack infographic depicting an implantable tracking chip for couples

We also re-imagined the implant as “coupletrack,” an implantable tracking chip for couples to use, as shown in this infographic.

Findings

We found that participants centered values in their discussions when looking at the designs – predominantly privacy, but also related values such as trust, fairness, security, and due process. We found eight themes of how participants interacted with the designs in ways that surfaced discussion of values, but I’ll highlight three here: Imagining the designs as real; seeing one’s self as multiple users; and seeing one’s self as a technology professional. The rest are discussed in more detail in the paper.

Imagining the Designs as Real

peta-cam-2

Conceptual product page for a small, hidden, wearable camera

Even though participants were aware that the designs were imagined, Some participants imagined the designs as seemingly real by thinking about long term effects in the fictional world of the design. This design (pictured above) is an easily hideable, wearable, live streaming HD camera. One participant imagined what could happen to social norms if these became widely adopted, saying “If anyone can do it, then the definition of wrong-doing would be questioned, would be scrutinized.” He suggests that previously unmonitored activities would become open for surveillance and tracking like “are the nannies picking up my children at the right time or not? The definition of wrong-doing will be challenged”. Participants became actively involved fleshing out and creating the worlds in which these designs might exist. This reflection is also interesting, because it begins to consider some secondary implications of widespread adoption, highlighting potential changes in social norms with increasing data collection.

Seeing One’s Self as Multiple Users

Second, participants took multiple user subject positions in relation to the designs. One participant read the webpage for TruWork and laughed at the design’s claim to create a “happier, more efficient workplace,” saying, “This is again, positioned to the person who would be doing the tracking, not the person who would be tracked.”  She notes that the website is really aimed at the employer. She then imagines herself as an employee using the system, saying:

If I called in sick to work, it shouldn’t actually matter if I’m really sick. […] There’s lots of reasons why I might not wanna say, “This is why I’m not coming to work.” The idea that someone can check up on what I said—it’s not fair.

This participant put herself in both the viewpoint of an employer using the system and as an employee using the system, bringing up issues of workplace surveillance and fairness. This allowed participants to see values implications of the designs from different subject positions or stakeholder viewpoints.

Seeing One’s Self as a Technology Professional

Third, participants also looked at the designs through the lens of being a technology practitioner, relating the designs to their own professional practices. Looking at the design that automatically flags and detects supposedly suspicious people, one participant reflected on his self-identification as a data scientist and the values implications of predicting criminal behavior with data when he said:

the creepy thing, the bad thing is, like—and I am a data scientist, so it’s probably bad for me too, but—the data science is predicting, like Minority Report… [and then half-jokingly says] …Basically, you don’t hire data scientists.

Here he began to reflect on how his practices as data scientist might be implicated in this product’s creepiness – that a his initial propensity to want to use the data to predict if subjects are criminals or not might not be a good way to approach this problem and have implications for due process.

Another participant compared the CoupleTrack design to a project he was working on. He said:

[CoupleTrack] is very similar to our idea. […] except ours is not embedded in your skin. It’s like an IOT charm which people [in relationships] carry around. […] It’s voluntary, and that makes all the difference. You can choose to keep it or not to keep it.

In comparing the fictional CoupleTrack product to the product he’s working on in his own technical practice, the value of consent, and how one might revoke consent, became very clear to this participant. Again, we thought it was compelling that the designs led some participants to begin reflecting on the privacy implications in their own technical practices.

Reflections and Takeaways

Given the workbooks’ ability to help elicit reflections on and discussion of privacy in multiple ways, we see this approach as useful for future Values in Design and Privacy by Design work.

The speculative workbooks helped open up discussions about values, similar to some of what Katie Shilton identifies as “values levers,” activities that foreground values, and cause them to be viewed as relevant and useful to design. Participants’ seeing themselves as users to reflect on privacy harms is similar to prior work showing how self-testing can lead to discussion of values. Participants looking at the designs from multiple subject positions evokes value sensitive design’s foregrounding of multiple stakeholder perspectives. Participants reflected on the designs both from stakeholder subject positions and through the lenses of their professional practices as technology practitioners in training.

While Shilton identifies a range of people who might surface values discussions, we see the workbook as an actor to help surface values discussions. By depicting some provocative designs that raised some visceral and affective reactions, the workbooks brought attention to questions about potential sociotechnical configurations of biosensing technologies. Future values in design work might consider creating and sharing speculative design workbooks for eliciting values reflections with experts and technology practitioners.

More specifically, with this project’s focus on privacy, we think that this approach might be useful for “Privacy by Design”, particularly for technologists trying to surface discussions about the nature of the privacy problem at play for an emerging technology. We analyzed participants’ responses using Mulligan et al’s privacy analytic framework. The paper discusses this in more detail, but the important thing is that participants went beyond just saying privacy and other values are important to think about. They began to grapple with specific, situated, and contextual aspects of privacy – such as considering different ways to consent to data collection, or noting different types of harms that might emerge when the same technology is used in a workplace setting compared to an intimate relationship. Privacy professionals are looking for tools to help them “look around corners,” to help understand what new types of problems related to privacy might occur in emerging technologies and contexts. This provides a potential new tool for privacy professionals in addition to many of the current top-down, checklist approaches–which assume that the concepts of privacy at play are well known in advance. Speculative design practices can be particularly useful here – not to predict the future, but in helping to open and explore the space of possibilities.

Thank you to my collaborators, our participants, and the anonymous reviewers.

Paper citation: Richmond Y. Wong, Deirdre K. Mulligan, Ellen Van Wyk, James Pierce, and John Chuang. 2017. Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks. Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 111 (December 2017), 26 pages. DOI: https://doi.org/10.1145/3134746