Data is big. Data can also be scary. But what’s most important is that instead of shying away from these concerns, we should be engaging and tackling them head-on. This hub aims to do just that–not only bringing up the discussions of legal and ethical concerns of data scientists, but insights, projects, and arguments that we should all be thinking about, regardless of whether or not your job title contains the word “data”.

This blog aims to serve as a main hub to discuss and showcase the finer points when it comes to the legal and ethical ‘gray space’ of data science, so to speak. Much of the content here comes from students enrolled in UC Berkeley’s School of Information, in the Data Science master’s program.

The Course

Much of the content and key issues discussed in this blog arise from course “Behind the Data: Humans and Values” (formerly titled “Legal, Policy, and Ethical Considerations for Data Scientists”) aims to explore this area in detail, from all different perspectives. This course provides an introduction to the legal, policy, and ethical implications of data. The course will examine legal, policy, and ethical issues that arise throughout the full life cycle of data science from collection, to storage, processing, analysis and use including, privacy, surveillance, security, classification, discrimination, decisional-autonomy, and duties to warn or act. Case studies will be used to explore these issues across various domains such as criminal justice, national security, health, marketing, politics, education, automotive, employment, athletics, and development. Attention will be paid to legal and policy constraints and considerations that attach to specific domains as well as particular data-types, collection methods, and institutions. Technical, legal, and market approaches to mitigating and managing discrete and compound sets of concerns will be introduced, and the strengths and benefits of competing and complementary approaches will be explored.

Much of the content on this blog showcases the thoughts and work of past and present students who have taken this course; their projects delve into this new and encourage thought and discussion, especially as big data continues to grow and technology continues to integrate itself into our daily lives.

The People

nathanPhD, Information Management and Systems; UC Berkeley; 2002 – 2008
Master’s degree in Computer Science; UC Berkeley; 2002 – 2008
Bachelor of Science in Computer Science; University of Minnesota-Twin Cities; 1997 – 2000


Nathan Good is the current Principal at Good Research, LLC. As a grad student, Nathan interned at PARC, Yahoo! and aHP Labs in Bernardo Huberman’s  Information Dynamics Lab.  Before that he was at PARC (formely Xerox PARC) in Marc Steffik’s Human Document Interaction Group. Nathan also worked with Joe Konstan and John Riedl in the Grouplens group at the University of Minnesota

Github Link for readings:

Class Syllabus: