Abstract: Data is now everywhere: enormous amounts of data are produced and processed every day. Data is gathered and used extensively in computations that serve many purposes: e.g., computing statistics on populations, refining bidding strategies in ad auctions, improving recommendation systems, and making loan or hiring decisions. Various organizations hold large amounts of data from their customers or users, while many aim to build or complement their data-sets by buying and aggregating data from other sources.


Yet, data is not always transacted and processed in a responsible manner. Often, data about individuals is collected without their consent and without appropriate transparency and compensation. Numerous leaks of private data have happened in the past decade, exhibiting a need for better privacy protections in transactions and computations involving data. Data-driven machine learning and decision making algorithms have been shown to mimic past bias or introduce additional bias in their decisions and predictions, leading to inequities and disparate impacts across individuals and populations.


In this talk, I will focus on my research on using data in a more responsible manner. I will first address the optimization and economic challenges that arise when letting agents opt in and out of data sharing, and compensating them sufficiently for their data contributions. I will then briefly cover some of my work on the privacy issues that arise in data transactions and data-driven analysis. Finally, I will talk about how to reduce the disparate and discriminatory impact of data-driven decision-making, with a focus on long-term fairness considerations.

 

Bio: Juba Ziani is a Warren Center Postdoctoral Fellow at the University of Pennsylvania, hosted by Sampath Kannan, Michael Kearns, Aaron Roth, and Rakesh Vohra. Prior to this, he was a PhD student at Caltech in the Computing and Mathematical Sciences department, where he was advised by Katrina Ligett and Adam Wierman.


Juba studies the optimization, game theoretic, economic, ethical, and societal challenges that arise from transactions and interactions involving data. In particular, his research focuses on the design of markets for data, on data privacy with a focus on "differential privacy", on fairness in machine learning and decision-making, and on strategic considerations in machine learning.