Caltech Home > PMA Home > Master Calendar > Special Seminar in Computing and Mathematical Sciences
open search form
Wednesday, February 05, 2020
11:00 AM - 12:00 PM
Annenberg 105

Special Seminar in Computing and Mathematical Sciences

Reliability, Equity, and Reproducibility in Modern Machine Learning
Yaniv Romano, Postdoctoral Scholar, Department of Statistics, Stanford University,
Speaker's Bio:
Yaniv Romano is a postdoctoral scholar in the Department of Statistics at Stanford University, advised by Prof. Emmanuel Candes. He earned his Ph.D. and M.Sc. degrees in 2017 from the Department of Electrical Engineering at the Technion—Israel Institute of Technology, under the supervision of Prof. Michael Elad. Before that, in 2012, Yaniv received his B.Sc. from the same department. His research spans the theory and practice of selective inference, sparse approximation, machine learning, data science, and signal and image processing. His goal is to advance the theory and practice of modern machine learning, as well as to develop statistical tools that can be wrapped around any data-driven algorithm to provide valid inferential results. Yaniv is also interested in image recovery problems: the super-resolution technology he invented together with Dr. Peyman Milanfar is being used in Google's flagship products, increasing the quality of billions of images and bringing significant bandwidth savings. In 2017, he constructed with Prof. Michael Elad a MOOC on the theory and practice of sparse representations, under the edX platform. Yaniv is a recipient of the 2015 Zeff Fellowship, the 2017 Andrew and Erna Finci Viterbi Fellowship, the 2017 Irwin and Joan Jacobs Fellowship, the 2018–2020 Zuckerman Postdoctoral Fellowship, the 2018–2020 ISEF Postdoctoral Fellowship, the 2018–2020 Viterbi Fellowship for nurturing future faculty members, Technion, and the 2019–2020 Koret Postdoctoral Scholarship, Stanford University. Yaniv was awarded the 2020 SIAG/IS Early Career Prize.

Modern machine learning algorithms have achieved remarkable performance in a myriad of applications, and are increasingly used to make impactful decisions in the hiring process, criminal sentencing, healthcare diagnostics and even to make new scientific discoveries. The use of data-driven algorithms in high-stakes applications is exciting yet alarming: these methods are extremely complex, often brittle, notoriously hard to analyze and interpret. Naturally, concerns have raised about the reliability, fairness, and reproducibility of the output of such algorithms. This talk introduces statistical tools that can be wrapped around any "black-box" algorithm to provide valid inferential results while taking advantage of their impressive performance. We present novel developments in conformal prediction and quantile regression, which rigorously guarantee the reliability of complex predictive models, and show how these methodologies can be used to treat individuals equitably. Next, we focus on reproducibility and introduce an operational selective inference tool that builds upon the knockoff framework and leverages recent progress in deep generative models. This methodology allows for reliable identification of a subset of important features that is likely to explain a phenomenon under-study in a challenging setting where the data distribution is unknown, e.g., mutations that are truly linked to changes in drug resistance.

For more information, please contact Sydney Garstang by phone at 6263954555 or by email at sydney@caltech.edu.