ICBINB promotes “slow science”, pushes back against “leaderboard-ism”, revalues unexpected negative results, and helps people push their research when “stuck”. More broadly, our moon-shot is to transform how we do research by cracking-open the research process and inviting meta-dialog. Starting in February 2022, we will be hosting virtual seminar series monthly for promoting such values and sharing stories from or about research.

The ICBINB Monthly Seminar Series seeks to shine a light on the “stuck” phases of research. Speakers will tell us about their most beautiful ideas that didn’t “work”, about when theory didn’t match practice, or perhaps just when the going got tough. These talks will let us peek inside the file drawer of unexpected negative results and peer behind the curtain to see the real story of how real researchers do real research.

Join us for our next ICBINB seminar series on August 4th, 2022 at 8am PST / 11am EDT / 5pm CEST, with Benjamin Bloem-Reddy as Invited Speaker (details below)!

August 4th, 2022

Benjamin Bloem-Reddy

Title: From Identifiability to Structured Representation Spaces, and a Case for (Precise) Pragmatism in Machine Learning

Date: August 4th, 2022 at 11am EDT / 5pm CEST / 8am PDT


Abstract: There has been a recent surge in research activity related to identifiability in generative models involving latent variables. Why should we care whether a latent variable model is identifiable? I will give some pragmatic reasons, which differ philosophically from and which have different practical and theoretical implications than, classical views on identifiability, which usually relate to recovering the “true” distribution or “true” latent factors of variation. In particular, a pragmatic approach requires us to consider how the structure we are imposing (or not imposing) on the latent space relates to the problems we’re trying to solve. I will highlight how I think a lack of precise pragmatism is holding back modern methods in challenging settings, including how aspects of my own research on identiability has gotten stuck without problem-specific constraints. Elaborating on methods for representation learning more generally, I will discuss some ways we can (and are beginning to) structure our latent spaces to achieve specific goals other than vague appeals to general AI.

Biography: Benjamin Bloem-Reddy is an Assistant Professor of Statistics at the University of British Columbia. He works on problems in statistics and machine learning, with an emphasis on probabilistic approaches. He has a growing interest in causality and its interplay with knowledge and inference and also collaborates with researchers in the sciences on statistical problems arising in their research.
Bloem-Reddy was a PhD student with Peter Orbanz at Columbia and a postdoc with Yee Whye Teh in the CSML group at the University of Oxford. Before moving to statistics and machine learning, he studied physics at Stanford University and Northwestern University.

July 7th, 2022

Thomas Dietterich

Title: Struggling to Achieve Novelty Detection in Deep Learning

Date: July 7th, 2022 at 11am EDT / 5pm CEST / 8am PST


Abstract: In 2005, motivated by an open world computer vision application, I became interested in novelty detection. However, there were few methods available in computer vision at that time, and my research turned to studying anomaly detection in standard feature vector data. In that arena, many good algorithms were being published. Fundamentally, these methods rely on a notion of distance or density in feature space and detect anomalies as outliers in that space.

Returning to deep learning 10 years later, my students and I attempted, without much success. to apply these methods to the latent representations in deep learning. Other groups attempted to apply deep density models, again with limited success. Summary: I couldn’t believe it was not better. In the meantime, simple anomaly scores such as the maximum softmax probability of the max logit score were shown to be doing very well.

We decided that we had reached the limits of what macro-level analysis (error rates, AUC scores) could tell us about these techniques. It was time to look closely at the actual feature values. In this talk, I’ll show our analysis of feature activations and introduce the Familiarity Hypothesis, which states that the max logit/max softmax score is measuring the amount of familiarity in an image rather than the amount of novelty. This is a direct consequence of the fact that the only features that are learned are ones that capture variability in the training data. Hence, deep nets can only represent images that fall within this variability. Novel images are mapped into this representation, and hence cannot be detected as outliers.

I’ll close with some potential directions to overcome this limitation.

Biography: Dr. Dietterich (AB Oberlin College 1977; MS University of Illinois 1979; PhD Stanford University 1984) is Distinguished Professor Emeritus in the School of Electrical Engineering and Computer Science at Oregon State University.  Dietterich is one of the pioneers of the field of Machine Learning and has authored more than 225 refereed publications and two books. His current research topics include robust artificial intelligence, robust human-AI systems, and applications in sustainability.

Dietterich has devoted many years of service to the research community and was recently given the ACML and AAAI distinguished service awards. He is a former President of the Association for the Advancement of Artificial Intelligence and the founding president of the International Machine Learning Society. Other major roles include Executive Editor of the journal Machine Learning, co-founder of the Journal for Machine Learning Research, and program chair of AAAI 1990 and NIPS 2000. He currently serves as one of the moderators for the cs.LG category on arXiv.

June 16th, 2022

Finale Doshi-Velez

Title: Research Process for Interpretable Machine Learning

Date: June 16th, 2022 at 8:30am EDT / 2:30pm CEST


Abstract: There has been much interest in interpretable machine learning (and/or explainable AI) as a way to allow domain experts to vet machine learning systems as well as a way to assist in human+AI teaming.  In this “chalk” talk, I’ll briefly provide a framework for thinking about the interdisciplinary ecosystem that interpretable machine learning provides and then dive into the process of doing high-quality, impactful machine learning research.  Specifically, I’ll talk about:

  • What are the kinds of interpretable machine learning questions that are computational and what are human factors?
  • How and when should we define abstractions between computational and human factor elements in interpretable machine learning?
  • When is a user study needed, and how should it be set up?

In the spirit of ICBINB, I’ll draw my own experience, including examples of times when I think we got things right, and when we could have done better.

Biography: Finale Doshi-Velez is a Gordon McKay Professor in Computer Science at the Harvard Paulson School of Engineering and Applied Sciences. She completed her MSc from the University of Cambridge as a Marshall Scholar, her PhD from MIT, and her postdoc at Harvard Medical School. Her interests lie at the intersection of machine learning, healthcare, and interpretability.

May 5th, 2022

Anna Korba

Title: Limitations of the theory for sampling with kernelised Wasserstein gradient flows

Date: May 5th, 2022 at 10am EDT / 4pm CEST

[Recording] [Slides]

Abstract: Sampling from a probability distribution whose density is only known up to a normalisation constant is a fundamental problem in statistics and machine learning. Recently, several algorithms based on interactive particle systems were proposed for this task, as an alternative to Markov Chain Monte Carlo methods or Variational Inference.
These particle systems can be designed by adopting an optimisation point of view for the sampling problem: an optimisation objective is chosen (which typically measures the dissimilarity to the target distribution), and its Wasserstein gradient flow is approximated by an interacting particle system, which can involve kernels. At stationarity, the stationarity states of these particle systems define an empirical measure approximating the target distribution.
In this talk I will present recent work on such algorithms, such as Stein Variational Gradient Descent [1] or Kernel Stein Discrepancy Descent [2]. I will discuss some recent results that highlight bottlenecks and open questions: on the empirical side, these particle systems may suffer from convergence issues, while on the theoretical side, optimisation tools may not be sufficient to analyse these algorithms. Still, I will also discuss recent empirical results that show that there is hope in demonstrating nice approximation properties of these particle systems.

[1]  A non-asymptotic Analysis of Stein Variational Gradient Descent. Korba, A., Salim, A., Arbel, M., Luise, G., Gretton, A. Neurips, 2020
[2] Kernel Stein Discrepancy Descent. Korba, A., Aubin-Frankowski, P.C., Majewski, S., Ablin, P. ICML, 2021.

Biography: Since September 2020, Anna Korba is an assistant professor at ENSAECREST in the Statistics Department. Her main line of research is in statistical machine learning. She has been working on kernel methods, optimal transport and ranking data. Currently, she is particularly interested in dynamical particle systems for ML and kernel-based methods for causal inference.

April 7th, 2022

Cynthia Rudin

Date: April 7th, 2022 at 10am EDT / 4pm CEST

Title: Applications Really Matter (And Publishing Them Is Essential For AI & Data Science)

[Recording] [Slides]

Abstract: Many of us want to work on real-world machine learning problems that matter. However, it’s really hard for us to focus on such problems because it is extremely difficult to publish applied machine learning papers in top venues. I will argue that the lack of respect for applied papers has several wide-ranging applications:

1) Benefits to Science: We are unable to leverage scientific lessons learned through applications if we cannot publish them. Applications should actually be driving ML methods development. It is important to point out that applied papers are scientific. A boring bake-off or technical report is not a scientific applied paper. An applied scientific paper provides knowledge that is systematized and generalizes, just like any good scientific paper in any area of science.

2) Benefits to the Real World: We publish overly complicated methods when simpler ones would suffice. If we could focus on solving problems rather than developing methods, this issue could vanish. Much more importantly, if we actually focus on problems that benefit humanity, we might actually solve them.

3) Broadening our Community: By limiting our top venues mainly to methodology papers, we limit our community to those who care primarily about methods development. This further limits our community to those who come from narrow training pipelines. It also limits our field to exclude those whose primary goal is to directly improve the world. A really good applied data scientist from any country should be able to publish in a top tier venue in data science or AI.

4) Freeing our Top Scientists: By tying promotions of our top data scientists to publication venues that accept (essentially only) methodology, it means our top scientists cannot focus on real-world problems. This is particularly problematic if one wants to publish a data science paper in an area for which a specialized journal does not exist.

My proposed fix is to have tracks in major ML conferences and journals that focus on applications.

Biography: Cynthia Rudin is a professor at Duke University. Her goal is to design predictive models that are understandable to humans. She applies machine learning in many areas, such as healthcare, criminal justice, and energy reliability. She is the recipient of the 2022 Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from AAAI (the “Nobel Prize of AI”). She is a fellow of the American Statistical Association, the Institute of Mathematical Statistics, and AAAI. She is a three-time winner of the INFORMS Innovative Applications in Analytics Award. Her work has been featured in news outlets including the NY Times, Washington Post, Wall Street Journal, and Boston Globe.

March 3rd, 2022

Sebastian Nowozin

Date: March 3rd, 2022 at 10am EST / 4pm CET

Title: I Can’t Believe Bayesian Deep Learning is not Better

[Recording] [Slides]

Abstract: Bayesian deep learning is seductive: it combines the simplicity, coherence, and beauty of the Bayesian approach to problem solving together with the expressivity and compositional flexibility of deep neural networks. Yes, inference can be challenging, but the promises of improved uncertainty quantification, better out-of-distribution behaviour, and improved sample efficiency are worth it. Or is it? In this talk I will tell a personal story of being seduced by, then frustrated with, and now recovering from Bayesian deep learning. I will present the context of our work on the cold posterior effect, (Wenzel et al., 2020) and it’s main findings, as well as some more recent work that tries to explain the effect. I will also offer some personal reflections on research practice and narratives that contributed to the lack of progress in Bayesian deep learning.

Biography: Sebastian Nowozin is a deep learning researcher at Microsoft Research Cambridge, UK, where he currently leads the Machine Intelligence research theme.  His research interests are in probabilistic deep learning and applications of machine learning models to real-world problems. He completed his PhD in 2009 at the Max Planck Institute in Tübingen, and has since worked on domains as varied as computer vision, computational imaging, cloud-based machine learning, and approximate inference.

February 3rd, 2022

Tamara Broderick

Date: February 3rd, 2022 at 10am EST / 4pm CET

Title: An Automatic Finite-Sample Robustness Metric: Can Dropping a Little Data Change Conclusions?

[Recording] [Slides]

Abstract: Imagine you’ve got a bold new idea for ending poverty. To check your intervention, you run a gold-standard randomized controlled trial; that is, you randomly assign individuals in the trial to either receive your intervention or to not receive it. You recruit tens of thousands of participants. You run an entirely standard and well-vetted statistical analysis; you conclude that your intervention works with a p-value < 0.01. You publish your paper in a top venue, and your research makes it into the news! Excited to make the world a better place, you apply your intervention to a new set of people and… it fails to reduce poverty. How can this possibly happen? There seems to be some important disconnect between theory and practice, but what is it? And is there any way you could have been tipped off about the issue when running your original data analysis? In the present work, we observe that if a very small percentage of the original data was instrumental in determining the original conclusion, we might worry that the conclusion could be unstable under new conditions. So we propose a method to assess the sensitivity of data analyses to the removal of a very small fraction of the data set. Analyzing all possible data subsets of a certain size is computationally prohibitive, so we provide an approximation. We call our resulting method the Approximate Maximum Influence Perturbation. Empirics demonstrate that while some (real-life) applications are robust, in others the sign of a treatment effect can be changed by dropping less than 0.1% of the data — even in simple models and even when p-values are small.

Biography: Tamara Broderick is an Associate Professor in the Department of Electrical Engineering and Computer Science at MIT. She is a member of the MIT Laboratory for Information and Decision Systems (LIDS), the MIT Statistics and Data Science Center, and the Institute for Data, Systems, and Society (IDSS). She completed her Ph.D. in Statistics at the University of California, Berkeley in 2014. Previously, she received an AB in Mathematics from Princeton University (2007), a Master of Advanced Study for completion of Part III of the Mathematical Tripos from the University of Cambridge (2008), an MPhil by research in Physics from the University of Cambridge (2009), and an MS in Computer Science from the University of California, Berkeley (2013). Her recent research has focused on developing and analyzing models for scalable Bayesian machine learning. She has been awarded selection to the COPSS Leadership Academy (2021), an Early Career Grant (ECG) from the Office of Naval Research (2020), an AISTATS Notable Paper Award (2019), an NSF CAREER Award (2018), a Sloan Research Fellowship (2018), an Army Research Office Young Investigator Program (YIP) award (2017), Google Faculty Research Awards, an Amazon Research Award, the ISBA Lifetime Members Junior Researcher Award, the Savage Award (for an outstanding doctoral dissertation in Bayesian theory and methods), the Evelyn Fix Memorial Medal and Citation (for the Ph.D. student on the Berkeley campus showing the greatest promise in statistical research), the Berkeley Fellowship, an NSF Graduate Research Fellowship, a Marshall Scholarship, and the Phi Beta Kappa Prize (for the graduating Princeton senior with the highest academic average).

Complete List: Monthly Seminar Series 2022

We will be hosting a seminar talk every first Thursday of each month, hold the date! Time may vary based on the time zone and availability of the speaker.

  • February 3rd: Tamara Broderick
  • March 3rd: Sebastian Nowozin
  • April 7th: Cynthia Rudin
  • May 5th: Anna Korba
  • June 16th: Finale Doshi-Velez (Please note the shift in the date!)
  • July 7th: Thomas Dietterich
  • August 4th: Benjamin Bloem-Reddy
  • September 1st: Javier Gonzalez
  • October 6th: TBD
  • November 3th: TBD
  • December 1st: Lena Maier-Hein