Selected Presentations
(
Upcoming |
Slides and video
)
Upcoming
- 2018 September 15--16. FoMICS-DADSi Summer School on Data Assimilation, Lugano, Switzerland.
- 2018 September 25. Systems and Technology Research, Woburn, MA.
- 2018 September 28. Instituto Tecnológico Autónomo de México (ITAM), Mexico City, Mexico.
- 2018 October 1. 33 Foro Nacional de Estadística (FNE) y 13 Congreso Latinoamericano de Sociedades de Estadística (CLATSE), Guadalajara, Mexico.
- 2018 October 4. Allerton Conference, University of Illinois at Urbana-Champaign, USA.
- 2018 October 14. Computational and Systems Biology Retreat, Maine, USA.
- 2018 October 15. IDS.900, MIT, Massachusetts, USA.
- 2018 October 17. Yale University, Connecticut, USA.
- 2018 October 26. Johns Hopkins University, Maryland, USA.
- 2018 October 30. Northeastern University, Massachusetts, USA.
- 2018 November 9. University of Michigan, USA.
- 2018 November 20. University of Tübingen, Germany.
- 2018 November 23. Lancaster University, UK.
- 2018 November 25. Workshop: Young Bayesians and Big Data for Social Good, CIRM, Marseille Luminy, France.
- 2018 November 29. Conference: Bayesian Statistics in the Big Data Era, CIRM, Marseille Luminy, France.
- 2018 December 2. Symposium on Advances in Approximate Bayesian Inference, Montreal, Canada.
- 2018 December 10. National Research Council Canada, Ottawa, Canada.
- 2019 December 14. CMStatistics, Pisa, Italy.
- 2019 March 28. Machine Learning Advances and Applications Seminar, Fields Institute for Research in Mathematical Sciences, Toronto, Canada
- 2019 June 24--28. Keynote, 12th International Conference on Bayesian Nonparametrics (BNP12), Oxford, UK.
Research presentations
- Automated Scalable Bayesian Inference via Data Summarization.
[abstract]
The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical relationships, uncertainty quantification, and prior specification these methods provide. Many standard Bayesian inference algorithms are often computationally expensive, however, so their direct application to large datasets can be difficult or infeasible. Other standard algorithms sacrifice accuracy in the pursuit of scalability. We take a new approach. Namely, we leverage the insight that data often exhibit approximate redundancies to instead obtain a weighted subset of the data (called a "coreset") that is much smaller than the original dataset. We can then use this small coreset in existing Bayesian inference algorithms without modification. We provide theoretical guarantees on the size and approximation quality of the coreset. In particular, we show that our method provides geometric decay in posterior approximation error as a function of coreset size. We validate on both synthetic and real datasets, demonstrating that our method reduces posterior approximation error by orders of magnitude relative to uniform random subsampling.
- [slides pdf]
- [video: 2018 November 9]. Michigan Institute for Data Science (MIDAS), University of Michigan, USA.
- Fast Quantification of Uncertainty and Robustness with Variational Bayes.
[abstract]
In Bayesian analysis, the posterior follows from the data and a choice of a prior and a likelihood. These choices may be somewhat subjective and reasonably vary over some range. Thus, we wish to measure the sensitivity of posterior estimates to variation in these choices. While the field of robust Bayes has been formed to address this problem, its tools are not commonly used in practice. We demonstrate that variational Bayes (VB) techniques are readily amenable to robustness analysis. Since VB casts posterior inference as an optimization problem, its methodology is built on the ability to calculate derivatives of posterior quantities with respect to model parameters. We use this insight to develop local prior robustness measures for mean-field variational Bayes (MFVB), a particularly popular form of VB due to its fast runtime on large data sets. A potential problem with MFVB is that it has a well-known major failing: it can severely underestimate uncertainty and provides no information about covariance. We generalize linear response methods from statistical physics to deliver accurate uncertainty estimates for MFVB---both for individual variables and coherently across variables. We call our method linear response variational Bayes (LRVB).
- [slides pdf]
- [video: 2017 August 23]. Microsoft Research New England.
- [video: 2016 October 17]. Biostatistics-Biomedical Informatics Big Data (B3D) Seminar, Harvard University, USA.
- [video: 2016 August 23]. 2016 Big Data Conference & Workshop, Harvard, Cambridge, Massachusetts, USA.
- Posteriors, conjugacy, and exponential families for completely random measures.
[abstract]
We demonstrate how to calculate posteriors for general Bayesian nonparametric priors and likelihoods based on completely random measures (CRMs).We further show how to represent Bayesian nonparametric priors as a sequence of finite draws using a size-biasing approach---and how to represent full Bayesian nonparametric models via finite marginals. Motivated by conjugate priors based on exponential family representations of likelihoods, we introduce a notion of exponential families for CRMs, which we call exponential CRMs. This construction allows us to specify automatic Bayesian nonparametric conjugate priors for exponential CRM likelihoods. Wedemonstrate that our exponential CRMs allow particularly straightforward recipes for size-biased and marginal representations of Bayesian nonparametric models. Along the way, we prove that the gamma process is a conjugate prior for the Poisson likelihood process and the beta prime process is a conjugate prior for a process we call the odds Bernoulli process. We deliver a size-biased representation of the gamma process and a marginal representation of the gamma process coupled with a Poisson likelihood process.
- Feature allocations, probability functions, and paintboxes.
[abstract]
The problem of inferring a clustering of a data set has been the subject of much research in Bayesian analysis, and there currently exists a solid mathematical foundation for Bayesian approaches to clustering. In particular, the class of probability distributions over partitions of a data set has been characterized in a number of ways, including via exchangeable partition probability functions (EPPFs) and the Kingman paintbox. Here, we develop a generalization of the clustering problem, called feature allocation, where we allow each data point to belong to an arbitrary, non-negative integer number of groups, now called features or topics. We define and study an "exchangeable feature probability function" (EFPF)---analogous to the EPPF in the clustering setting---for certain types of feature models. Moreover, we introduce a "feature paintbox" characterization---analogous to the Kingman paintbox for clustering---of the class of exchangeable feature models. We use this feature paintbox construction to provide a further characterization of the subclass of feature allocations that have EFPF representations.
- [slides pdf]
- [video: 2015 February 25]. Harvard Applied Statistics Workshop, Harvard, Cambridge, Massachusetts, USA.
- [video: 2014 October 15]. Redwood Center for Theoretical Neuroscience, UC Berkeley, Berkeley, California, USA.
- Streaming variational Bayes.
[abstract]
We present SDA-Bayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior. The framework makes streaming updates to the estimated posterior according to a user-specified approximation batch primitive. We demonstrate the usefulness of our framework, with variational Bayes (VB) as the primitive, by fitting the latent Dirichlet allocation model to two large-scale document collections. We demonstrate the advantages of our algorithm over stochastic variational inference (SVI) by comparing the two after a single pass through a known amount of data---a case where SVI may be applied---and in the streaming setting, where SVI does not apply.
- Clusters and features from combinatorial stochastic processes. [abstract]
In partitioning---a.k.a. clustering---data, we associate each data point with one and only one of some collection of groups called clusters or partition blocks. Here, we formally establish an analogous problem, called feature allocation, for associating data points with arbitrary non-negative integer numbers of groups, now called features or topics. Just as the exchangeable partition probability function (EPPF) can be used to describe the distribution of cluster membership under an exchangeable clustering model, we examine an analogous "exchangeable feature probability function" (EFPF) for certain types of feature models. Moreover, recalling Kingman's paintbox theorem as a characterization of the class of exchangeable clustering models, we develop a similar "feature paintbox" characterization of the class of exchangeable feature models. We use this feature paintbox construction to provide a further characterization of the subclass of feature allocations that have EFPF representations. We examine models such as the Bayesian nonparametric Indian buffet process as examples within these broader classes.
- [video: 2012 September 20]. Bayesian Nonparametrics, ICERM Semester Program on Computational Challenges in Probability, Brown University, Providence, Rhode Island, USA.
- [video: 2012 September 20]. Bayesian Nonparametrics, ICERM Semester Program on Computational Challenges in Probability, Brown University, Providence, Rhode Island, USA.
- MAD-Bayes: MAP-based asymptotic derivations from Bayes
[abstract]
The classical mixture of Gaussians model is related to K-means via small-variance asymptotics: as the covariances of the Gaussians tend to zero, the negative log-likelihood of the mixture of Gaussians model approaches the K-means objective, and the EM algorithm approaches the K-means algorithm. Kulis & Jordan (2012) used this observation to obtain a novel K-means-like algorithm from a Gibbs sampler for the Dirichlet process (DP) mixture. We instead consider applying small-variance asymptotics directly to the posterior in Bayesian nonparametric models. This framework is independent of any specific Bayesian inference algorithm, and it has the major advantage that it generalizes immediately to a range of models beyond the DP mixture. To illustrate, we apply our framework to the feature learning setting, where the beta process and Indian buffet process provide an appropriate Bayesian nonparametric prior. We obtain a novel objective function that goes beyond clustering to learn (and penalize new) groupings for which we relax the mutual exclusivity and exhaustivity assumptions of clustering. We demonstrate several other algorithms, all of which are scalable and simple to implement. Empirical results demonstrate the benefits of the new framework.
- [slides pdf]
- [poster pdf]
- [video: 2013 June 17]. 30th International Conference on Machine Learning (ICML 2013), Atlanta, Georgia, USA.
- User-friendly conjugacy for completely random measures
- [slides pdf] Joint Statistical Meetings (JSM) 2013, Montreal, Canada.
- Fast and flexible selection with a single switch.
- [video: 2009 December 10]. Mini-Symposia on Assistive Machine Learning for People with Disabilities, Neural Information Processing Systems (NIPS) 2009, Vancouver, British Columbia, Canada.
- [video] Nomon keyboard tutorial.
- [video] Example sentence written using Nomon.
- [video: 2009 December 10]. Mini-Symposia on Assistive Machine Learning for People with Disabilities, Neural Information Processing Systems (NIPS) 2009, Vancouver, British Columbia, Canada.