Portrait on green background, header for New England Machine Learning Day event page
May 1, 2013

New England Machine Learning Day 2013

10:00 AM–5:00 PM

Location: Cambridge, MA, USA

10:00‑10:05, Jennifer Chayes (MSR)
Opening remarks

10:10‑10:40, Sham Kakade (MSR)
Learning latent structure in documents, social networks, and more…

In many applications, we face the challenge of modeling the interactions between multiple observations and hidden causes; such problems range from document retrieval, where we seek to model the underlying topics, to community detection in social networks. The (unsupervised) learning problem is to accurately estimate the model (e.g. the hidden topics, the underlying clusters, or the hidden communities in a social network) with only samples of the observed variables. In practice, many of these models are fit with local search heuristics. This talk will overview how simple and scalable linear algebra approaches provide closed form estimation methods for a wide class of these models models—including Gaussian mixture models, hidden Markov models, topic models (including latent Dirichlet allocation), and mixed membership models for communities in social networks.

10:45‑11:15, Stefanie Tellex (Brown)
Learning Word Meanings for Human-Robot Interaction

As robots become more powerful and autonomous, it is critical to develop ways for untrained users to quickly and easily tell them what to do. Natural language is a powerful and flexible modality for conveying complex requests, but in order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. I will present approaches to learning these grounded meaning representations from a corpus of natural language sentences paired with a robot’s perceptual model of the environment. The robot can use these learned models to recognize events, follow commands, ask questions, and request help.

11:20‑11:50, Pablo Parrilo (MIT)
From Sparsity to Rank, and Beyond: algebra, geometry, and convexity

Optimization problems involving sparse vectors or low-rank matrices are of great importance in applied mathematics and engineering. They provide a rich and fruitful interaction between algebraic-geometric concepts and convex optimization, with strong synergies with popular techniques like L1 and nuclear norm minimization. In this lecture we will provide a gentle introduction to this exciting research area, highlighting key algebraic-geometric ideas as well as a survey of recent developments, including extensions to very general families of parsimonious models such as sums of a few permutations matrices, low-rank tensors, orthogonal matrices, and atomic measures, as well as the corresponding structure-inducing norms. Based on joint work with Venkat Chandrasekaran, Maryam Fazel, Ben Recht, Sujay Sanghavi, and Alan Willsky.

11:50‑1:45
Posters and lunch

1:45‑2:15, Erik Sudderth (Brown)
Toward Reliable Bayesian Nonparametric Learning

Applications of Bayesian nonparametrics increasingly involve datasets with rich hierarchical, temporal, spatial, or relational structure. While basic inference algorithms such as the Gibbs sampler are easily generalized to such models, in practice they can fail in subtle and hard-to-diagnose ways. We explore this issue via variants of a simple and popular nonparametric Bayesian model, the hierarchical Dirichlet process. By optimizing variational learning objectives in non-traditional ways, we build improved models of text, image, and social network data.

2:20‑2:50, Ryan Adams (Harvard)
Practical Bayesian Optimization of Machine Learning Algorithms

Machine learning algorithms frequently involve careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a “black art” requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. I will describe my recent work on solving this problem with Bayesian nonparametrics, using Gaussian processes. This approach of “Bayesian optimization” models the generalization performance as an unknown objective function with a GP prior. I will discuss new algorithms that account for variable cost in function evaluation and take advantage of parallelism in evaluation. These new algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation for text analysis, structured SVMs for protein motif finding, and convolutional neural networks for visual object recognition.

2:50‑3:20
Coffee break

3:20‑3:50, Hanna Wallach (UMASS)
Machine Learning for Complex Social Processes

From the activities of the US Patent Office or the National Institutes of Health to communications between scientists or political legislators, complex social processes—groups of people interacting with each other in order to achieve specific and sometimes contradictory goals—underlie almost all human endeavor. In order draw thorough, data-driven conclusions about complex social processes, researchers and decision-makers need new quantitative tools for exploring, explaining, and making predictions using massive collections of interaction data. In this talk, I will discuss the development of machine learning methods for modeling interaction data. I will concentrate on exploratory analysis of communication networks—specifically, discovery and visualization of topic-specific subnetworks in email data sets. I will present a new Bayesian latent variable model of network structure and content and explain how this model can be used to analyze intra-governmental email networks.

3:55‑4:25, Cynthia Rudin (MIT)
ML for the Future: Healthcare, Energy, and the Internet

I will overview recent applications of ML to some of society’s critical domains, including healthcare, energy grid reliability, and information retrieval. Specifically:
1) Stroke risk prediction in medical patients, using ML techniques for interpretable predictive modeling.
2) Energy grid reliability in New York City, using point process models.
3) Growing a list using the Internet, using clustering techniques.
These applications show the promise of how applications can drive the development of effective new ML techniques.
Collaborators: Ben Letham, Seyda Ertekin, Tyler McCormick, David Madigan, and Katherine Heller

4:30‑5:00, Antonio Torralba (MIT)
Who is to blame in object detection failures?