Portrait on green background, header for New England Machine Learning Day event page

May 6, 2016

New England Machine Learning Day 2016

Location: Cambridge, MA, USA

9:50 – 10:00
Opening remarks

10:00 – 10:30, Bill Freeman, MIT / Google
Learning to see by listening
Children may learn about the world by pushing, banging, and manipulating things, watching and listening as materials make their distinctive sounds– dirt makes a thud; ceramic makes a clink. These sounds reveal physical properties of the objects, as well as the force and motion of the physical interaction. We’ve explored a toy version of that learning-through-interaction by recording audio and video while we hit many things with a drumstick. We developed an algorithm the predict sounds from silent videos of the drumstick interactions. The algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We demonstrate that the sounds generated by our model are realistic enough to fool participants in a “real or fake” psychophysical experiment, and that the task of predicting sounds allows our system to learn about material properties in the scene. Joint work with: Andrew Owens, Phillip Isola, Josh McDermott, Antonio Torralba, Edward H. Adelson http://arxiv.org/abs/1512.08512 (opens in new tab) to appear in CVPR 2016.

10:35 – 11:05, Nicolo Fusi, Microsoft Research
Dissecting Genetic Signals Using Gaussian Processes

11:10 – 11:40, Stefanie Jegelka, MIT
Determinantal Point Processes in Machine Learning—old and new ideas
Many machine learning problems are, at their core, subset selection problems. Probabilistic models and practical algorithms for such scenarios rely on having sufficiently accurate yet tractable distributions over discrete sets. As one such example, Determinantal Point Processes (DPPs) have gained popularity in machine learning as elegant probabilistic models of diversity. Yet, their wide applicability has been hindered by computationally expensive sampling algorithms. In this talk, I will outline “old” and new applications of DPPs, and ideas for faster sampling procedures. These procedures build on new insights for algorithms that compute bilinear inverse forms. Our results find applications beyond DPPs, such as, for example, submodular maximization for sensing. This is joint work with Chengtao Li, Suvrit Sra, Josip Djolonga and Andreas Krause.

11:40 – 1:45
Lunch and posters

1:45 – 2:15, Eugene Charniak, Brown
Syntactic Parsing, a Machine Learning Success Story, (white-board talk).
Syntactic Parsing is one of the great success stories of modern, machine-learning based, natural-language processing. We briefly examine why it is useful in NLP, and how it has gone from non-functional to very accurate, in the last twenty years or so.

2:20 – 2:50, Lorenzo Orecchia, Boston University
Spectral Graph Algorithms Without Eigenvectors
Classical spectral algorithms for graph problems aim to extract information from the top k-eigenvectors with the goal of reducing the dimensionality of the problem on the way to detecting significant features, such as well-separated clusters or dense subsets. Similarly, classical spectral graph theory focuses on the relation between the top eigenvectors of graph matrices and combinatorial quantities of interest, such as conductance and size of the maximum independent set. For these reasons, eigenvectors are often the main object of study in these fields.

However, eigenvectors are a inherently unstable object. For instance, the top eigenvector of a graph can change completely under very small modifications of the graph, e.g., removal of edges. This is particularly problematic for the large data application where the edges of the graph may be noisy and where we may not want to compute a large number of eigenvectors.

In this talk, I will survey how the eigenvector problem can be regularized to construct a convex optimization problem whose optimal solution approximates the eigenvector, while changing smoothly as the instance matrix is modified.

Besides being more robust objects if the instance graph is noisy, these “regularized eigenvectors” can be used to speed up a number of fundamental spectral algorithms, e.g. to compute balanced partitions of a graph or to sparsify a matrix. At the same time, the smoothness of these object also allows us to simplify many classical proofs of results in spectral graph theory.

2:50 – 3:20
Coffee break

3:20 – 3:50, Brendan O’Connor, University of Massachusetts Amherst
Measuring social phenomena in news and social media
What can text analysis tell us about society? Corpora of news, social media, and historical documents record events, beliefs, and culture. Natural language processing and machine learning methods hold great promise to better explore this type of data. At the same time, our current NLP methods are confounded with social variables: I’ll preview ongoing work assessing NLP technology for social media messages, which shows disparate effectiveness for texts authored by different demographic groups. This is not surprising given what we know about sociolinguistics, but these phenomena may be less well known to technical practitioners. As the scope of available textual data expands to creative and non-standard language from a wide variety of social groups, we encounter crucial modeling and data collection challenges to ensure effective and equitable language technologies.

3:55 – 4:25, Guy Bresler, MIT
Learning tree-structured Ising models in order to make predictions

4:30 – 5:00, Finale Doshi-Velez, Harvard
Characterizing Non-Identifiability in Non-negative Matrix Factorization
Nonnegative matrix factorization (NMF) is a popular dimension reduction technique that produces interpretable decomposition of the data into parts. However, this decomposition is often not identifiable, even beyond simple cases of permutation and scaling. Non-identifiability is an important concern in practical data exploration settings, in which the basis of the NMF factorization may be interpreted as having some kind of meaning: it may be important to know that other non-negative characterizations of the data were also possible. While other studies have provide criteria under which NMF is unique, in this talk I’ll discuss when and how an NMF might *not* be unique. Then I’ll discuss some algorithms that leverage these insights to find alternate solutions.