January 31, 2019 - February 1, 2019

Swiss Joint Research Center Workshop 2019

Location: Zürich, Switzerland

EPFL PI: Tobias J. Kippenberg; Microsoft PI: Hitesh Ballani

The substantial increase in optical data transmission, and cloud computing, has fueled research into new technologies that can increase communication capacity. Optical communication through fiber, which traditionally has been used for long haul fiber optical communication, is now also employed for short haul communication, even with data-centers. In a similar vein, the increasing capacity crunch in optical fibers, driven in particular by video streaming, can only be met by two degrees of freedom: spatial and wavelength division multiplexing. Spatial multiplexing refers to the use of optical fibers that have multiple cores, allowing to transmit the same carrier wavelength in multiple fibers. Wavelength division multiplexing (WDM or dense-DWM) refers to the use of multiple optical carriers on the same fiber. A key advantage of WDM is the ability to increase line-rates on existing legacy network, without requirements to change existing SMF28 single mode fibers. WDM is also expected to be employed in data-centers. Yet to date, WDM implementation within datacenters faces a key challenge: a CMOS compatible, power efficient source of multi-wavelengths. Currently employed existing solutions, such as multi-laser chips based on InP (as developed by Infinera) cannot be readily scaled to a larger number of carriers. As a result, the currently prevalently employed solution is to use a bank of multiple, individual laser modules. This approach is not viable for datacenters due to space and power constraints. Over the past years, a new technology has rapidly matured – that was developed by EPFL – microresonator frequency combs, or microcombs that satisfy these requirements. The potential of this new technology in telecommunications has recently been demonstrated with the use of microcombs for massively coherent parallel communication on the receiver and transmitter side. Yet to date the use of such micro-combs in data-centers has not been addressed.
1. Kippenberg, T. J., Gaeta, A. L., Lipson, M. & Gorodetsky, M. L. Dissipative Kerr solitons in optical microresonators. Science 361, eaan8083 (2018).
2. Brasch, V. et al. Photonic chip–based optical frequency comb using soliton Cherenkov radiation. Science aad4811 (2015). doi:10.1126/science.aad4811
3. Marin-Palomo, P. et al. Microresonator-based solitons for massively parallel coherent optical communications. Nature 546, 274–279 (2017).
4. Trocha, P. et al. Ultrafast optical ranging using microresonator soliton frequency combs. Science 359, 887–891 (2018).
Opens in a new tab
ETH Zurich PI: Juan Gómez Luna

Data movement between storage/memory and compute units forms an increasingly critical bottleneck to overall system performance, scalability, and energy efficiency. Near-Data Processing (including Processing-in-Memory and Near-Storage Processing) is a promising paradigm to significantly alleviate the data movement bottleneck by placing computation closer to where the data resides. The Near-Data Processing paradigm is becoming a reality with a variety of new substrates, such as 3D-stacked DRAM, in-DRAM analog computation, or Open-Channel SSDs. However, what characteristics make an application a good fit for Near-Data Processing is yet an unanswered question.

In this talk, we will present the first large-scale characterization of over 345 applications to identify program characteristics that determine suitability for processing near data. Understanding these characteristics and the computing substrates also allows us to design new algorithmic expressions for data-intensive workloads that leverage the Near-Data Processing paradigm. For instance, one of the most fundamental computational steps in bioinformatics is the detection of the differences/similarities between two genomic sequences. We can express the similarity measurement as bulk logic and arithmetic operations which are a perfect fit for analog computation within DRAM. Finally, we will outline some of our next research directions to enable computation in the context of tiering mechanisms for memory and storage (e.g., specialized DRAM, Open-Channel SSDs, Storage-Class Memories, tiered NVM devices).
EPFL PIs: Robert West, Arnaud Chiolero, Magali Rios-Leyvraz; Microsoft PIs: Ryen White, Eric Horvitz, Emre Kiciman

The overall goal of this project is to develop methods for monitoring, modeling, and modifying dietary habits and nutrition based on large-scale digital traces. We will leverage data from both EPFL and Microsoft, to shed light on dietary habits from different angles and at different scales: Our team has access to logs of food purchases made on the EPFL campus with the badges carried by all EPFL members. Via the Microsoft collaborators involved, we have access to Web usage logs from IE/Edge and Bing, and via MSR’s subscription to the Twitter firehose, we gain full access to a major social media platform. Our agenda broadly decomposes into three sets of research questions: (1) Monitoring and modeling: How to mine digital traces for spatiotemporal variation of dietary habits? What nutritional patterns emerge? And how do they relate to, and expand, the current state of research in nutrition? (2) Quantifying and correcting biases: The log data does not directly capture food consumption, but provides indirect proxies; these are likely to be affected by data biases, and correcting for those biases will be an integral part of this project. (3) Modifying dietary habits: Our lab is co-organizing an annual EPFL-wide event called the Act4Change challenge, whose goal is to foster healthy and sustainable habits on the EPFL campus. Our close involvement with Act4Change will allow us to validate our methods and findings on the ground via surveys and A/B tests. Applications of our work will include new methods for conducting population nutrition monitoring, recommending better-personalized eating practices, optimizing food offerings, and minimizing food waste.
EPFL PIs: Marios Kogias, Edouard Bugnion; Microsoft PIs: Irene Zhang, Dan Ports

The deployment of a web-scale application within a datacenter can comprise of hundreds of software components, deployed on thousands of servers organized in multiple tiers and interconnected by commodity Ethernet switches. These versatile components communicate with each other via Remote Procedure Calls (RPCs) with the cost of an individual RPC service typically measured in microseconds. The end-user performance, availability and overall efficiency of the entire system are largely dependent on the efficient delivery and scheduling of these RPCs. Yet, these RPCs are ubiquitously deployed today on top fo general-purpose transport protocols such as TCP.

We propose to make RPC first-class citizens of datacenter deployment. This requires a revisitation of the overall architecture, application API, and network protocols. Our research direction is based on a novel RPC-oriented protocol, R2P2, which separates control flow from data flow and provides in-networking scheduling opportunities to tame tail latency. We are also building the tools that are necessary to scientifically evaluate microsesecond-scale services.
EPFL PIs: Pascal Fua, Mathieu Salzmann, Helge Rhodin; Microsoft PIs: Bugra Tekin, Sudipta Sinha, Federica Bogo, Marc Pollefeys

In recent years, there has been tremendous progress in camera-based 6D object pose, hand pose and human 3D pose estimation. They can now both be done in real time but not yet to the level of accuracy required to properly capture how people interact with each other and with objects, which is a crucial component of modeling the world in which we live. For example, when someone grasps an object, types on a keyboard, or shakes someone else’s hand, the position of their fingers with respect to what they are interacting with must be precisely recovered for the resulting models to be used by AR devices, such as the HoloLens device or consumer-level video see-through AR ones. This remains a challenge, especially given the fact that hands are often severely occluded in the egocentric views that are the norm in AR.

We will, therefore, work on accurately capturing the interaction between hands and objects they touch and manipulate. At the heart of it, will be the precise modeling of contact points and the resulting physical forces between interacting hands and objects. This is essential for two reasons. First, objects in contact exert forces on each other; their pose and motion can only be accurately captured and understood if reaction forces at contact points and areas are modeled jointly. Second, touch and touch-force devices, such as keyboards and touch-screens are the most common human-computer interfaces, and by sensing contact and contact forces purely visually, every-day objects could be turned into tangible interfaces, that react as if they were equipped with touch-sensitive electronics. For instance, a soft cushion could become a non-intrusive input device that, unlike virtual mid-air menus, provides natural force feedback.

In this talk, I will present some of our preliminary results and discuss our research agenda for the year to come.
ETH Zurich PI: Andreas Krause; Microsoft PI: Sebastian Tschiatschek

Reinforcement learning (RL) is a promising paradigm in machine learning and gained considerable attention in recent years, partly because of its successful application in previously unsolved challenging games like Go and Atari. While these are impressive results, applying reinforcement learning in most other domains, e.g. virtual personal assistants, self-driving cars or robotics, remains challenging. One key reason for this is the difficulty of specifying the reward function a reinforcement learning agent is intended to optimize. For instance, in a virtual personal assistant, the reward function might correspond to the user’s satisfaction with the assistant’s behavior and is difficult to specify as a function of observations (e.g. sensory information) available to the system. In such applications, an alternative to specifying the reward function is to actually query the user for the reward. This, however, is only feasible if the number of queries to the user are limited and the user’s response can be provided in a natural way such that the system’s queries are non-irritating. Similar problems arise in other application domains such as robotics in which, for instance, the true reward can only be obtained by actually deploying the robot but an approximation to the reward can be computed by a simulator. In this case, it is important to optimize the agent’s behavior while simultaneously minimizing the number of costly deployments. This project’s aim is to develop algorithms for these types of problems via scalable active reward learning for reinforcement learning. The project’s focus is on scalability in terms of computational complexity (to scale to large real-world problems) and sample complexity (to minimize the number of costly queries).
ETH Zurich PIs: Roland Siegwart, Nicholas Lawrance, Jen Jen Chung; Microsft PIs: Andrey Kolobov, Debadeepta Dey

A major factor restricting the utility of UAVs is the amount of energy aboard, which limits the duration of their flights. Birds face largely the same problem, but they are adept at using their vision to aid in spotting — and exploiting — opportunities for extracting extra energy from the air around them. Project Altair aims at developing infrared (IR) sensing techniques for detecting, mapping and exploiting naturally occurring atmospheric phenomena called thermals for extending the flight endurance of fixed-wing UAVs. In this presentation, we will introduce our vision and goals for this project.
ETH Zurich PIs: Torsten Hoefler, Renato Renner; Microsoft PIs: Matthias Troyer, Martin Roetteler

QIRO will establish a new internal representation for compilation systems on quantum computers. Since quantum computation is still emerging, I will provide an introduction to the general concepts of quantum computation and a brief discussion of its strengths and weaknesses from a high-performance computing perspective. This talk is tailored for a computer science audience with basic (popular-science) or no background in quantum mechanics and will focus on the computational aspects. I will also discuss systems aspects of quantum computers and how to map quantum algorithms to their high-level architecture. I will close with the principles of practical implementation of quantum computers and outline the project.
ETH Zurich PIs: Stelian Coros, Roi Poranne; Microsoft PIs: Federica Bogo, Bugra Tekin, Marc Pollefeys

With this project, we aim to accelerate the development of intelligent robots that can assist those in need with a variety of everyday tasks. People suffering from physical impairments, for example, often need help dressing or brushing their own hair. Skilled robotic assistants would allow these persons to live an independent lifestyle. Even such seemingly simple tasks, however, require complex manipulation of physical objects, advanced motion planning capabilities, as well as close interactions with human subjects. We believe the key to robots being able to undertake such societally important functions is learning from demonstration. The fundamental research question is, therefore, how can we enable human operators to seamlessly teach a robot how to perform complex tasks? The answer, we argue, lies in immersive telemanipulation. More specifically, we are inspired by the vision of James Cameron’s Avatar, where humans are endowed with alternative embodiments. In such a setting, the human’s intent must be seamlessly mapped to the motions of a robot as the human operator becomes completely immersed in the environment the robot operates in. To achieve this ambitious vision, many technologies must come together: mixed reality as the medium for robot-human communication, perception and action recognition to detect the intent of both the human operator and the human patient, motion retargeting techniques to map the actions of the human to the robot’s motions, and physics-based models to enable the robot to predict and understand the implications of its actions.
ETH Zurich PIs: Roland Siegwart, Cesar Cadena, Juan Nieto; Microsoft PIs: Johannes Schönberger, Marc Pollefeys

AR/VR allow new and innovative ways of visualizing information and provide a very intuitive interface for interaction. At their core, they rely only on a camera and inertial measurement unit (IMU) setup or a stereo-vision setup to provide the necessary data, either of which are readily available on most commercial mobile devices. Early adoptions of this technology have already been deployed in the real estate business, sports, gaming, retail, tourism, transportation and many other fields. The current technologies in visual-aided motion estimation and mapping on mobile devices have three main requirements to produce highly accurate 3D metric reconstructions: An accurate spatial and temporal calibration of the sensor suite, a procedure which is typically carried out with the help of external infrastructure, like calibration markers, and by following a set of predefined movements. Well-lit, textured environments and feature-rich, smooth trajectories. The continuous and reliable operation of all sensors involved.

This project aims at relaxing these requirements, to enable continuous and robust lifelong mapping on end-user mobile devices. Thus, the specific objectives of this work are: 1. Formalize a modular and adaptable multi-modal sensor fusion framework for online map generation; 2. Improve the robustness of mapping and motion estimation by exploiting high-level semantic features; 3. Develop techniques for automatic detection and execution of sensor calibration in the wild. A modular SLAM (simultaneous localization and mapping) pipeline which is able to exploit all available sensing modalities can overcome the individual limitations of each sensor and increase the overall robustness of the estimation. Such an information-rich map representation allows us to leverage recent advances in semantic scene understanding, providing an abstraction from low-level geometric features – which are fragile to noise, sensing conditions and small changes in the environment – to higher-level semantic features that are robust against these effects. Using this complete map representation, we will explore new ways to detect miscalibrations and sensor failures, so that the SLAM process can be adapted online without the need for explicit user intervention.
ETH Zurich PI: Ce Zhang; Microsoft PI: Matteo Interlandi
ETH Zurich PI: Onur Mutlu; Microsoft PIs: Michael Cornwell, Kushagra Vaid

Back to Swiss Joint Research Center >

Swiss Joint Research Center Workshop 2019

Photonic Integrated Multi-Wavelength Sources for Data Centers

Understanding and Reducing Data Movement Bottlenecks in Modern Workloads

Monitoring, Modelling, and Modifying Dietary Habits and Nutrition Based on Large-Scale Digital Traces

TTL-MSR Taiming Tail-Latency for Microsecond-scale RPCs

Hands in Contact for Augmented Reality

Scalable Active Reward Learning for Reinforcement Learning

Project Altair: Infrared Vision and AI Decision-Making for Longer Drone Flights

QIRO - A Quantum Intermediate Representation for Program Optimisation

Skilled Assistive-Care Robots through Immersive Mixed-Reality Telemanipulation

A Modular Approach for Lifelong Mapping from End-User Data

Automatic Recipe Generation for ML.NET Pipelines

Tiered NVM Designs, Software-NVM Interfaces, and Isolation Support