Latin American Faculty Summit 2012

Microsoft Research is a global research organization dedicated to expanding the possibilities of computing. It takes on a wide variety of activities, spanning mission-focused problem solving and technology development to “blue-sky,” curiosity-driven, basic research. The Microsoft Research lab delivers product-focused innovations and at the same time contributes to the broader scientific community by openly publishing basic research results. In this talk, Peter Lee will provide a glimpse into the breadth of research in the lab today and will describe the lab’s strategy for creating high impact for the company and the world. A key part of this approach involves embracing the diversity that is inherent in computing research, using a “quadrant” model that spans short-term to long-term research on one axis and reactive problems to open-ended exploration on the other.

The Mexican research community has historically collaborated well with colleagues from different countries, but as research systems have become more internationalized over the past decades, we must redouble our efforts to integrate effectively into the expanding global networks of knowledge. In this context, governments should support the strengthening of national systems along the chain of education-science-technology-innovation, and should facilitate their internationalization through the mobility of students and researchers, the undertaking of joint projects, and the funding of technology based companies. I will describe our efforts at CONACYT, where we are working to support the involvement of Mexican researchers with high-level groups internationally by incorporating programs that provide remote access to instruments, data, computational resources, and large-scale facilities located throughout the world, among other activities.

CrowdSourcing uses human intelligence to solve tasks that are simple for humans but difficult for computers. CrowdSourcing can also use humans as sources of valuable information, for example, to exploit the “wisdom of the crowd.” In this talk, I will give an overview of the CrowdSourcing work we are doing in the Stanford InfoLab. In particular, I will describe DeCo, a database system that seamlessly gives access to traditional data as well as to crowd information. I will also describe some crowd algorithms, where a computer orchestrates human tasks that solve a larger problem.

Resource poverty is a fundamental constraint that severely limits the type of applications that can be run on mobile devices. This constraint is not just a temporary limitation of current technology; it is intrinsic to mobility. In this talk, I will offer a vision of mobile computing that breaks free of the fundamental constraints, thereby opening up an new world in which mobile computing seamlessly augments the cognitive abilities of users by employing such compute-intensive capabilities as speech recognition, natural language processing, computer vision and graphics, machine learning, augmented reality, planning, and decision-making. By thus empowering mobile users, we could transform many areas of human activity. In this vision, mobile users seamlessly utilize the cloud to obtain resource benefits without incurring delays and jitter, and without worrying about energy. I will highlight challenges we face and solutions we are pursuing, and will describe the successes we are having.

The future of computer science lies neither in stability or change, but in the interplay between them. Both in terms of theory and in the software that brings the theories to life, there is a need for stability, so that ideas and techniques can become well known and part of the scientific ecosystem. But at the same time, researchers are always striving to push the boundaries and bring about change that reflects the concerns of our time, such as handling big data, ensuring privacy and security, and facilitating mobile computing, as well as unifying previously neglected groups through computer vision, natural user interfaces, and machine translation. In this talk, we explore the advances made by Microsoft Research in the development of new theories, new tools, and new communities in computer science.

In April, CareerCast.com placed software engineer at the top of its rankings of 200 jobs, noting that for software engineers the “pay is great, hiring demand for their skills is through the roof, and working conditions have never been better.” But what do software engineers do in 2012? From developing new tools for verifying software, to assisting product groups in coping with bugs and big data, to securing applications on mobile phones, there is a huge variety in software engineering jobs. In this talk, we’ll give an overview of the tasks that software engineers at Microsoft Research tackle, and how we are driving the future of this field.

Kinect has changed the way people play games and experience entertainment. Now, Kinect for Windows offers the potential to transform how people interact with computers and Windows-embedded devices in multiple industries, including education, healthcare, retail, transportation, and beyond. The February 2012 release of the Kinect for Windows sensor and software development kit (SDK) for commercial applications opens up the limitless possibilities offered by Kinect technology, following the preview release from Microsoft Research last spring. Kinect for Windows supports applications built with C++, C#, or Visual Basic by using Microsoft Visual Studio 2010. The latest Kinect for Windows SDK version 1 offers “near mode,” improved skeletal tracking, enhanced speech recognition, modified API, and the ability to support up to four Kinect for Windows sensors plugged into one computer.

Data-intensive research is a fast-evolving activity across a variety of fields. As much as it has changed in the past few years, expected advances in the coming years are even more commanding. This talk will map the current and near-term landscape by walking through the eResearch lifecycle— from data collection, to authoring, to publication/dissemination, and through to archiving and preservation. Specific industry and academic examples from around the world and across the community will be highlighted, as will be select contributions from Microsoft and Microsoft Research.

The use of service robots is rapidly expanding and soon will become part of everyday life. To personalize their functions, service robots will need to learn new tasks according to the preferences of the users, who will have to teach them in natural and accessible ways. In this talk, we will show how to teach a robot a new task using simple demonstrations and voice feedback. We consider three demonstrations modes: (1) controlling the robot with a joystick, (2) instructing the robot by voice, and (3) executing the task while the robot is watching with a Kinect sensor. The effectiveness of the proposed approach is shown on simulation and with real robots performing simple navigation and for pick-and-place tasks.

The world is experiencing a technology shift. In 2012, touchscreen-based mobile devices, namely smartphones and tablets, will outsell desktops, laptops, and netbooks combined. Powerful, easy-to-use smartphones are likely to be the first and, especially in developing countries, possibly the only computing devices that virtually everyone will own and carry at all times. Is it possible to develop new software directly on these mobile devices, without using a PC? What would a user interface for such a new development model look like? We will present a new tool from Microsoft Research, TouchDevelop that tries to address these questions. TouchDevelop is an application-creation environment that runs on the smartphone itself—no separate PC required. Its programming language and code editor have been built from scratch around the idea that all code is entered via a touchscreen, without a keyboard. We will report on how TouchDevelop is being used today by thousands of people.

Computers have proved a boon in education, but cash-strapped schools struggle to provide PCs. In this presentation, we address the question of how to get the same benefits of active participation and personal feedback that a computer provides at a cost of just a dollar per child per year. The answer is an “Interpersonal Computer,” in our case consisting of a PC, a projector, and a mouse for each child participating in the activity. We show how we can teach math and language, using a personal and a collaborative approach, and analyze the value of games.

Cloud computing is facilitating unlimited access to the computing and storage resources needed to build applications. The underlying infrastructure manages such resources transparently, without demanding the application manage or reserve more resources than those it really requires. Therefore, database management systems have exploded into cloud services that must be tuned and composed for efficiently and cost-effectively managing, querying, and exploiting huge data sets. In this talk, we will address a querying approach that consists of composing services that provide data and data management functions (aggregation, storage, refreshment). We will discuss how query processing is tuned with respect to the cost of accessing data and services, the cost of using cloud resources for executing the query, and the mashing up of results according to quality dimensions of completeness, data provenance, and data freshness. We will examine how this approach has provided solutions for e-government applications that integrate services from different countries.

Looking at the last 10 years, we see a shift towards multi- and many-core processors. In the 1990s, processor manufacturers were designing monolithic single-core processors and were struggling to increase the performance through hardware design complexity that increased power consumption beyond acceptable ranges. To deal with power-density problems of single complex cores, processor manufacturers started putting more cores on chip with each new technology generation, doubling the number of cores. To realize the potential of these additional cores requires parallel programming experts because it is very-difficult to program these multiple processors by using current hardware and software. This problem led many to ponder if a programmer-productivity wall is looming in the future. How to design multi-core processors to make them more effective and easier to program is a challenge for computer architects. The vision of the BSC-Microsoft Research Centre is of a top-down computer architecture approach in which software requirements drive the hardware innovation forward rather than letting the hardware design condition software development. With this perspective in mind, computer architecture experts at BSC have teamed up with computer scientists at Microsoft Research to look for innovative solutions to the challenges and opportunities that massively parallel processing represents. This talk highlights our research in top-down computer architecture, with special emphasis on handheld/datacenter application analysis, hardware-support for synchronization and language runtime systems, and programming model hardware interaction.

We are currently generating data at unprecedented rates, much more than can be analyzed “manually.” Discovering useful, actionable knowledge in this data presents enormous challenges. In this talk, I will review some of the principal problems of trying to use electronic data to better understand the world and, in particular, how to understand and model complex adaptive phenomena, such as diseases or financial markets, through data mining. Such phenomena depend on a myriad of factors, from the micro to the macro, and are both highly dynamic and spatially heterogeneous. I will discuss the challenges these characteristics present.

In this talk, we will present a measure of compactness for two-dimensional (2-D) and three-dimensional (3-D) shapes composed, respectively, of pixels and voxels. We will demonstrate the ease of computation, which uses only one equation, and we will show how this proposed measure makes it possible to compute the compactness of any kind of object, including porous and fragmented ones. We will demonstrate this by calculating the measures of discrete compactness of different objects, and we will present potential applications of the proposed measure of discrete compactness.

Common engineering practices today use testing to ensure the quality of software. But relying solely on testing has several well-known drawbacks, such as only testing the program for the given inputs and applying tests only after the entire program has been developed. An idealistic, long-standing dream has been to formally verify the correctness of program, for all inputs. Is there some reality in that dream? In this talk, I present Dafny, a state-of-the-art tool for program verification. Dafny has been used to verify the full correctness of some challenging algorithms. It was used by two medalist teams in the VSTTE 2012 program verification competition and is being used in teaching. Through demos of this research prototype, I will show the vision for how a program verifier can help during software development.

In this talk, I will present an architectural overview of the SQL Server Parallel Data Warehouse DBMS system. PDW is a massively parallel-processing, share-nothing, scaled-out version of SQL Server for DW workloads. The product is packaged as a database appliance built on industry-standard hardware.

LiveANDES (Advanced Network for Distribution of Endangered Species) provides a software platform where users can upload, visualize, and share wildlife data, helping to create a global conservation community in the Americas. Currently, LiveANDES covers all terrestrial vertebrates of Chile, displaying a database searchable by ecological, administrative, and protected areas. It empowers citizen scientists, enabling them to share data that helps map the presence and distribution of endangered species—information that is vital to assessing their conservation status. In this talk, we will cover the technological underpinnings of LiveANDES, including its web solution based on Microsoft .NET technologies and its mobile implementations for Windows Phone and Android devices. We will also cover our plans to migrate the platform to the cloud using Windows Azure, thereby creating a mobile, cloud-shared space for wildlife conservation, and our goal of adding Bolivian and North American libraries and regions for data-sharing and mapping.

Anurans (frogs and toads) are commonly used by biologists as bioindicators of the early stages of ecological stress. Unfortunately, most current monitoring methods are intrusive and error prone. By using sensor networks, we can automatically classify anuran calls and determine the species in a target site, thereby acquiring relevant and accurate data about the environment in a less intrusive way. Our research aims at using signal processing and machine learning techniques to classify anuran calls as a tool to continuously monitor the environment, allowing us to find correlations between destabilizing events, such as fire, flooding, and deforestation, and the anuran population in a given observation site.

Probabilistic graphical models include a variety of techniques based on probability and decision theory—techniques that give us a theoretically well-founded basis for making decisions under conditions of uncertainty and to solve complex problems efficiently. Over the last year, these methods have been used in a great variety of applications, from medical expert systems to intelligent user interfaces. In this talk, I will give a general introduction to probabilistic graphical models and describe some of the most popular ones, such as Bayesian networks and Markov decision processes. Then I will demonstrate their application in three complex problems in biomedicine: (1) helping a physician guide an endoscope in the colon, (2) modeling the evolutionary networks of HIV, and (3) adapting a stroke rehabilitation system for the patient.

Over the past last two and one-half years, the engineering team within Microsoft Research Connections has shipped 15 exciting software products for academics and researchers, including Microsoft Translator Hub, ChronoZoom, Layerscape, Try F#, Hawaii, .NET Bio, and the Chemistry Add-in for Word. How does a team of eight full-time Microsoft engineers consistently deliver great software for academics? What lessons have they learned in the process? Join us for this interactive talk to find out.

This talk will provide an overview of an enhancement that enables the Microsoft Translator to provide targeted and customizable translation systems. This new system, called the Microsoft Translator Hub, enables personalized, private, and/or crowdsourced translation models to be independently built by companies, communities, and language preservationists. By way of example, this talk will also cover the recent release of Hmong Daw—the first language to be empowered by the Microsoft Translator Hub—including the lessons learned during the Hmong community’s pre-release and post-release usage of the tool for language preservation purposes.

Big data and cloud computing are two of the hottest areas in computer science research. In this talk, we will cover the architecture design patterns and research challenges involved in building large, linearly scalable systems.

A service robot is a system with inferential, perceptual, and action capabilities oriented to assist people with diverse daily living tasks. In this talk, we will present an overview of the conceptual framework and methodology that went into the construction of the Golem series of service robots, which were developed over the last few years by the Golem group at IIMAS, UNAM. We will discuss the current state of the technology, highlighting the kinds of advances that are required for service robots to perform well in the RoboCup competition, especially in the @Home category. The talk will conclude with two reflections: one about the value of service robots in practical settings, and the other about the construction of service robots as a case study of technological development in the Latin American context.

Learn about pilot programs launched in Brazil, Colombia, and Mexico to expand the numbers and influence of women in computing. These programs are supported by Microsoft Research Connections Latin America Women in Computing Call for Proposals. Learn about the programs’ goals and progress to date, and hear about what’s worked and what hasn’t. Join the discussion and provide ideas on how we can make a difference in growing women in computing in Latin America, and learn about opportunities to apply for a similar program in the future.

Neural networks are experiencing a renaissance, thanks to a new mathematical formulation, known as restricted Boltzmann machines, and the availability of powerful GPUs and increased processing power. Unlike past neural networks, these new ones can have many layers and thus are called “deep neural networks”; and because they are a machine learning technique, the technology is also known as “deep learning.” In this talk I’ll describe this new formulation and its signal-processing application in such fields as speech recognition and image recognition. In all these applications, deep neural networks have resulted in significant reductions in error rate. This success has sparked great interest from computer scientists, who are also eager to learn from neuroscientists how neurons in the brain work.

Creating compelling-looking content using conventional graphics techniques is often laborious and requires significant artistry and experience. Over the past few years, I have been looking into how this content-creation process can be simplified through using computer vision techniques. In this talk, I will describe a variety of projects undertaken with this goal in mind, discussing how computer vision techniques can be used to simplify animations of Chinese paintings by analyzing brush strokes; to generate free-viewpoint videos from a small number of cameras; to produce 3-D models of plants and trees from images; and to personalize automatic enhancements of photographs.

Our understanding of the world around us is evolving, and with evolution comes the need for adaptation. Environmental scientific research increasingly has to adapt—from dealing with increasingly large and growing datasets, to trying to credibly inform the public and policy makers. There is a need to have new types of applications grounded in scientific research to move from raw discovery, to knowledge, to informing practical decisions. Understanding environmental changes from the levels of neighborhoods, to regions, to the globe is the focus of scientific study and policy decisions. Technology reinforced by computing is demonstrating the capacity to improve our environmental understanding.

Microsoft Research is the leading global research organization in computing. In addition to its core research laboratories, Microsoft Research has several Advanced Technology Labs (ATLs), which focus on driving innovation through advanced technology projects with high impact on Microsoft’s business and on industry at large. ATL projects are broad and involve collaborations with the core Microsoft Research labs, industrial and academic partners, and governmental institutions. In this talk, we present an overview of the ATL labs, their strategic directions, and examples of their core projects.

Scientific computing increasingly revolves around massive amounts of data. From physical sciences, to numerical simulations, to high throughput genomics and homeland security, we are quickly dealing with petabytes if not exabytes of data. This new, data-centric computing requires a fresh look at computing architectures and strategies. We will revisit Amdahl’s Law establishing the relation between CPU and I/O in a balanced computer system, using this to analyze current computing architectures and workloads. We will discuss how existing hardware can be used to build systems that are much closer to an ideal Amdahl machine, and we will describe a hypothetical cheap, yet high performance, multi-petabyte system currently under consideration at Johns Hopkins. We will also explore strategies of interacting with very large amounts of data, and compare various large-scale data analysis platforms.

Since May 2007, the LACCIR Federation (opens in new tab) has been promoting collaborative research activities focused in ICT applications among Latin American and Caribbean researchers. This joint research work has allowed the development of innovative ICT solutions to address region-wide challenges in such areas as education, healthcare, environment, energy, and e-government. This presentation will provide an up-to-date account of LACCIR results and will highlight future opportunities for ICT research in the region. During the presentation, we will also announce the RFP (request for proposal) process for the 2012 LACCIR collaborative research programs.

Latin American Faculty Summit 2012

Microsoft Research: from Basic Research to Technological Innovations

Strengthening Mexico’s Participation in Global Research Networks

Information Management via CrowdSourcing

Putting the Cloud in the Palm of Your Hand

Bringing Theories to Life: Computer Science at Microsoft Research

Why Software Engineer is the Best Job in 2012

Kinect for Windows – an Update for Researchers

eResearch: Surveying the State of the Art

Teaching a Robot How to Perform New Tasks

The Future of Software Engineering on Mobile Devices

The World of Multi-Mouse

E-CLOUDSS: Building e-Government Clouds Using Distributed Semantic Services

Rethinking Computer Architecture: Research at BSC-Microsoft Research Centre

Data Mining and Its Importance in the 21st Century

Discrete Compactness and Its Applications

Engineering Methods for Ensuring Program Correctness

Microsoft SQL Server Parallel Data Warehouse – Architecture Overview

LiveANDES: A Software Platform to Share and Analyze Information for Wildlife Conservation

Using Sensor Networks to Classify Frogs Based on Their Calls

Probabilistic Graphical Models: Applications in Biomedicine

Experiences in Software Engineering

Specialized Machine Translation Using the Microsoft Translator Hub – Customized Models for Language Preservation and Domain Specific Deployment

Big Data and the Cloud Phenomenon

The Golem Project: a Laboratory for the Construction of Service Robots

Moving the Needle and Growing Women in Computing in Latin America

Deep Neural Networks for Speech and Image Processing

Using Computer Vision for Graphics

Advancing Environmental Understanding: the Role of eScience

Driving Innovation Through the Microsoft Research Advanced Technology Labs

Data-Intensive Discoveries in Science: the Fourth Paradigm

LACCIR: Results, Thoughts, and Opportunities