October 11, 2010 - October 13, 2010

eScience Workshop 2010

Location: Berkeley, California, US

Tutorial Abstracts

  • Mark Smith, JulMar Technology

    The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT).

    The Microsoft Biology Foundation (MBF) is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework, initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, and documentation are freely downloadable.

    The Microsoft Biology Tools (MBT) are a collection of tools targeted at helping the biology and bioinformatics researcher be more productive in making scientific discoveries. The tools provided here take advantage of the capabilities provided in the Microsoft Biology Foundation, and are good examples of how MBF can be used to create other tools.

    This tutorial will provide an overview of the library, details about how to extend and re-use the library, and demonstrations of the tools released that use the library: The MSR Biology Extension for Excel and the MSR Sequence Assembler.

  • Dean Guo, Microsoft Corporation

    As described in the book, The Fourth Paradigm: Data-Intensive Scientific Discovery, scientific breakthroughs will be increasingly powered by advanced computing capabilities that help researchers manipulate and explore massive datasets.

    This tutorial uses three case studies to demonstrate the application of a wide range of technologies: .NET parallel extension on multicores, distributed computing on multiple nodes with Dryad/DryadLINQ, Windows Azure, HPC, data processing automation through workflows, and visualization in WorldWide Telescope. We hope that the techniques and technologies used are applicable in other data-intensive research.

    WorldWide Telescope (opens in new tab) (WWT) enables your computer to function as a virtual telescope, bringing together imagery from the best ground and space-based telescopes in the world. We are working on extending WWT to visualize scientific data on Earth.

    The three use case studies are: WWT LCAPI (Loosely Coupled API), TeraPixel, and MODISAzure. We will demonstrate how to use WWT to visualize the results from these projects.

    1. LCAPI: The Worldwide Telescope “Loosely Coupled API” uses the Restful communication style between a standalone application (SA) and a Worldwide Telescope Client (WTC). We will explore using this loosely coupled interface to read time-series event data into the SA, push this data to the WTC Layer manager, control WTC Layer-based data rendering, and control WTC state (location, perspective angles, time, and time rate). From this overview we will explore both what the LCAPI enables and the potential for future directions in visualization.
    2. Terapixel Sky image – creating the largest and clearest image of the sky from the Digitized Sky Survey data. We turned 1,800 pairs of red and blue individual image plates into 1,800 colored plates, adjusted brightness of each pixel on each plate, and stitched and smoothed them together into a terapixel sky image. The image is then visualized by the WorldWide Telescope.
    3. MODISAzure – accessing the vast and varied remote sensing data from the MODIS (Moderate Resolution Imaging Spectroradiometer) on NASA’s Terra satellite and other data sources to study evapotranspiration (ET), which is key to water balance, hence key to understanding interactions between global climate change and the biosphere. We will demonstrate how we generated time series monthly ET maps for the state of California from MODISAzure results to visualize them in WWT.
  • Alex Wade, Microsoft Research

    Microsoft External Research strongly supports the process of research and its role in the innovation ecosystem, including developing and supporting efforts in open access, open tools, open technology, and interoperability. These projects demonstrate our ongoing work towards producing next-generation documents that increase productivity and empower authors to increase the discoverability and appropriate re-use of their work.

    This workshop will provide a deep dive into several freely available tools from Microsoft External Research, and will demonstrate how these can help supplement and enhance current repository offerings. Come learn more about how the Microsoft Research tools can help extend the reach and utility of your repository efforts. Each session will include a hands-on component so that attendees can gain a deeper technical understanding of the available toolset, which includes the following:

  • Stephen Toub, Microsoft Corporation

    The Microsoft .NET Framework 4 and Visual Studio 2010 include new technologies for expressing, debugging, and tuning parallelism in managed applications. Dive into key areas of support, including Parallel Language Integrated Query (PLINQ), cutting-edge concurrency views in the Visual Studio profiler, and debugger tool windows for analyzing the state of concurrent code. In addition to exploring such features, we will examine some common parallel patterns prevalent in technical computing and how these features can be used to best implement such patterns.

  • Corrado Priami, University of Trento Centre for Computational and Systems Biology

    CoSBiLab is a software platform implementing the new conceptual framework of algorithmic systems biology. It is centered on the idea of representing biological elements as programs and elements interaction as message passing between the corresponding programs. This idea is guiding the programming paradigm supported by the new programming language BlenX. This approach is higher level and provides a component-based view of systems rather than reaction-based descriptions that are usually adopted in ODE or rewriting system tools. CoSBiLab allows its users to exploit compositionality and stochasticity addressing in a native way concurrency and complexity.

    To make the approach intuitive, CoSBiLab has a tabular interface to model systems and to grasp data from databases so that non-experts can use CoSBiLab; i.e., programming in BlenX without having programming skills. CoSBiLab has also tools that can help inferring missing data, perform network analysis, and visualize simulation outcomes. In addition to the introduction of the conceptual framework, demos will be provided to help in understanding the software that will support the e-scientists of the future in their work.

    For more information, see “Algorithmic Systems Biology,” Communications of the ACM, 52(5):80–88, May 2009.

  • Alex James, Microsoft Corporation

    There is a vast amount of data available today and data is now being collected and stored at a rate never seen before. Much, if not most, of this data, however, is locked into specific applications or formats and difficult to access or to integrate into new uses.

    The Open Data Protocol (OData) is a web protocol for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional websites.

    Join us in this tutorial to learn how OData can enable a new level of data integration and interoperability across a broad range of clients, servers, services, and tools. Bring your laptop and you will have a chance to work OData into your own projects on whatever platform you choose.

  • Cancelled