December 5, 2011 - December 8, 2011

eScience Workshop 2011

Location: Stockholm, Sweden

Monday, December 5, 2011

Time 

Event/Topic

Location
8:00–
8:30
Registration
9:00–
10:30
Keynote

Advancing Environmental Understanding; the Role of eScience (opens in new tab)Dan Fay (opens in new tab), Microsoft Research

Our understanding of the world around us is evolving, and with evolution comes the need for adaptation. Environmental scientific research is increasingly having to adapt—from dealing with increasingly large and growing datasets, to trying to credibly inform the public and policy makers. There is a need to have new types of applications grounded in scientific research to move from raw discovery, to knowledge, to informing practical decisions. Understanding environmental changes from the levels of neighborhoods, to regions, to the globe is the focus of scientific study and policy decisions. Technology reinforced by computing is demonstrating the capacity to improve our environmental understanding. This talk contains examples of environmental changes and the challenges that are the focus of scientific investigation; it also identifies technologies and tools that can make an impact on these understandings.

The Pillar Hall
Norra Latin (opens in new tab)
Drottningatan 71B 11136 Stockholm
10:30–
11:00
Coffee break
11:00–
12:30
Breakout Sessions
Citizen Science on Windows Phone 7 Platform (opens in new tab)

Session Chair: Yan Xu, Microsoft Research

Citizen Science Using Windows Phone 7—Yong Liu, University of Illinois, Urbana-Champaign

Citizen science is the inclusion of citizens, who may not be scientists, in conducting large scale scientific experiments. Citizen scientists usually collect or validate data based on observations and measurements in the field. In the past few years, there has been growing interest in using information technology to enable easy collection, storage, and analysis of data. The ubiquitous presence of smart mobile devices, together with availability of sensors in such devices, enables scientists to deploy their experiments to a large number of citizen scientists who can collect observations in near real time, and with rich meta-data. In this session, authors will talk about challenges, opportunities, and lessons learned while using smart mobile platforms for large-scale citizen science experiments.

Listen-n-Feel: Participatory Emotion Detection Using Windows Phone 7—Na Yang, Rochester University

This talk will present an emotion sensor on Windows phones, named Listen-n-Feel, which listens to the phone user’s speech and tells whether the user is happy or sad, based on audio signal features. This phone application can be widely used in social networks, integrated in character-playing games, or used to monitor patients with mental problems or in other health care areas. Recorded audio data is processed on the cloud and signal features are extracted, both in the time domain and frequency domain. Machine learning method is applied to predict emotions on statistics of speech signal features, with the training data derived from a prosody database.

Room 351
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
Scientific Programming for Biologists (and Everyone Else) (opens in new tab)

An Open Source Library for Bioinformatics—Simon Mercer, Microsoft Research

Microsoft has created a general-purpose library of useful functions for the assembly, comparison and manipulation of DNA, RNA and protein sequences. Built on the Microsoft .NET platform, this toolkit enables the scientific programmer to rapidly develop the applications needed by genomics scientists to cope with extracting knowledge from the increasing volumes of data common in the field of genomics research. Under development in Microsoft Research for the past three years, it contains a core of standard functions but also enables easy access to a wide range of other Microsoft technologies including Microsoft Silverlight, DeepZoom, and Microsoft Office as well as unique research tools such as Sho and PhyloD. The library is open source under the Apache 2 license, and is freely available for commercial and academic use. Developers are encouraged to adapt and extend the basic library, making their modifications available for others to use at Microsoft Biology Foundation (opens in new tab).

Programming with Massive Data Sources Made Simple, Fast, and Error-FreeDon Syme (opens in new tab), Microsoft Research

The F# 3.0 language is a breakthrough in productivity for programmers working with online data sources, databases, and ontologies. It combines three beautifully tuned features: F# 3.0 type providers, which give the programmer access to massive quantities of scientific data in an intuitive, strongly typed, and navigable way; F# units of measure, which let you attach unit annotations to numerical data; and F# functional programming, which lets you write succinct and efficient transformations of data with a low error rate. With the open source and highly interoperable nature of F#, these features work together to give the scientific programmer a uniquely powerful tool tuned to the needs of scientific programming in the 2010s and beyond.

Room 353
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
12:30–
13:30
Lunch
13:30–
15:00
Breakout Sessions
Community Capability Model for Data-Intensive Science (opens in new tab)

Session Chair: Alex Wade, Microsoft Research

Community Capability Model for Data Intensive Science OverviewKenji Takeda (opens in new tab), Microsoft Research

Microsoft Research Connections and UKOLN, University of Bath, are working in partnership on an exciting and challenging new project to develop a Community Model for Data-Intensive Science, building upon and extending the principles described in The Fourth Paradigm (opens in new tab). (opens in new tab) The speed at which any given discipline advances will depend on how well its researchers collaborate with one another as well as with others responsible for the computational infrastructure now presumed to be a core part of the research process. The model will include technological aspects such as common data infrastructure, standards, and ontologies. Social aspects such as collaborative approaches and opened and socio-legal issues will also be explored. This talk will introduce the draft model framework.

Community Capability Model (panel session)—Alex Wade, Microsoft Research

In this panel session, the draft community model will be opened up for discussion: experts in various domains who are helping influence research outcomes will speak, and the audience can provide feedback on areas of key importance in different disciplines and suggestions for improvements to the model. This session provides the opportunity to help guide the deep-dive discussion session.

Room 351
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
VENUS-C: eScience in the Cloud (opens in new tab)

Cloud Computing for Research—a European Perspective—Fabrizio Gagliardi, Microsoft Research

Cloud computing promises to revolutionise how science and engineering can be carried out by industry, research organisations, and universities alike. To accelerate the adoption of this key technology, it is important that users are able to access cloud resources to best match their real-world applications to this new way of working. VENUS-C (opens in new tab) brings together end-users, developers, and cloud providers to create a seamless framework for cloud computing in Europe and globally. This talk will provide an overview of the VENUS-C project and how it is helping to democratise research.

VENUS-C and the Generic Worker Execution Environment—Götz Brasche, European Microsoft Innovation Center

In this presentation, we provide a technical overview to the VENUS-C (opens in new tab) project. VENUS-C delivers a set of services for researchers and scientists to run scientific applications in the cloud, such as services for data management and processing. In particular, this talk focuses on an implementation of the generic worker pattern, which supports data processing on hybrid clouds deployments, spanning public clouds (such as Windows Azure) and private on-premises deployments. This presentation will highlight implementation tradeoffs, and design and technology choices that were made to build a single extensible code base, which supports a broad variety of deployment models (such as PaaS, IaaS, and on-premises deployments) as well as a broad variety of storage systems.

Data Management in Venus-C—Ilja Livenson, Kungliga Tekniska Hoegeskolan

Data management for cloud applications is an important topic to realize usable services. In this talk, we describe work done in VENUS-C to provide an integration layer for data management facilities.

Room 353
Norra Latin (opens in new tab)
Drottningatan 71B 11136 Stockholm
15:00–
15:30
Coffee break
15:30–
17:00
Breakout Sessions
Developing Communication Maturity Models for Data-Intensive Science (opens in new tab)

Session Chair: Lee Dirks, Microsoft Research

Community Capability Model Deep Dive Interactive Discussion Session—Alex Wade, Microsoft Research

Participants will be involved in guided discussions in order to provide in-depth feedback on the development of the community capability model. This is of particular importance to identify how the model can be refined and finalized for use by the community. The ultimate aim is to provide a framework that is useful for researchers and funders in modeling a range of disciplinary and community behaviors with respect to the adoption, usage, development, and exploitation of cyber-infrastructure for data-intensive research.

Summary and Next Steps for Community Capability ModelKenji Takeda (opens in new tab), Microsoft Research

The session will conclude with a summary of the day’s discussion and outline of the next steps for the project and how participants can continue to be involved.

Room 351
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
Cloud Computing: Real-World Experiences (opens in new tab)

Drug Design Experiments in the Cloud—Simon Woodman, Newcastle University

This talk describes how VENUS-C is exploiting existing drug-discovery services and components to improve them by experimenting on large-scale infrastructures and ultimately deploying them for the benefit and engagement of pharmaceutical research and healthcare industry.

Green Prefab, Civil Engineering in Cloud Computing—Furio Barzon, Collaboratorio

This talk describes how Collaboratorio is working on the extension of their product GreenPreFab to exploit the VENUS-C platform extensively, as well as adding features and models that will allow civil engineers to simulate building structural behavior in the same integrated environment of the green building commercial value chain. For this scenario, Collaboratorio involves a pilot case building (“Manifattura Domani” in Rovereto, Italy) in order to apply the emerging platform to a real case simulation. We are also engaging with the community of architects, enabling them to approach the e-Infrastructure via specific tools for experimenting with new building solutions and models.

Panel Session – Supporting Scientific Users in the Cloud

Session Chair: Fabrizio Gagliardi, Microsoft Research

  • Götz Brasche, European Microsoft Innovation Center
  • Ignacio Blanquer, Universitat Politècnica de València
  • Andrea Manieri, Ingegneria Informatica SpA

While cloud computing promises to be a key enabling technology for eScience, its success relies not only on technical capabilities, but also on socio-economic factors. VENUS-C (opens in new tab) has been addressing a broad spectrum of areas: software architecture, cloud infrastructure, user scenarios, dissemination, cooperation, and training. In this panel session, we will discuss how all of these factors are suppport scientific users in using cloud computing. We will also highlight the most important issues from both end-user and service provider perspectives.

Room 353
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
17:00–
18:00
Travel break
18:00–
22:00
Microsoft Research Banquet Clarion Hotel Sign (opens in new tab)
Östra Järnvägsgatan 35, Stockholm
Presentation of Jim Gray Award
Keynote

Knowledge Ecosystems: Data-Intensive Science is More Than Speeds and Feeds (opens in new tab)—Mark Abbott, College of Oceanic and Atmospheric Sciences, Oregon State University

Data-intensive science is driven by both technology and the nature of our scientific questions. The relentless pace of technology in regards to computation, storage, and networking has enabled an array of new services and systems, ranging from the well-known improvements in high-performance computing to the proliferation of smart sensors in unexpected platforms. Science has also blossomed into new areas of complexity, driven by interdisciplinary questions, such as climate change and the treatment of cancer. In both science and information systems, old architectures and institutions are struggling with the disruptive forces of data-intensive science. This new knowledge ecosystem is characterized by radical personalization, rapidly assembling (and disassembling) collaborative networks, and intense real-time communications. Traditional networks of mainframes and clients or isolated desktop PCs are as out of place in this ecosystem as are the traditional networks of scientists working in isolation and publishing in obscure journals three to five years after collecting their data. An ecosystem concept that is far from equilibrium and uniformity and is characterized more by resilience and flexibility than by robustness and predictability may be a better model for both science and information technology. Jim Gray’s “laws” of data-intensive science can help guide us through this new conceptualization.

Tuesday, December 6, 2011

Time 

Event/Topic

Location
8:00–
8:30
Registration
9:00–
9:30
Opening of IEEE International eScience Conference
9:30–
10:30
IEEE e-Science Conference Keynote 1
10:30–
11:00
Coffee break
11:00–
12:30
Breakout Sessions
Digital Humanities 1 (opens in new tab)

Session Chair: Donald Brinkman, Microsoft Research

Big Archaeology: Creation, Integration, Analysis, and Dissemination of Archaeological Data—Graeme Earl, University of Southampton

This presentation will introduce the work of the Archaeological Computing Research Group at Southampton, tracing the archaeological data life-cycle from creation in the field and lab to public and academic dissemination. Our work at sites such as Portus, the Port of Imperial Rome, has demonstrated the potential of an integrated digital approach to archaeological practice on a large scale. In addition to the data processing and management needs implied by techniques such as laser scanning, three-dimensional geophysics, and reflectance transformation imaging, the use of digital technologies on site has a transformative role to play in interpretation and field practice. We are therefore exploring potentials for building wearable prototypes to blur the digital:physical research environments of laboratories and field sites, and examining their impact on the archaeological research process. In partnership with the Southampton μ-VIS Centre we have also begun to employ micro computed tomography on the imaging of archaeological materials, which presents considerable data visualisation and management tasks to be explored in the paper. Finally, having explored the production of these datasets, the paper will explore modes for integrated examination of rich, spatial information at micro and geographic scales, the construction of graphical simulations for analysis and dissemination, and methods for creating rich publications.

The Archive Without Walls—Jeffrey Schnapp, Harvard University

This presentation will focus on a metaLAB project entitled extraMUROS: an open-source HTML5 infrastructure built on public APIs that enables faculty, students, and the general public to view, annotate, and remix digital multimedia collections and to interconnect them with other high-quality digital repositories across the web. While books (in material and digital form alike) are vital to the future of libraries, we believe that in an increasingly audiovisual world of scholarship and public discourse, it is essential that libraries play a major role in preserving, making available, and providing innovative tools for interpreting and disseminating society’s audiovisual past, present, and future across media.

Room 351
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
Semantics in Action with Data Enrichment (opens in new tab)

Session Chair: Evelyne Viegas (opens in new tab), Microsoft Research

Machines are Users Too: Towards Computational Research Objects—David De Roure, University of Oxford

We are seeing the emergence of new sharable digital artifacts as the practice of data-intensive research evolves, and we can look at these as an alternative to the paper publications which underpin the traditional scholarly knowledge cycle. Unlike papers, these “research objects” may be interactive and executable for the human reader/author, and they facilitate reproducibility and reuse. But they also have a computational dimension, as they are processed, composed, validated, and curated by machine, perhaps autonomically or to assist the human user, for example, through recommendation. We achieve this by bringing semantic computing to Research Objects, in their expression as Linked Data and also through a programming language semantics approach to execution and composition. This talk will present a semantic computing perspective on Research Objects, drawing on our experience with the myexperiment.org social website and the Wf4Ever digital preservation project, and suggest a future where machines better facilitate the practice of data-intensive research.

Publishing Open Government (Linked) Data in Brazil—Karin Breitman, Pontificia Universidade Católica do Rio de Janeiro

The publication of Open Government Data (OGD) that conforms to W3C Linked Data standard requires that a myriad of public information datasets are converted to RDF triples. A major step in this process is deciding how to represent the database schema concepts in terms of RDF classes and properties. This is done by mapping database concepts to a vocabulary, to be used as the base for generating the RDF triples. Although the construction of this vocabulary is extremely important—because the more one reuses well known standards, the easier it will be to interlink the result to other existing datasets—most triplifying engines do not support this activity. The problem is aggravated when the original datasets are a language other than English. Because the vast majority of the de facto standard RDF vocabularies are in English (Dublin Core, SKOS, dCat, FOAF, Good Relations, and so forth), one needs to map and translate concepts from the original databases to the various vocabularies. In this talk, we explore the topic by describing our experience in publishing OGD data for the Brazilian government: problems, tooling, process, cloud infrastructure, translation, visualization, publication, LOD, challenges, and opportunities.

F# 3.0 Type Providers – How a Simple Change to Programming Languages Can Open the Floodgates to the Semantic WebDon Syme (opens in new tab), Microsoft Research

The worlds of programming languages and the semantics web seem universes apart. However, in this talk we will show how a simple and intuitive change in programming language architecture can lead to a wonderful union between these two worlds, combining the simplicity and power of modern professional programming tools with the masses of organized data appearing through the structured, organized, schematized data sources now populating the web.

Room 353
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
12:30–
13:30
Lunch
13:30–15:00  Breakout Sessions
Digital Humanities 2 (opens in new tab)

Session Chair: Donald Brinkman, Microsoft Research

TextGrid – An Architecture to Store, Access and Manage Diverse Humanities Data—Andreas Witt, Institut für Deutsche Sprache, Zentrale Forschung

TextGrid is a virtual research environment for scholars in the humanities, currently focusing on German literature, art history, classics, linguistics, and musicology. TextGrid’s architecture includes a repository accessible via dedicated web service APIs. It consists of a storage component, which manages file access and replication and uses a storage grid as data backend, and another component that handles role-based access control. In addition to access rights, certain license policies must be enforced for published content. A TextGrid middleware component, called TG-license, has been designed for published content. For high scalability, these three TextGrid middleware components are designed to be distributed.

The presentation will give a general overview of the TextGrid architecture and the content of the TextGrid repository. It then will focus on license requirements and describe how the TG-license can handle licensing issues such as data that is:

  • Only publicly accessible 70 years after the death of the author
  • Only accessible if the user has acknowledged the license

The tool to solve these license issues is based on XACML policies that refer to attributes of the user and of the resources, information that is stored in a distributed LDAP directory.

Large-Scale Music Analysis in the Key of E-Science—Stephen Downie, University of Illinois – Urbana, Champaign

The use of eScience technologies in the digital arts and digital humanities research domains represents a fast-growing area of scholarly activity. The Structural Analysis of Large Amounts of Music Information (SALAMI) project is an excellent example of the fruitful union of classic musicology and eScience. SALAMI is a multinational (Canada, United Kingdom, and United States) and multidisciplinary (music theory, library science, and computer science) digital humanities research collaboration. Exploiting 250,000 hours of compute time donated by the National Center for Supercomputing Applications (NCSA) at the University of Illinois, the SALAMI project is conducting the structural analysis of some 20,000 hours (in other words, roughly 2.3 years) of music audio.

Thus, the SALAMI team is undertaking formal music analyses at a scale that no individual human scholar could ever hope to undertake. They will also contextualize the SALAMI project within the broader frameworks of the ongoing Networked Environment for Music Analysis (NEMA) and the Music Information Retrieval Evaluation eXchange (MIREX) projects. The motivations, goals, and developments of these three interrelated projects are presented to help illustrate the kinds of questions being explored by music informatics scholars and the roles that the eScience suite of tools—including HPC, semantic web, and Linked Data techniques—can play in answering those questions.

Room 351
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
Semantics in Action via Services and Policies (opens in new tab)

Session Chair: Evelyne Viegas (opens in new tab), Microsoft Research

Supporting the Virtual Physiological Human with Semantics and Services—Carlos Pedrinaci, Open University

The Virtual Physiological Human (VPH) is a methodological and technological framework that, once established, will enable collaborative investigation of the human body as a single complex system. The collective framework will make it possible to share resources and observations formed by institutions and organizations creating disparate, but integrated computer models of the mechanical, physical, and biochemical functions of a living human body. Over the last few years, we have been applying semantic and service technologies to support the above ambition. Recent work has focused in particular on service descriptions, with supporting tools, which can be represented as Linked Open Data.

Provenance in the Semantic Web—Steffen Staab, University of Koblenz-Landau

The Semantic Web has been developed in order to combine data from a broad range of underlying data sources. Once data sources are combined, the question follows how to identify the provenance of data contributed from these many sources and how to pursue the policies that are bound to such data. In this talk, we will present our approaches for tracking the provenance of RDF data and OWL axioms in SPARQL-based question answering and how to realize (privacy) policies concerning such data.

Making the Semantic Web Easier to Use for eScience Applications—Tim Finin, University of Maryland, Baltimore County

Semantic Web technologies have the potential to support science by providing a web-based data representation that ties data to semantics models, facilitates data sharing and linking, supports provenance annotations, and can exploit a large and growing collection of background knowledge on the web. While the concepts and technologies are mature and supported by sound standards, their use within scientific communities remains relatively low. This talk will discuss current research aimed at reducing some of the barriers to wider adoption and use by both professional and citizen scientists. It will describe approaches to publishing and sharing scientific data on social media systems, techniques enabling scientists to generate Semantic Web data by using familiar software tools, and new approaches to querying large collections of Semantic Web data.

Room 353
Norra Latin (opens in new tab) Drottningatan 71B 11136 Stockholm
15:00–
15:30
Coffee break
15:30–
17:00
Software for Science (opens in new tab)

Session Chair: Yan Xu, Microsoft Research

European Union support for e-Science through e-InfrastructuresKirsti Ala-Mutka, European Commission

Predicting Where on Earth Life Is and Will Be – SDM 2.0—Greg Mclnerny, Microsoft Research, Cambridge

Predicting the future distribution of biodiversity on the planet is of paramount scientific, economic, governmental, and societal importance as we try to develop understanding and policy that addresses global environmental change. Downloadable software and data resources have enabled an immense research domain to develop in response to this research agenda, known as SDM (Species Distribution Modelling). Despite more than 20 years of intensive research activity in SDM and more than 20,000 published papers on this activity, significant methodological issues exist and its relationship between the software and science is highly controversial. Current modelling methods are almost entirely based on correlative statistical models that include a very small part of ecological knowledge, and use coarse binary information on species’ occurrence that are prone to measurement error. I will talk about our work to develop a framework for a new generation of methods: SDM 2.0. This framework goes beyond our novel methodological developments that address major sources of uncertainty in biodiversity predictions (for example, Bayesian modelling of measurement errors and biotic-interactions); we are also addressing novel techniques for uncertain spatial information all the way to understanding the sociology of the software that drives this scientific domain and that will inform our own development of new SDM software. This holistic approach aims to produce software that enables new kinds of policy relevant science.

Are We Losing Science Within Software? (panel session)

Panel Chair: Kenji Takeda (opens in new tab), Microsoft Research; Jeremy Frey, University of Southampton; Alex Szalay, Johns Hopkins University

In this panel session, we will describe the sociology of scientific software development. We will then discuss key questions concerning how the scientific community can work together better to produce new tools, technologies, and platforms to accelerate the pace of scientific discovery

Room 351
Norra Latin (opens in new tab)
Drottningatan 71B 11136 Stockholm
18:00–
21:00
Dinner at City Hall City Hall

Wednesday, December 7, 2011

Time 

Event/Topic

Location
8:00–
9:00
Registration
9:00–
10:00
IEEE e-Science Conference Keynote 2
10:00–
10:30
Coffee break
10:30–
12:00
Breakout Sessions
Open Data for Open Science – An Environmental Informatics Workshop (part 1) (opens in new tab)

Session Chair: Yan Xu, Microsoft Research

How to effectively share, discover, access, and consume data from heterogeneous sources is one of the outstanding challenges in today’s environmental research. Environmental informatics is about innovatively using computing technologies to solve the data problems and advance the transformation of environmental data to information, to knowledge, and to social impact. The presentations in this session will share various perspectives, scenarios, and demos in environmental data discovery, data process via HPC and/or cloud computing, and data visualization. Each presentation will end with one or more questions to encourage audience discussion. Please join us with your questions and contribute to an interactive experience at this workshop.

Presentations

Open Data for Open Science –Environmental Informatics at Microsoft—Yan Xu, Microsoft Research

This talk presents the Microsoft Environmental Informatics Framework (EIF), which defines the role that Microsoft Research plays in contributing to the advancement of environmental research worldwide. EIF provides cutting-edge Microsoft technologies to researchers as well as successful examples of engaging the technologies with the most challenging current environmental research problems.

PivotViewer: a Living Infographic for Your Data—Bryan Kraus, Microsoft

PivotViewer takes data in any state of polish and construction and disassembles it into the visual and data elements that tell the true underlying story. This visualization tool provides a bridge from the unknown into the heart of what your results have to offer. Very simply, providing the right visualization and the right UI metaphors with which to immerse yourself in the data means that the value of the whole of the data becomes much greater than the sum of its parts. PivotViewer enables you to see previously invisible trends, which then enables you to iterate on various hypotheses rapidly until you see what your data actually will support; thereby reducing the barrier of entry into a world of interactive, fluid, and gracefully designed data driven experiences.

Data Integration in Environmental Observatory Networks—Ilya Zaslavsky, University of California, San Diego

Environmental observatories represent comprehensive, highly-instrumented observing systems, which are key components for the monitoring and studying of interdisciplinary terrestrial, aquatic, and meteorological processes across a range of temporal and spatial scales. Understanding the dynamics of complex environmental systems often requires cross-observatory comparison and data integration, and sophisticated information infrastructure to support data sharing. This presentation will review information infrastructure challenges and solutions from several large-scale environmental observatory projects, including the CUAHSI Hydrologic Information System, the Critical Zone Observatory, and the Tropical Ecology Assessment and Monitoring Network. We will focus, in particular, on issues of semantic consistency, standard information models and services for information exchange, and advanced visualization techniques used to summarize and represent data within a network of observatories.

Visual Analytics Challenges in Environmental Informatics: The Case of Paleoclimatology—Roberto Therón, University of Salamanca (Spain)

Current efforts related to environmental research pose several challenges that can be summarized as the great difficulty to make sense of vast and heterogeneous datasets within an increasingly complex process that involves many different disciplines, institutions, and practices. Visual Analytics has emerged in the recent years as a field that provides tools and techniques to facilitate the analytical reasoning process through the creation of software that maximizes human capacity to perceive, understand, and reason about complex and dynamic data and situations.

In order to gain insights on environmental problems, all the available spatio-temporal data sources must be integrated, analyzed, and presented in a meaningful way; it is imperative that new tools and new methodologies are developed to help the experts extract the relevant information. In this talk, Roberto will present several visual analysis examples and lessons learned in his research within the field of Paleoclimatology.

Room 351
Norra Latin (opens in new tab)
Drottningatan 71B 11136 Stockholm
Is NUI “Natural” for Scientists?

Session Chairs: Kenji Takeda (opens in new tab) and Stewart Tansley, Microsoft Research

Speakers: Mark Abbott, Oregon State University; Hans-Christian Jetter, Univeristy of Konstanz; Madhusudhanan Srinivasan, KAUST; Anne Trefethen, University of Oxford

Natural user interface (NUI) technology has captured imaginations across industry, education, and entertainment with the release of groundbreaking new technologies for interacting with computer systems, such as Microsoft Surface and Kinect for Xbox 360. You don’t have to look far to realize that technology is becoming more natural and intuitive. People use touch and speech to interact with their phones, at the ATM, at the grocery store, and in their cars. The learning curve for working with computers is becoming less and less of a barrier, thanks to more natural ways to interact. How is this technology being used by scientists today, and what is its potential? This panel will explore best practices and new directions, ultimately seeking to present a slice through the current research agenda for NUI research in scientific applications.

Room 353
Norra Latin (opens in new tab)
Drottningatan 71B 11136 Stockholm
12:00–
13:00
Lunch
13:00–14:30 Breakout Sessions
Open Data for Open Science – An Environmental Informatics Workshop (part 2) (opens in new tab)

Session Chair: Yan Xu, Microsoft Research

Using the OData protocol in eScience—Chris Robinson, Microsoft

OData is a protocol for sharing, querying, and manipulating data that is ideal for use in eScience. In this talk, Alex will explain how an open simple standard for sharing data is vital and how OData solves common challenges you might face sharing your data. You’ll also hear about recent additions to OData that cover everything from spatial data to concepts like inheritance.

On the Processing of Sensing Data in eScience—Antônio Loureiro, Federal University of Minas Gerais

Future eScience applications will strongly rely on sensing technologies to obtain a diversity of data. This creates new challenges to scientists who are involved in those eScience applications that are being studied by computer scientists. In this talk, we will discuss the different problems of sensing data processing in eScience and how the information technology and communication—in other words, software, hardware, and data communication—can help eScientists in solving those problems. We will present some case studies that support these points.

Advancing Digital Urban Informatics for Innovative Environmental and Water Resources Management—Yong Liu, University of Illinois – Urbana, Champaign

Continuing urbanization and regional climate change will increase the likelihood of extreme hydrological events such as storms and drought in the coming decades. Data from heterogeneous environmental sensor networks as well as social web, such as Twitter, provide an opportunity for us to develop data-driven participatory science, while at the same time increasingly higher resolution physics-based models continue to provide predictive understanding of the natural environment. The “Digital Urban Informatics” project at NCSA, in collaboration with Microsoft Research, has begun to develop a new computational framework that can harmonize both data-driven computing and physics-modeling approach for science-based innovative environmental and water resources decision support. Currently, we focus on two science use cases: long-term drought risk analysis in Arizona and short-term flooding situational awareness in South Florida. In this presentation, we will report our progress on three informatics tools development and their associated applications:

  • Development of mobile-plus-cloud technology that can support real-time citizen sensing and visualization using Window Phone 7 and NCSA streaming data services on the Windows Azure Cloud for localized situational awareness
  • On-demand “dropbox”-style ensemble run services on Windows Azure for groundwater risk analysis
  • Interoperability study using Linked Open Data approach for provenance-aware data mash-up across multiple domains

Preliminary results of this project have already attracted great interest in the environmental community and we will discuss our future plans to extend these services to the broad community.

Room 351
Norra Latin (opens in new tab)
Drottningatan 71B 11136 Stockholm
14:30–15:00 Coffee break
15:00–16:00 IEEE Workshops
16:15–17:15 IEEE e-Science Conference Keynote 3
18:30–22:00 Gala dinner at Stockholm venue

Thursday, December 8, 2011

Visit the IEEE International Conference on e-Science (opens in new tab) website for information about Thursday’s program.