UZH.7

Chemistry - Grid integration

Long Title: Framework for Integration of Chemistry applications with the Swiss Grid Portal
Leading
Organization:
Universität Zürich
Domain: Grid
Status: finished
Start Date: 01.07.2009
End Date: 31.12.2010
Project Leader: S. Maffioletti

This project extends the grid portal framework, developed by "Swiss Grid Portal", and especially the capabilities of the existing modules and projects to provide dynamic access to applications services. Users profit for example by a condsiderable reduction of time-consuming data format conversions and/or other code restructuring, once the capabilities of different applications are integrated.

The efforts of this project are aligned with the Swiss Grid Portal project. Both projects are going together in the Swiss Grid Portal extension (after end of 2010).

Results

Component Description
MyGAMESS MyGAMESS portal (available to UZH Chemistry users only)
Portal implementation Journal of Grid Computing. "GridCertlib: a Single Sign-on Solution for Grid Web Portals".
Portal implementation cited in this accepted paper. Available through Arxiv.org

Computational chemists are given a simple to use web interface to perform large computer simulations and to analyze digital data.
The access to the underlying computational infrastructure is masked by an analysis oriented interface that does not expose the technical details of the execution framework but rather presents to the end users abstractions and services that are more tailored to their research activities.
The result is a computational chemistry specific web portal named MyGAMESS, centered around the GAMESS-US application. The portal allows to access the Swiss National Grid Infrastructure (SMSCG) thus providing a large scale computing facility. Users can log into the portal using AAI; no further authentication is requested to run data analysis. A simple visualization tool (pyMol) also allows to visualize both input and output results.

This project is one of the first implementations of a web portal solution that fully integrates the AAI infrastructure with the access to the SMSCG computing infrastructure.
The operation of the portal as well as the subsequent development of customized user interfaces is performed as part of the regular activities of the UZH/GC3 team.
The possibility for an end-user to use his/her well known AAI credential to log into the portal and seamlessly access the national computing infrastructure is an unprecedented added value that has been acknowledge at the international level as experienced during the European Grid Initiative (EGI) conferences.
The approach followed by the GC3 team is separating the portal layer (graphical interface) from the execution layer (based on GC3Pie framework) allowed an additional degree of flexibility in building end-toend solutions for a large variety of user groups (as opposed to monolithic approaches where it is possible to only customized the available services).


Initial situation

In the past we have been working in collaboration with community researchers and teams to leverage new developments and to build application-level services for Computational Chemistry and Biomedical scientific communities. Working with the domain scientists, we have wrapped a set of existing scientific applications in the services model. The services are coupled with back-end computational clusters and job schedulers and use standard grid security best practices for client authentication and authorization.
Based on our experience building and operating these services, we found that end-users wish to integrate the capabilities of the different application services in arbitrary and novel ways that can't be defined a priori.
Currently, integrating these capabilities requires users to learn the intricacies of each software implementations and perform time-consuming data format conversions and/or other code restructuring. With the increased number of available applications, the capability to integrate across these numerous capabilities greatly diminishes.
Scientists often need to combine multiple separate application services in highly interactive user-directed ways: a scientist may inspect data at each step in a sequences in order to determine the next step. Support for such exploratory workflows could lead to great scientific advances.

Goals

In our approach, we leverage and extend the capabilities of the existing modules and projects to provide dynamic access to applications services. We will extend the grid portal framework that is developed by the Swiss Grid Portal project in the following way:

  1. Define a set of desired capabilities across a set of domain applications, and develop required common data representations for it.
  2. Explore algorithms to record exploratory workflows and to convert them into appropriate high-throughput workflows.
  3. Design and build necessary applications workflows for.
  4. Test Quantum Chemistry applications workflows with the grid portal and the JOpera workflow engine.

The development and testing will be fully integrated with the infrastructure provided through the Swiss Multi-Science Computing Grid (SMSCG) project.

Benefits

The applications are complex and require a flexible user interface that allows for parameterization and modification of the input and visualization of the output. The portal modules must be plug-and-play and easily adaptable to other applications and grid portal solutions.

Our framework will be reusable for other applications and extensible to other grid portal solutions. The proposed framework is intended to benefit all the Swiss Grid scientific user community, who will benefit from being able to create complex applications workflows and submit them to the grid resources.

Steps

The work is divided into:

  1. Identification and design:
    • Analysing the application use cases together with the domain scientists.
    • Mapping the use cases to the portal functionalities and modules.
    • Creation of application specific workflows.
    • Creation of requirements for user interfaces.
    • Identification of requirements for data management.
    • Identification of the common data representation.
  2. Integration and deployment:
    • Providing support for customizable workflows.
    • Building, validation and testing of workflows on real use cases.
    • Identification of the high-throughput workflows requirements.
    • Enabling and testing of high-throughput workflows.
    • Strong interaction with other SWITCH/AAI efforts.

Back