ETHZ.13

RSNAS

Long Title: Remote Scalable Network Attached Storage
Leading
Organization:
ETH Zürich
Participating
Organizations:
Swiss National Supercomputing Centre
Universität Zürich
Domain: Grid
Status: finished
Start Date: 10.07.2012
End Date: 28.02.2013
Project Leader: P. Kunszt
Deputy Project Leader: M. De Lorenzi

You need to share big amounts of data across the country or you would be able to access remote large storage systems? In this case you should have a look at the RSNAS prototype proposing a new path.

Results

The result of this project is a working system to provide NAS services anywhere in the country, linking it directly to the CSCS storage, or in fact any other storage that runs the same software based on GPFS.
The results have been presented at the HPC Advisory Council and will be written up also in other publications later in the year, including technical details of the work.

Remote Scalable Network Attached Storage was exploring a technology by IBM that would make the large remote storage at the Swiss National Supercomputing Center CSCS in Lugano available to the SystemsX.ch projects in Zurich and Basel as if it was local storage. Network Attached Storage systems are easy to use storage systems usually for smaller amounts of data that can be attached to any computer and device. The storage at CSCS is a very large (currently 8PB) hierarchical storage system (this means data is stored more than once such that it cannot be lost). Due to its very large size, the individual Terabyte cost is quite low. Several 100TB can be rent for a very low cost for research projects that need it just for 3-4 years if the data can be made available "locally".
Such a system with 200TB was prototyped at the ETH and the University of Zurich.
Know-how about how to design and configure was built up. The setup was tested extensively, in a close collaboration with CSCS and IBM.

A tested technology for data sharing can now be offered for other projects. CSCS is already offering this technology to other large projects: the new Human Brain Project will make use of this technology to access data stored in Lugane from Lausanne.

The main use case for RSNAS came from SystemsX.ch, who will make use of it also in the next 4 years of operations, also for new projects at other universities. Depending on the demand further NAS head nodes are now easily deployable. Also, this effort will be rolled into the CRUS Bridge project for the Swiss Academic Compute Cloud (SwissACC), led by the UZH.
Currently exist two user communities - one at the ETH and one at the UZH. The storage is used as the project storage for imaging and proteomics platforms. These in turn support several labs at both the ETH and UZH, counting on the order of 100 people.


Goals

Within RSNAS a remote storage setup between CSCS and ETHZ will be developed and tested, and a working instance at the UZH will be set up. The basic idea is to expose the scalable HSM that is operated at the CSCS over the WAN as a local Network Attached Storage remotely at the ETHZ.
This prototype will prove that expose the user's share as a local Network Attached Storage NAS inside the University's own network can be replicated as technology and be used in many other projects that need to share data across the country, making use of storage for Systems Biology research or in general being able to access remote large storage systems. In a sense, this can be seen as essential technology necessary to build a federated Swiss Storage 'Cloud' in the future.

Benefit

Currently, remote users need to copy large amounts of data to their local storage systems after their projects at CSCS have completed or before their projects start. Through new technology it is possible to expose the user's share (up to several hundred Terabytes) as a local Network Attached Storage NAS inside the University's own network (in this case the ETH Zurich and University of Zurich). A first installation is already in place at the ETHZ, directly connecting to the CSCS HSM storage, with a SystemsX.ch share of currently 100TB.
Scalable storage is an essential prerequisite for modern research, especially in the Life Sciences. Many projects underestimate the needs for their data repositories and the groups are not well prepared to deal with the large volumes that are generated within large collaborative projects.
In addition, there are no good concepts how to share and access the data across institutions collaborating with each other. Copying terabytes of data over the wide area network is a slow process, and both sides need large data storage infrastructures to accommodate all the data. Changes and additions need to be carefully synchronized inside collaborations. Local policies often prevent easy access to collaborative data as well, there are many historically grown firewall rules that make sharing and collaborations difficult.
Several research groups will need large data analytics resources in the future, like the ones currently available at the CSCS from Cray and SGI. How computational resources of the ETHZ Brutus Cluster and the UZH Schroedinger Cluster can be attached (if the data is available at all sites) will be tested within this project.

Description

A technology was prototyped to remotely access the CSCS storage from the ETH in Zurich as if it was a local Network Attached Storage (NAS). A server at the ETHZ (called storagex.ethz.ch) hands requests through to CSCS and provides NFS access to the data at CSCS, but many open questions remain.
This project is not a completely new one from scratch but it extends an already working setup in order to make it more usable, covering a wider range of use cases.
The technology used comes from IBM: General Parallel File System (GPFS) and parallel Network File System (pNFS).
The following technologies will be considered:

  • Caching of data locally at every site (set up and testing). This has a lot of implications that need to be addressed, cache updates, retention, etc.
  • Caching of the same data at several sites (work out the conditions and implement the possibilities to work with a distributed system also for writeable data). Use cases are mostly read-only data, and data is added, but the details will need to be worked out to avoid conflicts.
  • Security mappings (explore SWITCH AAI path and Active Directory Federation Services). In the initial setup, only specific User ID/Group IDs have access to the remote data through dedicated services, which belong to the CSCS domain.
  • Integration into the System Biology data lifecycle (propose economic and efficient way to provide a highly scalable project datastore; adapt services and workflow servers). Currently the services developed by SyBIT are performing a lot of the caching 'manually'. These services need to be adapted to work with this new technology.
  • In a second phase use locally defined user using a Swiss unique UserID/GroupID pair.

Steps

  1. Direct access to CSCS from ETHZ: no caching
  2. Integration at ETHZ
  3. Direct access to CSCS from UZH: no caching
  4. Caching design
  5. Security design
  6. Caching pilot
  7. Security implementation
  8. Caching server in place for ETHZ
  9. Caching server in place for UZH, and security services

Back