ETHZ.12 |
Academic Compute Cloud |
Long Title: | Academic Compute Cloud Provisioning and Usage |
Leading Organization: |
ETH Zürich |
Participating Organizations: |
Universität Zürich
|
Domain: | Grid |
Status: | finished |
Start Date: | 10.07.2012 |
End Date: | 28.02.2013 |
Project Leader: | P. Kunszt |
Deputy Project Leader: | S. Maffioletti |
Component | Description |
Academic Compute Cloud wiki | Results, recommendations, comparisions, surveys and slides (public) |
Workshop April 29 2013 | Slides (public); detailed report for the ETH and UZH to be used as input to their respective IT Cloud strategies is available upon request and counts 85 pages |
Octavius cloud resource | Cloud resource at ETHZ (restricted access; accounts are given to test users on request) |
Hobbes cloud resource | Cloud resource at UZH (restricted access; accounts are given to test users on request) |
Project wiki | Project wiki (protected, open by request to project partners and universities) |
This project was to build knowledge and to share this knowledge with all interested parties in Switzerland.
The team wanted to define the cloud, as there are already too many definition of the word "cloud computing".
In this definition, it is a cloud if a user can request certain resources or services on demand at any time without
any prior contracts or long-term commitments. The user can make use of these services for a certain amount
of time and then he or she only has to pay for the actual usage of the resource. The usage is monitored and
the user can see at any time how much is being used and what the cost is. Should the user need more or less resources,
they can be dynamically scaled up or down. What's more, the user can drive all these services programmatically with
his or her own software. If all this is possible, is is a cloud according to the project team definitions.
A ready-made solution from Hewlett Packard was rented and two open-source cloud systems built, based on a software
called OpenStack, one at the ETH and one at the University of Zurich. All three analyzed systems (the commercial HP,
and the two OpenStack systems) need professionals to operate them. The commercial system optimally needs
expert support directly from the company. The open source solutions are not yet as mature but can be made to work if local expertise exists.
Some charging models for such pay-per-use facilities are proposed, as currently it is not easy for researchers
to receive money for such infrastructures.
Both the ETH and the UZH take the output of this project as input to their respective cloud IT strategies. For the immediate future, both pilot infrastructures will be kept operational and will be part of the CRUS Bridge project for the Swiss Academic Compute Cloud (SwissACC), led by the UZH.
The project team will investigate'>
Cloud approaches provide many advantages that are attractive to the scientific community. Through virtualization, applications have maximum flexibility in their deployment and new computing models can be accommodated without the need to reconfigure the cluster (MapReduce, BigTable, etc). On the infrastructure level it can allow a cluster to be extended dynamically to partners such as another private university cloud (making resource sharing possible on the infrastructure level) or to a public cloud (Amazon EC2, IBM SmartCloud, etc). This dynamic extension of a local private cloud is generally called Cloud Bursting. It will be tested between UZH and ETH Zurich, and also to the public clouds.
1. Cloud Bursting batch cluster jobs to private and public clouds
The computing cluster be extended dynamically to another available private cloud only when needed and under
certain conditions. This can be done also without virtualization.
Cloud Bursting to the public cloud like Amazon EC2 will be explored, but also the SMSCG infrastructure,
which can run virtual cluster image instances, extending the institutional private cloud dynamically. For
that purpose, results of the VM-MAD project will be used.
Cloud Bursting the cluster has the advantage that end users do not need to change how they work with the system,
their data analysis methods as the access patterns remain unchanged. Nevertheless, not all applications
are suitable for such a model and the performance impact for various applications must be investigated.
Applications that will be tested are:
2. Virtualized Environments
In this model an EC2-like service on the institutional private cloud will be offered; end-users can dynamically
instantiate virtual infrastructures for their scientific applications. They control and configure their
own virtual instances. This has the advantage that virtual instances are fully customizable by end-users
to their application-specific requirements that might otherwise not be met on a standardized cluster infrastructure.
With this approach unsupported applications, like the applications used in VM-MAD by the Functional Genomic
Center or full virtual clusters can be set up to operate on the institutional private cloud (for example
a dedicated MapReduce/Hadoop cluster environments).