Скачиваний:
50
Добавлен:
20.06.2019
Размер:
50.48 Mб
Скачать

Chapter 13

Mixing Grids and Clouds: High-Throughput

Science Using the Nimrod Tool Family

Blair Bethwaite, David Abramson, Fabian Bohnert, Slavisa Garic, Colin Enticott, and Tom Peachey

AbstractThe Nimrod tool family facilitates high-throughput science by allowing researchers to explore complex design spaces using computational models. Users are able to describe large experiments in which models are executed across changing input parameters. Different members of the tool family support complete and partial parameter sweeps, numerical search by non-linear optimisation and even workflows. In order to provide timely results and to enable large-scale experiments, distributed computational resources are aggregated to form a logically single highthroughput engine. To date, we have leveraged grid middleware standards to spawn computations on remote machines. Recently, we added an interface to Amazon’s Elastic Compute Cloud (EC2), allowing users to mix conventional grid resources and clouds. A range of schedulers, from round-robin queues to those based on economic budgets, allow Nimrod to mix and match resources. This provides a powerful platform for computational researchers, because they can use a mix of university-level infrastructure and commercial clouds. In particular, the system allows a user to pay money to increase the quality of the research outcomes and to decide exactly how much they want to pay to achieve a given return. In this chapter, we will describe Nimrod and its architecture, and show how this naturally scales to incorporate clouds. We will illustrate the power of the system using a case study and will demonstrate that cloud computing has the potential to enable high-throughput science.

13.1  Introduction

Traditionally, university research groups have used varying sources of infrastructure to perform computational science, from clusters owned by individual departments to high-end facilities funded by federal governments. While these are priced differently,

B. Bethwaite (*)

Faculty of Information Technology, Monash University, Clayton, Australia e-mail: blair.bethwaite@infotech.monash.edu.au

N. Antonopoulos and L. Gillam (eds.), Cloud Computing: Principles,

219

Systems and Applications, Computer Communications and Networks,

DOI 10.1007/978-1-84996-241-4_13, © Springer-Verlag London Limited 2010

220

B. Bethwaite et al.

they have rarely been provided on a strict commercial basis. Local clusters, for example, are usually funded by recurrent university funding or by one-off grants. Further, access to these machines is controlled by the users themselves. With regard to high-end facilities, such as the Australian National Compute Infrastructure (http://www.nci.org.au) or the US TeraGrid (http://www.teragrid.org), it is often necessary to apply for a peer reviewed grant, and the quality of the application is assessed by a resource-allocation committee. However, these grants are usually made in terms of CPU hours rather than dollars.

Cloud computing represents a major shift in the provisioning and delivery of computing infrastructure and services. It enables a shift from distributed, unmanaged resources to a variety of scalable, centralised services managed in professional data-centres with rapid elasticity of resource and service provisioning to users. Most importantly, commercial cloud services have appeared in which users can pay for access on an hourly basis. These resources open the opportunity for university researchers to buy compute time on an ad-hoc basis – shifting university funding models from capital expenditure to recurrent costs.

This transition poses many policy issues as well as a range of technical challenges. Existing resources that are free will not disappear; there is clearly a role for continued investment in university infrastructure. On the other hand, commercial clouds could provide an overflow, or elastic, capability for individual researchers. One could easily imagine a research group performing much of their base-load computations on ‘free’ resources, but resorting to pay-as-you-go services to meet peak demand. To date, very few tools can support both styles of resource provisioning.

Many years ago, we introduced the idea of a computational economy as a mechanism to enable resource sharing on an open basis [1]. In this model, resource providers charged for time and users paid. At that time, we only envisaged a pseudo unit of currency to allow different users to compete for scarce resources. A user willing to pay more has more chance of achieving a deadline, and will complete more work than one who is only prepared to pay less. We implemented this scheme in the Nimrod tool family [2], though the lack of global infrastructure based on this model made it more of an academic proposal.

However, the Nimrod computational economy provides an ideal mechanism for mixing free and pay-as-you-go commercial cloud services. Interestingly, the same algorithms that we proposed for the computational economy can be used to trade-off resources in such a mixed grid.

In this chapter, we discuss the Nimrod tool family and describe the kind of highthroughput problems that it solves. We discuss the scheduling system that Nimrod uses to balance time and cost-based deadlines, and show that these can be used on a mixed test-bed consisting of grid and cloud resources. We then illustrate the power of the system to achieve scientific outcomes. Our case study shows that a user has the ability to decide how much money they are prepared to pay for improved science outcomes. Specifically, the case study explores the basic science that can be delivered from a typical university department cluster, and shows how the Amazon Elastic Cloud (EC2) (http://aws.amazon.com/ec2/) can augment this to improve the science outcomes. The chapter also discusses some of the

Соседние файлы в папке CLOUD COMPUTING