 |  | Page & Feed Options Bookmark This  |
 Table of Contents 
|
Fungibility and Virtualization in Grids Ideally, the resources in a computing grid should be fungible and virtualized. Two resources in a system are fungible if one can be used instead of the other with no loss of functionality. Two single dollar bills are fungible, in the sense that they will each purchase the same amount of goods, even if one is destroyed. In contrast, in most computer systems today, if one of two physically identical servers breaks, the second is not likely to be able to take over smoothly. The second server may not be in the right place, or the broken server may contain critical data on one of its hard drives, without which the computation cannot continue.
A system can be architected to attain fungibility, for instance, by keeping data separate from the servers that process it. A long-running computation can checkpoint its data every so often, so if a host breaks, the new host picks up the computation at the last checkpoint when it comes online. If the server was running an enterprise application, it could unwind any uncommitted transactions and proceed from there. An online user may notice a hiccup, but the computations are correct.
A virtualized resource has been abstracted out of certain physical limitations. For instance, any 32-bit program can access a 4GB memory virtual space, even if the amount of actual physical memory is substantially less. Virtualization can also apply to whole machines: multiple logical servers can be created out of a single physical server. These logical servers run their own copy of the operating system and applications. This setup makes sense in a consolidation setting, where the cost of maintaining the consolidated server is less than it would cost if the machines were hosted in separate, smaller machines. A hosting service provider can provide a client with what looks like an isolated machine but which is actually a virtualized portion of a larger machine.
The nodes in a cluster may be “heavy” in the sense of being built as two, four, or more CPUs sharing memory in a symmetric multiprocessor (SMP) configuration. Programs that take more than one node to run can operate in a hybrid Message Passing Interface (MPI)/OpenMP* configuration. These programs expose large-grain parallelism, with major portions running in different nodes using the MPI message-passing library. Within one node, each portion is split into a number of threads that are allocated to the CPUs within a node. Building software to a hybrid configuration can increase development costs enormously.
Fungibility helps improve operational behaviors. A node operating in a fungible fashion can be taken out of operation and replaced by another one on the fly. In a lights-out environment, malfunctioning nodes can be left in the rack until the next scheduled maintenance.
In a highly virtualized, fungible, and modularized environment, it is possible to deploy computing resources in small increments to respond to correspondingly small variations in demand. Contrast this to the mainframe environment two decades ago: because of the expense involved, a shop would wait until the resources of an existing mainframe were maxed out to purchase and bring in a new one in what was literally a forklift upgrade.
The main innovation brought up by IBM’s System/360* was the ability to run the same software base over a range of machine sizes. An organization could purchase a bigger machine as business grew. This change was expected to happen over months or years. This capability represented enormous progress over having to re-implement the application base for every new model, as the case was before.
The bar for business agility today is much higher. The expectation for the grid is that resources dedicated to applications can be scaled up and down almost in real-time. Outsourcing to service providers represents an alternative over long procurement cycles. Since commodity servers are less expensive than mainframes, the budgetary impact of adding a new server is much smaller than adding or upgrading a mainframe. Despite this affordability, however, not all applications can take advantage of extra servers smoothly.
The capability for incremental deployment simplifies business processes and reduces the cost of doing business. It enables new business models, such as utility computing, where service provisioning is metered to match demand.
A pure utility model is not yet practical today, because the concept can be taken only so far. Even traditional utilities have different granularities and costs. Consider, for example, a traditional electric utility company, where electrons have different costs depending on the time of day and the energy source with which they were generated. Most utilities hide this fact, presenting most residential customers with a single, integrated bill. On-demand computing is a more attainable degree of utility computing, where relatively non-fungible resources are allocated dynamically, within certain restrictions. One example is capacity-on-demand, where a large server is sold with extra CPUs that are turned on at customer request. A restriction is that the new CPUs cannot be turned off, and hence the rates cannot be rolled back.
|