Clusters: Affordable and Available
By Larry Stevens
High availability, scalability and relatively reasonable pricing are
the reasons that users are choosing cluster systems, but they may not be
the best idea for all applications.
Two heads may be better than one, but two complete bodies--or four, six
or eight--are better yet. That's the theory behind clusters. Along with
massively parallel processing (MPP) and symmetric multiprocessing (SMP),
clusters form a trio of technologies that aim to boost server functionality,
performance and availability.
Each of these three technologies has its pros and cons. SMP machines, which
allow you to add processors that share other resources, are good for boosting
processing power incrementally. For example, if you have 1,000 users accessing
a database application and you want to be able to double or triple the number
of users without investing in an entire new system, SMP offers scalability
at the lowest cost and development time.
However, SMP processors have local caches but share a single main memory.
As a result, they don't always scale well. Each time you add a processor,
the performance boost is smaller, because you're increasing the burden on
central memory and the memory-processor bus. At a certain point, adding
another processor may actually decrease performance. Accordingly, SMP machines
rarely go beyond eight processors. At that limit, the normal choice is to
replace the SMP box with MPP--a process that will require much reprogramming--or
buy a second SMP system and connect it to the first in a cluster.
In MPP, sometimes called "shared nothing," each processor has
its own local memory and operates independently. Accordingly, MPP systems,
which may have hundreds of processors, don't suffer from the bottlenecks
inherent in SMP. But MPP machines are expensive and present some programming
challenges. The use of MPP systems in business, as opposed to scientific
or engineering applications, is still new. Most of the early commercial
adopters plan to use them to replace high-end mainframes.
In contrast, clustering offers a way to scale up to if not hundreds at least
about 64 processors, less expensively than with MPP, incrementally, and
in some cases with minimum reprogramming. A cluster is a set of two
or more loosely coupled systems that act as one complete server. Clusters
generally are made up of SMP machines, which usually have from two to eight
processors each, a shared-disk subsystem and software that takes over in
the case of a failure.
The key advantage of clusters over the other technologies for many users
is that they provide a degree of high availability at a much lower cost
than traditional fault-tolerant systems (or hot standby) systems. Many companies
that want high availability but are not content to have the extra processors
sit idle when they're not needed opt for clusters.
"It's easy for an SMP machine to go down if one processor fails,"
says Derek Kaufman, middleware manager for clothing manufacturer Levi Strauss
and Co. in San Francisco. "MPP is still too new and expensive, although
that might be changing. So far clusters are the most flexible and reasonably
priced solution for scaling up and for high availability."
New to Unix
Unlike SMP and MPP, clustering is not a new technology. The VAXCluster from
Digital Equipment Corp., which uses the OpenVMS operating system, has been
around since the early 1980s. But clusters are new to Unix. It wasn't until
companies began to use Unix servers for mission-critical applications that
high availability, and therefore clusters, became important in the Unix
arena.
In 1993 DEC leveraged its clustering know-how and introduced its Unix-based
DECAdvantage. The same year, IBM, with the help of Clam Associates of Cambridge,
MA, introduced High-Availability Cluster Multi-Processing (HACMP) for AIX,
its Unix variant. Now over half a dozen companies provide Unix clusters,
including Data General, Hewlett-Packard, Pyramid Technology, Sequent Computer
Systems and Sun Microsystems.
Since most clusters are made up of SMP machines, virtually all cluster makers
also sell SMP systems. MPP, on the other hand, is new and specialized. IBM
and Pyramid are two cluster companies that also sell MPP systems, but only
Pyramid allows you to include its MPP system in a cluster.
Hot for Uptime
While clusters give companies both increased scalability and high availability,
currently most users are more interested in availability. They also value
the fact that members of a cluster can be busy with processing tasks while
standing ready to take over another member's work in the event of a failure.
"Our hot standby system gave us high availability, but clustering allows
us to make the best use of our investment," says Peter Smith, systems
manager for TMI Communications in Ottawa, Canada. The company provides MSAT
satellite communications to telecommunications carriers. It currently has
two DEC AlphaServer 2100s (each with two processors), but it only uses one.
The second is a hot standby connected to the first via DECsafe Available
Server.
Clusters achieve their high availability through redundancy. A fail-over
system, such as DECsafe, includes "heartbeat" software, in which
each member of the cluster continually checks the others to ensure that
they are running properly. As soon as a member senses a failure, it takes
over the jobs of the failed processors. Normally users who were logged on
to the failed machine are automatically relogged into another machine, a
process which may take several seconds to several minutes. Automatic fail-over
is important when purchasing a cluster for high availability. "Without
it, getting users back online is manual and can take too long," says
Wayne Kernochan, director of commercial systems research at the Aberdeen
Group in Boston.
MSAT, an alternative to cellular technology, provides communications via
pager, fax or voice phone to users through resellers such as Bell Mobility.
Currently, one AlphaServer 2100 is enough to run TMI's Customer Management
Information System (CMIS) applications, which hold data for each MSAT customer
it serves. The server is used to activate the service for customers and
to collect usage data for billing purposes. While TMI provides 24-hour-by-7-day
service, employees work 9 to 5 and not on weekends. "We have pagers,
but we don't want to be wakened in the middle of the night," Smith
says.
Soon, TMI will convert the pair of servers to a cluster solution DEC calls
TruCluster. It includes a wide-channel bus technology called Memory Channel.
Based on Peripheral Component Interconnect (PCI, a bus standard developed
by Intel) and licensed from Encore Computer Corp. of Ft. Lauderdale, FL,
Memory Channel enables clustered servers to process queries much faster
than through SCSI I/O channels. Also part of TruCluster is the Oracle Parallel
Server, which can distribute a database query across multiple nodes in the
cluster. This activity is transparent to the user.
The hot standby solved the problem of availability, but Smith says with
Memory Channel the company can make better use of both machines. He plans
to put all system software on one machine and user information and applications
on the other. Memory Channel will allow the two systems to work together
as a single machine, increasing overall system performance. In the event
of a failure of one server, all the processing tasks (both systems and applications)
will be moved to the second server. In that case, performance will degrade
a bit, but Smith isn't worried. "In the worst-case scenario, in the
event of a failure, the performance of the system will revert to what it
is now, and that's acceptable," he says.
TMI's requirement is 99 percent uptime, and Smith expects to beat that when
the cluster system is installed. But the main advantage, in his view, is
economic. "We're getting the most bang for the buck, because we can
utilize both machines most of the time," he says.
Not all of the system's availability comes from the cluster itself. The
fact is that, in a stand-alone mode, many of today's servers already have
availability over 99 percent. To increase that percentage, you can add things
like mirrored or duplexed drives, data striping, RAID disk arrays and hot-swap
features. With a carefully designed configuration, a company can achieve
about 99.9 percent on a stand-alone SMP machine. What clusters bring to
the party is availability above that number.
To demand instead of 99.9 percent, say, 99.9999 percent availability may
seem obsessive. But translated into actual downtime, that means moving from
500 minutes a year to only 30 seconds. Virtually all applications can stand
30 seconds of downtime, but many companies cannot accept having their sales
or service applications down for an hour a month.
Down on Purpose
While clusters provide a slight decrease of unexpected downtime compared
with stand-alone servers, their effect is more dramatic when it comes to
scheduled downtime. Shutdowns for maintenance, which can total as much as
15 hours a month for a large operation, are becoming increasingly unacceptable.
Clusters let you shift processing from a node on which you want to perform
maintenance before powering off the machine.
"The only time we need to shut down is when we're upgrading our cluster
software. Otherwise, moving processing to another node takes seconds,"
says Phil Zmigrodski, director of software development at Hygrade Furniture
Distribution and Delivery Systems in South Kearney, NJ. Hygrade uses a two-node
IBM RS/6000 HACMP cluster. Overall, Zmigrodski says the company achieves
about 99.99 percent availability.
Hygrade purchased a HACMP cluster a year and a half ago to support its new
video catalog system. Furniture retailers purchase or lease the video catalog,
which is a PC combined with a satellite hookup to the RS/6000 model 570
at Hygrade's headquarters. Each day the 570 downloads an upgraded video
catalog to each store. Customers shop through the catalog and, if they choose
to make a purchase, enter name, address and product choice. The data is
sent back to the RS/6000 570, which in turn sends it to the company's RS/6000
model 590. The 590 system handles fulfillment functions such as routing
the order information to the appropriate shipping dock facility.
Because many of Hygrade's retailers rely on the video catalog and don't
stock floor models of the furniture, high availability of the system is
essential. Even a few minutes of downtime could result in lost sales. But
purchasing a hot standby for each of the firm's RS/6000 servers could increase
the price of the system by over $200,000, according to Zmigrodski.
The solution was to cluster the two servers. "The 570 and the 590 have
very different jobs to do, but when we bought them, we kept in mind that
either one had to be able to take over the job of the other," he says.
Hygrade hasn't had an emergency shutdown since purchasing the system, but
it has moved processes from one machine to the other when upgrading software.
Of course, these activities usually are performed at off-peak periods. But
Zmigrodski says that when he does so, he sees no degradation of performance.
Flexible Options
Although clusters don't provide the 100 percent availability offered by
fault-tolerant machines, they are more economical and flexible. That's because,
when configuring your cluster, you can select where on the scalability/availability
continuum your requirements lie. For example, to optimize for availability,
you can have one machine in the cluster standing idle for each one that
is running. You can move a bit closer to the scalability side of the continuum
by having one machine idle for every two, three or four machines operating
at full capacity. Or you can optimize for scalability by using all the machines
at full tilt. If a member of the cluster goes down, performance on the other
machines, which now have an added burden, will degrade. But for some companies,
that's an acceptable trade-off.
And there are ways to configure a system to minimize degradation in performance.
For example, you can configure the cluster to shut down all batch operations
in the event of a failure. The multitude of options is what makes clusters
so flexible.
The emphasis on high availability doesn't minimize the need for scalability,
a key item on nearly everyone's IT features list these days. Data, applications
and the number of users who access them are growing constantly, and computer
systems have to stay ahead of this curve.
The ultimate in scalability comes not only from clusters but combining them
with SMP or MPP technology. If you want to replicate the processing power
and availability of a mainframe operation with a Unix system, MPP, with
its hundreds of microprocessors, may be the only option. Until recently,
clustering SMP was the norm, and doing so with MPP was virtually unheard
of. Last March, Pyramid Technology announced the Reliant RM1000 Cluster
Server, which lets users integrate the MPP-based Reliant1000 Parallel Server,
which can run up to 300 nodes, with up to 16 Nile or RM600 SMP servers (a
total possible 256 processors) in a single system running Pyramid's Reliant
Unix variant.
This is an example of what Kernochan of Aberdeen Group calls "fusion
technology," which he defines as the tight integration of SMP and MPP
in a cluster. "It lets you solve the problem with the right technology,
whatever that might be," he says. For example, if you need more processing
power, the cheapest way to get it is to add another processor to the SMP
machine. But that could result in bottlenecks, so the next option might
be to add a second SMP machine and create a cluster. Of if you want to move
a decision support system or online transaction processing (OLTP) onto the
cluster and need a tremendous amount of power, add an MPP machine. "This
makes it virtually impossible to hit a computational limit for commercial
applications," says Kernochan.
Not for All Uses
Despite the advantages enumerated above, cluster technology has limitations.
In particular, when compared to SMP or MPP in terms of speed, "Clusters
always lose," says Jonathan Eunice, an analyst at Illuminata, Inc.,
a consulting firm in Nashua, NH.
According to Eunice, the greatest benefit of clustering will come when it
can be put to use in parallel processing, where multiple queries or tasks
are handled simultaneously because they're assigned to different processors.
Parallel processing is especially important for OLTP and other realtime
applications in industries such as banking and insurance. In order to be
successful, he estimates, throughput needs to be at least in the range of
200 to 300 megabytes per second (mbps). In the past, when 20mbps to 40mbps
over SCSI storage interconnect was the norm, those applications ran poorly
on clusters, he says. DEC's TruCluster with Memory Channel, announced in
April, has a bandwidth of 100mbps. Eunice calls this a significant improvement,
though he hasn't been able to evaluate its effect on these types of applications.
Another roadblock when you're ready to move your application from a single
machine to a cluster is whether you'll have to reprogram your applications
to be cluster-ready. Programming an application that runs on a single SMP
or MPP machine to run on a cluster may take little or no work, or it may
require extensive rewriting. In general, moving from a single SMP to a cluster
that includes MPP does require reprogramming. However, if you're running
a parallel server on an SMP machine, moving to a clustered solution may
be easy, because the database application runs on the cluster just as it
would on a single SMP machine.
According to Patrick Smyth, director of marketing for DEC's Unix business
segment in Maynard, MA, clustering is a good fit for an application such
as SAP America's R/3 suite, where the configuration is a huge database server
surrounded by many application servers. "You can manage the cluster
as one database environment," Smyth says.
While in the near term Digital has raised the prospects of Unix servers
by creating fast clusters, at the same time it's romancing users of Microsoft
Windows NT. Digital will use the memory interconnect with the clustering
software on its Intel processor-based Prioris servers running NT. However,
skepticism is rampant regarding NT's scalability above four processors.
Still, running either operating system, clustering is a technology that
most analysts say will continue to make gains. "In five years every
serious server will be clustered," predicts Eunice. He asserts that
even if the circa 1999 stand-alone servers are powerful enough to handle
all business applications, customers will want clusters for high availability.
"It's a simple matter of not wanting to put all your processing eggs
in one basket," he says.
If your Unix machines and client/server systems are taking on larger and
more important applications that used to be run by mainframes, you may need
a range of options. You'll have to find some way--preferably more than one--to
replicate the mainframe's large number of millions of processes per second
as well as high availability or fault tolerance. Clustering SMP machines
allows you to boost the performance of an already powerful server while
at the same time providing high availability. Adding to the cluster an MPP
machine, while expensive and sometimes difficult to program, will allow
your company to offload virtually any mission-critical system to a Unix
base.
Larry Stevens writes about business and technology from
Monson, MA. He can be reached at 71412.631@compuserve.com.