Evergrid Launches Global Resource Management Solution for Next Generation Data Centers

Posted by dcparris on Jun 12, 2007 2:26 PM EDT
PR Newswire
Mail this story
Print this story

New Cluster Availability Management Suite Provides Near 100-Percent Reliability; Scales to Thousands of Nodes

FREMONT, Calif., June 12 /PRNewswire/ -- Evergrid, Inc., a provider of global resource management software for next generation data centers, today announced the Evergrid Cluster Availability Management Suite (CAMS), a new continuous availability and resource management software solution for High Productivity Computing grid environments and the Utility Enterprise data center. CAMS manages server clusters from power-on through operating system provisioning and application scheduling to load management. CAMS is integrated with Evergrid's Availability Management Service (AvS) to provide checkpoint/resume capabilities for applications, including massively parallel distributed applications. With CAMS, batch applications run at near 100- percent reliability.

Evergrid provides transparent fault tolerance using an OS abstraction layer that loads between the operating system (OS) and the application. Without modifying either the application or the operating system, CAMS/AvS periodically captures the collective state of the application across the entire infrastructure while the application continues processing. By recording the state of an application and all of the OS and system state, Evergrid is able to checkpoint and resume from failures or interruptions rapidly with minimal overhead. Even failure of multiple servers or of software systems does not stop an application from being able to resume processing from a checkpoint.

Evergrid provides recovery especially for long-running, multi-server batch jobs that are limited in their runtime by the inherent reliability characteristics of software and hardware. The patented checkpoint/resume technology also allows transparent stateful job preemption and application migration of batch workloads on multiple servers. Moreover, recovery and pre- emptive scheduling for applications can be done globally, scaling across geographically dispersed data centers.

"What differentiates Evergrid from other solutions that attempt to solve the checkpoint problem is our ability to scale up to thousands of nodes, with less than five percent performance overhead and without OS or application changes," said Dave Anderson, CEO of Evergrid. "You can't get this capability anywhere else."

Evergrid Cluster Availability Management Suite (CAMS) is comprised of two products, Evergrid Availability Services (AvS-Batch) and Evergrid Resource Manager (RM-Batch). Evergrid AvS-Batch captures the collective state of single or multiple nodes running distributed applications and prevents downtime by performing checkpoint, migration and recovery of the application, thus providing automatic failover across multiple nodes and tiers. Evergrid RM- Batch allows efficient allocation of resources and stateful preemptive scheduling of jobs. CAMS ensures that no compute cycle is lost by recovering, migrating or pre-empting jobs. This translates to greater flexibility, reliability and utilization of computing resources.

"Software solutions that minimize downtime for compute-intensive applications, improve job execution, and minimize job preemption while maximizing utilization of servers will fundamentally change how we serve our user community," said Henry Neeman, director of the OU Supercomputing Center for Education & Research (OSCER) at the University of Oklahoma.

Evergrid's software is designed for demanding, computing-intensive sectors such as manufacturing, financial services, and pharmaceutical and petrochemical research. Currently, Evergrid solutions target High Performance Technical Computing (HPTC) applications that are computationally intensive and use high speed interconnects. In the near future, Evergrid will also provide solutions for the High Performance Enterprise Computing (HPEC) and online transaction processing (OLTP) database and enterprise application markets.

Evergrid licenses its Cluster Availability Management Suite software on a per-socket, annual subscription basis, with substantial discounts for large deployments. Evergrid's Availability Management Service can be licensed separately for integration with other resource managers. Currently CAMS and AvS are implemented on multiple versions of Linux. Both Cluster Availability Management Suite and Availability Management Services are available immediately from Evergrid. For more information, go to http://www.evergrid.com.

About Evergrid, Inc.

Evergrid, a provider of global resource management software for next generation datacenters, lets massively parallelized, distributed applications run properly on high performance cluster grids, at near 100 percent reliability. Evergrid's fault tolerant application virtualization software prevents downtime, automates checkpoint, migration, and recovery of applications, and scales to thousands of nodes, with less than five percent performance overhead.

Evergrid's leadership team brings extensive management and technology expertise from IBM, Amdahl, VERITAS, Motorola, Tandem Computers and the Virginia Polytechnic Institute and State University. Evergrid is a private company that is funded, in part, by Menlo Ventures and the Acartha Group. For more information, visit http://www.evergrid.com.

Contact: Patricia Colby

Page One PR

650-543-4703

patricia@pageonepr.com

  Nav
» Read more about: Story Type: Press Release; Groups: Linux

« Return to the newswire homepage

This topic does not have any threads posted yet!

You cannot post until you login.