Fault Tolerant MPI
Clusters of every size experience failures: processors can die, hard disks often crash, and interface cards have been known to produce spurious errors. Of course, software can fail, too, for any number of reasons. Prevention is a necessity, but the next best option is to react and respond to faults as they occur. If you're a cluster developer, Fault Tolerant MPI (FT-MPI) can help keep your compute jobs humming.
|
|
Full Story |
This topic does not have any threads posted yet!
You cannot post until you login.