Automatically assigned DDC number:

Manually assigned DDC number: 00435

Title: Replication For Efficiency And Fault Tolerance In A Dsm System

Author:

Subject: Anne-marie Kermarrec Replication For Efficiency And Fault Tolerance In A Dsm System

Description: Distributed Shared Memory (DSM) systems implemented on a network of workstations (NOW) have become a convenient alternative to shared memory architectures to execute long running parallel applications. However, such architectures are susceptible to experience failures. This paper presents the design and implementation of a recoverable DSM (RDSM) based on a backward error recovery (BER) mechanism. Our RDSM's design has focused on exploiting replication of data for both fault-tolerance and efficiency. This RDSM has been implemented on a NOW and performance evaluation shows the benefits of exploiting both types of replication to design an efficient, scalable and low-cost recoverable DSM. Key Words: Distributed Shared Memory, Replication, Fault Tolerance, Network of Workstations. 1 INTRODUCTION Networks of workstations (now) are an attractive and much cheaper alternative [1] to shared memory parallel architectures for executing long-running parallel applications. A dsm [2] implemented o...

Contributor: The Pennsylvania State University CiteSeer Archives

Publisher: unknown

Date: 1998-04-03

Pubyear: unknown

Format: ps

Identifier: http://citeseer.ist.psu.edu/140391.html

Source: http://www.irisa.fr/EXTERNE/projet/solidor/members/../doc/ps97/pdcs.ps.gz

Language: en

Rights: unrestricted

Graph

<?xml   version="1.0"   encoding="UTF-8"?>

<references_metadata>

      <rec   ID="SELF"   Type="SELF"   CiteSeer_Book="SELF"   CiteSeer_Volume="SELF"   Title="Replication   For   Efficiency   And   Fault   Tolerance   In   A   Dsm   System"   />

</references_metadata>

www.000webhost.com