JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]


Journal of Information Science and Engineering, Vol. 21 No. 2, pp. 239-257


More Properties of Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability


Jichiang Tsai, Sy-Yen Kuo and Yi-Min Wang
*Department of Electrical Engineering 
National Chung Hsing Universitiy 
Taichung, 407 Taiwan 
E-mail: jctsai@ee.nchu.edu.tw 
**Department of Electrical Engineering 
National Taiwan University 
Taipei, 106 Taiwan 
E-mail: sykuo@cc.ee.ntu.ed.tw 
+Microsoft Resesarch, Microsoft Corporation 
Redmond, Washington, U.S.A. 
E-mail: ymwang@microsoft.com


    Rollback-Dependency Trackability (RDT) is a property stating that all rollback dependencies between local checkpoints are on-line trackable using a transitive dependency vector. In this paper, we introduce some properties of communication-induced checkpointing protocols possessing the RDT property. First, we demonstrate that wherever an RDT protocol detects a PCM-path in the checkpoint and communication pattern associated with a distributed computation, it can also detect an EPSCM-path there. Moreover, if this detected PCM-path is non-visibly doubled, its corresponding EPSCMpath is also non-visibly doubled. Next, we go on to prove that if an RDT protocol breaks all EPSCM-cycles and non-visibly doubled EPSCM-paths, it breaks all visibly doubled EPSCM-paths as well. From these results, we find that some RDT protocols actually have the same behavior for all possible patterns. Furthermore, we also construct patterns to show that a few RDT protocols are incomparable in terms of the number of forced checkpoints. Last but not least, we discuss a simulation study to verify our previous theoretical results


Keywords: distributed systems, fault tolerance, rollback-dependency trackability, communication- induced checkpointing protocols, rollback-recovery

  Retrieve PDF document (JISE_200502_01.pdf)