JISE


  [1] [2] [3] [4] [5] [6] [7] [8]


Journal of Information Science and Engineering, Vol. 10 No. 2, pp. 259-269


Fault Tolerant Distributed Simulation


Yi-Bing Lin
Bell Communications Research 
Rm 2D297, 445 South Street, Morristown NJ 07960 
U.S.A.


    This paper presents a fault tolerant protocol for distributed Time Warp simulation. Based on the concept of global virtual time, we show that a distributed snapshot of Time Warp can be efficiently taken. A set of simple distributed snapshot algorithms and fault recovery algorithms are proposed. The distributed snapshot algorithms checkpoint the system states (distributed snapshots) from time to time. The fault recovery algorithms restore the system state from ht most recent distributed snapshot taken by the distributed snapshot algorithms. This protocol is robust enough to tolerate failures occurring at any moment.


Keywords: discrete event simulation, distributed simulation, distributed snapshot, fault tolerance, global virtual time, time warp protocol

  Retrieve PDF document (JISE_199402_07.pdf)