Stochastic Models for Fault Tolerance Restart, Rejuvenation and Checkpointing

As modern society relies on the fault-free operation of complex computing systems, system fault-tolerance has become an indispensable requirement. Therefore, we need mechanisms that guarantee correct service in cases where system components fail, be they software or hardware elements. Redundancy pat...

Full description

Bibliographic Details
Main Author: Wolter, Katinka
Format: eBook
Language:English
Published: Berlin, Heidelberg Springer Berlin Heidelberg 2010, 2010
Edition:1st ed. 2010
Subjects:
Online Access:
Collection: Springer eBooks 2005- - Collection details see MPG.ReNa
LEADER 03151nmm a2200361 u 4500
001 EB000383842
003 EBX01000000000000000236894
005 00000000000000.0
007 cr|||||||||||||||||||||
008 130626 ||| eng
020 |a 9783642112577 
100 1 |a Wolter, Katinka 
245 0 0 |a Stochastic Models for Fault Tolerance  |h Elektronische Ressource  |b Restart, Rejuvenation and Checkpointing  |c by Katinka Wolter 
250 |a 1st ed. 2010 
260 |a Berlin, Heidelberg  |b Springer Berlin Heidelberg  |c 2010, 2010 
300 |a XVI, 269 p  |b online resource 
505 0 |a Basic Concepts and Problems -- Task Completion Time -- Restart -- Applicability Analysis of Restart -- Moments of Completion Time Under Restart -- Meeting Deadlines Through Restart -- Software Rejuvenation -- Practical Aspects of Preventive Maintenance and Software Rejuvenation -- Stochastic Models for Preventive Maintenance and Software Rejuvenation -- Checkpointing -- Checkpointing Systems -- Stochastic Models for Checkpointing -- Summary, Conclusion and Outlook 
653 |a Mathematical statistics 
653 |a Electronic digital computers / Evaluation 
653 |a Computer science 
653 |a Mathematics of Computing 
653 |a System Performance and Evaluation 
653 |a Computer science / Mathematics 
653 |a Probability and Statistics in Computer Science 
653 |a Computer simulation 
653 |a Computer Modelling 
653 |a Theory of Computation 
041 0 7 |a eng  |2 ISO 639-2 
989 |b Springer  |a Springer eBooks 2005- 
028 5 0 |a 10.1007/978-3-642-11257-7 
856 4 0 |u https://doi.org/10.1007/978-3-642-11257-7?nosfx=y  |x Verlag  |3 Volltext 
082 0 |a 004.0151 
520 |a As modern society relies on the fault-free operation of complex computing systems, system fault-tolerance has become an indispensable requirement. Therefore, we need mechanisms that guarantee correct service in cases where system components fail, be they software or hardware elements. Redundancy patterns are commonly used, for either redundancy in space or redundancy in time. Wolter’s book details methods of redundancy in time that need to be issued at the right moment. In particular, she addresses the so-called "timeout selection problem", i.e., the question of choosing the right time for different fault-tolerance mechanisms like restart, rejuvenation and checkpointing. Restart indicates the pure system restart, rejuvenation denotes the restart of the operating environment of a task, and checkpointing includes saving the system state periodically and reinitializing the system at the most recent checkpoint upon failure of the system. Her presentation includes a brief introduction to the methods, their detailed stochastic description, and also aspects of their efficient implementation in real-world systems. The book is targeted at researchers and graduate students in system dependability, stochastic modeling and software reliability. Readers will find here an up-to-date overview of the key theoretical results, making this the only comprehensive text on stochastic models for restart-related problems