全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Fault Tolerance In Grid Computing: State of the Art and Open Issues

Keywords: Grid Computing , Fault Tolerance , Workflow Grid

Full-Text   Cite this paper   Add to My Lib

Abstract:

Fault tolerance is an important property for large scale computational grid systems, wheregeographically distributed nodes co-operate to execute a task. In order to achieve high level of reliabilityand availability, the grid infrastructure should be a foolproof fault tolerant. Since the failure of resourcesaffects job execution fatally, fault tolerance service is essential to satisfy QOS requirement in gridcomputing. Commonly utilized techniques for providing fault tolerance are job checkpointing andreplication. Both techniques mitigate the amount of work lost due to changing system availability but canintroduce significant runtime overhead. The latter largely depends on the length of checkpointing intervaland the chosen number of replicas, respectively. In case of complex scientific workflows where tasks canexecute in well defined order reliability is another biggest challenge because of the unreliable nature ofthe grid resources.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133