%0 Journal Article %T JIACKPT: A Recoverable Software Distributed Shared Memory System
可恢复的软件DSM系统JIACKPT %A ZHANG Long-Bing %A ZHANG Fu-Xin %A HU Wei-Wu %A TANG Zhi-Min %A
章隆兵 %A 张福新 %A 胡伟武 %A 唐志敏 %J 软件学报 %D 2005 %I %X Software distributed shared memory (DSM) system has constructed a virtual shared memory abstract on cluster, which combines the programmability of shared memory and fine scalability of cluster. So it is widely studied. Software DSM system is easy to fail because it is a distributed system, some kinds of fault tolerance are necessary for it to be more practical. A recoverable and portable software DSM system, JIACKPT (JIAjia with ChecKPoinTing), has been designed and implemented to tolerate the fault of system. JIACKPT, based on JIAJIA, has adopted the checkpointing technology. By maintaining the strict global consistent state and using some optimization techniques, JIACKPT has gotten high performance. The experimental results on an 8-node PC cluster show that the checkpoint overhead is less than 10% of the whole execution time when checkpoint is done once per minute. JIACKPT also has good portability and can run on several operating systems, such as Linux, Solaris, etc. JIACKPT is a practical recoverable software DSM system. %K software distributed shared memory system %K checkpoint %K global consistent state %K JIAJIA
软件DSM系统 %K 检查点 %K 全局一致状态 %K JIAJIA %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=7AB0E9FE16FBC9CB&yid=2DD7160C83D0ACED&vid=7801E6FC5AE9020C&iid=0B39A22176CE99FB&sid=31611641D4BB139F&eid=DABEF202280E7EF1&journal_id=1000-9825&journal_name=软件学报&referenced_num=0&reference_num=16