%0 Journal Article
%T Adaptive Scalable RPC Timeout Mechanism for Large Scale Clusters
大规模集群中一种自适应可扩展的RPC超时机制
%A QIAN Ying-Jin
%A XIAO Nong
%A JIN Shi-Yao
%A
钱迎进
%A 肖侬
%A 金士尧
%J 软件学报
%D 2010
%I
%X Timeouts are usually used for failure detection in RPC (remote produce call) based systems, which are typically reported on a per-call basis. During pressure testing, on a very large cluster system, it has been found that the traditional fixed timeout mechanism leads lots of unnecessary timeouts, especially when the server loading is involved. This paper proposes an Adaptive Scalable RPC Timeout (AST for short) mechanism that considers network conditions, server load, scalability, and performance. Under this control, the timeout value, set by clients, can be adapted and adjusted in a dynamic fashion, according to congestion of the network and the server. Moreover, the server can notify the client to modify the timeout value of the RPC. Via a series of simulations, it has been proved that the AST mechanism is a more suitable failure detection mechanism for RPC models with timeouts, and it enhances the system responsibility, reliability, and stability without negative impact on performance, even for large-scaled cluster systems.
%K RPC (remote produce call)
%K failure detection
%K timeout
%K large scale
%K scalability
%K responsibility
%K reliability
远程过程调用
%K 失效检测
%K 超时
%K 大规模
%K 扩展性
%K 响应性
%K 可靠性
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=03464258023B9B462DA6C414BC06D0B3&yid=140ECF96957D60B2&vid=659D3B06EBF534A7&iid=59906B3B2830C2C5&sid=F81D6EA4425F431D&eid=4A6E57A9C5ECD9DA&journal_id=1000-9825&journal_name=软件学报&referenced_num=0&reference_num=16