This paper presents a parametric fault detection algorithm which can discriminate the persistence (permanent, intermittent, and transient) of faults in wireless sensor networks. The main characteristics of these faults are the amount the fault appears. We adopt this state-holding time to discriminate transient from intermittent faults. Neighbor-coordination-based approach is adopted, where faulty sensor nodes are detected based on comparisons between neighboring nodes and dissemination of the decision made at each node. Simulation results demonstrate the robustness of the work at varying transient fault rate. 1. Introduction Node failures and environmental hazards cause frequent topology change, communication failure, and network partition. Such perturbations are far more common in wireless sensor networks (WSNs) than those found in traditional wireless networks. The extent of such a perturbations depends on the persistence of faults. Based on persistence, faults can be classified as transient, intermittent, or permanent. A transient fault will eventually disappear without any apparent intervention, whereas a permanent one will remain unless it is removed by some external agency [1]. After their first appearance, the rate of fault appearance is relatively high for intermittent faults, and finally the intermittent faulty nodes tend to become permanent [2, 3]. Permanent or hard faults are software or hardware faults that always produce errors when they are fully exercised [4]. In fact, experimental studies have shown that more than of the faults that occur in real systems are transient or intermittent faults [3, 5, 6]. These faults are more severe, from both data aggregation and network lifetime perspective. They are much problematic to diagnose and handle. In contrast, permanent faults are considerably easier to diagnose and handle. Since the effect of faults is not always present, detection of intermittent or transient faults requires repetitive testing at the discrete time in contrast to single test to detect permanent faults. Discrimination of transient from intermittent or permanent faults is crucial as a sensor node with transient fault does not necessarily imply that the sensor node should be isolated although the unstable environment might warrant a temporary shutdown [4]. A discrimination between transient and intermittent or permanent faults solves the following key problems. Effective Bandwidth Utilization By isolating permanent faults, the traffic generated by the permanent faulty nodes is restricted. Effective Energy Utilization The depletion
References
[1]
B. Selic, “Fault tolerance techniques for distributed systems,” July 2004, http://www.ibm.com/developerworks/rational/library/114.html.
[2]
A. Bondavalli, S. Chiaradonna, F. Di Giandomenico, and F. Grandoni, “Threshold-based mechanisms to discriminate transient from intermittent faults,” IEEE Transactions on Computers, vol. 49, no. 3, pp. 230–245, 2000.
[3]
D. P. Siewiorek and R. S. Swmlz, Reliable Computer System Design and Evaluation, Digital Press, 1992.
[4]
M. Barborak, A. Dahbura, and M. Malek, “Consensus problems in fault-tolerant computing,” ACM Computing Surveys, vol. 25, no. 2, pp. 171–220, 1993.
[5]
R. Horst, D. Jewett, and D. Lenoski, “The risk of data corruption in microprocessor-based systems,” in Proceedings of the 23rd International Symposium on Fault-Tolerant Computing, pp. 576–585, June 1993.
[6]
A. Avi?ienis, J. C. Laprie, B. Randell, and C. Landwehr, “Basic concepts and taxonomy of dependable and secure computing,” IEEE Transactions on Dependable and Secure Computing, vol. 1, no. 1, pp. 11–33, 2004.
[7]
M. Serafini, A. Bondavalli, and N. Suri, “Online diagnosis and recovery: on the choice and impact of tuning parameters,” IEEE Transactions on Dependable and Secure Computing, vol. 4, no. 4, pp. 295–312, 2007.
[8]
M. Malek, “A comparison connection assignment for diagnosis of multiprocessor systems,” in Proceedings of the 7th Annual Symposium on Computer Architecture (ISCA '80), pp. 31–36, ACM, 1980.
[9]
H. W. Brown and D. M. Blough, “The broadcast comparison model for on-line fault diagnosis in multicomputer systems: theory and implementation,” IEEE Transactions on Computers, vol. 48, no. 5, pp. 470–493, 1999.
[10]
G. M. Megson, X. Yang, and D. J. Evans, “A comparison-based diagnosis algorithm tailored for crossed cube multiprocessor systems,” Microprocessors and Microsystems, vol. 29, no. 4, pp. 169–175, 2005.
[11]
Y. Y. Tang and X. Yang, “Efficient fault identification of diagnosable systems under the comparison model,” IEEE Transactions on Computers, vol. 56, no. 12, pp. 1612–1618, 2007.
[12]
Y. S. Chen and S. Y. Hsieh, “Strongly diagnosable product networks under the comparison diagnosis model,” IEEE Transactions on Computers, vol. 57, no. 6, pp. 721–732, 2008.
[13]
G. Y. Chang, “(t, k)-diagnosability for regular networks,” IEEE Transactions on Computers, vol. 59, no. 9, pp. 1153–1157, 2010.
[14]
M. H. Lee and Y. H. Choi, “Fault detection of wireless sensor networks,” Computer Communications, vol. 31, no. 14, pp. 3469–3475, 2008.
[15]
S. Chessa and P. Santi, “Crash faults identification in wireless sensor networks,” Computer Communications, vol. 25, no. 14, pp. 1273–1282, 2002.
[16]
J. Chen, S. Kher, and A. Somani, “Distributed fault detection of wireless sensor networks,” in Proceedings of the 2006 Workshop on Dependability Issues in Wireless Ad Hoc Networks and Sensor Networks (DIWANS '06), pp. 65–71, ACM, September 2006.
[17]
P. Jiang, “A new method for node fault detection in wireless sensor networks,” Sensors, vol. 9, no. 2, pp. 1282–1294, 2009.
[18]
C. Hsin and M. Liu, “Self-monitoring of wireless sensor networks,” Computer Communications, vol. 29, no. 4, pp. 462–476, 2006.
[19]
X. Miao, K. Liu, Y. He, Y. Liu, and D. Papadias, “Agnostic diagnosis: discovering silent failures in wireless sensor networks,” in Proceedings of IEEE (INFOCOM '11), pp. 1548–1556, April 2011.
[20]
S. Guo, Z. Zhong, and T. He, “Find: faulty node detection for wireless sensor networks,” in Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems (SenSys '09), pp. 253–266, ACM, New York, NY, USA, November 2009.
[21]
J. L. Gao, Y. J. Xu, and X. W. Li, “Weighted-median based distributed fault detection for wireless sensor networks,” Journal of Software, vol. 18, no. 5, pp. 1208–1217, 2007.
[22]
B. Krishnamachari and S. Iyengar, “Distributed Bayesian algorithms for fault-tolerant event region detection in wireless sensor networks,” IEEE Transactions on Computers, vol. 53, no. 3, pp. 241–250, 2004.
[23]
M. Elhadef, A. Boukerche, and H. Elkadiki, “A distributed fault identification protocol for wireless and mobile ad hoc networks,” Journal of Parallel and Distributed Computing, vol. 68, no. 3, pp. 321–335, 2008.
[24]
J. Y. Choi, S. J. Yim, Y. J. Huh, and Y. H. Choi, “A distributed adaptive scheme for detecting faults in wireless sensor networks,” WSEAS Transactions on Communications, vol. 8, no. 2, pp. 269–278, 2009.
[25]
A. Weber, A. Kutzke, and S. Chessa, “Energy-aware test connectionassignment for the self-diagnosis of a wireless sensor network,” Journal of the Brazilian Computer Society, vol. 18, no. 1, pp. 19–27, 2012.
[26]
W. B. Heinzelman, A. P. Chandrakasan, and H. Balakrishnan, “An application-specific protocol architecture for wireless microsensor networks,” IEEE Transactions on Wireless Communications, vol. 1, no. 4, pp. 660–670, 2002.
[27]
E. N. Gilbert, “Capacity of a burst-noise channel,” Bell System Technical Journal, vol. 39, pp. 1253–1265, 1960.
[28]
E. O. Elliott, “Estimates of error rates for codes on burst error channels,” Bell System Technical Journal, vol. 42, pp. 1977–1997, 1963.
[29]
M. C. Vuran, ?. B. Akan, and I. F. Akyildiz, “Spatio-temporal correlation: theory and applications for wireless sensor networks,” Computer Networks, vol. 45, no. 3, pp. 245–259, 2004.
[30]
A. Mahapatro and P. M. Khilar, “Detection of node failure in wirelessimage sensor networks,” ISRN Sensor Networks, vol. 2012, pp. 1–8, 2012.
[31]
A. Boulis, Castalia: A Simulator For Wireless Sensor Networks and Body Area Networks, National ICT Australia, 2009.
[32]
A. Varga and R. Hornig, “An overview of the omnet simulation environment,” in Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops, pp. 1–10, 2008.