Model learning is the process of extracting, analysing and synthesising information from data sets. Graphical models are a suitable framework for probabilistic modelling. A Bayesian Network (BN) is a probabilistic graphical model, which represents joint distributions in an intuitive and efficient way. It encodes the probability density (or mass) function of a set of variables by specifying a number of conditional independence statements in the form of a directed acyclic graph. Specifying the structure of the model is one of the most important design choices in graphical modelling. Notwithstanding their potential, there are several limitations to learning BNs from small data sets. In this paper, we introduce a set of practical guidelines a modeller can use to deal with these limitations. The main goal of the guidelines is to increase awareness of the underlying assumptions and the tacit implications of several learning techniques. Unsurprisingly, one of the authors’ findings is that learning BNs from small data sets is a complex and challenging task, yet potentially very rewarding. The paper also draws attention to the amount of subjective input needed from the modeller and the necessity to tailor solutions on the particularity of the application.
Cite this paper
Bookholt, F. D. , Stuurman, P. and Hanea, A. M. (2014). Practical Guidelines for Learning Bayesian Networks from Small Data Sets. Open Access Library Journal, 1, e481. doi: http://dx.doi.org/10.4236/oalib.1100481.
Langseth, H., Nielsen, T., Rumi, R. and Salmeron, A. (2009) Maximum Likelihood Learning of Conditional MTE Distributions. Proceedings of the 10th European Conference, ECSQARU 2009, Verona, 1-3 July 2009.
Friedman, N. and Koller, D. (2003) Being Bayesian about Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks. Ma-chine Learning, 50, 95-125. http://dx.doi.org/10.1023/A:1020249912095
Cooper, G.F. and Herskovits, E. (1993) A Bayesian Method for the Induction of Probabilistic Networks from Data. Technical Report KSL-91-02, Knowledge Systems Laboratory, Medical Computer Science, Stanford University School of Medicine, Stanford.
Cheng, J., Bell, D.A. and Lin, W. (1997) An Algorithm for Bayesian Belief Network Construction from Data. School of Information and Software Engineering, University of Ulster at Jordanstown, Jordanstown.
Cheng, J., Greiner, R., Kelly, J., Bell, D. and Liu, W. (2002) Learning Bayesian Networks from Data: An Information-Theory Based Approach. Artificial Intelligence, 137, 43-90. http://dx.doi.org/10.1016/S0004-3702(02)00191-1
Kotsiantis, S. and Kanellopoulos, D. (2006) Discretiza-tion Techniques: A Recent Survey. GESTS International Transactions on Computer Science and Engineering, 32, 47-58.
Mehta, C.R. and Patel, N.R. (1983) A Network Algorithm for Performing Fisher’s Exact Test in Contingency Tables. Journal of the American Statistical Association, 78, 427-434.
Cardillo, G. (2010) MyFisher: The Definitive Function for the Fisher’s Exact and Conditional Test for Any Matrix. http://www.mathworks.com/matlabcentral/fileexchange/26883
Baba, K., Ritei, S. and Masaaki, S. (2004) Partial Correlation and Conditional Correlation as Measures of Conditional Independence. Australian and New Zealand Journal of Statistics, 46, 657-664. http://dx.doi.org/10.1111/j.1467-842X.2004.00360.x
Zhang, K., Peters, J., Janzing, D. and Sch?lkopf, B. (2012) Kernel-Based Conditional Independence Test and Application in Causal Discovery. Max Planck Institute for Intelligent Systems, Tübingen.
Pearl, J. and Verma, T.S. (1990) Equivalence and Synthesis of Causal Models. Proceedings of the 6th Conference on Uncertainty in Artificial Intelligence, Cambridge, 27-29 July, 220-227.
Ramsey, P.H. (1989) Critical Values for Spearman’s Rank Order Correlation. Journal of Educational and Behavioral Statistics, 14, 245. http://dx.doi.org/10.3102/10769986014003245