We present a computationally tractable approach to dynamically measure statistical dependencies in multivariate non-Gaussian signals. The approach makes use of extensions of independent component analysis to calculate information coupling, as a proxy measure for mutual information, between multiple signals and can be used to estimate uncertainty associated with the information coupling measure in a straightforward way. We empirically validate relative accuracy of the information coupling measure using a set of synthetic data examples and showcase practical utility of using the measure when analysing multivariate financial time series. 1. Introduction The task of accurately inferring the statistical dependency structure (association) in multivariate systems has been an area of active research for many years, with a wide range of practical applications [1]. Many of these applications require real-time sequential analysis of dependencies in multivariate data streams with dynamically changing properties. However, most existing measures of dependence have some serious limitations; in terms of the type of data sets they are suitable for or in their computational complexities. If the data being analysed is generated using a known stable process, with known marginal and multivariate distributions, the degree of dependence can be estimated relatively easily. However, most real-world data sets have dynamically changing properties to which a single distribution cannot be assigned. Multivariate data generated in global financial markets is an example of such complex data sets. Financial data exhibits rapidly changing dynamics and is non-Gaussian in nature; this is especially true for financial data recorded at high frequencies [2]. In fact, as the scale over which financial returns are calculated decreases, their distribution becomes increasingly non-Gaussian, a feature referred to as aggregational Gaussianity. The recent explosive growth in availability and use of financial data sampled at high frequencies therefore requires the use of computationally efficient algorithms which are suitable for dynamically analysing dependencies in non-Gaussian data streams. The most commonly used measure of statistical dependence is linear correlation. However, practical use of the linear correlation measure has three main limitations; that is, it cannot accurately model dependencies between signals with non-Gaussian distributions [3]; it is restricted to measuring linear statistical dependencies and is very sensitive to outliers [4]. Rank correlation is another frequently used
References
[1]
D. Berg and K. Aas, “Models for construction of multivariate dependence—a comparison study,” The European Journal of Finance, vol. 15, no. 7-8, pp. 639–659, 2009.
[2]
M. M. Dacorogna, An Introduction to High-Frequency Finance, Academic Press, New York, NY, USA, 2001.
[3]
J. Hlinka, M. Palu?, M. Vejmelka, D. Mantini, and M. Corbetta, “Functional connectivity in resting-state fMRI: is linear correlation sufficient?” NeuroImage, vol. 54, no. 3, pp. 2218–2225, 2011.
[4]
S. J. Devlin, R. Gnanadesikan, and J. R. Kettenring, “Robust estimation and outlier detection with correlation coefficients,” Biometrika, vol. 62, no. 3, pp. 531–545, 1975.
[5]
A. R. Cowan, “Nonparametric event study tests,” Review of Quantitative Finance and Accounting, vol. 2, no. 4, pp. 343–358, 1992.
[6]
G. J. Glasser and R. F. Winter, “Critical values of the coefficient of rank correlation for testing the hypothesis of independence,” Biometrika, vol. 48, no. 3-4, pp. 444–448, 1961.
[7]
P. Embrechts, F. Lindskog, and A. McNeil, “Modelling dependence with copulas and applications to risk management,” in Handbook of Heavy Tailed Distributions in Finance, vol. 1, chapter 8, pp. 329–384, 2003.
[8]
E. Jondeau, S. H. Poon, and M. Rockinger, Financial Modeling Under Non-Gaussian Distributions, Springer, New York, NY, USA, 2007.
[9]
J. D. Fermanian and O. Scaillet, “Some statistical pitfalls in copula modeling for financial applications,” in Capital Formation, Governance and Banking, p. 59, 2005.
[10]
T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley-Interscience, New York, NY, USA, 2006.
[11]
A. Kraskov, H. St?gbauer, and P. Grassberger, “Estimating mutual information,” Physical Review E, vol. 69, no. 6, Article ID 066138, 16 pages, 2004.
[12]
S. Khan, S. Bandyopadhyay, and A. R. Ganguly, “Computing mutual information based nonlinear dependence among noisy and finite geophysical time series,” in Proceedings of the American Geophysical Union, Fall Meeting, 2005, abstract no. NG22A-03.
[13]
M. M. V. Hulle, “Edgeworth approximation of multivariate differential entropy,” Neural Computation, vol. 17, no. 9, pp. 1903–1910, 2005.
[14]
L. B. Almeida, “Linear and nonlinear ICA based on mutual information,” in Proceedings of the IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium (AS-SPCC '00), pp. 117–122, IEEE, 2000.
[15]
A. D. Back and A. S. Weigend, “A first application of independent component analysis to extracting structure from stock returns,” International Journal of Neural Systems, vol. 8, no. 4, pp. 473–484, 1997.
[16]
E. Oja, K. Kiviluoto, and S. Malaroiu, “Independent component analysis for financial time series,” in Proceedings of the IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium (AS-SPCC '00), pp. 111–116, IEEE, 2000.
[17]
A. Hyv?rinen, “Survey on independent component analysis,” Neural Computing Surveys, vol. 2, no. 4, pp. 94–128, 1999.
[18]
C. J. Lu, T. S. Lee, and C. C. Chiu, “Financial time series forecasting using independent component analysis and support vector regression,” Decision Support Systems, vol. 47, no. 2, pp. 115–125, 2009.
[19]
H. Joe, “Relative entropy measures of multivariate dependence,” Journal of the American Statistical Association, vol. 84, no. 405, pp. 157–164, 1989.
[20]
M. Novey and T. Adali, “Complex ICA by negentropy maximization,” IEEE Transactions on Neural Networks, vol. 19, no. 4, pp. 596–609, 2008.
[21]
S. Roberts and R. Everson, Independent Component Analysis: Principles and Practice, Cambridge University Press, New York, NY, USA, 2001.
[22]
J. Palmer, K. Kreutz-Delgado, and S. Makeig, “Super-Gaussian mixture source model for ICA,” in Independent Component Analysis and Blind Signal Separation, pp. 854–861, 2006.
[23]
R. A. Choudrey and S. J. Roberts, “Variational mixture of Bayesian independent component analyzers,” Neural Computation, vol. 15, no. 1, pp. 213–252, 2003.
[24]
R. Everson and S. Roberts, “Independent component analysis: a flexible nonlinearity and decorrelating manifold approach,” Neural Computation, vol. 11, no. 8, pp. 1957–1983, 1999.
[25]
R. E. Kass, L. Tierney, and J. B. Kadane, “Laplace's method in Bayesian analysis,” in Statistical Multiple Integration: Proceedings of the AMS-IMS-SIAM Joint Summer Research Conference [on Statistical Multiple Integration] Held at Humboldt University, June 17–23, 1989, vol. 115, p. 89, American Mathematical Society, 1991.
[26]
W. Addison and S. Roberts, “Blind source separation with non-stationary mixing using wavelets,” in Proceedings of the ICA Research Network Workshop, The University of Liverpool, 2006.
[27]
N. J. Higham, “Matrix nearness problems and applications,” in Applications of Matrix Theory, 1989.
[28]
K. B. Datta, Matrix and Linear Algebra, PHI Learning, 2004.
[29]
K. Boudt, J. Cornelissen, and C. Croux, “The Gaussian rank correlation estimator: robustness properties,” Statistics and Computing, vol. 22, no. 2, pp. 471–483, 2012.
[30]
D. Evans, “A computationally efficient estimator for mutual information,” Proceedings of the Royal Society A, vol. 464, no. 2093, pp. 1203–1215, 2008.
[31]
K. Torkkola, “On feature extraction by mutual information maximization,” in Proceedings of the IEEE International Conference on Acustics, Speech, and Signal Processing (ICASSP '02), vol. 1, pp. I/821–I/824, IEEE, May 2002.
[32]
K. Nagarajan, B. Holland, C. Slatton, and A. D. George, “Scalable and portable architecture for probability density function estimation on FPGAs,” in Proceedings of the 16th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '08), pp. 302–303, IEEE Computer Society, April 2008.
[33]
C. G. Bowsher, “Modelling security market events in continuous time: intensity based, multivariate point process models,” Journal of Econometrics, vol. 141, no. 2, pp. 876–912, 2007.
[34]
B. Peiers, “Informed traders, intervention, and price leadership: a deeper view of the microstructure of the foreign exchange market,” Journal of Finance, vol. 52, no. 4, pp. 1589–1614, 1997.
[35]
A. Dionisio, R. Menezes, and D. A. Mendes, “Mutual information: a measure of dependency for nonlinear time series,” Physica A, vol. 344, no. 1-2, pp. 326–329, 2004.
[36]
J. Y. Campbell, A. W. Lo, A. C. MacKinlay, and R. F. Whitelaw, “The econometrics of financial markets,” Macroeconomic Dynamics, vol. 2, no. 4, pp. 559–562, 1998.
[37]
G. Koutmos, C. Negakis, and P. Theodossiou, “Stochastic behaviour of the Athens stock exchange,” Applied Financial Economics, vol. 3, no. 2, pp. 119–126, 1993.
[38]
D. M. Guillaume, M. M. Dacorogna, R. R. Davé, U. A. Müller, R. B. Olsen, and O. V. Pictet, “From the bird's eye to the microscope: a survey of new stylized facts of the intra-daily foreign exchange markets,” Finance and Stochastics, vol. 1, no. 2, pp. 95–129, 1997.
[39]
R. Cheng, “Using pearson type IV and other Cinderella distributions in simulation,” in Proceedings of the Winter Simulation Conference (WSC '11), pp. 457–468, IEEE, 2011.
[40]
Y. Nagahara, “The PDF and CF of Pearson type IV distributions and the ML estimation of the parameters,” Statistics and Probability Letters, vol. 43, no. 3, pp. 251–264, 1999.
[41]
A. Shephard, Pearson IV, Institute for Fiscal Studies, University College London, 2008.
[42]
S. Stavroyiannis, I. Makris, V. Nikolaidis, and L. Zarangas, “Econometric modeling and value-at-risk using the Pearson type IV distribution,” International Review of Financial Analysis, vol. 22, pp. 10–17, 2012.
[43]
M. Kendall, A. Stuart, and J. K. Ord, Kendall's Advanced Theory of Statistics, Charles Griffin, 1987.
[44]
R. Willink, “A closed-form expression for the pearson type IV distribution function,” Australian and New Zealand Journal of Statistics, vol. 50, no. 2, pp. 199–205, 2008.
[45]
A. Venelli, “Efficient entropy estimation for mutual information analysis using B-splines,” in Information Security Theory and Practices. Security and Privacy of Pervasive Systems and Smart Devices, pp. 17–30, 2010.
[46]
C. G. Park and D. W. Shin, “An algorithm for generating correlated random variables in a class of infinitely divisible distributions,” Journal of Statistical Computation and Simulation, vol. 61, no. 1-2, pp. 127–139, 1998.
[47]
R. L. Iman and W. J. Conover, “A distribution-free approach to inducing rank correlation among input variables,” Communications in Statistics-Simulation and Computation, vol. 11, no. 3, pp. 311–334, 1982.
[48]
N. Shah and S. Roberts, “Hidden Markov independent component analysis as a measure of coupling in multivariate financial time series,” in Proceedings of the ICA Research Network International Workshop, Liverpool, UK, 2008.
[49]
A. Rossi and G. M. Gallo, “Volatility estimation via hidden Markov models,” Journal of Empirical Finance, vol. 13, no. 2, pp. 203–230, 2006.
[50]
J. Crotty, “Structural causes of the global financial crisis: a critical assessment of the ‘new financial architecture’,” Cambridge Journal of Economics, vol. 33, no. 4, pp. 563–580, 2009.
[51]
J. A. Murphy, “An analysis of the financial crisis of 2008: causes and solutions,” Social Science Research Network, 2008.
[52]
M. Pojarliev and R. M. Levich, “Detecting crowded trades in currency funds,” Financial Analysts Journal, vol. 67, no. 1, pp. 26–39, 2011.
[53]
L. Sandoval and I. D. P. Franca, “Correlation of financial markets in times of crisis,” Physica A, vol. 391, no. 1, pp. 187–208, 2012.