Abstract:
The paper suggests a simple method of deriving minimax lower bounds to the accuracy of statistical inference on heavy tails. A well-known result by Hall and Welsh (Ann. Statist. 12 (1984) 1079-1084) states that if $\hat{\alpha}_n$ is an estimator of the tail index $\alpha_P$ and $\{z_n\}$ is a sequence of positive numbers such that $\sup_{P\in{\mathcal{D}}_r}\mathbb{P}(|\hat{\alpha}_n-\alpha_P|\ge z_n)\to0$, where ${\mathcal{D}}_r$ is a certain class of heavy-tailed distributions, then $z_n\gg n^{-r}$. The paper presents a non-asymptotic lower bound to the probabilities $\mathbb{P}(|\hat{\alpha}_n-\alpha_P|\ge z_n)$. We also establish non-uniform lower bounds to the accuracy of tail constant and extreme quantiles estimation. The results reveal that normalising sequences of robust estimators should depend in a specific way on the tail index and the tail constant.

Abstract:
Researchers are often interested in drawing inferences regarding the order between two experimental groups on the basis of multivariate response data. Since standard multivariate methods are designed for two-sided alternatives, they may not be ideal for testing for order between two groups. In this article we introduce the notion of the linear stochastic order and investigate its properties. Statistical theory and methodology are developed to both estimate the direction which best separates two arbitrary ordered distributions and to test for order between the two groups. The new methodology generalizes Roy's classical largest root test to the nonparametric setting and is applicable to random vectors with discrete and/or continuous components. The proposed methodology is illustrated using data obtained from a 90-day pre-chronic rodent cancer bioassay study conducted by the National Toxicology Program (NTP).

Abstract:
This article provides a fully Bayesian approach for modeling of single-dose and complete pharmacokinetic data in a population pharmacokinetic (PK) model. To overcome the impact of outliers and the difficulty of computation, a generalized linear model is chosen with the hypothesis that the errors follow a multivariate Student t distribution which is a heavy-tailed distribution. The aim of this study is to investigate and implement the performance of the multivariate t distribution to analyze population pharmacokinetic data. Bayesian predictive inferences and the Metropolis-Hastings algorithm schemes are used to process the intractable posterior integration. The precision and accuracy of the proposed model are illustrated by the simulating data and a real example of theophylline data.

Abstract:
A complete and user-friendly directory of tails of Archimedean copulas is presented which can be used in the selection and construction of appropriate models with desired properties. The results are synthesized in the form of a decision tree: Given the values of some readily computable characteristics of the Archimedean generator, the upper and lower tails of the copula are classified into one of three classes each, one corresponding to asymptotic dependence and the other two to asymptotic independence. For a long list of single-parameter families, the relevant tail quantities are computed so that the corresponding classes in the decision tree can easily be determined. In addition, new models with tailor-made upper and lower tails can be constructed via a number of transformation methods. The frequently occurring category of asymptotic independence turns out to conceal a surprisingly rich variety of tail dependence structures.

Abstract:
Branching random walks on multidimensional lattice with heavy tails and a constant branching rate are considered. It is shown that under these conditions (heavy tails and constant rate), the front propagates exponentially fast, but the particles inside of the front are distributed very non-uniformly. The particles exhibit intermittent behavior in a large part of the region behind the front (i.e., the particles are concentrated only in very sparse spots there). The zone of non-intermittency (were particles are distributed relatively uniformly) extends with a power rate. This rate is found.

Abstract:
The covariance matrix is formulated in the framework of a linear multivariate ARCH process with long memory, where the natural cross product structure of the covariance is generalized by adding two linear terms with their respective parameter. The residuals of the linear ARCH process are computed using historical data and the (inverse square root of the) covariance matrix. Simple measure of qualities assessing the independence and unit magnitude of the residual distributions are proposed. The salient properties of the computed residuals are studied for three data sets of size 54, 55 and 330. Both new terms introduced in the covariance help in producing uncorrelated residuals, but the residual magnitudes are very different from unity. The large sizes of the inferred residuals are due to the limited information that can be extracted from the empirical data when the number of time series is large, and denotes a fundamental limitation to the inference that can be achieved.

Abstract:
The Marcinkiewicz Strong Law, $\displaystyle\lim_{n\to\infty}\frac{1}{n^{\frac1p}}\sum_{k=1}^n (D_{k}- D)=0$ a.s. with $p\in(1,2)$, is studied for outer products $D_k=X_k\overline{X}_k^T$, where $\{X_k\},\{\overline{X}_k\}$ are both two-sided (multivariate) linear processes ( with coefficient matrices $(C_l), (\overline{C}_l)$ and i.i.d.\ zero-mean innovations $\{\Xi\}$, $\{\overline{\Xi}\}$). Matrix sequences $C_l$ and $\overline{C}_l$ can decay slowly enough (as $|l|\to\infty$) that $\{X_k,\overline{X}_k\}$ have long-range dependence while $\{D_k\}$ can have heavy tails. In particular, the heavy-tail and long-range-dependence phenomena for $\{D_k\}$ are handled simultaneously and a new decoupling property is proved that shows the convergence rate is determined by the worst of the heavy-tails or the long-range dependence, but not the combination. The main result is applied to obtain Marcinkiewicz Strong Law of Large Numbers for stochastic approximation, non-linear functions forms and autocovariances.

Abstract:
Block (1975) extended bivariate exponential distributions (BVEDs) of Freund (1961)and Proschan and Sullo (1974) to multivariate case and called them as Generalized Freund-Weinman's multivariate exponential distributions (MVEDs). In this paper, we obtain MLEs of theparameters and large sample test for testing independence and symmetry of k components in thegeneralized Freund-Weinman's MVEDs.

Abstract:
In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.

Abstract:
Multivariate normal mixtures provide a flexible model for high-dimensional data. They are widely used in statistical genetics, statistical finance, and other disciplines. Due to the unboundedness of the likelihood function, classical likelihood-based methods, which may have nice practical properties, are inconsistent. In this paper, we recommend a penalized likelihood method for estimating the mixing distribution. We show that the maximum penalized likelihood estimator is strongly consistent when the number of components has a known upper bound. We also explore a convenient EM-algorithm for computing the maximum penalized likelihood estimator. Extensive simulations are conducted to explore the effectiveness and the practical limitations of both the new method and the ratified maximum likelihood estimators. Guidelines are provided based on the simulation results.