Publish in OALib Journal

ISSN: 2333-9721

APC: Only $99


Any time

2019 ( 52 )

2018 ( 221 )

2017 ( 226 )

2016 ( 295 )

Custom range...

Search Results: 1 - 10 of 7449 matches for " Variable selection "
All listed articles are free for downloading (OA Articles)
Page 1 /7449
Display every page Item
Variable Selection for Partially Linear Varying Coefficient Transformation Models with Censored Data  [PDF]
Jiang Du, Zhongzhan Zhang, Ying Lu
Open Journal of Statistics (OJS) , 2012, DOI: 10.4236/ojs.2012.25072
Abstract: In this paper, we study the problem of variable selection for varying coefficient transformation models with censored data. We fit the varying coefficient transformation models by maximizing the marginal likelihood subject to a shrink- age-type penalty, which encourages sparse solutions and hence facilitates the process of variable selection. We further provide an efficient computation algorithm to implement the proposed methods. A simulation study is conducted to evaluate the performance of the proposed methods and a real dataset is analyzed as an illustration.
Estudo sobre métodos de sele??o de variáveis em DEA
Senra, Luis Felipe Arag?o de Castro;Nanci, Luiz Cesar;Mello, Jo?o Carlos Correia Baptista Soares de;Meza, Lidia Angulo;
Pesquisa Operacional , 2007, DOI: 10.1590/S0101-74382007000200001
Abstract: one of the main issues in dea modeling is the variables choice. this may have conflictive objectives, like increasing the mean efficiency or maximizing the model ranking capability - a dea classic fragility. in this paper, we compare four variable selection methods, focused on dmus sorting. these methods are applied to a real situation of assessing the efficiency of third-party logistics in the activity of newspaper home delivery, in rio de janeiro.
Cross-Validation, Shrinkage and Variable Selection in Linear Regression Revisited  [PDF]
Hans C. van Houwelingen, Willi Sauerbrei
Open Journal of Statistics (OJS) , 2013, DOI: 10.4236/ojs.2013.32011

In deriving a regression model analysts often have to use variable selection, despite of problems introduced by data- dependent model building. Resampling approaches are proposed to handle some of the critical issues. In order to assess and compare several strategies, we will conduct a simulation study with 15 predictors and a complex correlation structure in the linear regression model. Using sample sizes of 100 and 400 and estimates of the residual variance corresponding to R2 of 0.50 and 0.71, we consider 4 scenarios with varying amount of information. We also consider two examples with 24 and 13 predictors, respectively. We will discuss the value of cross-validation, shrinkage and backward elimination (BE) with varying significance level. We will assess whether 2-step approaches using global or parameterwise shrinkage (PWSF) can improve selected models and will compare results to models derived with the LASSO procedure. Beside of MSE we will use model sparsity and further criteria for model assessment. The amount of information in the data has an influence on the selected models and the comparison of the procedures. None of the approaches was best in all scenarios. The performance of backward elimination with a suitably chosen significance level was not worse compared to the LASSO and BE models selected were much sparser, an important advantage for interpretation and transportability. Compared to global shrinkage, PWSF had better performance. Provided that the amount of information is not too small, we conclude that BE followed by PWSF is a suitable approach when variable selection is a key part of data analysis.

Efficiency of Selecting Important Variable for Longitudinal Data  [PDF]
Jongmin Ra, Ki-Jong Rhee
Psychology (PSYCH) , 2014, DOI: 10.4236/psych.2014.51002

Variable selection with a large number of predictors is a very challenging and important problem in educational and social domains. However, relatively little attention has been paid to issues of variable selection in longitudinal data with application to education. Using this longitudinal educational data (Test of English for International Communication, TOEIC), this study compares multiple regression, backward elimination, group least selection absolute shrinkage and selection operator (LASSO), and linear mixed models in terms of their performance in variable selection. The results from the study show that four different statistical methods contain different sets of predictors in their models. The linear mixed model (LMM) provides the smallest number of predictors (4 predictors among a total of 19 predictors). In addition, LMM is the only appropriate method for the repeated measurement and is the best method with respect to the principal of parsimony. This study also provides interpretation of the selected model by LMM in the conclusion using marginal R2.

Automatic Variable Selection for Single-Index Random Effects Models with Longitudinal Data  [PDF]
Suigen Yang, Liugen Xue
Open Journal of Statistics (OJS) , 2014, DOI: 10.4236/ojs.2014.43022

We consider the problem of variable selection for the single-index random effects models with longitudinal data. An automatic variable selection procedure is developed using smooth-threshold. The proposed method shares some of the desired features of existing variable selection methods: the resulting estimator enjoys the oracle property; the proposed procedure avoids the convex optimization problem and is flexible and easy to implement. Moreover, we use the penalized weighted deviance criterion for a data-driven choice of the tuning parameters. Simulation studies are carried out to assess the performance of our method, and a real dataset is analyzed for further illustration.

Model Detection for Additive Models with Longitudinal Data  [PDF]
Jian Wu, Liugen Xue
Open Journal of Statistics (OJS) , 2014, DOI: 10.4236/ojs.2014.410082
Abstract: In this paper, we consider the problem of variable selection and model detection in additive models with longitudinal data. Our approach is based on spline approximation for the components aided by two Smoothly Clipped Absolute Deviation (SCAD) penalty terms. It can perform model selection (finding both zero and linear components) and estimation simultaneously. With appropriate selection of the tuning parameters, we show that the proposed procedure is consistent in both variable selection and linear components selection. Besides, being theoretically justified, the proposed method is easy to understand and straightforward to implement. Extensive simulation studies as well as a real dataset are used to illustrate the performances.
Clustering of the Values of a Response Variable and Simultaneous Covariate Selection Using a Stepwise Algorithm  [PDF]
Olivier Collignon, Jean-Marie Monnez
Applied Mathematics (AM) , 2016, DOI: 10.4236/am.2016.715141
Abstract: In supervised learning the number of values of a response variable can be very high. Grouping these values in a few clusters can be useful to perform accurate supervised classification analyses. On the other hand selecting relevant covariates is a crucial step to build robust and efficient prediction models. We propose in this paper an algorithm that simultaneously groups the values of a response variable into a limited number of clusters and selects stepwise the best covariates that discriminate this clustering. These objectives are achieved by alternate optimization of a user-defined model selection criterion. This process extends a former version of the algorithm to a more general framework. Moreover possible further developments are discussed in detail.
Group Variable Selection via a Combination of Lq Norm and Correlation-Based Penalty  [PDF]
Ning Mao, Wanzhou Ye
Advances in Pure Mathematics (APM) , 2017, DOI: 10.4236/apm.2017.71005
Abstract: Considering the problem of feature selection in linear regression model, a new method called LqCP is proposed simultaneously to select variables and favor a grouping effect, where strongly correlated predictors tend to be in or out of the model together. LqCP is based on penalized least squares with a penalty function that combines the Lq (0n. In addition, a simulation about grouped variable selection is performed. Finally, The model is applied to two real data: US Crime Data and Gasoline Data. In terms of prediction error and estimation error, empirical studies show the efficiency of LqCP.
Sparse Additive Gaussian Process with Soft Interactions  [PDF]
Garret Vo, Debdeep Pati
Open Journal of Statistics (OJS) , 2017, DOI: 10.4236/ojs.2017.74039
Abstract: This paper presents a novel variable selection method in additive nonparametric regression model. This work is motivated by the need to select the number of nonparametric components and number of variables within each nonparametric component. The proposed method uses a combination of hard and soft shrinkages to separately control the number of additive components and the variables within each component. An efficient algorithm is developed to select the importance of variables and estimate the interaction network. Excellent performance is obtained in simulated and real data examples.
Identification of significant genes in genomics using Bayesian variable selection methods
Eugene Lin, Lung-Cheng Huang
Advances and Applications in Bioinformatics and Chemistry , 2008, DOI: http://dx.doi.org/10.2147/AABC.S3624
Abstract: entification of significant genes in genomics using Bayesian variable selection methods Original Research (6158) Total Article Views Authors: Eugene Lin, Lung-Cheng Huang Published Date July 2008 Volume 2008:1 Pages 13 - 18 DOI: http://dx.doi.org/10.2147/AABC.S3624 Eugene Lin1, Lung-Cheng Huang2,3 1Vita Genomics, Inc., Wugu Shiang, Taipei, Taiwan; 2Department of Psychiatry, National Taiwan University Hospital Yun-Lin Branch, Taiwan; 3Graduate Institute of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan Abstract: In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for research ranging from candidate gene studies to genome-wide association studies. In this study, we proposed a Bayesian method for identifying the promising candidate genes that are significantly more influential than the others. We employed the framework of variable selection and a Gibbs sampling based technique to identify significant genes. The proposed approach was applied to a genomics study for persons with chronic fatigue syndrome. Our studies show that the proposed Bayesian methodology is effective for deriving models for genomic studies and for providing information on significant genes.
Page 1 /7449
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.