全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Smoothed Linear Modeling for Smooth Spectral Data

DOI: 10.1155/2013/604548

Full-Text   Cite this paper   Add to My Lib

Abstract:

Classification and prediction problems using spectral data lead to high-dimensional data sets. Spectral data are, however, different from most other high-dimensional data sets in that information usually varies smoothly with wavelength, suggesting that fitted models should also vary smoothly with wavelength. Functional data analysis, widely used in the analysis of spectral data, meets this objective by changing perspective from the raw spectra to approximations using smooth basis functions. This paper explores linear regression and linear discriminant analysis fitted directly to the spectral data, imposing penalties on the values and roughness of the fitted coefficients, and shows by example that this can lead to better fits than existing standard methodologies. 1. Introduction There are a number of settings in which one wishes to predict some dependent variable from measurements of an optical spectrum. For example, Brown et al. [1] were concerned with the regression problem of predicting the composition of dough on the basis of measurements of the near infrared spectrum emitted from dough samples. Schomacker et al. [2] discussed the classification problem of deciding whether a colon polyp was benign or malignant on the basis of the optical spectrum emitted after illuminating the polyp with a laser. Both these applications involve electromagnetic spectra, but the scope of spectral data modeling is much broader: auditory spectra and chemical chromatography data also fit the same framework as the general problem considered here. Many fields of science now deal with high-dimensional data sets—“large ” data sets have many cases, and “large ” data sets have many measurements per case. “Small large ” problems are particularly challenging and have been the subject of much recent research. Spectral data are generally in the “large ” class and may also have “small .” They are, however, different from “small large ” problems in many other areas such as microarrays in that it is often expected that the regression models should be smooth: if the signal measured at 450?nm is predictive, then one would expect that measured at 449?nm and at 451?nm to be about equally predictive. This is the setting considered in this paper, where it is assumed that subject-matter knowledge motivates models in which the information varies smoothly with wavelength. This feature sets such spectral data apart from the typical statistical high-dimensional data set and leads to considering methods that fit models in which the coefficients are smooth functions of the wavelengths to which

References

[1]  P. J. Brown, T. Fearn, and M. Vannucci, “Bayesian wavelet regression on curves with application to a spectroscopic calibration problem,” The Journal of the American Statistical Association, vol. 96, no. 454, pp. 398–408, 2001.
[2]  K. T. Schomacker, J. K. Frisoli, C. C. Compton et al., “Ultraviolet laser-induced fluorescence of colonic tissue: basic biology and diagnostic potential,” Lasers in Surgery and Medicine, vol. 12, no. 1, pp. 63–78, 1992.
[3]  W. Saeys, B. de Ketelaere, and P. Darius, “Potential applications of functional data analysis in chemometrics,” Journal of Chemometrics, vol. 22, no. 5, pp. 335–344, 2008.
[4]  Y. Fan and G. James, “Functional additive regression,” 2012, http://www-bcf.usc.edu/~gareth/.
[5]  R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society B, vol. 58, pp. 267–288, 1996.
[6]  A. E. Hoerl and R. W. Kennard, “Ridge regression: applications to nonorthogonal problems,” Technometrics, vol. 12, pp. 27–51, 1970.
[7]  R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, “Sparsity and smoothness via the fused lasso,” Journal of the Royal Statistical Society B, vol. 67, no. 1, pp. 91–108, 2005.
[8]  J. O. Ramsey and B. W. Silverman, Applied Functional Data Analysis, Methods and Case Studies, Springer, New York, NY, USA, 2002.
[9]  J. O. Ramsey and B. W. Silverman, Functional Data Analysis, Springer, New York, NY, USA, 2nd edition, 2005.
[10]  H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” Journal of the Royal Statistical Society B, vol. 67, no. 2, pp. 301–320, 2005.
[11]  D. M. Witten and R. Tibshirani, “Covariance-regularized regression and classification for high dimensional problems,” Journal of the Royal Statistical Society B, vol. 71, no. 3, pp. 615–636, 2009.
[12]  L. S. Chen, D. Paul, R. L. Prentice, and P. Wang, “A regularized Hotelling's T2 test for pathway analysis in proteomic studies,” The Journal of the American Statistical Association, vol. 106, no. 496, pp. 1345–1360, 2011.
[13]  S. Dudoit, J. Fridlyand, and T. P. Speed, “Comparison of discrimination methods for the classification of tumors using gene expression data,” The Journal of the American Statistical Association, vol. 97, no. 457, pp. 77–87, 2002.
[14]  P. J. Bickel and E. Levina, “Some theory for Fisher's linear discriminant function, “naive Bayes”, and some alternatives when there are many more variables than observations,” Bernoulli, vol. 10, no. 6, pp. 989–1010, 2004.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133