%0 Journal Article %T Effects of Subsetting by Carbon Content, Soil Order, and Spectral Classification on Prediction of Soil Total Carbon with Diffuse Reflectance Spectroscopy %A Meryl L. McDowell %A Gregory L. Bruland %A Jonathan L. Deenik %A Sabine Grunwald %J Applied and Environmental Soil Science %D 2012 %I Hindawi Publishing Corporation %R 10.1155/2012/294121 %X Subsetting of samples is a promising avenue of research for the continued improvement of prediction models for soil properties with diffuse reflectance spectroscopy. This study examined the effects of subsetting by soil total carbon ( ) content, soil order, and spectral classification with k-means cluster analysis on visible/near-infrared and mid-infrared partial least squares models for prediction. Our sample set was composed of various Hawaiian soils from primarily agricultural lands with contents from <1% to 56%. Slight improvements in the coefficient of determination ( ) and other standard model quality parameters were observed in the models for the subset of the high activity clay soil orders compared to the models of the full sample set. The other subset models explored did not exhibit improvement across all parameters. Models created from subsets consisting of only low samples (e.g., < 10%) showed improvement in the root mean squared error (RMSE) and percent error of prediction for low soil samples. These results provide a basis for future study of practical subsetting strategies for soil prediction. 1. Introduction Diffuse reflectance spectroscopy (DRS) and chemometric analysis have become popular subjects of research for their potential to predict soil carbon and other soil properties. This methodology could be beneficial for monitoring soil quality and temporal variation, as well as helping to facilitate digital soil mapping efforts. Both visible/near-infrared (VNIR) and mid-infrared (MIR) spectra show promise for the prediction of soil total carbon ( ) and organic carbon, as well as organic matter, total N, total P, sand, silt, and clay fractions, cation exchange capacity, and pH (e.g., [1¨C8]). Particular attention has been given to soil carbon, which is an important indicator of soil fertility and biological activity and is crucial to carbon sequestration endeavors [9¨C12]. Partial least squares regression (PLSR) appears to be the most widely used chemometric method for developing prediction models from soil diffuse reflectance spectra. A sample set is commonly divided into two groups with the larger used for calibration and the smaller for validation to approximate true independent model validation, but no clear or consistent guidelines have been adopted for this process. Model results are known to vary with different groupings of samples for the calibration and validation sets. To address this issue, some studies have created multiple models, each with different random divisions of the sample set into calibration and validation sets, to %U http://www.hindawi.com/journals/aess/2012/294121/