%0 Journal Article
%T msmsEval: tandem mass spectral quality assignment for high-throughput proteomics
%A Jason WH Wong
%A Matthew J Sullivan
%A Hugh M Cartwright
%A Gerard Cagney
%J BMC Bioinformatics
%D 2007
%I BioMed Central
%R 10.1186/1471-2105-8-51
%X We describe an application, msmsEval, that builds on previous work by statistically modeling the spectral quality discriminant function using a Gaussian mixture model. This allows a researcher to filter spectra based on the probability that a spectrum will ultimately be identified by database searching. We show that spectra that are predicted by msmsEval to be of high quality, yet remain unidentified in standard database searches, are candidates for more intensive search strategies. Using a well studied public dataset we also show that a high proportion (83.9%) of the spectra predicted by msmsEval to be of high quality but that elude standard search strategies, are in fact interpretable.msmsEval will be useful for high-throughput proteomics projects and is freely available for download from http://proteomics.ucd.ie/msmseval webcite. Supports Windows, Mac OS X and Linux/Unix operating systems.The identification of proteins by tandem mass spectrometry (MS/MS) is an important step in many proteomics studies [1]. The introduction of orthogonal peptide separation techniques coupled to the mass spectrometer, such as multidimensional protein identification technology (MudPIT) [2] and combined fractional diagonal chromatography (COFRADIC) [3], has significantly increased the potential throughput of tandem mass spectrometry experiments, enabling the identification of 100s or 1000s of proteins from a single sample. Yet, this potential has not been fully realized because the vast amount of primary data generates computational burdens, notably time-consuming and processor-intensive tandem mass spectra interpretation. The most widely-used interpretation programs, such as SEQUEST [4], X!Tandem [5] and Mascot [6], use amino acid sequence databases that are expanding in size daily. Recently, heuristic programs such as X!Tandem [5] and PFSM [7] have been reported to reduce search times by 80每90%. Even so, an emerging goal for the biologist is to identify the post-translational modif
%U http://www.biomedcentral.com/1471-2105/8/51