Abstract:
This paper focused on the sensitivity of Laplacian eigenmap (LE) to outliers, and presented a robust Laplacian eigenmap (RLE). RLE was base on the outlier detection, projected the outliers and their neighbors to the low-dimensional tangent space with the robust PCA method. In the low-dimensional tangent space, RLE constructed the to weight graph connected the outliers and their neighbors, which could reflect the intrinsic local geometry of the outliers.The algorithm reduced the impact of outliers on the Laplacian matrix. Simulation and real examples show that RLE is robust against outliers.

Abstract:
The conventional Laplacian Eigenmap preserves neighborhood relationships based on Euclidean distance, that is, the neighboring high-dimensional data points are mapped into neighboring points in the low-dimensional space. However, the selections of neighborhood may influence the global low-dimensional coordinates. In this paper, both the geodesic distance and generalized Gaussian function are incorporated into Laplacian eigenmap algorithm. At first, a generalized Gaussian Laplacian eigenmap algorithm based on geodesic distance (GGLE) is proposed. The global low-dimensional coordinates obtained by GGLE have different clustering properties when different generalized Gaussian functions are used to measure the similarity between the high-dimensional data points. Then, this paper utilizes these properties to further propose the ensemble-based discriminant algorithm of the above-motioned GGLE. The main advantages of the ensemble-based algorithm are: The neighborhood parameter K is fixed and to construct the neighborhood graph and geodesic distance matrix needs one time only. Finally, the recognition experimental results on wood texture dataset show that it is an efficient ensemble discriminant algorithm based on manifold.

Abstract:
We derive several new applications of the concept of sequences of Laplacian cut-off functions on Riemannian manifolds (which we prove to exist on geodesically complete Riemannian manifolds with nonnegative Ricci curvature): In particular, we prove that this existence implies $\mathsf{L}^q$-estimates of the gradient, a new density result of smooth compactly supported functions in Sobolev spaces on the whole $\mathsf{L}^q$-scale, and a slightly weaker and slightly stronger variant of the conjecture of Braverman, Milatovic and Shubin on the nonnegativity of $\mathsf{L}^2$-solutions $f$ of $(-\Delta+1)f\geq 0$. The latter fact is proved within a new notion of positivity preservation for Riemannian manifolds which is related to stochastic completeness.

Abstract:
We show how 'test' vector fields may be used to give lower bounds for the Cheeger constant of a Euclidean domain (or Riemannian manifold with boundary), and hence for the lowest eigenvalue of the Dirichlet Laplacian on the domain. Also, we show that a continuous version of the classical Max Flow Min Cut Theorem for networks implies that Cheeger's constant may be obtained precisely from such vector fields. Finally, we apply these ideas to reprove a known lower bound for Cheeger's constant in terms of the inradius of a plane domain.

Abstract:
purpose: to compare pca3 score cut-off of 35 vs 20 in pca diagnosis in patients undergoing repeated saturation prostate biopsy (spbx). materials and methods: from january 2010 to may 2011, 118 patients (median 62.5 years) with primary negative extended biopsy underwent a transperineal spbx (median 30 cores) for persistent suspicion of pca. the indications for repeated biopsy were: persistently high or increasing psa values; psa > 10 ng/ml, psa values between 4.1-10 or 2.6-4 ng/ml with free/total psa ≤ 25% and ≤ 20%, respectively; moreover, before performing spbx urinary pca3 score was evaluated. results: all patients had negative dre and median psa was 8.5 ng/ml (range: 3.7-24 ng/ml). a t1c pca was found in 32 patients (27.1%): pca3 score was 59 (median; range: 7-201) in the presence of pca and 35 (median; range: 3-253) in the absence of cancer (p < 0.05). in the presence of asap and hgpin median pca3 score was 109 (range: 42-253) and 40 (range: 30-140), respectively. diagnostic accuracy, sensitivity, specificity, ppv and npv of pca3 score cut-off of 20 vs 35 in pca diagnosis were 44.9 vs 50%, 90.6 vs 71.9%, 27.9 vs 41.8%, 31.9 vs 31.5% and 88.9 vs 80%, respectively. roc analysis demonstrated an auc for pca3 ≥ 20 vs ≥ 35 of 0.678 and 0.634, respectively. conclusions: our data suggest that pca3 is more useful as an exclusion tool; moreover, setting a pca3 cut-off at 20 vs 35, would have avoided 22.9 vs 38.1% of biopsies while missing 9.4% and 28% diagnosis of pca.

Abstract:
The presence of a sparse "truth" has been a constant assumption in the theoretical analysis of sparse PCA and is often implicit in its methodological development. This naturally raises questions about the properties of sparse PCA methods and how they depend on the assumption of sparsity. Under what conditions can the relevant variables be selected consistently if the truth is assumed to be sparse? What can be said about the results of sparse PCA without assuming a sparse and unique truth? We answer these questions by investigating the properties of the recently proposed Fantope projection and selection (FPS) method in the high-dimensional setting. Our results provide general sufficient conditions for sparsistency of the FPS estimator. These conditions are weak and can hold in situations where other estimators are known to fail. On the other hand, without assuming sparsity or identifiability, we show that FPS provides a sparse, linear dimension-reducing transformation that is close to the best possible in terms of maximizing the predictive covariance.

Abstract:
We give a reduction from {\sc clique} to establish that sparse PCA is NP-hard. The reduction has a gap which we use to exclude an FPTAS for sparse PCA (unless P=NP). Under weaker complexity assumptions, we also exclude polynomial constant-factor approximation algorithms.

Abstract:
Sparse Principal Component Analysis (PCA) methods are efficient tools to reduce the dimension (or the number of variables) of complex data. Sparse principal components (PCs) are easier to interpret than conventional PCs, because most loadings are zero. We study the asymptotic properties of these sparse PC directions for scenarios with fixed sample size and increasing dimension (i.e. High Dimension, Low Sample Size (HDLSS)). Under the previously studied spike covariance assumption, we show that Sparse PCA remains consistent under the same large spike condition that was previously established for conventional PCA. Under a broad range of small spike conditions, we find a large set of sparsity assumptions where Sparse PCA is consistent, but PCA is strongly inconsistent. The boundaries of the consistent region are clarified using an oracle result.

Abstract:
It is well known that Sparse PCA (Sparse Principal Component Analysis) is NP-hard to solve exactly on worst-case instances. What is the complexity of solving Sparse PCA approximately? Our contributions include: 1) a simple and efficient algorithm that achieves an $n^{-1/3}$-approximation; 2) NP-hardness of approximation to within $(1-\varepsilon)$, for some small constant $\varepsilon > 0$; 3) SSE-hardness of approximation to within any constant factor; and 4) an $\exp\exp\left(\Omega\left(\sqrt{\log \log n}\right)\right)$ ("quasi-quasi-polynomial") gap for the standard semidefinite program.