OALib Journal期刊

ISSN: 2333-9721




2019 ( 5 )

2018 ( 56 )

2017 ( 47 )

2016 ( 72 )


匹配条件: “Alexander Gray” ,找到相关结果约21006条。
UPAL: Unbiased Pool Based Active Learning
Ravi Ganti,Alexander Gray
Computer Science , 2011,
Abstract: In this paper we address the problem of pool based active learning, and provide an algorithm, called UPAL, that works by minimizing the unbiased estimator of the risk of a hypothesis in a given hypothesis space. For the space of linear classifiers and the squared loss we show that UPAL is equivalent to an exponentially weighted average forecaster. Exploiting some recent results regarding the spectra of random matrices allows us to establish consistency of UPAL when the true hypothesis is a linear hypothesis. Empirical comparison with an active learner implementation in Vowpal Wabbit, and a previously proposed pool based active learner implementation show good empirical performance and better scalability.
Stochastic Smoothing for Nonsmooth Minimizations: Accelerating SGD by Exploiting Structure
Hua Ouyang,Alexander Gray
Computer Science , 2012,
Abstract: In this work we consider the stochastic minimization of nonsmooth convex loss functions, a central problem in machine learning. We propose a novel algorithm called Accelerated Nonsmooth Stochastic Gradient Descent (ANSGD), which exploits the structure of common nonsmooth loss functions to achieve optimal convergence rates for a class of problems including SVMs. It is the first stochastic algorithm that can achieve the optimal O(1/t) rate for minimizing nonsmooth loss functions (with strong convexity). The fast rates are confirmed by empirical comparisons, in which ANSGD significantly outperforms previous subgradient descent algorithms including SGD.
Data-Distributed Weighted Majority and Online Mirror Descent
Hua Ouyang,Alexander Gray
Computer Science , 2011,
Abstract: In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent algorithms, as special cases. We show, using both theoretical analysis and experiments, that compared to a single agent: given the same computation time, these distributed algorithms achieve smaller generalization errors; and given the same generalization errors, they can be $N$ times faster.
Local Support Vector Machines:Formulation and Analysis
Ravi Ganti,Alexander Gray
Statistics , 2013,
Abstract: We provide a formulation for Local Support Vector Machines (LSVMs) that generalizes previous formulations, and brings out the explicit connections to local polynomial learning used in nonparametric estimation literature. We investigate the simplest type of LSVMs called Local Linear Support Vector Machines (LLSVMs). For the first time we establish conditions under which LLSVMs make Bayes consistent predictions at each test point $x_0$. We also establish rates at which the local risk of LLSVMs converges to the minimum value of expected local risk at each point $x_0$. Using stability arguments we establish generalization error bounds for LLSVMs.
Maximum Inner-Product Search using Tree Data-structures
Parikshit Ram,Alexander G. Gray
Computer Science , 2012,
Abstract: The problem of {\em efficiently} finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied in literature. However, a closely related problem of efficiently finding the best match with respect to the inner product has never been explored in the general setting to the best of our knowledge. In this paper we consider this general problem and contrast it with the existing best-match algorithms. First, we propose a general branch-and-bound algorithm using a tree data structure. Subsequently, we present a dual-tree algorithm for the case where there are multiple queries. Finally we present a new data structure for increasing the efficiency of the dual-tree algorithm. These branch-and-bound algorithms involve novel bounds suited for the purpose of best-matching with inner products. We evaluate our proposed algorithms on a variety of data sets from various applications, and exhibit up to five orders of magnitude improvement in query time over the naive search technique.
Faster Gaussian Summation: Theory and Experiment
Dongryeol Lee,Alexander G. Gray
Computer Science , 2012,
Abstract: We provide faster algorithms for the problem of Gaussian summation, which occurs in many machine learning methods. We develop two new extensions - an O(Dp) Taylor expansion for the Gaussian kernel with rigorous error bounds and a new error control scheme integrating any arbitrary approximation method - within the best discretealgorithmic framework using adaptive hierarchical data structures. We rigorously evaluate these techniques empirically in the context of optimal bandwidth selection in kernel density estimation, revealing the strengths and weaknesses of current state-of-the-art approaches for the first time. Our results demonstrate that the new error control scheme yields improved performance, whereas the series expansion approach is only effective in low dimensions (five or less).
The World Wide Telescope: An Archetype for Online Science
Jim Gray,Alexander S. Szalay
Computer Science , 2004,
Abstract: Most scientific data will never be directly examined by scientists; rather it will be put into online databases where it will be analyzed and summarized by computer programs. Scientists increasingly see their instruments through online scientific archives and analysis tools, rather than examining the raw data. Today this analysis is primarily driven by scientists asking queries, but scientific archives are becoming active databases that self-organize and recognize interesting and anomalous facts as data arrives. In some fields, data from many different archives can be cross-correlated to produce new insights. Astronomy presents an excellent example of these trends; and, federating Astronomy archives presents interesting challenges for computer scientists.
Building Bridges: Viewing Active Learning from the Multi-Armed Bandit Lens
Ravi Ganti,Alexander G. Gray
Computer Science , 2013,
Abstract: In this paper we propose a multi-armed bandit inspired, pool based active learning algorithm for the problem of binary classification. By carefully constructing an analogy between active learning and multi-armed bandits, we utilize ideas such as lower confidence bounds, and self-concordant regularization from the multi-armed bandit literature to design our proposed algorithm. Our algorithm is a sequential algorithm, which in each round assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for the label of this sampled point. The design of this sampling distribution is also inspired by the analogy between active learning and multi-armed bandits. We show how to derive lower confidence bounds required by our algorithm. Experimental comparisons to previously proposed active learning algorithms show superior performance on some standard UCI datasets.
Where the Rubber Meets the Sky: Bridging the Gap between Databases and Science
Jim Gray,Alexander S. Szalay
Computer Science , 2005,
Abstract: Scientists in all domains face a data avalanche - both from better instruments and from improved simulations. We believe that computer science tools and computer scientists are in a position to help all the sciences by building tools and developing techniques to manage, analyze, and visualize peta-scale scientific information. This article is summarizes our experiences over the last seven years trying to bridge the gap between database technology and the needs of the astronomy community in building the World-Wide Telescope.
Multibody Multipole Methods
Dongryeol Lee,Arkadas Ozakin,Alexander G. Gray
Physics , 2011, DOI: 10.1016/j.jcp.2012.06.027
Abstract: A three-body potential function can account for interactions among triples of particles which are uncaptured by pairwise interaction functions such as Coulombic or Lennard-Jones potentials. Likewise, a multibody potential of order $n$ can account for interactions among $n$-tuples of particles uncaptured by interaction functions of lower orders. To date, the computation of multibody potential functions for a large number of particles has not been possible due to its $O(N^n)$ scaling cost. In this paper we describe a fast tree-code for efficiently approximating multibody potentials that can be factorized as products of functions of pairwise distances. For the first time, we show how to derive a Barnes-Hut type algorithm for handling interactions among more than two particles. Our algorithm uses two approximation schemes: 1) a deterministic series expansion-based method; 2) a Monte Carlo-based approximation based on the central limit theorem. Our approach guarantees a user-specified bound on the absolute or relative error in the computed potential with an asymptotic probability guarantee. We provide speedup results on a three-body dispersion potential, the Axilrod-Teller potential.

Copyright © 2008-2017 Open Access Library. All rights reserved.