|
Computer Science 2015
The Limitations of Optimization from SamplesAbstract: As we grow highly dependent on data for making predictions, we translate these predictions into models that help us make informed decisions. But what are the guarantees we have? Can we optimize decisions on models learned from data and be guaranteed that we achieve desirable outcomes? In this paper we formalize this question through a novel model called optimization from samples (OPS). In the OPS model, we are given sampled values of a function drawn from some distribution and our objective is to optimize the function under some constraint. Our main interest is in the following question: are functions that are learnable (from samples) and approximable (given oracle access to the function) also optimizable from samples? We show that there are classes of submodular functions which have desirable approximation and learnability guarantees and for which no reasonable approximation for optimizing from samples is achievable. In particular, our main result shows that even for maximization of coverage functions under a cardinality constraint $k$, there exists a hypothesis class of functions that cannot be approximated within a factor of $n^{-1/4 + \epsilon}$ (for any constant $\epsilon > 0$) of the optimal solution, from samples drawn from the uniform distribution over all sets of size at most $k$. In the general case of monotone submodular functions, we show an $n^{-1/3 + \epsilon}$ lower bound and an almost matching $\tilde{\Omega}(n^{-1/3})$-optimization from samples algorithm. Additive and unit-demand functions can be optimized from samples to within arbitrarily good precision. Finally, we also consider a corresponding notion of additive approximation for continuous optimization from samples, and show near-optimal hardness for concave maximization and convex minimization.
|