%0 Journal Article
%T A Sample Path Convergence Time Analysis of Drift-Plus-Penalty for Stochastic Optimization
%A Xiaohan Wei
%A Hao Yu
%A Michael J. Neely
%J Mathematics
%D 2015
%I arXiv
%X This paper considers the problem of minimizing the time average of a controlled stochastic process subject to multiple time average constraints on other related processes. The probability distribution of the random events in the system is unknown to the controller. A typical application is time average power minimization subject to network throughput constraints for different users in a network with time varying channel conditions. We show that with probability at least $1-2\delta$, the classical drift-plus-penalty algorithm provides a sample path $\mathcal{O}(\varepsilon)$ approximation to optimality with a convergence time $\mathcal{O}(\frac{1}{\varepsilon^2}\max\left\{\log^2\frac1\varepsilon\log\frac2\delta,~\log^3\frac2\delta\right\})$, where $\varepsilon>0$ is a parameter related to the algorithm. When there is only one constraint, we further show that the convergence time can be improved to $\mathcal{O}\left(\frac{1}{\varepsilon^2}\log^2\frac1\delta\right)$.
%U http://arxiv.org/abs/1510.02973v2