Abstract: Many stochastic optimization and empirical dynamic programming algorithms have been proposed in the literature that approximates certain deterministic learning algorithms. Examples of such algorithms are stochastic gradient descent and empirical value iteration, empirical Q value iteration, etc. for discounted or average cost MDPs. We refer to them as stochastic recursive algorithms, in which an exact contraction operator over a Euclidean space is replaced with an approximate iid random operator at every step of the iteration. These algorithms can be viewed within the framework of iterated random maps, and thus Markov chain theory can be leveraged to study the convergence properties of these algorithms. In the talk, we will discover some new insights about the convergence properties of stochastic recursive algorithms. We complement the theoretical findings with extensive numerical simulations.
Bio: Abhishek Gupta is an assistant professor in the ECE department at The Ohio State University. He completed his PhD in Aerospace Engineering from UIUC in 2014. His research interests are in stochastic control theory, probability theory, and game theory with applications to transportation markets, electricity markets, and cybersecurity of control systems.