Talk 1: Sequential Stopping for Parallel Monte Carlo
The most natural way to run many Monte Carlo computations is to sample until a given error tolerance is achieved. The solution of this problem goes back to Chow and Robbins when the underlying variance can be easily estimated from the samples generated. But there are many Monte Carlo problems in which estimating variances is difficult. We discuss a new procedure for use in such settings in which the Bessel process turns out to play a key role. It turns out that it is especiallly well-suited for parallel computing applications.
The work discussed in this talk is joint with Jing Dong.
Talk 2: Non-stationary Markov Processes: Approximations and Numerical Methods
In many Markov modeling contexts, the system under consideration exhibits strong time-of-day effects, day-of-week effects, or seasonality effects. In fact, most real-world systems that are modeled as Markov processes exhibit such non-stationarities. Nevertheless, the great majority of the academic literature focuses on modeling and theory for Markov processes with stationary transition probabilities, which describe systems that have no such time-of-day effects. In this talk, we will briefly describe three different methodologies for handling non-stationary Markov models. The first approach involves analytical approximations, anchored on traditional stationary analysis, that are available when the transition probabilities are slowly changing over the time horizon in question. The second class of methods relates to the use of stable numerical solvers for the Kolmogorov differential equations that exploit the stochastic structure that is present in these models. Finally, the third methodology involves use of Monte Carlo simulation to study such non-stationary processes. In this setting, we will discuss the question of how to initialize such simulations, and the role of backwards coupling in this context.
This work is joint with Harsha Honnappa, Alex Infanger, Mohammad Mousavi, and Zeyu Zheng.
Talk 3: Experimentation with Temporal Interference: Adaptive Markov Chain Sampling and Estimation
For many experiments, the treatment is a new algorithm or policy (e.g., a matching policy in a ridesharing system, delivery system, or ad auction). These policies change the state of the system, and so classical A/B tests that randomize platform participants to treatment or control are infeasible. Instead, the typical design is a “switchback” experiment: each policy is alternately applied in successive time intervals, and then the performance of each policy is estimated by its sample average across intervals where it was used.
Despite their appeal, switchback designs suffer from temporal interference: the application of policy A in one interval affects the initial state of the system as seen by policy B in the next interval. Motivated by the need to mitigate temporal interference, in this talk we study optimal experimental design and estimation for comparison of two policies. Our key innovation is to model the problem as one of efficiently estimating the difference of the steady state reward between two different Markov chains on the same state space (the two policies).
While the dynamic programming optimality principle does not hold here, it turns out that the optimal policy can be computed as the solution to a convex optimization problem over the standard polyhedron of feasible state-action frequencies.
This is joint work with Ramesh Johari and Mohammad Rasouli.