Abstract
In this paper, we introduce the Steady-State Policy
Synthesis (SSPS) problem which consists of finding a stochastic decision-making policy that maximizes expected rewards while satisfying a set of
asymptotic behavioral specifications. These speci-
fications are determined by the steady-state probability distribution resulting from the Markov chain
induced by a given policy. Since such distributions necessitate recurrence, we propose a solution
which finds policies that induce recurrent Markov
chains within possibly non-recurrent Markov Decision Processes (MDPs). The SSPS problem functions as a generalization of steady-state control,
which has been shown to be in PSPACE. We improve upon this result by showing that SSPS is in
P via linear programming. Our results are validated using CPLEX simulations on MDPs with
over 10000 states. We also prove that the deterministic variant of SSPS is NP-hard