Reinforcement learning theory and approaches are applied to JLQ model and Q function-based policy iteration algorithm is designed to optimize system performance.

英美