Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning



Wide adoption of deep networks as function approximators in modern reinforcement learning (RL) is changing the research environment, both with regard to best practices and application domains. Yet, our understanding of RL methods has been shaped by theoretical and empirical results with tabular representations and linear function approximators. These results suggest that RL methods using temporal differencing (TD) are superior to direct Monte Carlo (MC) estimation...