Traditional dynamic programming requires a mathematical model of the transition function for the state vector. Leveraging reinforcement learning techniques, we develop a framework to solve dynamic optimization problems that does not require modeling the data-generating process (DGP) of exogenous states. Instead, the method samples realizations of these states directly from the data, allowing the modeler to be “agnostic” about the DGP. We apply our method to a canonical life cycle consumption-saving problem, solving the model without specifying the DGP for income. Using income data from the CPS, we find that the welfare loss from using a standard parametric income process relative to placing no restrictions on the DGP is small. We conclude by verifying that our method achieves a global optimum when given a known DGP and discussing directions for future work.