A discrete time, finite state and action Markov decision problem is considered where the objective is the maximization of the expectation of an arbitrary utility function defined on the sequence of rewards. Basic concepts are formulated, generalizing the standard notions of the optimality equations, conserving and unimprovable strategies, and strategy and value iteration. Analogues of positive, negative and convergent dynamic programming are analyzed.