Learning And Planning With The Average-Reward Formulation