Beyond Classical Statistics: Optimality In Transfer Learning And Distributed Learning