Unsupervised Learning Of Spatiotemporal Features By Video Completion