The tragedy of data science is that 79% of an analyst’s time goes to data preparation. Data preparation is not only tedious, it steals time from analysis. Data packages make for fast, reproducible analysis by simplifying data prep, eliminating parsing, and versioning data. In round numbers, data packages speed both I/O and data preparation by a factor of 10.
Apr 25, 2017