Feather format is smaller than CSV, i.e. more efficient on space/processing, and stores dtypes, helping to avoid some problems when loading the data for further processing.
We initially moved to .csv.gz, which was an improvement on uncompressed CSVs. However, it uses a significant amount of CPU. We believe that moving to Arrow/Feather would use much less CPU and be an overall improvement.
To do:
Feather format is smaller than CSV, i.e. more efficient on space/processing, and stores dtypes, helping to avoid some problems when loading the data for further processing.
We initially moved to
.csv.gz, which was an improvement on uncompressed CSVs. However, it uses a significant amount of CPU. We believe that moving to Arrow/Feather would use much less CPU and be an overall improvement.To do:
.feather/.arrowfiles