Can I read 1 big CSV file in parallel in R? [duplicate]
Based upon the comment by the OP, fread
from the data.table
package worked. Here's the code:
library(data.table)dt <- fread("myFile.csv")
In the OP's case, read in time for a 1.2GB file with read.csv
it took about 4-5 minutes and just 14 seconds with fread
.
Update 29 January 2021: It appears that fread()
now works in parallel per a Tweet from the package's creator.