write.csv for large data.table
UPDATE 2019.01.07:
fwrite
has been on CRAN since 2016-11-25.
install.packages("data.table")
UPDATE 08.04.2016:
fwrite
has been recently added to the data.table package's development version. It also runs in parallel (implicitly).
# Install development version of data.tableinstall.packages("data.table", repos = "https://Rdatatable.github.io/data.table", type = "source")# Load packagelibrary(data.table)# Load data data(USArrests)# Write CSVfwrite(USArrests, "USArrests_fwrite.csv")
According to the detailed benchmark tests shown under speeding up the performance of write.table, fwrite
is ~17x faster than write.csv
there (YMMV).
UPDATE 15.12.2015:
In the future there might be a fwrite
function in the data.table
package, see: https://github.com/Rdatatable/data.table/issues/580. In this thread a GIST is linked, which provides a prototype for such a function speeding up the process by a factor of 2 (according to the author, https://gist.github.com/oseiskar/15c4a3fd9b6ec5856c89).
ORIGINAL ANSWER:
I had the same problems (trying to write even larger CSV files) and decided finally against using CSV files.
I would recommend you to use SQLite as it is much faster than dealing with CSV files:
require("RSQLite")# Set up database drv <- dbDriver("SQLite")con <- dbConnect(drv, dbname = "test.db")# Load example datadata(USArrests)# Write data "USArrests" in table "USArrests" in database "test.db" dbWriteTable(con, "arrests", USArrests)# Test if the data was correctly stored in the database, i.e. # run an exemplary query on the newly created database dbGetQuery(con, "SELECT * FROM arrests WHERE Murder > 10") # row_names Murder Assault UrbanPop Rape# 1 Alabama 13.2 236 58 21.2# 2 Florida 15.4 335 80 31.9# 3 Georgia 17.4 211 60 25.8# 4 Illinois 10.4 249 83 24.0# 5 Louisiana 15.4 249 66 22.2# 6 Maryland 11.3 300 67 27.8# 7 Michigan 12.1 255 74 35.1# 8 Mississippi 16.1 259 44 17.1# 9 Nevada 12.2 252 81 46.0# 10 New Mexico 11.4 285 70 32.1# 11 New York 11.1 254 86 26.1# 12 North Carolina 13.0 337 45 16.1# 13 South Carolina 14.4 279 48 22.5# 14 Tennessee 13.2 188 59 26.9# 15 Texas 12.7 201 80 25.5# Close the connection to the databasedbDisconnect(con)
For further information, see http://cran.r-project.org/web/packages/RSQLite/RSQLite.pdf
You can also use a software like http://sqliteadmin.orbmu2k.de/ to access the database and export the database to CSV etc.
--