Ways to read only select columns from a file into R? (A happy medium between `read.table` and `scan`?) [duplicate] Ways to read only select columns from a file into R? (A happy medium between `read.table` and `scan`?) [duplicate] r r

Ways to read only select columns from a file into R? (A happy medium between `read.table` and `scan`?) [duplicate]


Sometimes I do something like this when I have the data in a tab-delimited file:

df <- read.table(pipe("cut -f1,5,28 myFile.txt"))

That lets cut do the data selection, which it can do without using much memory at all.

See Only read limited number of columns for pure R version, using "NULL" in the colClasses argument to read.table.


One possibility is to use pipe() in lieu of the filename and have awk or similar filters extract only the columns you want.

See help(connection) for more on pipe and friends.

Edit: read.table() can also do this for you if you are very explicit about colClasses -- a value of NULL for a given column skips the column alltogether. See help(read.table). So there we have a solution in base R without additional packages or tools.


I think Dirk's approach is straight forward as well as fast. An alternative that I've used is to load the data into sqlite which loads MUCH faster than read.table() and then pull out only what you want. the package sqldf() makes this all quite easy. Here's a link to a previous stack overflow answer that gives code examples for sqldf().