How do you read multiple .txt files into R? [duplicate]
You can try this:
filelist = list.files(pattern = ".*.txt")#assuming tab separated values with a header datalist = lapply(filelist, function(x)read.table(x, header=T)) #assuming the same header/columns for all filesdatafr = do.call("rbind", datalist)
There are three fast ways to read multiple files and put them into a single data frame or data table
First get the list of all txt files (including those in sub-folders)
list_of_files <- list.files(path = ".", recursive = TRUE, pattern = "\\.txt$", full.names = TRUE)
1) Use fread()
w/ rbindlist()
from the data.table
package
#install.packages("data.table", repos = "https://cran.rstudio.com")library(data.table)# Read all the files and create a FileName column to store filenamesDT <- rbindlist(sapply(list_of_files, fread, simplify = FALSE), use.names = TRUE, idcol = "FileName")
2) Use readr::read_table2()
w/ purrr::map_df()
from the tidyverse
framework:
#install.packages("tidyverse", # dependencies = TRUE, repos = "https://cran.rstudio.com")library(tidyverse)# Read all the files and create a FileName column to store filenamesdf <- list_of_files %>% set_names(.) %>% map_df(read_table2, .id = "FileName")
3) (Probably the fastest out of the three) Use vroom::vroom()
:
#install.packages("vroom", # dependencies = TRUE, repos = "https://cran.rstudio.com")library(vroom)# Read all the files and create a FileName column to store filenamesdf <- vroom(list_of_files, .id = "FileName")
Note: to clean up file names, use basename
or gsub
functions
Benchmark: readr
vs data.table
vs vroom
for big data
Edit 1: to read multiple csv
files and skip the header
using readr::read_csv
list_of_files <- list.files(path = ".", recursive = TRUE, pattern = "\\.csv$", full.names = TRUE)df <- list_of_files %>% purrr::set_names(nm = (basename(.) %>% tools::file_path_sans_ext())) %>% purrr::map_df(read_csv, col_names = FALSE, skip = 1, .id = "FileName")
Edit 2: to convert a pattern including a wildcard into the equivalent regular expression, use glob2rx()
There is a really, really easy way to do this now: the readtext package.
readtext::readtext("path_to/your_files/*.txt")
It really is that easy.