Blank space not recognised as NA in fread Blank space not recognised as NA in fread r r

Blank space not recognised as NA in fread


In case you want to avoid the additional manipulation after reading the file, you could try using

quote = FALSE

when writing to csv. This prevents the use of quotations " " around the values and all missing values should now be read as NAs. It should look like this -

# also turned off row names to prevent an additional column when reading the file.write.csv(df, "tr.csv", quote = FALSE, row.names = FALSE) 

Output -

tr1 <- fread("tr.csv", header=T, fill = T,             sep= ",", na.strings = c("",NA), data.table = F,             stringsAsFactors = FALSE)tr1 x1         x2   x3 x41 NA 1006678566 <NA> NA2 NA         NA   ac  23 NA 1011160152 <NA>  3tr2 <- read.table("tr.csv", fill = TRUE, header=T,                   sep= ",", na.strings = c(""," ", NA),                   stringsAsFactors = FALSE)tr2  x1         x2   x3 x41 NA 1006678566 <NA> NA2 NA         NA   ac  23 NA 1011160152 <NA>  3


One thing that I found was the way data gets saved when we do a write.csv().

Open the csv file and hit delete for blank cells in X4 and save . If you import it now, the NA would show up in R.

To check:

apply(tr1, 2, function(x) length(which(is.na(x))))

V1 x1 x2 x3 x4

0 3 1 2 1

If there is a csv file with blanks and we do fread using

na.strings("", NA)

The character data types also show up as "NA" for blanks.


@SJB Use na.strings = c(NA_character_, "") as argument in fread() and blank spaces/cells will be read as NA.

There are forms of NA for various data types. See help(NA):NA_character_NA_real_NA_integer_ etc.