Insert rows for missing dates/times Insert rows for missing dates/times r r

Insert rows for missing dates/times


This is an old question, but I just wanted to post a dplyr way of handling this, as I came across this post while searching for an answer to a similar problem. I find it more intuitive and easier on the eyes than the zoo approach.

library(dplyr)ts <- seq.POSIXt(as.POSIXct("2001-09-01 0:00",'%m/%d/%y %H:%M'), as.POSIXct("2001-09-01 0:07",'%m/%d/%y %H:%M'), by="min")ts <- seq.POSIXt(as.POSIXlt("2001-09-01 0:00"), as.POSIXlt("2001-09-01 0:07"), by="min")ts <- format.POSIXct(ts,'%m/%d/%y %H:%M')df <- data.frame(timestamp=ts)data_with_missing_times <- full_join(df,original_data)   timestamp     tr tt sr st1 09/01/01 00:00 15 15 78 422 09/01/01 00:01 20 64 98 873 09/01/01 00:02 31 84 23 354 09/01/01 00:03 21 63 54 205 09/01/01 00:04 15 23 36 156 09/01/01 00:05 NA NA NA NA7 09/01/01 00:06 NA NA NA NA8 09/01/01 00:07 NA NA NA NA

Also using dplyr, this makes it easier to do something like change all those missing values to something else, which came in handy for me when plotting in ggplot.

data_with_missing_times %>% group_by(timestamp) %>% mutate_each(funs(ifelse(is.na(.),0,.)))   timestamp     tr tt sr st1 09/01/01 00:00 15 15 78 422 09/01/01 00:01 20 64 98 873 09/01/01 00:02 31 84 23 354 09/01/01 00:03 21 63 54 205 09/01/01 00:04 15 23 36 156 09/01/01 00:05  0  0  0  07 09/01/01 00:06  0  0  0  08 09/01/01 00:07  0  0  0  0


I think the easiest thing ist to set Date first as already described, convert to zoo, and then just set a merge:

df$timestamp<-as.POSIXct(df$timestamp,format="%m/%d/%y %H:%M")df1.zoo<-zoo(df[,-1],df[,1]) #set date to Indexdf2 <- merge(df1.zoo,zoo(,seq(start(df1.zoo),end(df1.zoo),by="min")), all=TRUE)

Start and end are given from your df1 (original data) and you are setting by - e.g min - as you need for your example. all=TRUE sets all missing values at the missing dates to NAs.


Date padding is implemented in the padr package in R. If you store your data frame, with your date-time variable stored as POSIXct or POSIXlt. All you need to do is:

library(padr)pad(df_name)

See vignette("padr") or this blog post for its working.