Permute all unique enumerations of a vector in R Permute all unique enumerations of a vector in R r r

Permute all unique enumerations of a vector in R


EDIT: Here's a faster answer; again based on the ideas of Louisa Grey and Bryce Wagner, but with faster R code thanks to better use of matrix indexing. It's quite a bit faster than my original:

> ddd <- c(1,0,3,4,1,0,0,3,0,4)> system.time(up1 <- uniqueperm(d))   user  system elapsed   0.183   0.000   0.186 > system.time(up2 <- uniqueperm2(d))   user  system elapsed   0.037   0.000   0.038 

And the code:

uniqueperm2 <- function(d) {  dat <- factor(d)  N <- length(dat)  n <- tabulate(dat)  ng <- length(n)  if(ng==1) return(d)  a <- N-c(0,cumsum(n))[-(ng+1)]  foo <- lapply(1:ng, function(i) matrix(combn(a[i],n[i]),nrow=n[i]))  out <- matrix(NA, nrow=N, ncol=prod(sapply(foo, ncol)))  xxx <- c(0,cumsum(sapply(foo, nrow)))  xxx <- cbind(xxx[-length(xxx)]+1, xxx[-1])  miss <- matrix(1:N,ncol=1)  for(i in seq_len(length(foo)-1)) {    l1 <- foo[[i]]    nn <- ncol(miss)    miss <- matrix(rep(miss, ncol(l1)), nrow=nrow(miss))    k <- (rep(0:(ncol(miss)-1), each=nrow(l1)))*nrow(miss) +                l1[,rep(1:ncol(l1), each=nn)]    out[xxx[i,1]:xxx[i,2],] <- matrix(miss[k], ncol=ncol(miss))    miss <- matrix(miss[-k], ncol=ncol(miss))  }  k <- length(foo)  out[xxx[k,1]:xxx[k,2],] <- miss  out <- out[rank(as.numeric(dat), ties="first"),]  foo <- cbind(as.vector(out), as.vector(col(out)))  out[foo] <- d  t(out)}

It doesn't return the same order, but after sorting, the results are identical.

up1a <- up1[do.call(order, as.data.frame(up1)),]up2a <- up2[do.call(order, as.data.frame(up2)),]identical(up1a, up2a)

For my first attempt, see the edit history.


The following function (which implements the classic formula for repeated permutations just like you did manually in your question) seems quite fast to me:

upermn <- function(x) {    n <- length(x)    duplicates <- as.numeric(table(x))    factorial(n) / prod(factorial(duplicates))}

It does compute n! but not like permn function which generates all permutations first.

See it in action:

> dat <- c(1,0,3,4,1,0,0,3,0,4)> upermn(dat)[1] 18900> system.time(uperm(dat))   user  system elapsed   0.000   0.000   0.001 

UPDATE: I have just realized that the question was about generating all unique permutations not just specifying the number of them - sorry for that!

You could improve the unique(perm(...)) part with specifying unique permutations for one less element and later adding the uniqe elements in front of them. Well, my explanation may fail, so let the source speak:

uperm <- function(x) {u <- unique(x)                    # unique values of the vectorresult <- x                       # let's start the result matrix with the vectorfor (i in 1:length(u)) {    v <- x[-which(x==u[i])[1]]    # leave the first occurance of duplicated values    result <- rbind(result, cbind(u[i], do.call(rbind, unique(permn(v)))))}return(result)}

This way you could gain some speed. I was lazy to run the code on the vector you provided (took so much time), here is a small comparison on a smaller vector:

> dat <- c(1,0,3,4,1,0,0)> system.time(unique(permn(dat)))   user  system elapsed   0.264   0.000   0.268 > system.time(uperm(dat))   user  system elapsed   0.147   0.000   0.150 

I think you could gain a lot more by rewriting this function to be recursive!


UPDATE (again): I have tried to make up a recursive function with my limited knowledge:

uperm <- function(x) {    u <- sort(unique(x))    l <- length(u)    if (l == length(x)) {        return(do.call(rbind,permn(x)))    }    if (l == 1) return(x)    result <- matrix(NA, upermn(x), length(x))    index <- 1    for (i in 1:l) {        v <- x[-which(x==u[i])[1]]        newindex <- upermn(v)        if (table(x)[i] == 1) {            result[index:(index+newindex-1),] <- cbind(u[i], do.call(rbind, unique(permn(v))))            } else {                result[index:(index+newindex-1),] <- cbind(u[i], uperm(v))            }        index <- index+newindex    }    return(result)}

Which has a great gain:

> system.time(unique(permn(c(1,0,3,4,1,0,0,3,0))))   user  system elapsed  22.808   0.103  23.241 > system.time(uperm(c(1,0,3,4,1,0,0,3,0)))   user  system elapsed   4.613   0.003   4.645 

Please report back if this would work for you!


One option that hasn't been mentioned here is the allPerm function from the multicool package. It can be used pretty easily to get all the unique permutations:

library(multicool)perms <- allPerm(initMC(dat))dim(perms)# [1] 18900    10head(perms)#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]# [1,]    4    4    3    3    1    1    0    0    0     0# [2,]    0    4    4    3    3    1    1    0    0     0# [3,]    4    0    4    3    3    1    1    0    0     0# [4,]    4    4    0    3    3    1    1    0    0     0# [5,]    3    4    4    0    3    1    1    0    0     0# [6,]    4    3    4    0    3    1    1    0    0     0

In benchmarking I found it to be faster on dat than the solutions from the OP and daroczig but slower than the solution from Aaron.