Counts & Percentages in xTable, Sweave, R, cross tabulations Counts & Percentages in xTable, Sweave, R, cross tabulations r r

Counts & Percentages in xTable, Sweave, R, cross tabulations


In the Tables-package it is one line:

# data:dow <- sample(1:7, 100, replace=TRUE)purp <- sample(1:4, 100, replace=TRUE)dow <- factor(dow, 1:7, c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun"))purp <- factor(purp, 1:4, c("Business", "Commute", "Vacation", "Other"))dataframe <-  data.frame( dow, purp)# The packageslibrary(tables)library(Hmisc)# The tabletabular(  (Weekday=dow) ~  (Purpose=purp)*(Percent("row")+ 1)    ,data=dataframe        )# The latex tablelatex(  tabular(  (Weekday=dow) ~  (Purpose=purp)*(Percent("col")+ 1)    ,data=dataframe        ))

Using booktabs, you get this (can be further customised):

enter image description here


Great question, this one's bothering me for a while (it's not that hard, it's just me being lazy as hell... as usual). However... though the question's great, your approach, I'm afraid, isn't. There's priceless package called xtable that you can (mis)use. Besides, this issue is too common - there's a great chance that there's already some ready-made solution sitting somewhere on the Internets.

One of these days I'm about to work it out once and for all (I'll post the code on GitHub). The main idea goes a little bit like this: would you like frequency and/or percentage values within one cell (separated by \) or rows with absolute and relative frequencies (or %) in succession? I'd go with the 2nd one, so I'll post a "first-aid" solution for now:

ctab <- function(tab, dec = 2, ...) {  tab <- as.table(tab)  ptab <- paste(round(prop.table(tab) * 100, dec), "%", sep = "")  res <- matrix(NA, nrow = nrow(tab) * 2, ncol = ncol(tab), byrow = TRUE)  oddr <- 1:nrow(tab) %% 2 == 1  evenr <- 1:nrow(tab) %% 2 == 0  res[oddr, ] <- tab  res[evenr, ] <- ptab  res <- as.table(res)  colnames(res) <- colnames(tab)  rownames(res) <- rep(rownames(tab), each = 2)  return(res)}

Now try something like:

data(HairEyeColor)           # load an appropriate datasettb <- HairEyeColor[, , 1]    # choose only male respondentsctab(tb)      Brown  Blue   Hazel GreenBlack 32     11     10    3    Black 11.47% 3.94%  3.58% 1.08%Brown 53     50     25    15   Brown 19%    17.92% 8.96% 5.38%Red   10     10     7     7    Red   3.58%  3.58%  2.51% 2.51%Blond 3      30     5     8    Blond 1.08%  10.75% 1.79% 2.87%

Make sure you loaded xtable package and use print (it's a generic function, so you must pass a xtable classed object). It's important that you suppress the row names. I'll optimize this one tomorrow - it should be xtable compatible. It's 3AM in my time zone, so with these lines I'll end my answer:

print(xtable(ctab(tb)), include.rownames = FALSE)

Cheers!


I wasn't able to figure out how to generate a multi column header using xtable, but I did realize that i could concatenate my counts & percentages into the same column for printing purposes. Not ideal, but seems to get the job done. Here's the function I've written:

ctab3 <- function(row, col, margin = 1, dec = 2, percs = FALSE, total = FALSE, tex = FALSE, caption = NULL){    tab <- as.table(table(row,col))    ptab <- signif(prop.table(tab, margin = margin), dec)    if (percs){        z <- matrix(NA, nrow = nrow(tab), ncol = ncol(tab), byrow = TRUE)         for (i in 1:ncol(tab)) z[,i] <- paste(tab[,i], ptab[,i], sep = " ")        rownames(z) <- rownames(tab)        colnames(z) <- colnames(tab)        if (margin == 1 & total){            rowTot <- paste(apply(tab, 1, sum), apply(ptab, 1, sum), sep = " ")            z <- cbind(z, Total = rowTot)        } else if (margin == 2 & total) {            colTot <- paste(apply(tab, 2, sum), apply(ptab, 2, sum), sep = " ")            z <- rbind(z,Total = colTot)        }    } else {        z <- table(row, col)        }ifelse(tex, return(xtable(z, caption)), return(z))}

Probably not the final product, but does allow for some flexibility in parameters. At the most basic level, is only a wrapper of table() but can also generate LaTeX formatted output as well. Here is what I ended up using in a Sweave document:

<<echo = FALSE>>=for (i in 1:ncol(df)){    print(ctab3(        col = df[,1]        , row = df[,i]        , margin = 2        , total = TRUE        , tex = TRUE        , caption = paste("Dow by", colnames(df[i]), sep = " ")    ))}@