Include zero frequencies in frequency table for Likert data Include zero frequencies in frequency table for Likert data r r

Include zero frequencies in frequency table for Likert data


EDIT:

tabular produces frequency tables while table produces contingency tables. However, to get zero frequencies in a one-dimensional contingency table as in the above example, the below code still works, of course.


This question provided the missing link. By converting the Likert item to a factor, and explicitly specifying the levels, levels with a frequency of 0 are still counted

data <- factor(data, levels = c(1:5))table(data)

produces the desired output


table produces a contingency table, while tabular produces a frequency table that includes zero counts.

tabulate(data)# [1] 3 1 0 2 1

Another way (if you have integers starting from 1 - but easily modifiable for other cases):

setNames(tabulate(data), 1:max(data))  # to make the output easier to read# 1 2 3 4 5 # 3 1 0 2 1 


If you want to quickly calculate the counts or proportions for multiple likert items and get your output in a data.frame, you may like the function psych::response.frequencies in the psych package.

Lets create some data (note that there are no 9s):

df <- data.frame(item1 = sample(1:7, 2000, replace = TRUE),                  item2 = sample(1:7, 2000, replace = TRUE),                  item3 = sample(1:7, 2000, replace = TRUE))

If you want to calculate the proportion in each category

psych::response.frequencies(df, max = 1000, uniqueitems = 1:9)

you get the following:

           1      2     3      4      5      6      7 8 9 missitem1 0.1450 0.1435 0.139 0.1325 0.1380 0.1605 0.1415 0 0    0item2 0.1535 0.1315 0.126 0.1505 0.1535 0.1400 0.1450 0 0    0item3 0.1320 0.1505 0.132 0.1465 0.1425 0.1535 0.1430 0 0    0

If you want counts, you can multiply by the sample size:

psych::response.frequencies(df, max = 1000, uniqueitems = 1:9) * nrow(df)

You get the following:

        1   2   3   4   5   6   7 8 9 missitem1 290 287 278 265 276 321 283 0 0    0item2 307 263 252 301 307 280 290 0 0    0item3 264 301 264 293 285 307 286 0 0    0

A few notes:

  • the default max is 10. Thus, if you have more than 10 response options, you'll have issues. Otherwise, in your case, and many Likert item cases, you could omit the max argument.
  • uniqueitems specifies the possible values. If all your values were present in at least one item, then this would be inferred from the data.
  • I think the function only works with numeric data. So if you have your likert categories coded "Strongly disagree", etc. it wont work.