Extract the labels attribute from "labeled" tibble columns from a haven import from Stata

r data-structures attributes stata r-haven

The original question asks how 'to extract the values of the labels attribute to a list.' A solution to the main question follows (assuming some_df is imported via haven and has label attributes). Update: I've now added a way to extract a label vector with the package sjlabelled.

library(purrr)n <- ncol(some_df)labels_list <- map(1:n, function(x) attr(some_df[[x]], "label") )# if a vector of character strings is preferablelabels_vector <- map_chr(1:n, function(x) attr(some_df[[x]], "label") )# to make a simple codebooklibrary(kable)variable_name <- names(some_df)data.frame(variable_name, description = labels_vector) %>%  kable(format = 'markdown')# UPDATE: another approach with package sjlabelledlibrary(sjlabelled)sjlabelled::get_label(some_df)

r data-structures attributes stata r-haven

I'm going to take a go at answering this one, though my code isn't very pretty.

First I make a function to extract a named attribute from a single column.

ColAttr <- function(x, attrC, ifIsNull) {# Returns column attribute named in attrC, if present, else isNullC.  atr <- attr(x, attrC, exact = TRUE)  atr <- if (is.null(atr)) {ifIsNull} else {atr}  atr}

Then a function to lapply it to all the columns:

AtribLst <- function(df, attrC, isNullC){# Returns list of values of the col attribute attrC, if present, else isNullC  lapply(df, ColAttr, attrC=attrC, ifIsNull=isNullC)}

Finally I run it for each attribute.

stub93 <- AtribLst(cps_00093.df, attrC="label", isNullC=NA)labels93 <- AtribLst(cps_00093.df, attrC="labels", isNullC=NA)labels93 <- labels93[!is.na(labels93)]

All the columns have a "label" attribute, but only some are of type "labeled" and so have a "labels" attribute. The labels attribute is named, where the labels match values of the data and the names tell you what those values signify.

r data-structures attributes stata r-haven

Jumping off @omar-waslow answer above, but adding the use of attr_getter.

If the data (some_df) is imported using read_dta in the haven package, then each column in the tibble has an attr called "label". So we split up the dataframe, going column by column. This creates a two column dataframe which can be joined back (after pivot_longer, for example).

library(tidyverse)label_lookup_map <- tibble(   col_name = some_df %>% names(),   labels = some_df %>% map_chr(attr_getter("label")))

CodeHunter

Extract the labels attribute from "labeled" tibble columns from a haven import from Stata

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last