Question

Assume that I have several dataframes in a workspace in R and I want a list of the names of the columns in all the dataframes.

I thought the following would work. But it does not. Try it in your own workspace.

sapply(ls(),names) 

Why does it not work? ls() creates a list of all the dataframes and then names function should be applied to each dataframe. That is my simple question for now.

Coming next: I want to determine all the columns that have a name with the letters "date" in them so that I can the apply the following function to each of those columns no matter what dataframe they are in.

as.Date(dataframe$dateofenrollment,origin="1899-12-30")
Was it helpful?

Solution

It doesn't work because ls() returns the names of the objects in our workspace, not the objects themselves.

You probably want something like the following:

lapply(ls(), function(x) if(is.data.frame(o <- get(x))) names(o))

This will have NULL elements for any objects that are not data frames, but presumably you can work around that.

OTHER TIPS

The first part of your question can be answered with allNames <- lapply(ls(), function(x) names(get(x))). Using one of the regex functions to determine the columns of interest should also be pretty straight forward with something like lapply(allNames, function(x) grepl("date", x)). I'm running out of steam as to how to take those first two bits and update the columns, but maybe this will get you and others down the right path.

Here is another solution with a simple example to accomplish both of your objectives. You can modify it quite easily to suit your situation. Let me know if you have questions.

# create a set of dummy data frames
df1 = data.frame(x = rnorm(100), y = rnorm(100))
df2 = data.frame(x = rnorm(100), z = rnorm(100))
ch1 = c('a', 'b', 'c')


# get all objects
all.obj = sapply(ls(), get)

# get data frames
dfrs = all.obj[sapply(all.obj, is.data.frame)]

# get data frames containing 'x' as column name
dfrs2 = dfrs[lapply(dfrs, function(df) {'x' %in% names(df)}) == 'TRUE']

# replace x with square of x in all these data frames
dfrs3 = lapply(dfrs2, function(df) {df$x = df$x^2; df})
f <- function(){
lo <- ls(envir=.GlobalEnv)
lo <- lo[sapply(lo,function(x) eval(substitute(class(X)=="data.frame",
                                                   list(X=as.name(x))))
                   )]
if(length(lo)>0){
    res <- lapply(lo,function(x) eval(substitute(names(X),list(X=as.name(x)))))
    names(res) <- lo
} else res <- NULL
return(res)
}

EDIT

ls.names <- function(){
  res <- lapply(mapply(as.name,ls(pos=1)),
                function(x) if(class(xe<-eval(x))=="data.frame") names(xe))
  res <- res[!unlist(lapply(res,is.null))]
  return(res)
}

EDIT2

eapply(env=.GlobalEnv,function(x) if(is.data.frame(x)) names(x))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top