Pergunta

I have a generic question of how to use and manipulate factors. In my work, R often coerces something into a factor because R does not allow different modes in a matrix, but in actuality I would prefer those columns to remain numeric.

When working with such factors I noticed:

  • When you have two similar factors (e.g all values between 1 and 5) in different columns, coercing the first column factor to a number by as.numeric() works fine. Coercing the second, third or fourth via as.numeric always adds 1 to every "factor". Why?

  • There seems to be a difference between

    go$V4 <- as.double(go$V4)
    

    AND

    go[,4] <- as.numeric(levels(go[,4]))[go[,4]]
    

Assuming as.double and as.numeric are indeed largely identical, the difference is somewhere else but I don't get it.

Any syntax experts?

Foi útil?

Solução

The statement about coercing to factor b/c of matrix requirements is simply wrong. (R matrices are incapable of holding factor variables.) Perhaps you are thinking of a data.frame. As the R FAQ says you need to use:

go$V4 <- as.numeric(as. character(go$V4))

If a numeric vector is concatenated (with c()) to any character vector, it is immediately coerced to "character" mode. If a column in a text field has non-numeric characters in it, the same thing happens on input.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top