· r-2 rstats

R: dplyr - mutate with strptime (incompatible size/wrong result size)

Having worked out how to translate a string into a date or NA if it wasn’t the appropriate format the next thing I wanted to do was store the result of the transformation in my data frame.

I started off with this:

data = data.frame(x = c("2014-01-01", "2014-02-01", "foo"))
> data
           x
1 2014-01-01
2 2014-02-01
3        foo

And when I tried to do the date translation ran into the following error:

> data %>% mutate(y = strptime(x, "%Y-%m-%d"))
Error: wrong result size (11), expected 3 or 1

As I understand it this error is telling us that we are trying to put a value into the data frame which represents 11 rows rather than 3 rows or 1 row.

It turns out that storing POSIXlts in a data frame isn’t such a good idea! In this case we can use the as.character function to create a character vector which can be stored in the data frame:</a>

> data %>% mutate(y = strptime(x, "%Y-%m-%d") %>% as.character())
           x          y
1 2014-01-01 2014-01-01
2 2014-02-01 2014-02-01
3        foo       <NA>

We can then get rid of the NA row by using the is.na function:

> data %>% mutate(y = strptime(x, "%Y-%m-%d") %>% as.character()) %>% filter(!is.na(y))
           x          y
1 2014-01-01 2014-01-01
2 2014-02-01 2014-02-01

And a final tweak so that we have 100% pipelining goodness:

> data %>%
    mutate(y = x %>% strptime("%Y-%m-%d") %>% as.character()) %>%
    filter(!is.na(y))
           x          y
1 2014-01-01 2014-01-01
2 2014-02-01 2014-02-01
  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket