· r-2 rstats

R: String to Date or NA

I’ve been trying to clean up a CSV file which contains some rows with dates and some not - I only want to keep the cells which do have dates so I’ve been trying to work out how to do that.

My first thought was that I’d try and find a function which would convert the contents of the cell into a date if it was in date format and NA if not. I could then filter out the NA values using the http://www.statmethods.net/input/missingdata.html function.

I started out with the http://www.statmethods.net/input/dates.html function...

> as.Date("2014-01-01")
[1] "2014-01-01"

> as.Date("foo")
Error in charToDate(x) :
  character string is not in a standard unambiguous format

...but that throws an error if we have a non date value so it’s not so useful in this case.

Instead we can make use of the http://stackoverflow.com/questions/14755425/what-are-the-standard-unambiguous-date-formats function which does exactly what we want:

> strptime("2014-01-01", "%Y-%m-%d")
[1] "2014-01-01 GMT"

> strptime("foo", "%Y-%m-%d")
[1] NA

We can then feed those values into is.na..

> strptime("2014-01-01", "%Y-%m-%d") %>% is.na()

> strptime("foo", "%Y-%m-%d") %>% is.na()
[1] TRUE

...and we have exactly the behaviour we were looking for.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket