
Then, you have to read it into R because JudgeIt is only an R package so far.
All the dataframes were made in the same way, yet R decided that indicators of California’s 112th districts were factors, not numbers, causing JudgeIt sanity checks to fail.
these errors read something like:a
ERROR: Some $VOTE values not in [0,1] interval.
So, I did what the reasonable person who is used to using float() would do and just as.numeric() over the offending columns.
And got garbage.
Now, I worked in R to do my undergraduate honors thesis, about two years ago. I don’t remember this being an issue, but I really should go back and revisit my code. But, apparently,as.numeric() translates factors into some “internal R” representation rather than what the typical individual (I’d assume) would want from a function that converts something to numeric type. instead, to convert factors to numbers, you have to
as.numeric(as.character($1))
or, convert the factor into its character-based representation, and then convert those characters into numbers. To me, that’d be like converting from some object to numbers by using float(str($1)). Not impossible, but definitely unintuitive.
More annoying than this double-conversion, though, is the fact that only one of my data frames, generated from the same base document, parsed in the same for loop in the same way, was designated a factor. And, all that work to move the code into a standard dataframe type into some unique, non-standard object for the analysis package I’m using.
Just silly. Data-wrangling in R is easy, but unintuitive, coming from someone whose first language was R 2.8.
Originally posted on yetanothergeographer.tumblr.com.