r - How can I specify only some colClasses in sqldf file.format? -


i have csv files problematic columns sqldf, causing numeric columns classed character. how can specify classes columns, , not every column? there many columns, , don't want have specify class of them.

much of data in these problem columns zeros, sqldf reads them integer, when numeric (or real) data type. note read.csv correctly assigns classes. i'm not clever enough generate suitable data set has right properties (first 50 values zero, value of 1.45 in 51st row), here's example call load data:

df <- read.csv.sql("data.dat", sql="select * file",                      file.format=list(colclasses=c("attr4"="numeric"))) 

which returns error:

error in sqldf(sql, envir = p, file.format = file.format, dbname = dbname,  :    formal argument "file.format" matched multiple actual arguments 

can somehow use read.table call work out data types? can read columns in character, , convert numeric? there small number character, , easier specify of numeric columns. have come ugly partial solution, still fails on final line same error message:

df.head <- read.csv("data.dat", nrows=10) classes <- lapply(df.head, class)  # fails classes correct classes <- replace(classes, classes=="integer", "numeric") df <- read.csv.sql("data.dat", sql="select * file",                      file.format=list(colclasses=classes)) 

take closer @ documentation read.csv.sql, @ argument nrows:

nrows: number of rows used determine column types. defaults 50. using -1 causes use rows determining column types.

another thing you'll note looking @ documentation read.csv.sql , sqldf there no colclasses parameter. if read file.format documenation in sqldf , you'll see parameters in file.format list not passed read.table rather sqliteimportfile, has no understanding of r's data types. if don't modifying nrows parameter, read entire dataframe having character type , use whatever methods figure out column should class. you're going have problem of not knowing whether integer integer or numeric until read entire column however. also, if speed issue killing here, may want consider moving away csv's.


Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -