r - How to perform RMSE with missing values? -


i have huge dataset 679 rows , 16 columns 30 % of missing values. decided impute missing values function impute.knn package impute , got dataset 679 rows , 16 columns without missing values.

but want check accuracy using rmse , tried 2 options:

  1. load package hydrogof , apply rmse function
  2. sqrt(mean (obs-sim)^2), na.rm=true)

in 2 situations have error: errors in sim .obs: non numeric argument binary operator.

this happening because original data set contains na value (some values missing).

how can calculate rmse if remove missing values? obs , sim have different sizes.

how simply...

sqrt( sum( (df$model - df$measure)^2 , na.rm = true ) / nrow(df) ) 

obviously assuming dataframe called df , have decide on n ( i.e. nrow(df) includes 2 rows missing data; want exclude these n observations? i'd guess yes, instead of nrow(df) want use sum( !is.na(df$measure) ) ) or, following @joshua just

sqrt( mean( (df$model-df$measure)^2 , na.rm = true ) ) 

Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -