r - How to perform RMSE with missing values? -
i have huge dataset 679 rows , 16 columns 30 % of missing values. decided impute missing values function impute.knn package impute , got dataset 679 rows , 16 columns without missing values.
but want check accuracy using rmse , tried 2 options:
- load package
hydrogof, applyrmsefunction sqrt(mean (obs-sim)^2), na.rm=true)
in 2 situations have error: errors in sim .obs: non numeric argument binary operator.
this happening because original data set contains na value (some values missing).
how can calculate rmse if remove missing values? obs , sim have different sizes.
how simply...
sqrt( sum( (df$model - df$measure)^2 , na.rm = true ) / nrow(df) ) obviously assuming dataframe called df , have decide on n ( i.e. nrow(df) includes 2 rows missing data; want exclude these n observations? i'd guess yes, instead of nrow(df) want use sum( !is.na(df$measure) ) ) or, following @joshua just
sqrt( mean( (df$model-df$measure)^2 , na.rm = true ) )
Comments
Post a Comment