Why do some Unicode characters display in matrices, but not data frames in R? -


for @ least cases, asian characters printable if contained in matrix, or vector, not in data.frame. here example

q<-'天'  q # works # [1] "天"   matrix(q) # works #      [,1] # [1,] "天"  q2<-data.frame(q,stringsasfactors=false)  q2 # not work #          q # 1 <u+5929>  q2[1,] # works again. # [1] "天" 

clearly, device capable of displaying character, when in data.frame, not work.

doing digging, found print.data.frame function runs format on each column. turns out if run format.default directly, same problem occurs:

format(q) # "<u+5929>" 

digging format.default, find calling internal format, written in c.

before dig further, want know if others can reproduce behaviour. is there configuration of r allow me display these characters within data.frames?

my sessioninfo(), if helps:

r version 3.0.1 (2013-05-16) platform: x86_64-w64-mingw32/x64 (64-bit)  locale: [1] lc_collate=english_canada.1252  lc_ctype=english_canada.1252    [3] lc_monetary=english_canada.1252 lc_numeric=c                    [5] lc_time=english_canada.1252      attached base packages: [1] stats     graphics  grdevices utils     datasets  methods   base       loaded via namespace (and not attached): [1] tools_3.0.1 

i hate answer own question, although comments , answers helped, weren't quite right. in windows, doesn't seem can set generic 'utf-8' locale. can, however, set country-specific locales, work in case:

sys.setlocale("lc_ctype", locale="chinese") q2 # works fine #  q #1 天 

but, make me wonder why format seems use locale; wonder if there way have ignore locale in windows. wonder if there generic utf-8 locale don't know on windows.


Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -