Why do some Unicode characters display in matrices, but not data frames in R? -
for @ least cases, asian characters printable if contained in matrix
, or vector
, not in data.frame
. here example
q<-'天' q # works # [1] "天" matrix(q) # works # [,1] # [1,] "天" q2<-data.frame(q,stringsasfactors=false) q2 # not work # q # 1 <u+5929> q2[1,] # works again. # [1] "天"
clearly, device capable of displaying character, when in data.frame
, not work.
doing digging, found print.data.frame
function runs format
on each column. turns out if run format.default
directly, same problem occurs:
format(q) # "<u+5929>"
digging format.default
, find calling internal format
, written in c.
before dig further, want know if others can reproduce behaviour. is there configuration of r allow me display these characters within data.frame
s?
my sessioninfo()
, if helps:
r version 3.0.1 (2013-05-16) platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] lc_collate=english_canada.1252 lc_ctype=english_canada.1252 [3] lc_monetary=english_canada.1252 lc_numeric=c [5] lc_time=english_canada.1252 attached base packages: [1] stats graphics grdevices utils datasets methods base loaded via namespace (and not attached): [1] tools_3.0.1
i hate answer own question, although comments , answers helped, weren't quite right. in windows, doesn't seem can set generic 'utf-8' locale. can, however, set country-specific locales, work in case:
sys.setlocale("lc_ctype", locale="chinese") q2 # works fine # q #1 天
but, make me wonder why format
seems use locale
; wonder if there way have ignore locale in windows. wonder if there generic utf-8
locale don't know on windows.
Comments
Post a Comment