TypeError: decoding Unicode is not supported python -


i using lxml.html parse html file , text page. bur have string has character ' example florian's due which, while printing output traceback

parent_link_id_text =  parent_link_id.xpath('./td[@width="400"]/text()') print (sgs_mid[0]+";"+"external"+";"+str(link_id_num[0])+";"+parent_link_id_text[0]+";"+parent_link_link[0], file = log_file_1) 

unicodeencodeerror: 'ascii' codec can't encode characters in position 56-58: ordinal not in range(128)

then tried

print (sgs_mid[0]+";"+"publicfreeurl"+";"+str(link_id_num[0])+";"+unicode(parent_link_id_text[0],"utf-8")+";"+parent_link_link[0], file = log_file_1) 

and traceback:

typeerror: decoding unicode not supported

how can solve printing string unicode cahracter?

not sure if solution problem, perhaps guide in right direction.

without seeing code have data, i'm going speculate , make programmatic guess how solve issue.

please see following code:

import lxml.html lh import urllib2  url = 'http://loremipsum.net/about.html'  doc = lh.parse(urllib2.urlopen(url))  value = doc.xpath('//p/strong/text()')[0]  print value 

printed result:

what 'lorem ipsum'?

by reading page on lorem ipsum site, can see text returned indeed has ' in it.

i hope helps in right direction.


Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -