using "getElementsByTagName" to get tag <string name="ID"> in python -
my xml file is
<list> <profiledefinition> <string name="id">ncghwaznpy6</string> <string name="name">02.11.2013 scott mobile</string> <decimal name="accountid">10954</decimal> <decimal name="timezoneid">-600</decimal> </profiledefinition><profiledefinition> <string name="id">9jsg57bruu6</string> <string name="name">huggies us-en & ca-en test town responsive - prod</string> <decimal name="accountid">10954</decimal> <decimal name="timezoneid">-600</decimal> </profiledefinition><profiledefinition> <string name="id">i3cjq4gdkk6</string> <string name="name">huggies us-en brand desktop - prod</string> <decimal name="accountid">10954</decimal> <decimal name="timezoneid">-600</decimal></profiledefinition>
my code is
import urllib2 theurl = 'https://ws.webtrends.com/v2/reportservice/profiles/?format=xml' pagehandle = urllib2.urlopen(theurl) ########################################################################## xml.dom.minidom import parsestring file = pagehandle data = file.read() file.close() dom = parsestring(data) xmltag = dom.getelementsbytagname('string name="id"')[0].toxml() xmldata=xmltag.replace('<string name="id">','').replace('</string>','') print xmltag print xmldata
i want value of element tagname 'string name="id"'
but error comes
traceback (most recent call last): file "c:\users\vaibhav\desktop\webtrends\test.py", line 43, in xmltag = dom.getelementsbytagname('string name="id"')[0].toxml() indexerror: list index out of range
if replace
dom.getelementsbytagname('string name="id"')[0].toxml()
to
dom.getelementsbytagname('string')[0].toxml()
the output comes
"ncghwaznpy6"
since first element of list second element is
"02.11.2013 scott mobile"
which saved in list dont want
however there 2 string tag name="id" , name="name" how access string tag name="id" only
string name="id"
not tag name. string
tag name.
you have compare name attribute value each string
tag.
.... dom = parsestring(data) s in dom.getelementsbytagname('string'): if s.getattribute('name') == 'id': print s.childnodes[0].data
i recommed use lxml or beautifulsoup.
following equivalent code using lxml.
import lxml.html dom = lxml.html.fromstring(data) s in dom.cssselect('string[name=id]'): print s.text
Comments
Post a Comment