Extract attributes and certain tag values from xml using python script -
i want parse xml content , return dictionary contains name attribute , values dictionary. example:
<ecmaarray> <number name="xyz1">123.456</number> <ecmaarray name="xyz2"> <string name="str1">aaa</string> <number name="num1">55</number> </ecmaarray> <strictarray name="xyz3"> <string>aaa</string> <number>55</number> </strictarray> </ecmaarray>
the output has in dictionary this..
dict:{ 'xyz1': 123.456, 'xyz2': {'str1':'aaa', 'num1': '55'}, 'xyz3': ['aaa','55'] }
can 1 suggest recursive solution ?
assuming situation this:
<strictarray name="xyz4"> <string>aaa</string> <number name="num1">55</number> </strictarray>
is not possible, here's sample code using lxml
:
from lxml import etree tree = etree.parse('test.xml') result = {} element in tree.xpath('/ecmaarray/*'): name = element.attrib["name"] text = element.text childs = element.getchildren() if not childs: result[name] = text else: child_dict = {} child_list = [] child in childs: child_name = child.attrib.get('name') child_text = child.text if child_name: child_dict[child_name] = child_text else: child_list.append(child_text) if child_dict: result[name] = child_dict else: result[name] = child_list print result
prints:
{'xyz3': ['aaa', '55'], 'xyz2': {'str1': 'aaa', 'num1': '55'}, 'xyz1': '123.456'}
you may want improve code - it's hint on go.
hope helps.
Comments
Post a Comment