読者です 読者をやめる 読者になる 読者になる

peketaminの日記

その辺のプログラマーです。うーむ…。

How to replace tag name with namespace using lxml

If you want to change tag/node name in XML, the code below can do it.

from lxml import etree

class ElementsTagReplacer:
    """replace tag name with namespaces."""
    def __init__(self):
        self.encoding = 'utf-8'
        self.parser = etree.XMLParser(encoding=self.encoding)
        self.namespaces = {'media': "http://search.yahoo.com/mrss/"}

    def replace(self, xml):
        if isinstance(xml, str):
            xml = xml.encode(self.encoding, 'ignore')

        root = etree.fromstring(xml, parser=self.parser)

        for analytics_gn in root.iterfind('.//media:thumbnail', namespaces=self.namespaces):
            analytics_gn.tag = "{%s}image" % analytics_gn.nsmap['media']

        return etree.tostring(root, encoding='utf-8').decode('utf-8', 'ignore')


replaced_xml = ElementsTagReplacer().replace(xml)