I am currently working on an app to convert documents (specifically Open Document Text, at least for now) to epub format. The problem I'm running into right now is this, I am using etree ElementTree to parse the xml files extracted from the .odt file, right now I'm working on the content.xml (the file with all the text), and I'm having a problem, I'm trying to get the tag (element.tag) and attribute (element.attrib).
Ok, now hopefully someone has been able to follow thus far, so I understand attrib is a dictionary, I try print(element.attrib.keys())
and it prints out this:dict_keys(['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}style-name'])
If I do print(element.attrib)
it prints this:{'{urn:oasis:names:tc:opendocument:xmlns:text:1.0}style-name': 'T1'}
now it looks to me like the key is {urn:oasis:names:tc:opendocument:xmlns:text:1.0}style-name
and the value should be 'T1'However if I try print(element.attrib['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}style-name'])
it fails, it says it's not a key.
For a little more info, this is the tag I'm trying to get the info from: <text:span text:style-name="T1"> and I've tried 'text' and 'text:style-name' as the keys for the attribute dictionary, and they both fail. I also noticed the first tag is this:
<office:document-content xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"... and it goes on.
You'll notice it has xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0", now I believe this is a namespace??? I'm not sure, but I think it's defining text as that long string, I'm not exactly an xml expert, but I've been trying to digest this xml for a while. Any help is greatly appreciated, I hope my question hasn't been to confusing.
To clarify, what I need is a way to get the attribute, that way I can get the style-name and convert it to class for the html in the epub, I know that the attributes are a dictionary, but I can't find the key... :[