Oh, sorry... what I should have stated is any advice on where to start for the extended list of XML formats that there is no support for at the moment.  In regards to Office files, I doubt much that the .doc and .ppt have been converted to OOXML formats, so these would obviously not qualify as XML.  In this regard, I have already taken your advice from before and started working on the suggested solution.

as I've said elsewhere since the verification is about malware
detection, esp. in regards to office files I would highly recommend
looking into a export/import model where the upload is translated into
a benign format (like script-less html or xml). I'm sure there's a
community of doc-weenies out there that knowd this stuff. getID3 is
focused on media.



