Oh, sorry... what I should have stated is any advice on where to start for the extended list of XML formats that there is no support for at the moment.  In regards to Office files, I doubt much that the .doc and .ppt have been converted to OOXML formats, so these would obviously not qualify as XML.  In this regard, I have already taken your advice from before and started working on the suggested solution.

On 7/10/07, Victor Stone <fourstones.net@gmail.com> wrote:
On 7/10/07, M. David Peterson <xmlhacker@gmail.com> wrote:
> On 7/9/07, Jon Phillips <jon@rejon.org> wrote:
> Fortunately XML happens to be one my specialties ;-)  I'll be happy to do
> the work in this space.  Again, will push everything through the dev lists
> to ensure proper coordination with everyone/everything else to ensure I am
> headed in the proper technical direction.
>
> Any advice on where I should start?

as I've said elsewhere since the verification is about malware
detection, esp. in regards to office files I would highly recommend
looking into a export/import model where the upload is translated into
a benign format (like script-less html or xml). I'm sure there's a
community of doc-weenies out there that knowd this stuff. getID3 is
focused on media.

VS



--
/M:D

M. David Peterson
http://mdavid.name | http://www.oreillynet.com/pub/au/2354 | http://dev.aol.com/blog/3155