#81 "Invalid XML" error in .xcat / .xzcat files

0.8
open
Marc Dutoo
kernel (7)
7
2008-02-21
2008-02-21
Marc Dutoo
No

(Experienced by Nirulfen)

"Invalid XML" error in .xcat / .xzcat files.

Detail :

[11/12/2007 15:34:36 TechnicalException <init> ERROR] Technical
Exception : Read error
org.xml.sax.SAXParseException: character not allowed
at com.jclark.xml.sax.Driver.parse(Driver.java)
at medialib.io.file.XmlCodec.read(XmlCodec.java:175)
at medialib.io.file.GZIPXmlCodec.read(GZIPXmlCodec.java :75)
at medialib.io.file.GenericFileCodec.readFile(GenericFileCodec.java:101)
at medialib.gui.actions.LoadTask.runFileTask(LoadTask.java:49)
at medialib.gui.actions.FileTask.runTask(FileTask.java:56)

Analysis :

This "character not allowed" means that, when opening the file, it found in it a character that is not allowed in XML. Such characters are typically < and ' but might also be >, &, " (and maybe even some weird
japanese characters etc. ??). Obviously the program should normally handle these characters well even if there are on your CDs files whose name contain ' ... Since the character is there, the program must have written it before, and here is the real bug.

Workaround :

* .xzcat catalog files are merely zipped XML files. If you unzip it and open it in a good text editor, it is pretty readable. A simple way to detect above mentioned invalid XML characters would be to open it in an XML reader (ex. Internet Explorer) which will tell you where are the faulty characters, allowing you to patch them.
* Worse case, you may still save big chunks through copy / paste in another "clean" file. It won't help getting back your Windows CD scan and its 36000 files, but it'll help you avoid rescanning (reindexing) your 150 CDs that only contain 4 files each :)

Other help :

* This won't help you solve this problem, but you should be using the 0.8pre1 release, it is of higher quality than the previous ones.
* When you save your catalog in MediaLibrary, MediaLibrary automatically saves a backup of the old version, which may be still correct.

Discussion