Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

XML parsing error with file names

2005-10-30
2013-05-30
  • Stefan Lucke
    Stefan Lucke
    2005-10-30

    Hi,
    I get an error message when gonig through my NFS filesystem.
    stefan@jarada:~> ls -d /net_data/video/\@Im_Auge_des_Sturms_\(2^B2\)_Sturmflut_und_Monsterwellen/
    /net_data/video/@Im_Auge_des_Sturms_(2?2)_Sturmflut_und_Monsterwellen/

    In vdr's recordings menu, this character is shown as '/' .

    XML Parsing Error: not well-formed
    Location: http://192.168.192.3:49152/content/interface?req_type=browse&object_id=2f6e65745f646174612f766964656f&starting_index=0&driver=2&requested_count=0&sid=52260da28ed184e8551ff1a17af164e5
    Line Number 68, Column 39:      <dc:title>@Im_Auge_des_Sturms_(2 2)_Sturmflut_und_Monsterwellen</dc:title>
    --------------------------------------^

     
    • Jin
      Jin
      2005-10-30

      Hi,

      interesting thing, required some investigation :)
      first I did the following here:
      I pasted the directory name from your "ls" line and did an mkdir with that. However, when I then do ls, I get:
      @Im_Auge_des_Sturms_(2^B2)_Sturmflut_und_Monsterwellen/

      This is clearly not what you have, so I gave it one more thought and I guess you really have ^B there (ascii 0x02).  The ASCII table says, that this is a character called "start of text", to be honest I never came across it before, so not sure about it's original purpose. Anyway.. by looking at the XML spec I was only able to find the following:
      Character Range
      [2]        Char        ::=        #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]     /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

      and

      S (white space) consists of one or more space (#x20) characters, carriage returns, line feeds, or tabs.
      White Space
      [3]        S        ::=        (#x20 | #x9 | #xD | #xA)+

      -----

      So in my understanding 0x02 should indeed not be in an XML document (please correct me if I'm wrong).

      I think we can take the following approach:
      we will replace all "out of range" characters with a  user configurable whitespace character.

      One thing that I will have to look up is the fs driver implementation, I think we use full paths as obejct id's , that would be a problem since we obviously can not transfer the "out of range" chars in that way. But we will figure something out :)

      Thanks for the bug report!
      Jin

       
    • Stefan Lucke
      Stefan Lucke
      2005-10-30

      Thanks for your response. Now I looked at that deeper and it seems that vdr ( http://www.cadsoft.de/people/kls/vdr/index.htm ) does it's own translation of some characters for whatever reason. In the sources (recording.c) I found:
      struct tCharExchange { char a; char b; };
      tCharExchange CharExchange[] = {
        { '~',  '/'    },
        { ' ',  '_'    },
        { '\'', '\x01' },
        { '/',  '\x02' },
        { 0, 0 }
        };

      Hope that info helps.

      Stefan

       
      • Jin
        Jin
        2005-10-30

        Ah, I see.. so they are working around the same problem. Thanks for the additional info!

        Greetings,
        Jin

         
        • Stefan Lucke
          Stefan Lucke
          2005-10-30

          So it would be nice to set a char translation per directory. As vdr is affected support for it's directory structure would be nice. Don't know if that could be done by an external script. The following is part of vdr's base recording tree:
          Enterprise/%Canamar/2003-12-12.20:09.99.99.rec
          Enterprise/%Canamar/2003-12-12.20:09.99.99.rec/marks.vdr
          Enterprise/%Canamar/2003-12-12.20:09.99.99.rec/summary.vdr
          Enterprise/%Canamar/2003-12-12.20:09.99.99.rec/index.vdr
          Enterprise/%Canamar/2003-12-12.20:09.99.99.rec/001.vdr
          Enterprise/%Carbon_Creek
          Enterprise/%Carbon_Creek/2003-09-12.20:12.99.99.rec
          Enterprise/%Carbon_Creek/2003-09-12.20:12.99.99.rec/marks.vdr
          Enterprise/%Carbon_Creek/2003-09-12.20:12.99.99.rec/summary.vdr
          Enterprise/%Carbon_Creek/2003-09-12.20:12.99.99.rec/index.vdr
          Enterprise/%Carbon_Creek/2003-09-12.20:12.99.99.rec/001.vdr

          The main video is stored in xxx.vdr files. By default vdr does a split at 2GB. From playback they should play consecutive. For seeking information index.vdr files should be used, as each I-frame is stored with offset and file number. The directory structure is recording_name/episode_name/rec_date.rec_time.rec_priority.life_time/

          Stefan

           
          • Jin
            Jin
            2005-10-30

            > So it would be nice to set a char translation per directory. As vdr is affected support for it's directory structure would be nice.

            sorry, I do not quite get the idea here: why would you want character translation "per directory"? what for?

            and by the way, what is vdr? :) could you point me to a page or something?

            >The main video is stored in xxx.vdr files. By default vdr does a split at 2GB. From playback they should
            > play consecutive. For seeking information index.vdr files should be used, as each I-frame is stored with
            > offset and file number. The directory structure is recording_name/episode_name/rec_date.rec_time.rec_priority.life_time/

            that is actually not related to character translation, right?

            question: what UPnP device is playing back this data, what are you using?

            The thing is, that there is nothing in the UPnP standard, that would allow me to say: use this file as an index file and use the other resources as video data; yes, it would work with resources - but the device would have to know which resource to take and so on. So those vdr structured directories must be recognized upon import, and one video file that will be served should be created by some external script. A problem that we currently have: the SDK can not handle files >2GB (I think we had a thread about that), so we will need to fix that too.

            Anyway.. you got me a little confused, because those issues are not really related to character translation, right? :) it's more like a feature request?

            maybe you could outline in a separate thread, on how you would want those vdr directories to be imported, so I could understand what is requried and if it can be done

            thanks!
            Jin