Menu

#329 Invalid XML document using localized NFO files.

v1.0 (example)
open
nobody
None
5
2020-08-04
2020-07-29
Luis Backup
No

I have problems with some nfo files using non english language.

Situation:
DLNA clients ignore the whole xml document because its invalid.

Description:
On the server side, when strings were trimmed, sometimes, this trim occurs in the middle of a multibyte char, so, the byte that goes to the last position is a control char or something not like an ascii.

When this string go to the client, the last INVALID char of the string is the preceding char of the closing tag:
for example: <opentag>blahblahbl{INVALIDCHAR}
so, the xml parser of the client see: <opentag>blahblahblP/closingtag>, there is NO valid closing tag.</opentag></opentag>

I've tested this using the xml parser of php, in fact, I'm working on a simple client based on php to browse my movies from the minidlna, some kind of helper.

In the attached file, take a look on the plot string, minidlna cut it on the word: "creación", exactly in the middle of the accentuated "o".

Happens with the current git version.

Thanks.

1 Attachments

Discussion

  • Luis Backup

    Luis Backup - 2020-08-04

    Another example of a problematic xml document for some xml parsers, look the last &quot, it's not ended properly due to string truncate.

    <dc:description>Londres, 1593, reinado de Isabel I Tudor. William Shakespeare, joven dramaturgo de gran talento, necesita urgentemente poner fin a la mala racha por la que está pasando su carrera. Por mucho que lo intenta, y a pesar de la presión de los productores y de los dueños de salas de teatro, no consigue concentrarse en su nueva obra: "Romeo y Ethel, la hija del pirata&quot</dc:description>

     
  • Luis Backup

    Luis Backup - 2020-08-04

    I think a good method will be back from the end up to the first space, but with more than 3 chars then append three dots.
    For example

    Example: Hi how are you?, alongword anotherlongword
    Back from the end counting chars up to the first (last space), if more than 3 chars was counted, cut at that position (space char position) and append three dots.
    Result: Hi how are you, alongword...

     
  • Luis Backup

    Luis Backup - 2020-08-04

    Because almost everytime I put a new movie my php stop parsing the xml document I made a horrible modification (worst code ever written), please adapt to your coding style and the rules of the programing code, or take it as example. Not tested with 383/384 or 385 bytes strings, just with less length and more length, but was tested OK with words and html elements like " in the middle of the cut.

    File: upnpsoap.c
    near Line: 1011

        if( comment && (passed_args->filter & FILTER_DC_DESCRIPTION) ) {
            //ret = strcatf(str, "&lt;dc:description&gt;%.384s&lt;/dc:description&gt;", comment);
    
    
    
            ret = strcatf(str, "&lt;dc:description&gt;");
    
            uint16_t sourceLen = strlen(comment);
            if(sourceLen > 384)
            {
                uint16_t maxCharsToCopy =  MIN(384 , sourceLen);
                char * pEndPosition = comment;
                pEndPosition += maxCharsToCopy;
                uint16_t skippedChars = 0;
                uint8_t bQuit = 0;
                while(0 == bQuit && pEndPosition > comment)
                {
                    if(' ' == *pEndPosition)
                    {
                        //we found a space, ideal to cut, but check if we skip at least 3 chars
                        if(skippedChars >= 3)
                        {
                            //ok to cut
                            bQuit = 1;
                            break;
                        }
                    }
                    skippedChars++;
                    pEndPosition--;
                }
                uint16_t resultingCount = pEndPosition - comment;
                uint16_t lengtTmp = strlen(comment);
    
                strncat(str->data, comment, resultingCount);
                uint16_t copiedChars = MIN(lengtTmp, resultingCount);
                str->off += copiedChars;
                ret = strcatf(str, "...", comment);
            }
            else
            {
                ret = strcatf(str, "%.384s", comment);
            }
            ret = strcatf(str, "&lt;/dc:description&gt;");
    
    
    
    
    
        }
        if( creator && (passed_args->filter & FILTER_DC_CREATOR) ) {
    
     

    Last edit: Luis Backup 2020-08-04

Log in to post a comment.