
#329 Invalid XML document using localized NFO files.

Luis Backup

I have problems with some nfo files using non english language.

DLNA clients ignore the whole xml document because its invalid.

On the server side, when strings were trimmed, sometimes, this trim occurs in the middle of a multibyte char, so, the byte that goes to the last position is a control char or something not like an ascii.

When this string go to the client, the last INVALID char of the string is the preceding char of the closing tag:
for example: <opentag>blahblahbl{INVALIDCHAR}
so, the xml parser of the client see: <opentag>blahblahblP/closingtag>, there is NO valid closing tag.</opentag></opentag>

I've tested this using the xml parser of php, in fact, I'm working on a simple client based on php to browse my movies from the minidlna, some kind of helper.

In the attached file, take a look on the plot string, minidlna cut it on the word: "creación", exactly in the middle of the accentuated "o".

Happens with the current git version.


  • Luis Backup

    Luis Backup - 2020-08-04

    Another example of a problematic xml document for some xml parsers, look the last &quot, it's not ended properly due to string truncate.

    <dc:description>Londres, 1593, reinado de Isabel I Tudor. William Shakespeare, joven dramaturgo de gran talento, necesita urgentemente poner fin a la mala racha por la que está pasando su carrera. Por mucho que lo intenta, y a pesar de la presión de los productores y de los dueños de salas de teatro, no consigue concentrarse en su nueva obra: "Romeo y Ethel, la hija del pirata&quot</dc:description>

  • Luis Backup

    Luis Backup - 2020-08-04

    I think a good method will be back from the end up to the first space, but with more than 3 chars then append three dots.
    For example

    Example: Hi how are you?, alongword anotherlongword
    Back from the end counting chars up to the first (last space), if more than 3 chars was counted, cut at that position (space char position) and append three dots.
    Result: Hi how are you, alongword...

  • Luis Backup

    Luis Backup - 2020-08-04

    Because almost everytime I put a new movie my php stop parsing the xml document I made a horrible modification (worst code ever written), please adapt to your coding style and the rules of the programing code, or take it as example. Not tested with 383/384 or 385 bytes strings, just with less length and more length, but was tested OK with words and html elements like " in the middle of the cut.

    File: upnpsoap.c
    near Line: 1011

        if( comment && (passed_args->filter & FILTER_DC_DESCRIPTION) ) {
            //ret = strcatf(str, "&lt;dc:description&gt;%.384s&lt;/dc:description&gt;", comment);
            ret = strcatf(str, "&lt;dc:description&gt;");
            uint16_t sourceLen = strlen(comment);
            if(sourceLen > 384)
                uint16_t maxCharsToCopy =  MIN(384 , sourceLen);
                char * pEndPosition = comment;
                pEndPosition += maxCharsToCopy;
                uint16_t skippedChars = 0;
                uint8_t bQuit = 0;
                while(0 == bQuit && pEndPosition > comment)
                    if(' ' == *pEndPosition)
                        //we found a space, ideal to cut, but check if we skip at least 3 chars
                        if(skippedChars >= 3)
                            //ok to cut
                            bQuit = 1;
                uint16_t resultingCount = pEndPosition - comment;
                uint16_t lengtTmp = strlen(comment);
                strncat(str->data, comment, resultingCount);
                uint16_t copiedChars = MIN(lengtTmp, resultingCount);
                str->off += copiedChars;
                ret = strcatf(str, "...", comment);
                ret = strcatf(str, "%.384s", comment);
            ret = strcatf(str, "&lt;/dc:description&gt;");
        if( creator && (passed_args->filter & FILTER_DC_CREATOR) ) {

    Last edit: Luis Backup 2020-08-04

