I have problems with some nfo files using non english language.
Situation:
DLNA clients ignore the whole xml document because its invalid.
Description:
On the server side, when strings were trimmed, sometimes, this trim occurs in the middle of a multibyte char, so, the byte that goes to the last position is a control char or something not like an ascii.
When this string go to the client, the last INVALID char of the string is the preceding char of the closing tag:
for example: <opentag>blahblahbl{INVALIDCHAR}
so, the xml parser of the client see: <opentag>blahblahblP/closingtag>, there is NO valid closing tag.</opentag></opentag>
I've tested this using the xml parser of php, in fact, I'm working on a simple client based on php to browse my movies from the minidlna, some kind of helper.
In the attached file, take a look on the plot string, minidlna cut it on the word: "creación", exactly in the middle of the accentuated "o".
Happens with the current git version.
Thanks.
Another example of a problematic xml document for some xml parsers, look the last ", it's not ended properly due to string truncate.
<dc:description>Londres, 1593, reinado de Isabel I Tudor. William Shakespeare, joven dramaturgo de gran talento, necesita urgentemente poner fin a la mala racha por la que está pasando su carrera. Por mucho que lo intenta, y a pesar de la presión de los productores y de los dueños de salas de teatro, no consigue concentrarse en su nueva obra: "Romeo y Ethel, la hija del pirata"</dc:description>
I think a good method will be back from the end up to the first space, but with more than 3 chars then append three dots.
For example
Example: Hi how are you?, alongword anotherlongword
Back from the end counting chars up to the first (last space), if more than 3 chars was counted, cut at that position (space char position) and append three dots.
Result: Hi how are you, alongword...
Because almost everytime I put a new movie my php stop parsing the xml document I made a horrible modification (worst code ever written), please adapt to your coding style and the rules of the programing code, or take it as example. Not tested with 383/384 or 385 bytes strings, just with less length and more length, but was tested OK with words and html elements like " in the middle of the cut.
File: upnpsoap.c
near Line: 1011
Last edit: Luis Backup 2020-08-04