#1069 broken rendering in tt-rss feeds

v1.10
closed-fixed
Lars Windolf
None
7
2013-06-11
2013-03-26
Pavel
No

All of my tt-rss feeds are showing this rendering error:

"This page contains the following errors:

error on line 23 at column 3190: Opening and ending tag mismatch: br line 0 and p
Below is a rendering of the page up to the first error."

Liferea: 1.10-RC1 (same issue also in 1.8.12)
TT-Rss: 1.7.5
Distro: Ubuntu 12.04.2

Discussion

  • I don't see this on every single item but most of them. I haven't yet determined whether tt-rss really is sending out bad markup.

     
  • Jonathan
    Jonathan
    2013-04-11

    I confirm this same bug.

    -Affects some feeds, not others.
    -Affects 100% of items in affected feeds.
    -Affected feeds from various sources.
    -Affected feeds originate from the same sources as unaffected feeds.

    Liferea 1.8.12
    TT-RSS 1.7.8
    Ubuntu 12.04

     
  • Brian
    Brian
    2013-04-13

    Confirming same bug. Seems universal on feeds that have embedded images, and falls over after the first image. Both Liferea and TT-RSS can display the feed correctly, but Liferea displaying from TT-RSS instance has the bug.

    Liferea 1.8.12; Slackware 14 64 bit
    TT-RSS 1.7.8 on Synology NAS running DSM 4.2

     
  • Jonathan
    Jonathan
    2013-04-17

    Okay, I have found out where the problem is. It is choking on the <br /> tag. Wherever this appears, which is in 90+% of my feeds, Liferea chokes parsing the XML.

    Is <br /> not a valid tag?

    Liferea 1.8.12
    TT-RSS 1.7.8
    Ubuntu 12.04

     
  • Not just <br /> but also <img />, and that probably means any singular tag. <br /> is XHTML syntax while <br> is HTML. My guess is it's being parsed too strictly as the latter.

     
    Last edit: James Le Cuirot 2013-04-17
  • David Smith
    David Smith
    2013-05-12

    Also getting this problem with Liferea 1.10rc1 and tt-rss 1.7.8.
    I'd say about 40% of all news articles aren't loading properly (if at all). Including the Debian news feed.

     
    Last edit: David Smith 2013-05-12
  • Seeing the problem also on current git master. For reference, here's a simplest possible test case: RSS. It's valid RSS, works well in TTRSS and Liferea when added as a normal subscription, but fails as described above when viewed as part of a Liferea TTRSS source.

     
  • Hi again. Looking at this in spare minutes... I put some printfs in the code to see what happens. When my test RSS above is loaded as a regular subscription in Liferea, the following is passed from parseRSSItem (in rss_item.c) to item_set_description:

    <div xmlns="http://www.w3.org/1999/xhtml"><p>This triggers a bug in Liferea TTRSS sources: <br/>
          This content is not seen.</p></div>
    

    I.e., the xhtml content from the description tag of the item, wrapped in a <div> and a <p>.

    But when the same RSS is loaded through a TTRSS source, this is what gets passed to item_set_description (from ttrss_feed_subscription_process_update_result in ttrss_source_feed.c):

    <html><body><p>This triggers a bug in Liferea TTRSS sources: <br>
          This content is not seen.</p></body></html>
    

    This content comes, I believe, straight from the TTRSS API's JSON. So it seems TTRSS transforms content to straight HTML. The parser then expects XHTML, and chokes because the <br> isn't closed. (So it's the opposite from James' suggestion above – strictly expecting xhtml, getting html.)

    Two possible paths emerge:

    1. Investigate why TTRSS transforms XHTML to HTML and see if the behavior can be changed.
    2. Make Liferea convert content from TTRSS (back) to xhtml, or somehow make the parser accept self-closing tags. (Don't know how this stuff works.)

    Regards, Simon

     
  • Hello again,

    I now have a working fix! It should probably be cleaned up a bit before going into the master code, but I wanted to post it here before going to bed and not knowing when I'll be able to hack further on this. It would be nice to get it tested. Applies against git master or 1.10RC2 with patch -p1.

    This makes the item content from TTRSS go through the same conversion as regular RSS item descriptions do.

    Regards,
    Simon Kågedal Reimer

     
  • Here's a cleaner patch that does the same job with less duplicated code. I've tested with a couple of feeds that previously didn't work, and they now seem to parse correctly. Would be great if somebody else could try it out.

    Hope this can be included in RC3. No code except the TTRSS module should be affected.

    Regards, Simon

     
  • Brian
    Brian
    2013-05-15

    I have installed patch version 2 against 1.10-RC2 - everything appears to be working perfectly now

    Thanks!
    Brian

     
  • Jonathan
    Jonathan
    2013-05-17

    FYI, Patch not working for 1.8.13

    Upgraded to 1.10-RC2 - working fine. Thanks

     
  • Lars Windolf
    Lars Windolf
    2013-05-18

    Patch for 1.8.13 merged with git master

     
  • Lars Windolf
    Lars Windolf
    2013-05-18

    Patch for 1.0-RC2 merged with git master

    Thanks for the work! I hope to release them soon. But want to test for some days.

     
  • Lars Windolf
    Lars Windolf
    2013-05-18

    • status: open --> open-fixed
    • assigned_to: Lars Windolf
    • Priority: 5 --> 7
     
  • Thank you for accepting, Lars!

    There was a stupid bug in my code pointed out on liferea-devel by Emilio Pozuelo Monfort - thank you! Here's a follow-up patch that fixes that and another thing (shouldn't use g_return_val_if_fail if NULL is considered valid input). It applies to either 1.8.13 or 1.10RC2 after first applying the appropriate VERSION2 patch, or to git master.

     
  • Lars Windolf
    Lars Windolf
    2013-05-18

    Thanks for the update! I'm glad to Emilio spotted it. Applied to both branches.

     
  • Lars Windolf
    Lars Windolf
    2013-06-11

    • status: open-fixed --> closed-fixed
     
  • Lars Windolf
    Lars Windolf
    2013-06-11

    Fixed with recent releases.