Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#948 Feeds with DTD generates IOExceptions in UrlResolver

CVS_version
closed
Dare Obasanjo
7
2012-09-23
2007-02-07
No

From our mail conversation:

Looks like we can (sometimes) handle feeds with DTD's correctly (while IE7 simply sais:
"IE does not support feeds with DTD's", e.g. this one:
http://www.webmasterworld.com/index.rss

While debugging (with CLR 2.0 running) I get now a DirectoryNotFoundException (inherits
IOException) at line 1837, file RssParser.cs, class ProxyXmlUrlResolver::GetEntity()
The above feed provides the parameter
absoluteUri = {file:///D:/My Projects/DOT.NET/Sourceforge.RssBandit/CurrentWork/Source/RssBandit/bin/Debug2/-/Netscape Communications/DTD RSS 0.91/EN}

So how do we handle this? Catching that exception and delegate to base.GetEntity()?

Or is the relative DTD file ref. badly expanded to a local file Url?

Discussion

  • Dare Obasanjo
    Dare Obasanjo
    2007-02-10

    Logged In: YES
    user_id=24549
    Originator: NO

    Couldn't reproduce this error.

     
  • Logged In: YES
    user_id=714452
    Originator: YES

    Did you tested on top of CLR 2.0? Here it is repro all the time...

     
  • Logged In: YES
    user_id=714452
    Originator: YES

    Another one: exception

    "Cannot resolve external DTD subset - public ID = '-//W3C//DTD XHTML 1.0 Strict//EN', system ID = 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'"

    at feed https://addons.mozilla.org/rss/?application=firefox&type=E&list=popular

     
  • Logged In: YES
    user_id=714452
    Originator: YES

    U can repro this way: In VS.IDE 8.0/2005 Ctrl-Alt-E to show debug exceptions window, enable break on all CLR exceptions (or at least IOException/DirectoryNotFoundException), or just put a break point into the IOException catch clause, then Refresh the above feed(s).

    I guess it is another framework issue, because we already get a badly resolved Url as parameter (it is already a file Url, but it should not be one...
    Could it be if there is a "System ID" in the DTD Url CLR assume it is a local file?
    Cannot examine with my fiddler, it does not yet support "https:"

     
  • Logged In: YES
    user_id=714452
    Originator: YES

    My bad: the provided feed samples seems to be gone (no DTD anymore) - will look, if I can find another one.

     
  • Logged In: YES
    user_id=714452
    Originator: YES

    Now I seems to get it with all DTD feeds I have. Here is one of the more popular that now show up in my Errors node every time:

    http://feedvalidator.org/check?url=http://www.xml.com/xml/news.rss

    Looks like a more serious CLR 2.0 issue: we get always the DOCTYPE part '-//Netscape Communications//DTD RSS 0.91//EN' within the absoluteUri parameter (resoved/expanded to a local filesystem Uri relative to the executable), not the DOCTYPE part 'http://my.netscape.com/publish/formats/rss-0.91.dtd' as we expect (and get with CLR 1.1).

    xml news feed head:

    ...
     
  • Dare Obasanjo
    Dare Obasanjo
    2007-02-17

    Logged In: YES
    user_id=24549
    Originator: NO

    I can now repro this issue. I've fixed this by retrieving RSS 0.91 DTD from a local file instead of trying to fetch from the provided absoluteURI. Since the RSS 0.91 DTD is the only DTD that is regularly referenced in RSS feeds this should solve the problem in almost all cases.