#19 XML through the parser not behaving as expected

xml (1) rss (1)


Please bear with me as I'm just starting to learn PHP and playing around a bit with parsing (thank you by the way for this excellent tool).

I'm having some problems parsing the following URL:


This page is as far as I understand an RSS feed coded in XML. Now, when I try the following

echo file_get_html('http://www.dn.se/m/rss/senaste-nytt');

I don't get the output I expect. Here's what turns up (not all of it, but to show the syntax):

http://www.dn.se/m/rss/senaste-nytt 1.0 http://www.dn.se/ekonomi/ericssons-partner-lamnar-st-ericsson Mon, 10 Dec 2012 09:02:00 GMT 2012-12-10T09:02:00Z http://www.dn.se/ekonomi/ericssons-partner-lamnar-st-ericsson http://www.dn.se/kultur-noje/viktig-grav-kan-bli-begravd Mon, 10 Dec 2012 08:20:00 GMT 2012-12-10T08:20:00Z http://www.dn.se/kultur-noje/viktig-grav-kan-bli-begravd 

What I would expect to get is what I get when I look at this URL in a browser, something more orderly with tags and so on:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ex="http://www.mobiletech.no/rss/ex10" version="2.0">
    <title><![CDATA[DN.se - Nyheter - Senaste nytt]]></title>
    <description><![CDATA[DN.se - Nyheter - Senaste nytt]]></description>
      <title><![CDATA[Ericssons partner lämnar ST Ericsson]]></title>
      <description><![CDATA[Ericssons partner i hälftenägda ST Ericsson, ST Microelectronics, lämnar sitt ägande. ]]></description>
      <pubDate>Mon, 10 Dec 2012 09:02:00 GMT</pubDate>

Could anyone explain why this is happening? I'm expecting the output from the parser to be fairly "clean" in the sense that it should be the same as I would see browsing the source in the browser. Am I misunderstanding how the parser works?

Thanks in advance.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks