From: Dave P. <da...@da...> - 2002-09-24 09:51:31
|
As of some time around 2002-09-18 some code that I've had running for a long time now (written around cllib's url and xml functions) has started to fail. The code <URL:http://www.davep.org/lisp/slashdot.lisp> (along with other bits) connects to slashdot, grab's the headline file and adds them to a local database of headlines. As of the 18th the connection no longer seems to work. For example: -- cut here ---------------------------------------------------------------- [10]> (cllib:xml-read-from-url "http://slashdot.org/slashdot.xml") [1] Connecting to slashdot.org:80 [timeout 86,400 sec]...done: [#<io unbuffered socket-stream character slashdot.org:80>] Connection to <http://slashdot.org:80/slashdot.xml> dropped: - read: input stream #<io unbuffered socket-stream character slashdot.org:80> has reached its end -- cut here ---------------------------------------------------------------- whereas a connection to a copy of the XML file, held locally on my machine, works fine. Likewise, if I use something like wget to grab a copy of the xml file everything works fine. I don't know what has changed here (it wasn't my code) but I'm wondering if some change has taken place as slashdot.org which can't be handled by cllib's url oriented code? -- Dave Pearson http://www.davep.org/ |
From: Sam S. <sd...@gn...> - 2002-09-24 17:51:38
|
> * In message <200...@ha...> > * On the subject of "Loading xml from an URL and slashdot.org" > * Sent on Tue, 24 Sep 2002 10:51:26 +0100 > * Honorable Dave Pearson <da...@da...> writes: > > As of some time around 2002-09-18 some code that I've had running for > a long time now (written around cllib's url and xml functions) has > started to fail. I modified url.lisp - it appears to work now again. I think they changes something at slashdot.org - I did not touch cllib for months. -- Sam Steingold (http://www.podval.org/~sds) running RedHat7.3 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> Hard work has a future payoff. Laziness pays off NOW. |
From: Sam S. <sd...@gn...> - 2002-09-24 17:51:38
|
> * In message <200...@ha...> > * On the subject of "Loading xml from an URL and slashdot.org" > * Sent on Tue, 24 Sep 2002 10:51:26 +0100 > * Honorable Dave Pearson <da...@da...> writes: > > As of some time around 2002-09-18 some code that I've had running for a long > time now (written around cllib's url and xml functions) has started to fail. > The code <URL:http://www.davep.org/lisp/slashdot.lisp> (along with other > bits) connects to slashdot, grab's the headline file and adds them to a > local database of headlines. > > As of the 18th the connection no longer seems to work. For example: > > -- cut here ---------------------------------------------------------------- > [10]> (cllib:xml-read-from-url "http://slashdot.org/slashdot.xml") > [1] Connecting to slashdot.org:80 [timeout 86,400 sec]...done: > [#<io unbuffered socket-stream character slashdot.org:80>] > Connection to <http://slashdot.org:80/slashdot.xml> dropped: > - read: input stream #<io unbuffered socket-stream character slashdot.org:80> has reached its end > -- cut here ---------------------------------------------------------------- > > whereas a connection to a copy of the XML file, held locally on my machine, > works fine. Likewise, if I use something like wget to grab a copy of the xml > file everything works fine. > > I don't know what has changed here (it wasn't my code) but I'm wondering if > some change has taken place as slashdot.org which can't be handled by > cllib's url oriented code? I have not changed anything there for a long long time. I will look into this... -- Sam Steingold (http://www.podval.org/~sds) running RedHat7.3 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> Hard work has a future payoff. Laziness pays off NOW. |
From: Dave P. <da...@da...> - 2002-09-25 10:00:43
|
* Sam Steingold <sd...@gn...> [2002-09-24 13:39:14 -0400]: > > As of some time around 2002-09-18 some code that I've had running for a > > long time now (written around cllib's url and xml functions) has started > > to fail. > > I modified url.lisp - it appears to work now again. Confirmed here. Many thanks. > I think they changes something at slashdot.org - I did not touch cllib for > months. I suspect that it was something at slashdot.org that url.lisp wasn't taking into account rather than url.lisp having been modified (indeed, this had to be the case 'cos I've not done an update of clocc in many months). Can I ask what the problem was? -- Dave Pearson http://www.davep.org/ |
From: Sam S. <sd...@gn...> - 2002-09-25 13:43:19
|
> * In message <200...@ha...> > * On the subject of "Re: Loading xml from an URL and slashdot.org" > * Sent on Wed, 25 Sep 2002 10:37:01 +0100 > * Honorable Dave Pearson <da...@da...> writes: > > * Sam Steingold <sd...@gn...> [2002-09-24 13:39:14 -0400]: > > I modified url.lisp - it appears to work now again. > Confirmed here. Many thanks. great! > Can I ask what the problem was? slashdot wanted a more verbose request. you can see the request and the headers if you give WITH-OPEN-URL stream :ERR arg. -- Sam Steingold (http://www.podval.org/~sds) running RedHat7.3 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> In the race between idiot-proof software and idiots, the idiots are winning. |
From: Dave P. <da...@da...> - 2002-10-24 08:56:21
|
* Dave Pearson <da...@da...> [2002-09-24 10:51:26 +0100]: > I don't know what has changed here (it wasn't my code) but I'm wondering > if some change has taken place as slashdot.org which can't be handled by > cllib's url oriented code? This "problem" seems to have happened again: ,---- | [12]> (cllib:xml-read-from-url "http://slashdot.org/slashdot.xml") | [1] Connecting to slashdot.org:80 [timeout 86,400 sec]...done: | [#<io unbuffered socket-stream character slashdot.org:80>] | [cllib:open-url]GET /slashdot.xml HTTP/1.0 | [cllib:open-url]User-Agent: CLOCC/CLLIB (CLISP) | [cllib:open-url]Host: slashdot.org | [cllib:open-url]Accept: */* | [cllib:open-url]Connection: close | [cllib:open-url]<terpri> `---- At that point clisp "hangs" until I Ctrl-C to get back to a prompt. Something like wget is still working ok: ,---- | davep@hagbard:~$ wget http://slashdot.org/slashdot.xml | --09:55:26-- http://slashdot.org:80/slashdot.xml | => `slashdot.xml' | Connecting to slashdot.org:80... connected! | HTTP request sent, awaiting response... 200 OK | Length: 3,790 [text/xml] | | 0K -> ... [100%] | | 09:55:27 (10.89 KB/s) - `slashdot.xml' saved [3790/3790] `---- -- Dave Pearson http://www.davep.org/ |
From: Sam S. <sd...@gn...> - 2002-10-24 14:11:23
|
> * In message <200...@ha...> > * On the subject of "Re: Loading xml from an URL and slashdot.org" > * Sent on Thu, 24 Oct 2002 09:56:09 +0100 > * Honorable Dave Pearson <da...@da...> writes: > > * Dave Pearson <da...@da...> [2002-09-24 10:51:26 +0100]: > > > I don't know what has changed here (it wasn't my code) but I'm wondering > > if some change has taken place as slashdot.org which can't be handled by > > cllib's url oriented code? > > This "problem" seems to have happened again: > > ,---- > | [12]> (cllib:xml-read-from-url "http://slashdot.org/slashdot.xml") > | [1] Connecting to slashdot.org:80 [timeout 86,400 sec]...done: > | [#<io unbuffered socket-stream character slashdot.org:80>] > | [cllib:open-url]GET /slashdot.xml HTTP/1.0 > | [cllib:open-url]User-Agent: CLOCC/CLLIB (CLISP) > | [cllib:open-url]Host: slashdot.org > | [cllib:open-url]Accept: */* > | [cllib:open-url]Connection: close > | [cllib:open-url]<terpri> > `---- > > At that point clisp "hangs" until I Ctrl-C to get back to a prompt. WFM: > (cllib:xml-read-from-url "http://slashdot.org/slashdot.xml") [WITH-XML-FILE] * [/usr/local/src/clocc/src/cllib/entities.xml 20,432 bytes]...done [entities(%/&): 0/251] [bytes: 20,432] [run: 0.088 sec] [real: 0.104 sec] [1] Connecting to slashdot.org:80 [timeout 86,400 sec]...done: [#<io unbuffered socket-stream character slashdot.org:80>] [open-url]GET /slashdot.xml HTTP/1.0 [open-url]User-Agent: CLOCC/CLLIB/url.lisp (CLISP) [open-url]Host: slashdot.org [open-url]Accept: */* [open-url]Connection: close [open-url]<terpri> [xml-read-from-url]"Date: Thu, 24 Oct 2002 13:52:29 GMT" [xml-read-from-url]"Server: Apache/1.3.26 (Unix) mod_gzip/1.3.19.1a mod_perl/1.27 mod_ssl/2.8.10 OpenSSL/0.9.6g" [xml-read-from-url]"X-Powered-By: Slash 2.003000" [xml-read-from-url]"X-Fry: There's a lot about my face you don't know." [xml-read-from-url]"Last-Modified: Thu, 24 Oct 2002 13:43:11 GMT" [xml-read-from-url]"ETag: \"336705-ecd-3db7f8ef\"" [xml-read-from-url]"Accept-Ranges: bytes" [xml-read-from-url]"Content-Length: 3789" [xml-read-from-url]"Connection: close" [xml-read-from-url]"Content-Type: text/xml" * added XML namespace: #<xml-namespace "http://slashdot.org/backslash.dtd" "ns492" 0 #x203ECD89> [prefix "backslash"] * new XML name: "backslash" * new XML name: "story" * new XML name: "title" * new XML name: "url" * new XML name: "time" * new XML name: "author" * new XML name: "department" * new XML name: "topic" * new XML name: "comments" * new XML name: "section" * new XML name: "image" (#<xml-decl xml [version="1.0"] #x203E3319> #<xml-obj backslash [] 302/21 objects 1,821/2,326 chars #x203E33C1>) -- Sam Steingold (http://www.podval.org/~sds) running RedHat8 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> UNIX is as friendly to you as you are to it. Windows is hostile no matter what. |
From: Dave P. <da...@da...> - 2002-10-24 15:36:28
|
* Sam Steingold <sd...@gn...> [2002-10-24 10:00:04 -0400]: > > At that point clisp "hangs" until I Ctrl-C to get back to a prompt. > > WFM: Should "WFM" mean something to me? -- Dave Pearson http://www.davep.org/ |
From: Sam S. <sd...@gn...> - 2002-10-24 15:47:37
|
> * In message <200...@ha...> > * On the subject of "Re: Loading xml from an URL and slashdot.org" > * Sent on Thu, 24 Oct 2002 16:36:20 +0100 > * Honorable Dave Pearson <da...@da...> writes: > > * Sam Steingold <sd...@gn...> [2002-10-24 10:00:04 -0400]: > > > > At that point clisp "hangs" until I Ctrl-C to get back to a prompt. > > > > WFM: > > Should "WFM" mean something to me? Sorry, I thought it was a standard abbreviation for "Works For Me". (the first site that google returns for "WFM abbreviation" says that) -- Sam Steingold (http://www.podval.org/~sds) running RedHat8 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> Those who don't know lisp are destined to reinvent it, poorly. |
From: Dave P. <da...@da...> - 2002-10-24 17:17:42
|
* Sam Steingold <sd...@gn...> [2002-10-24 11:49:18 -0400]: > > Should "WFM" mean something to me? > > Sorry, I thought it was a standard abbreviation for "Works For Me". Ahh, ok. Hmm, can you think of anything useful I could do to try and debug this? > (the first site that google returns for "WFM abbreviation" says that) Very probably, but Google isn't always available when you're reading email. -- Dave Pearson http://www.davep.org/ |
From: Sam S. <sd...@gn...> - 2002-10-24 19:50:26
|
> * In message <200...@ha...> > * On the subject of "Re: Loading xml from an URL and slashdot.org" > * Sent on Thu, 24 Oct 2002 18:17:32 +0100 > * Honorable Dave Pearson <da...@da...> writes: > > Hmm, can you think of anything useful I could do to try and debug > this? 1. Try to go "low level": "telnet slashdot.org www", then type the commands that CLISP sends, i.e.: GET /slashdot.xml HTTP/1.0 <RET> ... <RET> and see what happens. 2. Try the same using CLISP REPL, i.e., open the socket, write the messages, don't forget 2 EOLs at the end of the request; don't forget that EOL is <CR><LF>. 3. Do not despair. Yesterday I observed that Mozilla hangs on my own homepage (while w3c, links, lynx, wget work fine). I restarted Mozilla and it worked again. You never know... -- Sam Steingold (http://www.podval.org/~sds) running RedHat8 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> The difference between theory and practice is that in theory there isn't any. |
From: Dave P. <da...@da...> - 2002-10-25 12:29:59
|
* Sam Steingold <sd...@gn...> [2002-10-24 15:52:13 -0400]: > > Hmm, can you think of anything useful I could do to try and debug > > this? > > 1. Try to go "low level": "telnet slashdot.org www", then type the > commands that CLISP sends, i.e.: > > GET /slashdot.xml HTTP/1.0 <RET> > ... > <RET> > > and see what happens. No response. You'll be glad to know that it isn't a clocc issue, it seems to be something lower level and odder. For example, "lynx -head http://slashdot.org/" hangs in a similar way (although not when pointing at other sites). As I've said before, other methods of downloading (wget, Netscape) work fine. Yet, at the same time, w3m and lynx fail along with my clocc based code. Quite odd. > 3. Do not despair. I think despair is a long way off, I mean, it's only slashdot and some silly code I like to amuse myself with. <g> > Yesterday I observed that Mozilla hangs on my own > homepage (while w3c, links, lynx, wget work fine). I restarted Mozilla > and it worked again. You never know... Sadly, in this case, there's no "Mozilla" to restart. Anyway, sorry for the bogus "heads up". -- Dave Pearson http://www.davep.org/ |