From: Dave P. <da...@da...> - 2002-03-27 10:48:17
|
I'm experiencing an odd problem with `xml-read-from-url' and the slashdot headline file, I wonder if someone can suggest a way of resolving the problem, or even finding the cause? As of some time yesterday, code similar to: ,---- | (cllib:xml-read-from-url "http://slashdot.org/slashdot.xml" :max-retry 0) `---- seems to no longer work, it just hangs "forever". Whereas similar code, pulling data out of a slashdot headline file on my local web server works just fine: ,---- | [11]> (cllib:xml-read-from-url "http://hagbard/slashdot.xml" :max-retry 0) | [1/0] Connecting to hagbard:80 [timeout 86,400 sec]...done: | [#<io unbuffered socket-stream character hagbard:80>] | [*] "Date: Wed, 27 Mar 2002 10:45:04 GMT" | [*] "Server: Apache/1.3.22 (Unix) (Red-Hat/Linux) PHP/3.0.17" | [*] "Last-Modified: Tue, 26 Mar 2002 11:14:51 GMT" | [*] "ETag: \"180d-e65-3ca0582b\"" | [*] "Accept-Ranges: bytes" | [*] "Content-Length: 3685" | [*] "Connection: close" | [*] "Content-Type: text/xml" | (#<cllib::xml-decl xml [version="1.0"] #x20464D19> | #<cllib:xml-obj backslash [] 302/21 objects 1,717/2,222 chars #x20464D85>) `---- I know that slashdot.org is listening and working because I can, for example, wget the file I'm interested in: ,---- | davep@hagbard:~$ wget http://slashdot.org/slashdot.xml | --10:47:37-- http://slashdot.org:80/slashdot.xml | => `slashdot.xml' | Connecting to slashdot.org:80... connected! | HTTP request sent, awaiting response... 200 OK | Length: 3,614 [text/xml] | | 0K -> ... [100%] | | 10:47:43 (12.17 KB/s) - `slashdot.xml' saved [3614/3614] `---- Can someone suggest what the cause of the problem might be, or how I might figure out the cause? -- Dave Pearson http://www.davep.org/ |
From: Sam S. <sd...@gn...> - 2002-03-30 04:36:34
|
> * In message <200...@ha...> > * On the subject of "`xml-read-from-url' and http://slashdot.org/slashdot.xml" > * Sent on Wed, 27 Mar 2002 10:48:13 +0000 > * Honorable Dave Pearson <da...@da...> writes: > > I'm experiencing an odd problem with `xml-read-from-url' and the > slashdot headline file, I wonder if someone can suggest a way of > resolving the problem, or even finding the cause? (dump-url "http://cnn.com") works (same for all hosts but /.), but (dump-url "http://slashdot.org") hangs. $ telnet slashdot.org 80 GET / HTTP1.0 $ works just fine (for all hosts). i.e., when I open a socket to ./ and write-line the GET (with 2 newlines and finish-output), I do not get the page (I do on all other pages) This is with both CMUCL and CLISP. I am stumped and I would greatly appreciate it if some person knowledgeable in IP (or whatever is relevant here! :-) could explain what is going on here. Thanks. -- Sam Steingold (http://www.podval.org/~sds) running RedHat7.2 GNU/Linux Keep Jerusalem united! <http://www.onejerusalem.org/Petition.asp> Read, think and remember! <http://www.iris.org.il> <http://www.memri.org/> The difference between genius and stupidity is that genius has its limits. |
From: Dave P. <da...@da...> - 2002-03-30 09:07:31
|
* Sam Steingold <sd...@gn...> [2002-03-29 23:36:47 -0500]: > (dump-url "http://cnn.com") works (same for all hosts but /.), but > (dump-url "http://slashdot.org") hangs. > > $ telnet slashdot.org 80 > GET / HTTP1.0 > $ > works just fine (for all hosts). > i.e., when I open a socket to ./ and write-line the GET (with 2 newlines > and finish-output), I do not get the page (I do on all other pages) > This is with both CMUCL and CLISP. > > I am stumped and I would greatly appreciate it if some person > knowledgeable in IP (or whatever is relevant here! :-) could explain what > is going on here. I'm guessing here, partly from testing with some other code too. I've got this for messing in emacs: -- cut here ---------------------------------------------------------------- (defun get-http-url-as-string (host location) (with-temp-buffer (let ((s (open-network-stream "http-connection" nil host "http"))) (when s (set-process-filter s #'(lambda (p o) (insert o))) (erase-buffer) (process-send-string s (format "GET /%s\n" location)) (while (eq (process-status s) 'open) (sit-for 0.01)) (delete-process s)) (buffer-string)))) -- cut here ---------------------------------------------------------------- and it too works with just about everything but /.. Remembering similar problems when writing nntp.lisp I wondered if, instead of simply ending the request with a "\n", I should be ending with "\r\n". That didn't work but ending with "\r\n\r\n" did. Doubtless someone who fully understand HTTP (I don't) will be able to comment. I've not had the chance to try this with cllib yet but, perhaps, a similar change there will make it work (I did look for a central "transmit HTTP request" function but couldn't find one, it seems to be done using formats throughout the code). -- Dave Pearson: | lbdb.el - LBDB interface. http://www.davep.org/ | sawfish.el - Sawfish mode. Emacs: | uptimes.el - Record emacs uptimes. http://www.davep.org/emacs/ | quickurl.el - Recall lists of URLs. |
From: Doug M. <do...@wi...> - 2002-03-30 13:09:50
|
Dave Pearson <da...@da...> writes: > Remembering similar > problems when writing nntp.lisp I wondered if, instead of simply ending the > request with a "\n", I should be ending with "\r\n". That didn't work but > ending with "\r\n\r\n" did. Doubtless someone who fully understand HTTP (I > don't) will be able to comment. "\r\n" is the canonical line-terminator for HTTP, SMTP, NNTP and most other text-oriented protocols. Many servers will accept a bare "\n", but some won't. Reading specs can be useful sometimes. ;) -Doug <-- not much of a LISPer, but does know protocols -- Doug McNaught Wireboard Industries http://www.wireboard.com/ Custom software development, systems and network consulting. Java PostgreSQL Enhydra Python Zope Perl Apache Linux BSD... |
From: Sam S. <sd...@gn...> - 2002-03-30 17:17:31
|
> * In message <m38...@va...> > * On the subject of "Re: `xml-read-from-url' and http://slashdot.org/slashdot.xml" > * Sent on 30 Mar 2002 08:09:37 -0500 > * Honorable Doug McNaught <do...@wi...> writes: > > "\r\n" is the canonical line-terminator for HTTP, SMTP, NNTP and most > other text-oriented protocols. Many servers will accept a bare "\n", > but some won't. indeed. I just fixed the problem in the CVS. thanks. > Reading specs can be useful sometimes. ;) :-) > -Doug <-- not much of a LISPer, but does know protocols nice to have suck people here :-) -- Sam Steingold (http://www.podval.org/~sds) running RedHat7.2 GNU/Linux Keep Jerusalem united! <http://www.onejerusalem.org/Petition.asp> Read, think and remember! <http://www.iris.org.il> <http://www.memri.org/> Between grand theft and a legal fee, there only stands a law degree. |
From: John D. <de...@ma...> - 2002-03-30 20:27:39
|
At 8:09 AM -0500 3/30/02, Doug McNaught wrote: > > Remembering similar >> problems when writing nntp.lisp I wondered if, instead of simply ending the >> request with a "\n", I should be ending with "\r\n". That didn't work but >> ending with "\r\n\r\n" did. Doubtless someone who fully understand HTTP (I >> don't) will be able to comment. > >"\r\n" is the canonical line-terminator for HTTP, SMTP, NNTP and most >other text-oriented protocols. Many servers will accept a bare "\n", >but some won't. > By \n, I hope you don't mean Lisp #\newline. In the spec it is CRLF (ascii 13, ascii 10) or #\return #\linefeed in Lisp. #\newline is different in many lisp environments based on the operating system (and rightly so according to the spec). On the Mac it is typically ascii 13 and on Windows it has to be converted to CRLF (two characters), at least when writing to files. I'm not trying to be picky, but I have been working on the MCL port for AllegroServe. One of the big problems I ran into was the use of #\newline instead of #\linefeed in all of the header parsing code. Best, John DeSoi, Ph.D. |
From: Sam S. <sd...@gn...> - 2002-03-31 19:33:50
|
> * In message <m3y...@gn...> > * On the subject of "Re: `xml-read-from-url' and http://slashdot.org/slashdot.xml" > * Sent on 30 Mar 2002 12:17:57 -0500 > * I write: > > > -Doug <-- not much of a LISPer, but does know protocols > nice to have suck people here :-) "such"!!!! :-) sorry! -- Sam Steingold (http://www.podval.org/~sds) running RedHat7.2 GNU/Linux Keep Jerusalem united! <http://www.onejerusalem.org/Petition.asp> Read, think and remember! <http://www.iris.org.il> <http://www.memri.org/> (let((a'(list'let(list(list'a(list'quote a)))a)))`(let((a(quote ,a))),a)) |
From: Dave P. <da...@da...> - 2002-04-03 08:30:20
|
* Sam Steingold <sd...@gn...> [2002-03-30 12:17:57 -0500]: > > "\r\n" is the canonical line-terminator for HTTP, SMTP, NNTP and most > > other text-oriented protocols. Many servers will accept a bare "\n", but > > some won't. > > indeed. > I just fixed the problem in the CVS. > thanks. And my code is now working just fine again. Thanks. -- Dave Pearson http://www.davep.org/ |
From: Doug M. <do...@wi...> - 2002-03-30 20:42:15
|
John DeSoi <de...@ma...> writes: > >"\r\n" is the canonical line-terminator for HTTP, SMTP, NNTP and most > >other text-oriented protocols. Many servers will accept a bare "\n", > >but some won't. > > > > By \n, I hope you don't mean Lisp #\newline. In the spec it is CRLF > (ascii 13, ascii 10) or #\return #\linefeed in Lisp. You're right, I was being Unix-centric. CRLF it is. -Doug -- Doug McNaught Wireboard Industries http://www.wireboard.com/ Custom software development, systems and network consulting. Java PostgreSQL Enhydra Python Zope Perl Apache Linux BSD... |