|
From: Lachlan A. <lh...@ee...> - 2002-09-30 02:19:19
|
Greetings again Gilles,
Thanks for your feedback. I've fixed up some of the
sloppyness in the patch (patch.kurl, relative to
htdig-3.2.0b4-20020922) and done some fairly thorough
testing on it. It now only removes leading slashes if
there are at least the prescribed number. It also stores
the slash count as '0'+slashcount rather than
'\1'+slashcount. That retains the efficiency of using a
single-character, but means that naive reading of the
string gives the correct result (provided slashcount < 10).
I still haven't found out the "right" way to call test/url,
but I have called it by hand, and the results match those
of the original, with one exception. It now normalises the
host name (e.g. resolves aliases) for *all* protocols with
an IP host, rather than just http://. It still only
removes "index.html" from "http://".
I have also expanded the tests in test/url.cc to test the
new functionality, and to test alias resolution. The patch
is in "patch.test". It would be good to have a file of the
"correct" output of this file so that changes to URL.cc
could be tested simply with diff, but I didn't know how
to put that in the patch.
In the process of testing, I noticed that double slashes
('//') are normalised out *after* resolving '../'. This
means that '/foo//../' resolves to '/foo/', when UNIX would
resolve it to '/'. Is the current behaviour deliberate, or
an oversight?
Finally, what is the schedule/criterion for the release of
3.2.0b4? The KDE team won't base their code on snapshots,
and so I'd like to expedite this... Which of the items in
Geoff's regular "current status" posts have to be addressed
before the release?
Thanks!
Lachlan
On Wed, 18 Sep 2002 08:58, Gilles Detillieux wrote:
> According to Lachlan Andrew:
> > Here is a patch (url-format.patch) to allow the format
> > of external_protocol URLs be <protocol>:<path>, rather
> > than <protcol>://<host>/<path>. It seems to work in the
> > cases I've tested, but I'm not sure how to try it on
> > the test suite, so I hope I haven't broken anything
> > else...
>
> For the 1st one, I don't have time to test it myself, but
> I'd like to wait for comments/testing from other
> developers if any is forthcoming.
>
> To answer some of the questions you ask in the code of
> your patch, the .get() after a sub seemed to be needed
> with some compilers, to avoid warnings. For whatever
> reason, we couldn't just assign the result of a .sub() to
> another String, even though .sub() is supposed to return
> a String. The .get() gives us a (char *) from that, and
> everything seems to work well that way.
>
> On line 324 of URL.cc, you say "(should also check the
> slashes are actually there...)". I agree, the code
> should do this.
>
> The way the slash count is encoded in the Dictionary
> entries is kludgy, but I think there are similar kludges
> elsewhere in the code. It's also self-contained in one
> method of the URL class, so I don't have a problem with
> it.
--
Lachlan Andrew Phone: +613 8344-3816 Fax: +613 8344-6678
Dept of Electrical and Electronic Engg CRICOS Provider Code
University of Melbourne, Victoria, 3010 AUSTRALIA 00116K
|