|
From: Joe R. J. <jj...@cl...> - 2002-03-16 00:13:39
|
On Fri, 15 Mar 2002, Gilles Detillieux wrote:
> Date: Fri, 15 Mar 2002 16:34:36 -0600 (CST)
> From: Gilles Detillieux <gr...@sc...>
> To: jj...@cl...
> Cc: "ht://Dig developers list" <htd...@li...>
> Subject: Re: [htdig-dev] "file name.html" -> "filename.html";(
>
> No, the code below does two things: 1) if allow_space_in_url is not
> set, the code works like the standard 3.1.x code does, i.e. in strips
> out all white space characters, and 2) if allow_space_in_url is set,
> the code strips out all white space characters other than the space
> itself - for the space character (ASCII 20 hex) it strips leading and
> trailing spaces and converts the spaces within the URL to %20. The name
> allow_space_in_url is correct, because if the attribute is false,
> no spaces are allowed - they're stripped out, just as the currently
> released code does, in accordance with RFC 2396. However, if you prefer
> encode_space_in_url we can go with that. We're not going to start putting
> all sorts of wierd punctuation characters like "%" in attribute names.
>
> > > static int allowspace = config.Boolean("allow_space_in_url", 0);
> > > String temp;
> > > while (*ref)
> > > {
> > > if (*ref == ' ' && temp.length() > 0 && allowspace)
> > > {
> > > // Replace space character with %20 if there's more non-space
> > > // characters to come...
> > > char *s = ref+1;
> > > while (*s && isspace(*s))
> > > s++;
> > > if (*s)
> > > temp << "%20";
> > > }
> > > else if (!isspace(*ref))
> > > temp << *ref;
> > > ref++;
> > > }
>
> Maybe my description of the code above helps you see the rationale more
> clearly. The attribute selects both behaviours, not just the encoding.
> The reason to make it user-selectable option is that some users may
> actually prefer htdig to follow the standards rather than ignore them
> like MS/AOL do.
>
> I'm not sure what you mean by integrating my option in the entire patch.
> The code above should be complete on its own, as a change to vanilla
> 3.1.6 URL.cc code. You don't need to integrate it with earlier proposed
> changes - just put it in both URL methods you were changing before and
> make a patch out of it.
I misunderstood. Here is the patch:
-------------------------------------8<-------------------------------------
*** htlib/URL.cc.031202 Thu Feb 7 17:15:38 2002
--- htlib/URL.cc Fri Mar 15 15:25:27 2002
***************
*** 74,82 ****
//
URL::URL(char *ref, URL &parent)
{
! String temp(ref);
! temp.remove(" \r\n\t");
! ref = temp;
_host = parent._host;
_port = parent._port;
--- 74,97 ----
//
URL::URL(char *ref, URL &parent)
{
! static int allowspace = config.Boolean("allow_space_in_url", 0);
! String temp;
! while (*ref)
! {
! if (*ref == ' ' && temp.length() > 0 && allowspace)
! {
! // Replace space character with %20 if there's more non-space
! // characters to come...
! char *s = ref+1;
! while (*s && isspace(*s))
! s++;
! if (*s)
! temp << "%20";
! }
! else if (!isspace(*ref))
! temp << *ref;
! ref++;
! }
_host = parent._host;
_port = parent._port;
***************
*** 243,255 ****
}
//*****************************************************************************
! // void URL::parse(char *u)
// Given a URL string, extract the service, host, port, and path from it.
//
! void URL::parse(char *u)
{
! String temp(u);
! temp.remove(" \t\r\n");
char *nurl = temp;
//
--- 258,286 ----
}
//*****************************************************************************
! // void URL::parse(char *ref)
// Given a URL string, extract the service, host, port, and path from it.
//
! void URL::parse(char *ref)
{
! static int allowspace = config.Boolean("allow_space_in_url", 0);
! String temp;
! while (*ref)
! {
! if (*ref == ' ' && temp.length() > 0 && allowspace)
! {
! // Replace space character with %20 if there's more non-space
! // characters to come...
! char *s = ref+1;
! while (*s && isspace(*s))
! s++;
! if (*s)
! temp << "%20";
! }
! else if (!isspace(*ref))
! temp << *ref;
! ref++;
! }
char *nurl = temp;
//
-------------------------------------8<-------------------------------------
But it failed to follow any link;(I must have misread your instructions;)
any ideas?
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah jj...@cl...
|