I had the following URL in a page
[Twin Cities search entry |
http://twincities.citysearch.com/profile/35716516?cslink=roundup_name_cust&ulink=rounduproundupentity2-5_1_profile_2_1]
The page itself says:
BAD URL -- remove all of <, >, "
There are no <, >, or ". I remove the query stuff to leave
[Twin Cities search entry |
http://twincities.citysearch.com/profile/35716516]
and it doesn't warn anymore.
Either the parser or the error message should change.
I would track it down, but I don't have time this minute.
Dan
Logged In: YES
user_id=417594
This is based on 1.3.9, by the way.
Logged In: YES
user_id=245140
I know what's going on here. The pairs of double underscores
in your url are being interpreted as bold tags in the wiki
markup language, and the url is parsed for markup prior to
the check for whether it's a safe url (i.e., whether it
contains any of <, >, "). Since the markup parser has
inserted a '' tag, the url now contains <, and so
it's rejected.
So, that's a diagnosis of the problem. I haven't had a
chance to dig any deeper to see where to correct it. I'll
see about that in the next few days.
(This bug is still in the version in CVS right now, so has
been carried along from at least 1.3.9.)
Logged In: YES
user_id=245140
Here's what I came up with as a solution:
Look at ConvertOldMarkup() in lib/stdlib.php. Immediately
before the line
try adding the following:
and let me know if this appears to work for you. I'm not
very familiar with this part of the codebase; it works for
my test case, but it might break things of which I'm not aware.
One side-effect this has is that it changes the appearance
of unnamed external links containing pairs of "__" or pairs
of "''" (single quotes). Right now they don't work at all,
so this is probably an improvement, nonetheless.
Also, I'm not sure that this does the proper thing with
internal links containing paired underscores or single
quotes. E.g., [foobar] ends up as 'foo%5F%5Fbar%5F%5F'.
Actually, I'm not sure whether it's even legal to have a
page with that name.
Logged In: YES
user_id=417594
Joel,
Thanks for your suggestion. I will not have a chance to try
it out for a few weeks, probably, but I would like to try it.
A couple of thoughts:
It would be quite valuable to clarify exactly what pagenames
are legal. Actually, we are trying to allow any characters
in our pagenames. We import Amazon book titles, and they
have spaces, :s, etc. etc. Thus, I hope Phpwiki ends up
supporting any characters.
The way this is normally done is with quoting schemes (e.g.,
"foo~~_bar~~_" or "q(foobar")). This can be ugly, but
makes things possible.
In general I've found Phpwiki's parsing to be fragile and
adhoc. Apologies to whoever wrote it-- I'm sure it was a lot
of work.
Your suggestion I'm afraid looks a lot like a patch on top
of a shaky parser. Please correct me if I'm wrong.
Dan
Logged In: YES
user_id=245140
Fixed in current CVS.
http://sourceforge.net/cvs/?group_id=6121
or http://phpwiki.sf.net/nightly/phpwiki.nightly.tar.gz
patch for double underscores in link URLs (old markup)
Logged In: YES
user_id=245140
I'm attaching a patch which is a more robust solution to
the problem than the one I suggested below. The affected
part of lib/stdlib.php is identical between 1.3.9 and
current CVS, so it should just work. If you can't get it to
apply, email me.
The underscore problem you're having shows up only with
the old markup syntax. The link you give as an example works
as is with the new makup syntax. In general, I'd suggest
using the new markup syntax.
You might want to ask on the talk list for clarification
of what pagenames are legal. I'm not sure myself---I was
able to get most punctuation to work in pagenames, but not
all (e.g., colons don't work, but spaces and underscores
do). Maybe open a new bug on that point, as well.