Share

Heritrix: Internet Archive Web Crawler

Tracker: Bugs

5 "&amp" converted to "&" in preselector override regex - ID: 1023929
Last Update: Comment added ( karl-ia )

When trying to add the string "&amp" (without the HTML-entity-
ending ";")to a regular expression in a Preselector override, the
"&amp" gets quietly converted to a single "&" character.

Multiple "&amp"s are converted into multiple "&"s.


Dan Avery ( danavery ) - 2004-09-07 19:38

5

Closed

Fixed

Gordon Mohr

None

None

Public


Comments ( 4 )

Date: 2007-03-14 00:16
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-242 -- please add further
comments at that location.


Date: 2005-03-02 03:09
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

Really fixed now (hadn't committed previously).


Date: 2005-02-12 00:01
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

It is a browser issue -- the '&amp' gets up to the crawler
OK, but then when sent back down, is rendered as '&'. (You
can look in the source and see the original '&amp'.) So far,
so good.

Real problem arises if any other setting is changed, and
plain '&' is not corrected to '&amp' before submission.
Then, rendered '&' clobbers crawler-side '&amp'.

(Further complication: '&ampamp' is not rendered in browser
as '&amp' -- perhaps it only presumes the ';' if at the end
of a value?)

Sufficient fix appears to be converting all '&' in
input-field values to '&' before returning to browser.

Commit comment:
Fix for [ 1023929 ] "&amp" converted to "&" in preselector
override regex
* jobconfigure.jsp
patch '&' to '&' in strings destined to be displayed
as input-field values





Date: 2004-09-22 22:43
Sender: gojomoProject Admin

Logged In: YES
user_id=144912

might be a browser issue?


Attached File

No Files Currently Attached

Changes ( 4 )

Field Old Value Date By
assigned_to nobody 2005-03-02 03:09 gojomo
status_id Open 2005-02-12 00:01 gojomo
resolution_id None 2005-02-12 00:01 gojomo
close_date - 2005-02-12 00:01 gojomo