Share

Heritrix: Internet Archive Web Crawler

Tracker: Bugs

5 User Agent is tolowercased - ID: 1113977
Last Update: Comment added ( karl-ia )

The method getUserAgent in CrawlOrder isn't correct.
The return UA string has toLowerCase in it. Mobile sites
wont recognize the crawlers UA then. The below code
corrects the problem


From CrawlOrder(Working):
public String getUserAgent(CrawlURI curi) {
if (caseFlattenedUserAgent == null) {
try {
caseFlattenedUserAgent = ((String)
httpHeaders.getAttribute(ATTR_USER_AGENT, curi));
} catch (AttributeNotFoundException e) {
logger.severe(e.getMessage());
}
}
return caseFlattenedUserAgent;
}

Regards,

Henrik Appelfeldt
System Specialist
3


Nobody/Anonymous ( nobody ) - 2005-02-01 13:31

5

Closed

Fixed

Nobody/Anonymous

None

None

Public


Comments ( 2 )

Date: 2007-03-14 00:20
Sender: karl-ia


This issue is now discussed in the new JIRA tracker at
http://webteam.archive.org/jira/browse/HER-349 -- please add further
comments at that location.


Date: 2005-02-03 23:05
Sender: stack-sfProject Admin

Logged In: YES
user_id=924942

Thank you for filing the issue and for the patch. I chatted
with Brad and Igor here and we all agree.

Below are what I committed and the message used. Closing.

Fix '[ 1113977 ] User Agent is tolowercased'
Patch contributed by Henrik Appelfeldt.
* src/java/org/archive/crawler/datamodel/CrawlOrder.java
(getUserAgent): Removed toLowerCase on found user agent.




Index: src/java/org/archive/crawler/datamodel/CrawlOrder.java
===================================================================
RCS file:
/cvsroot/archive-crawler/ArchiveOpenCrawler/src/java/org/archive/crawler/datamodel/CrawlOrder.java,v
retrieving revision 1.43
diff -u -r1.43 CrawlOrder.java
--- src/java/org/archive/crawler/datamodel/CrawlOrder.java
4 Jan 2005 02:24:57 -0000 1.43
+++ src/java/org/archive/crawler/datamodel/CrawlOrder.java
3 Feb 2005 23:02:21 -0000
@@ -320,10 +320,8 @@
public String getUserAgent(CrawlURI curi) {
if (caseFlattenedUserAgent == null) {
try {
- caseFlattenedUserAgent =
- ((String) httpHeaders
- .getAttribute(ATTR_USER_AGENT, curi))
- .toLowerCase();
+ caseFlattenedUserAgent = ((String) httpHeaders.
+ getAttribute(ATTR_USER_AGENT, curi));
} catch (AttributeNotFoundException e) {
logger.severe(e.getMessage());
}



Attached File

No Files Currently Attached

Changes ( 4 )

Field Old Value Date By
close_date - 2005-02-03 23:05 stack-sf
status_id Open 2005-02-03 23:05 stack-sf
resolution_id None 2005-02-03 23:05 stack-sf
summary User Agent 2005-02-03 23:05 stack-sf