From: <Mic...@ss...> - 2001-11-09 15:53:23
|
Thanks for your initial response. I have tried running htdig with -vvvv (again) as you suggest. An extract of the result is below. Can anyone confirm for me what it is saying. My specific problem is trying to return results based on the 'postcode' written into the meta tag. If you go to www.edinburgh.gov.uk, choose search from the top menu and enter 'postcode' you will not get back any results _FOR which the only match is from the META tag_ (You will get lots of results where postcode appears in the body.) Thanks in advance for any help that can be provided, Mike pick: www.edinburgh.gov.uk, # servers = 1 28:28:2:http://www.edinburgh.gov.uk/CEC/Recreation/Libraries/Local_Organisat ions/local_Action_Group.html: Retrieval command for http://www.edinburgh.gov.uk/CEC/Recreation/Libraries/Local_Organisations/loc al_Action_Group.html: GET http://www.edinburgh.gov.uk/CEC/Recreation/Libraries/Local_Organisations/loc al_Action_Group.html HTTP/1.0 User-Agent: htdig/3.1.2 ('') Referer: http://www.edinburgh.gov.uk/CEC/Recreation/Libraries/Local_Organisations/loc al_A.html Host: www.edinburgh.gov.uk Header line: HTTP/1.1 200 OK Header line: Date: Fri, 09 Nov 2001 09:40:15 GMT Header line: Server: Apache/1.3.14 (Win32) Header line: Last-Modified: Mon, 05 Nov 2001 17:32:57 GMT Translated Mon, 05 Nov 2001 17:32:57 GMT to 05 Nov 2001 (101) And converted to Mon, 05 Nov 2001 Header line: ETag: "0-fe1-3be6cd49" Header line: Accept-Ranges: bytes Header line: Content-Length: 4065 Header line: Connection: close Header line: Content-Type: text/html Header line: returnStatus = 0 Read 4065 from document Read a total of 4065 bytes Tag: HTML>, matched -1 Tag: HEAD>, matched -1 Tag: META name='htdig-keywords' content='EH7 postcode 5QY postcode EH75QY postcode disabled community care disability mentally handicapped Learning disabilities PEOPLE WITH DISABILITIES Learning disabilities'>, matched 20 Tag: TITLE>, matched 0 According to Mic...@ss...: > Can anyone tell me how HtDig is supposed to use the keywords that it finds > in the Meta tags of a document. > I was under the impression that HtSearch would return a document even if the > only match for the search term was found in a Meta tag ie not in the body > text, but on our site it doesn't seem to be doing this. Have I got the > config wrong, or have I misunderstood how meta tags work. The accepted forms of meta keywords tags that htdig recongizes are: <meta keywords="words"> <meta htdig-keywords="words"> <meta name="keywords" content="words"> <meta name="htdig-keywords" content="words"> The last two forms are controlled by the keywords_meta_tag_names attribute, so any names listed there are allowed in this way. The default ones are keywords and htdig-keywords. The first two forms above are not configurable. As long as your documents are using one of these forms of the tag, and you haven't set max_keywords to 0, keywords should behave as you expect. If they don't, try running htdig with at least four -v options to see what words are being indexed for a given document. Meta description tags are also indexed, but only to the maximum length allowed by max_meta_description_length (512 by default). -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 *********************************************************************** This email may contain information which is privileged or confidential. If you are not the intended recipient of this email, please notify the sender immediately and delete it without reading, copying, storing, forwarding or disclosing its contents to any other person, Thank you ********************************************************************** |