Pending a longer-term solution that stops htdig from this duplication, you
could pass the output through a script which simply scans for the duplicates.
The list of URL's could be sorted in any order based on the characters
(alphanumeric, reverse-alphanumeric, etc.) to put the duplicates next to each
other. A script could then compare them and eliminate one if it differs from
an adjacent line only by a terminal "/".
> You might use global search and replace in your site creation
> software to replace instances of "http://.../cr/reneezellweger"
> with "http://.../cr/reneezellweger/". If the site is dynamic, you
> might be able to change entries in your DB in a similar way.
> -- Duke
> Jim wrote:
> > On Sat, 28 Aug 2004 ianevans@... wrote:
> >>> For the first case, I am not certain what is happening. I suspect
> >>> there is
> >>> an issue with the way the web server is configured. Typically a web
> >>> server will respond with some sort of "moved" status code (e.g. 301)
> >>> and a
> >>> pointer to a new location when a URL ending with a directory name is
> >>> provided without a trailing slash. For example, a request for
> >>> http://www.digitalhit.com/cr/reneezellweger
> >> I don't know if this helps, but a) we're not using mod_rewrite and b)
> >> 'cr'
> >> is actually a php file that's taking 'reneezellweger' as a database
> >> variable.
se J2EE developer tools!