I've just been through the painful process of creating a urlalias.[site].txt file from an existing history log and, although this no doubt saved me a lot of time, it was also fraught with problems.
Firstly, there was the usual problem of having hundreds of URLs in the original log file that weren't actually pages. I would have thought the .pdf extension on these would have been a clear enough indicator.
Secondly, the list is incomplete. I know I could just use the -site= param but I tried that in the first instance which is when I noticed all the invalid page URLs getting hit.
Thirdly, pages requiring login were not indexed, despite being logged in already on the same machine.
So, some questions:
All replies are greatly accepted.
Thanks in advance.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.