From: Duncan B. <dun...@su...> - 2006-04-10 08:58:00
|
Mikko Koivunen wrote: > Is it possible, at all, to make Plone's livesearch (or normal search) > somehow index the pages that Apache serves from the DMS system? I did something like this by running a spider to extract the relevant information from the flat site and then adding it to Plone's catalog. I didn't actually add any proxy objects inside Plone: it is quite possible to index external content without needing a persistent proxy object (you can just create a non-persistent object with the appropriate fields), although if you rebuild the catalog you will wipe out the index entries for the dummy objects. For the dummy objects I set the path to the real absolute URL and hotfixed a couple of points in the Plone code to allow the brains to work with absolute URLs as well internal paths. Unfortunately I never got around to wrapping it up as a releasable product: the main problem is that the spidering code needed to know quite a bit about the structure of the site it was spidering so as to avoid indexing the boilerplate code round the outside of each page. We've now switched to using GoogleSA wherever possible for our indexing and searching. |