From: Robert R. <ri...@li...> - 2004-07-30 20:10:07
|
htd...@li... wrote: >Send htdig-dev mailing list submissions to > htd...@li... > >To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/htdig-dev >or, via email, send a message with subject or body 'help' to > htd...@li... > >You can reach the person managing the list at > htd...@li... > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of htdig-dev digest..." > > >Today's Topics: > > 1. Is using htdig with Frontpage 2003 possible? (Julia Richter) > >--__--__-- > >Message: 1 >Date: Thu, 29 Jul 2004 12:55:37 -0700 (PDT) >From: Julia Richter <jri...@sb...> >To: htd...@li... >Subject: [htdig-dev] Is using htdig with Frontpage 2003 possible? > >Hi: > >I have been trying to find a search engine that will >return items and not pages. I built a website using >Frontpage 2003 and is hosted on IIS. Do I need to get >a host with a Linux server to use your product? > >The engine that came with Frontpage only returns >pages. > >I have over 5000 books in word tables on four pages. > >Will yur search engine work for me? > >Everyone I talk to has a different answer from make a >database in Access to get Atom search for $10,000! > >I would so appreciate any help. > > >Regards, > >Julia > > Hello Julia, Htdig is a 'solution' that has two components: a) An indexing 'component' - Like a robot, it will follow documents (usually accessed through the hypertext transfer protocol), boild them down to their 'keywords', and store these keywords in a (Berkely DB) Database for searching. Through the use of 'helper applicatiions' it becomes possible to index non-html content (among other things: word files). b) A search component, usually including a search form (html) and a cgi-script returning (html-style) customizable result pages. This will only require a web server (eg. IIS), and return its result pages On the other hand, I am not completely clear: - You have '5000 books as word documents' and want a web-based full-text search over those documents? - First of all the Word Document format is not ideal for searching (since its basically closed). Then you'll have a set of 'results' (text passages), that you need to transform these results (from the lcosed proprietary word format into something a little more documented, like eg. PDF) and make them fit for web display (forget PDF, requires a plugin and breaks the meaning of the web). Or do you have 4 pages with references (Books), written as a word table?- If thats the case I'd convert them into a proper database, allowing for a proper search - A task for a database engineer, nothing to do with htdig. When you convert the database you should have an eye to use open formats so a bit company from redmond does not dictate your future choice of an operating system. - In that case you might want to look at open databases like MySQL(http://www.mysql.com), and building a search interface to the database eg. with php (http://www.php.net) Or finally, do you have the full text of 5k books as word documents, and want to search through that - doable with htdig, but I'd personally prefer a more open format, like eg. DocBook... So please elucidate us Robert Ribnitz Debian Maintainer of HT://Dig |