From: Geoff H. <ghu...@ws...> - 2002-01-29 14:34:43
|
At 6:37 PM -0700 1/28/02, Neal Richter wrote: >The libhtdig_htdig.cc is a callable replacement for the htdig executable. >The libhtdig_htmerge.cc is a callable replacement for the htmerge executable. I'm not sure why you'd want htmerge and not htpurge in this case (not to mention htdump and htload). I would think that in a library, you're less likely to need to merge whole databases together and more likely to want to delete URLs, etc. > sprintf(htdig_params.configFile, "/etc/htdig/htdig.conf"); > strcpy(htdig_params.credentials,""); > strcpy(htdig_params.max_hops, ""); //9 digit limit > strcpy(htdig_params.minimalFile, ""); > strcpy(htdig_params.URL, ""); //stdin HTTP addrs Maybe this is just me, but I don't see how this is any more useful than using brackets (e.g. in the defaults.cc code) or simply: config["authorization"] = ""; config["max_hops"] = "-1"; etc. >to mix and match them. Currently the parser classes receive a Retriever >object as a parameter and issue callback-style calls to the Retriever >object. This seems a little strange to me. Clearly you could do this, but since the Parser and Retriever classes are tied tightly in any model, if you would have to write completely new Parsers for any new Retriever you wrote. If there's something that can be improved in the current system, that's great and let's do that. It probably needs updating and/or refactoring given the new database model we're using. But as many of us can attest, getting the parsers to work "correctly" on all the strange files out there is a difficult task. No one is going to want to write an HTML parser by themselves. Perhaps it would make more sense to me if you had an example of what sort of Retriever you're thinking about. >would be more elegant to have a set of PHP wrappers written in C that >provide an interface back and forth to the core searching code. It would be more elegant to have htsearch code that supported such an API. At the moment, Torsten has done what he can do--massage the output from htsearch. This is one reason for a new query parser. (And yes, the new query parser that Quim donated could be styled to have different query syntax if you want.) -Geoff |