From: Igor S. <oz...@gr...> - 2000-09-07 17:14:07
|
To Wagner: In the context of the email that I sent to the grub-database list titled "Pre-computed ranking vs. ranking on-the-fly", even though storing pre-computed rank value has a great disadvantage over generating it on-the-fly in that it takes a lot of effort to rebuild the database once you change the ranking parameters, I think we should go with it for now, as we get a lot of performance gain. Therefore, I think your module should implement the second type -- CUMULATIVE. However, in order to assure that your module will be used in the future, I think it needs to be modular enough so that if we needed to use it just for getting the words from pages and figuring out the types and positions (and not rank/weigh them), we would be able to do this. Here is why. Initially, we want the Ranker to be located at the Server. The Clients will pass back full contents of pages to the Server, and the Server will use the Ranker to get the words out, figure out the type, position, and their weight/rank. This a cumulative rank will be generated, upon which the searches will be done. In later stages of the project, we may actually move the Ranker (your module) to the Client, but its responsibility will be somewhat limited -- it will NOT rank the pages, but only "preprocess" them. This means, it will get the words, associate appropriate type with them (REGULAR, ANCHOR, META, TITLE, ...), position, ... and send them to the Server. The Server will do the ranking on the partially processed data, and hence utilize more processing power on the Clients. I have actually included this capability in the Client/Server protocol. Another option would be to have the Clients do the ranking, where they will be highly configurable from the Server on the parameters to be used for ranking, and what to rank upon. But let's not worry too much about the later stages. Just to have them in mind so that we won't get into too much trouble rewriting code when we get there. Cheers, ozra. -------------------------------------------------------------- Igor Stojanovski Grub.Org Inc. Chief Technical Officer 5100 N. Brookline #830 Oklahoma City, OK 73112 oz...@gr... Voice: (405) 917-9894 http://www.grub.org Fax: (405) 848-5477 -------------------------------------------------------------- |