Menu

#206 Searching won't scale

bug list
open-later
nobody
core (102)
4
2000-10-26
2000-10-11
No

Brokers are currently hacked to search every content tracker available.

Discussion

  • Gregory P. Smith

    we're well aware of this and its not a simple fix so I'm lowering the priority while a real solution is worked on.

    in the interim, stats on how much goodness you've gotten from content trackers is now kept and used when sorting by preference so hopefully it causes people to gravitate towards more useful trackers.

     
  • Gregory P. Smith

    • priority: 7 --> 5
    • labels: 103276 --> core
    • status: open --> open-later
     
  • Gregory P. Smith

    • priority: 5 --> 4
     
  • Jukka Santala

    Jukka Santala - 2000-12-27

    The code now uses X "best" content-trackers, but the selection could still be better, new trackers aren't tried/rotated in a lot, and there seems to be some odd stuff with the content-trackers you publish to getting removed from the pool etc.

    Over the longer range, this form of content-tracker use won't scale either, since the number of content-trackers published to and searched will need to keep increasing. There are quire a few possible solutions; things seem to be currently geared towards dividing the content-trackers by "content type", but I've expressed my opinion that the current content-types aren't perfect option before.

    One alternative would be to allow random content-types, which would then be divided up into X letter strings starting at every letter. Resulting sub-strings would be used as ID's to distribute the content entries in usual manner. Longer content types would make files harder to locate (As their entries are spread out over more trackers) which could be desirable. But also searching for shorter sub-strings than X letters would be harder, and the hash-space wouldn't be very evenly used with default content types. This would be more finely tuned than basing things on straight content classes, though, and could maybe be improved on.

    Another possible approach would be to approach the problem from user-requirements, which are producing a match. Some relief for scalability could be attained simply by stopping the search only after X matches have been returned, since non-matches don't usually consume very much resources, while searching for popular file on current system can be relatively stressing.

     

Log in to post a comment.