Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#1472 solr update responsiveness

someday
open
nobody
nobody
General
None
2013-01-24
2011-02-08
Mark Ramm
No

Often times when a ticket is changed, it still shows up in searches and bins where it is no longer supposed to be -- and can stay in that state for a few minutes. If this is showing up now with the scale we have today, it's likely to grow worse as we migrate projects over to the beta.

We should investigate the cause of these delays, and determine what we can do to scale up SOLR write performance.

Probably our approach will be something like the following:

For better search performance, we should have two solr cores we use, a 'hot' core and a 'cold' core. The hot core will be the one all online updates go to, and it should be committed frequently. Periodically (maybe once a day), all the indexing operations on the hot core will be replayed to the cold core and the hot core will be cleared out, keeping its index small. Queries will need to query both cores, and will need to verify each document ID is not in a 'deleted' list if the document comes from the cold core.

Related

Tickets: #1523
Tickets: #5265
Forge: Site Support: #2548

Discussion

  • I think it's likely that our solution will involve keeping two solr cores around, one that receives incremental index updates, and one that receives daily dumps from the incremental one (at which time the incemental one will be cleared out). We can then search both of them and scan the ArtifactReference collection to make sure the results are still relevant if the result only appears in the 'old' core.

     
    • Description has changed:

    Diff:

    --- old 
    +++ new 
    @@ -1,3 +1,7 @@
     Often times when a ticket is changed, it still shows up in searches and bins where it is no longer supposed to be -- and can stay in that state for a few minutes.    If this is showing up now with the scale we have today, it's likely to grow worse as we migrate projects over to the beta. 
    
     We should investigate the cause of these delays, and determine what we can do to scale up SOLR write performance. 
    +
    +Probably our approach will be something like the following:
    +
    +For better search performance, we should have two solr cores we use, a 'hot' core and a 'cold' core.  The hot core will be the one all online updates go to, and it should be committed frequently. Periodically (maybe once a day), all the indexing operations on the hot core will be replayed to the cold core and the hot core will be cleared out, keeping its index small.  Queries will need to query both cores, and will need to verify each document ID is not in a 'deleted' list if the document comes from the cold core.
    
     


Anonymous


Cancel   Add attachments