How do collapsing fields work?

Help
2013-12-24
2014-01-13
  • Taka Muraoka
    Taka Muraoka
    2013-12-24

    The documentation says that a results set A A A B A B will be collapsed down to A A B A B (with maxDocs=2).

    This seems to imply that what OSS returns depends on what order the search results are being returned i.e.. if we sort on different fields, the results will come back in a different order and hence different results will be collapsed away.

    Is there any way to retrieve the documents that have been removed due to collapsing?

    I want to group results on a field, regardless of sort criteria e.g. A A A B A B comes back as A A A A B B. Is there any way to do this?

     
  • Did you try the "cluster" collapsing ? It will do what you expect. If you set the "collapse.max" value to 0, all documents are returned grouped.

     
  • Taka Muraoka
    Taka Muraoka
    2014-01-13

    Thanks. This seems to work, sorta.

    How exactly does OSS decide to cluster documents together? It doesn't seem to be on an exact match of the field. When I collapsed documents on title, documents with a title of "hello" got clustered correctly. When I added some new documents with a title of "hello hello", they got collapsed together with the "hello" documents. Changing the query to collapse on titleExact stopped clustering from working at all.

    Also, is the documentation correct?
    - full collapsing = applies only to the displayed results
    - optimized colllapsing = applies to the entire result set.
    This seems to be the wrong way around.