#1 "cut" applies to early

TODO-4.0
closed-fixed
CQP engine (8)
5
2017-07-01
2006-05-15
No

When using CWB for parallell corpora, the "cut" keyword
does not give the correct results: It is applied to the
first corpus, and does not take into account that there
can be restrictions on the aligned regions as well,
thus returning to few hits.

Discussion

  • Stefan Evert

    Stefan Evert - 2009-08-17
    • labels: --> CQP engine
    • assigned_to: nobody --> schtepf
     
  • Stefan Evert

    Stefan Evert - 2009-08-17

    One of the many problems of the CQP query evaluation engine, with its multiple-pass architecture.

    If the CQP engine is rewritten to evaluate even complex queries in a single pass (which has many other advantages, e.g. incremental query result sets that can be serialised to disk if they do not fit into RAM), this problem will be rather trivial to solve.

     
  • Andrew Hardie

    Andrew Hardie - 2011-07-31
    • milestone: --> TODO-4.0
     
  • Stefan Evert

    Stefan Evert - 2017-07-01

    The only solution given the current implementation of CQP is to run every aligned query to completion and then apply the cut after all alignment constraints have been checked.

    This means that "cut" can no longer be used to limit the runtime of aligned queries, but we consider it more important to produce correct results.

     
  • Stefan Evert

    Stefan Evert - 2017-07-01
    • status: open --> closed-fixed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks