Menu

#73 CQPweb: Collocation analysis with POS filter is inconsistent

TODO-3.5
open
nobody
CQPweb (22)
6
2022-03-31
2021-03-30
No

If a POS filter is applied in collocation analysis, the co-occurrence frequency counts are reduced (to instances of the collocated that are tagged accordingly), but the expected frequency is not adjusted (should also be reduced to the respective POS tag). This results in mathematically inconsistent contingency tables.

The behaviour appears to be inherited from BNCweb and might be difficult to fix because the table of marginal frequencies has to be broken down by POS tag (and aggregated if no POS restriction is specified).

Discussion

  • Stephanie Evert

    Stephanie Evert - 2021-03-30

    An alternative solution would be to reduce the co-occurence spans to token positions with the correct POS tag(s), but this is a less intuitive and less obvious interpretation of the POS filter (more similar to word sketches than a post-hoc filter).

     
  • Stephanie Evert

    Stephanie Evert - 2022-03-31
    • labels: CQPweb --> CQP
    • status: open --> closed-fixed
     
  • Stephanie Evert

    Stephanie Evert - 2022-03-31
    • labels: CQP --> CQPweb
    • status: closed-fixed --> open
     

Log in to post a comment.