Menu

#73 Colligation causes crash

open-works-for-me
Client (37)
5
2008-08-01
2007-02-26
No

The colligation function appears to be broken for all the corpora that I have indexed here.

This is working with the BNC written sampler, FLOB, and one or two other small-ish corpora. When I go to the Collocation window, and go to the Colligation pane, if I first click on the tick-box and then tick on POS, the following happens:

1) when I click on the tick-box the collocation-box refreshes, but just re-displays the wordform collocations;

2) then when I click on POS, the client crashes.

If I do it the other way aorund (i.e. first POS, then the tick box), then I get the refresh on pressing POS and the crash on pressing the tickbox.

However, I also tried it out on the remote access BNC beta test, and that seemed to work fine.

So I'm not sure whether this is a problem with the client or with the indexes that I've got here...

Discussion

  • Lou Burnard

    Lou Burnard - 2007-02-26

    Logged In: YES
    user_id=1021146
    Originator: NO

    Does the problem go away if you reindex the corpora? As you note, it doesn't affect the current version of BNC XML, nor can I reproduce the error on a couple of corpora I have here which were indexed with versions 1.20

    Corpora indexed earlier than this are crashing on start up for me, which is a different bug.

     
  • Andrew Hardie

    Andrew Hardie - 2007-02-27

    Logged In: YES
    user_id=1460495
    Originator: YES

    I reindexed twice, once with the wizard and once without. The problem didn't go away.

     
  • Tony Dodd

    Tony Dodd - 2007-03-01

    Logged In: YES
    user_id=1036552
    Originator: NO

    The reason this occurs is inconsistent case folding in the indexer. While the headword lists are folded correctly the index themselves record unfolded values. Helpfully the line in question in Dict.cpp is commented 'This is wrong'.

    We don't see this in BNC because pos tags there are case sensitive.

     
  • Lou Burnard

    Lou Burnard - 2008-08-01
    • status: open --> open-works-for-me
     
  • Lou Burnard

    Lou Burnard - 2008-08-01

    Logged In: YES
    user_id=1021146
    Originator: NO

    I can't repeat this with BNC Sampler. Tony's explanation doesn't make much sense to me: in general, attribute values should never be case-folded in XML (@hw in BNC and other lemma forms may be considered an exception, perhaps) surely?

     
  • Lou Burnard

    Lou Burnard - 2008-08-01

    Logged In: YES
    user_id=1021146
    Originator: NO

    I can't repeat this with BNC Sampler. Tony's explanation doesn't make much sense to me: in general, attribute values should never be case-folded in XML (@hw in BNC and other lemma forms may be considered an exception, perhaps) surely?