Menu

#21 Mosaic does not highlight collocates with followed by quotes or puctuation

1.0
open
Shane
None
2021-09-21
2019-08-28
S Luz
No

I’m working hard on my paper and just encountered what I think is a fairly serious problem (for my analysis – I think I might have to go back a few steps and treat the Mosaic more cautiously).

Basically I’m looking at the word refugee in the internet corpus, and the mosaic tells me (Local MI3 EXP Scale, though not important for our purposes here) that crisis is the most significant item to the left. So I clicked on crisis and it duly highlighted the lines that include the combination refugee crisis. Or I thought it did highlight them all. I proceeded to delete the lines that were not highlighted so that I can focus on the relevant lines, did my counts, etc., but because I had done some of this work a few months ago without using Mosaic I found that a couple of the examples I was interested in were not there. To cut a long story short, I did the same exercise starting with crisis instead of refugee and noticed immediately that there were many more instances than I had seen highlighted in the mosaic output for refugee. It turns out that mosaic does not highlight any lines that include single or double quotes preceding or ending the string in question. See attached. So I now have to do the whole thing again and not rely on Mosaic, which is a shame.

Can this be fixed?

Mona Baker

Update:

Just reporting further on this problem – it’s worse than I first realised, because it’s not just single or double quotes – it’s any kind of punctuation, including commas and full stops. As in the last three example below (first and last lines not highlighted because of the comma and full stop).

Otherwise the Mosaic is proving extremely helpful – more helpful than I anticipated. So it would be really good to fix this. At the moment I’m having to work carefully through all the lines to make sure I don’t lose all these examples and end up with unreliable analysis.

1 Attachments

Discussion

  • S Luz

    S Luz - 2019-08-28
    • status: open --> closed
     
  • S Luz

    S Luz - 2019-08-28

    Shane fixed regexp used to highlight lines. Version e7d427... containing the fix has been uploaded to the server. Closing it.

     
  • S Luz

    S Luz - 2021-09-21
    • status: closed --> open
     
  • S Luz

    S Luz - 2021-09-21

    This problem seems to have returned. I've just added an identical bug report to the main modnlp tickets list. Reopening it.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.