Hi, I am opening this ticket because CQPweb is hanging in an specific scenario.
- Input. An standard query in a corpus with 12 p-attributes (in a corpus with no extra p-attributes everything works as expected).
- Output when there is not match. Expected no results.
- Output when there is a match. Platform hangs and consumes all the CPU; I have to restart the web server in order to stop this behavior.
I attach the VRT file (the test corpus only has one VRT) and the data corpus directory as a ZIP file.
I forgot to put the versions:
(a) Does this happen for a specific query, or for every query that returns some matches? In the latter case there' s probably sth in the corpus that confuses CQPweb.
(b) Are there also problems if you run the query directly in CQP?
(c) You can't possibly have installed this corpus in CQPweb because it's lacking the mandatory
<text id="...">
elements!Last edit: Stephanie Evert 2023-07-21
Let's see:
Re. 3: If you installed this particular corpus in CQPweb, you must have ignored all the error messages that it shot at you. It should have outright refused to install the corpus, but perhaps it went far enough to get its database into an inconsistent state that causes the lock-up.
I didn't get any errors. I attach the process I am doing for the corpus installation. Thanks!
I can confirm that the problem is because the lack of
<text>
tag. I did a test where I include that tag and it doesn't hangs. But now I have these doubts:[UNREADABLE]
means in the query result?I attach the VRT test file and the screenshot of the result.
Ok, now I confirm that the
[UNREADABLE]
was because the text tag requires the id. With the id attribute everything work as expected.Thanks @schtepf for your patience. The only weird thing left is that I don't receive an error message or warning when I don't add the text tag.
[UNREADABLE] means that the some word could not be read from the data returned by CQP. In this case, it happened because the absence of text_id mucked up the processes that break up that data for formatting.
You'll note that on your installation form screenshot, there is a notice at the top of the XML table saying that <text> and its id="..." are "... compulsory".</text>
So they are added to the corpus definition even if you don't speciy them on the form.
Unfortunately cwb-encode doesn't issue errors for declared tags that don't appear. So CQPweb doesn't know there is an error. I'd better add a check for correct text tags in input files. But this will be in 3.3 not 3.2 as I don't add new features there.