The (meaningless) query "<s> * </s>" (which seems quite
natural to try for people who access CQP through a Web
interface and don't know its query language) gets
caught in an infinite loop, i.e. CQP runs forever with
full CPU power.
What happens is that in the FSA constructed from the query,
<s> is a transition that does not "consume" a token, so <s>*
(Kleene star over <s>) effectively generates an eps-loop at
the start state of the FSA. Normally, such errors are
caught because the start state is also a final state (the
query "<s> *" would result in such an error message), but in
this case, the additional constraint </s> (which can never
be satisfied) inserts another transition.
There should probably some test for eps-loops in the FSA
simulation, which can never do anything useful (unless I'm
mistaken).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A partial fix for this bug has been in place for some time (since before v3.0.0). In the specific example listed in the bug report, CQP will abort with this error message:
CQP Error:
Infinite loop detected: did you quantify over a zero-width element (XML tag or lookahead)?
If you are reasonably sure that your query is valid, please contact the CWB development team and file a bug report!
Query execution aborted.
However, due to the messy internals of the query implementation, it may still be possible to write CQP queries that trigger an infinite loop.
A clean solution will be possible when the query evaluation mechanism is completely overhauled for CWB 4.0. We should keep the ticket pending until then so we remember to consider the case of such "empty loop" queries.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=545257
What happens is that in the FSA constructed from the query,
<s> is a transition that does not "consume" a token, so <s>*
(Kleene star over <s>) effectively generates an eps-loop at
the start state of the FSA. Normally, such errors are
caught because the start state is also a final state (the
query "<s> *" would result in such an error message), but in
this case, the additional constraint </s> (which can never
be satisfied) inserts another transition.
There should probably some test for eps-loops in the FSA
simulation, which can never do anything useful (unless I'm
mistaken).
Logged In: YES
user_id=545257
This is especially a problem for Web interfaces, which
should try to catch queries of this type until the problem
is fixed.
Logged In: YES
user_id=545257
This is especially a problem for Web interfaces, which
should try to catch queries of this type until the problem
is fixed.
A partial fix for this bug has been in place for some time (since before v3.0.0). In the specific example listed in the bug report, CQP will abort with this error message:
CQP Error:
Infinite loop detected: did you quantify over a zero-width element (XML tag or lookahead)?
If you are reasonably sure that your query is valid, please contact the CWB development team and file a bug report!
Query execution aborted.
However, due to the messy internals of the query implementation, it may still be possible to write CQP queries that trigger an infinite loop.
A clean solution will be possible when the query evaluation mechanism is completely overhauled for CWB 4.0. We should keep the ticket pending until then so we remember to consider the case of such "empty loop" queries.
I'm closing this bug and moving it to a feature request.
A postscript: Here are two examples of queries that still run into infinite loops (work in most corpora)
Most such queries are likely to be intentional malicious attacks.