This could be done by catching ExecutionException in SerialAnalyserController and ConditionalSerialAnalyserController and printing the document there. Then either re-throw the exception so that it gets caught somewhere higher up which currently leads to termination of the corpus processing, or just print the stack trace and continue processing of the corpus.
In some cases, the latter might be preferable: it is frustrating to heave a corpus left just nearly fully processed because one document did not work with a PR (e.g. not finding a sentence annotation might throw an exception in minipar plugin). On the other hand, some serious condition might then cause hundreds or thousands of exceptions to get logged. A compromise might be to accept only a maximum of a dozen such errors before terminating processing the corpus.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=1472154
Originator: NO
This could be done by catching ExecutionException in SerialAnalyserController and ConditionalSerialAnalyserController and printing the document there. Then either re-throw the exception so that it gets caught somewhere higher up which currently leads to termination of the corpus processing, or just print the stack trace and continue processing of the corpus.
In some cases, the latter might be preferable: it is frustrating to heave a corpus left just nearly fully processed because one document did not work with a PR (e.g. not finding a sentence annotation might throw an exception in minipar plugin). On the other hand, some serious condition might then cause hundreds or thousands of exceptions to get logged. A compromise might be to accept only a maximum of a dozen such errors before terminating processing the corpus.
Moved to gihub https://github.com/GateNLP/gate-core/issues/46