The indent on the log file entries keeps getting bigger. What appears to be happening is that it's being broken by Exceptions that occur and are caught by a TryProcessor, where the processors between the cause and the TryProcessor are left on the stack. I think what needs to happen is that the call to scraper.finishExecutingProcessor() in BaseProcessor needs to check that it's removing the same processor that it added instead of blindly removing the one at the top of the stack, i.e. it should pop entries from the top of the stack until it gets to the one it added.
I'm running the following change to BaseProcessor.java, which I hope will work.
int level = scraper.getRunningLevel();
scraper.setExecutingProcessor(this);
Variable result = execute(scraper, context);
long executionTime = System.currentTimeMillis() - startTime;
setProperty(Constants.EXECUTION_TIME_PROPERTY_NAME, new Long(executionTime));
setProperty(Constants.VALUE_PROPERTY_NAME, result);
while (scraper.getRunningLevel() > level) {
BaseProcessor proc = scraper.getRunningProcessor();
if (proc == this)
scraper.processorFinishedExecution(proc, proc.properties);
scraper.finishExecutingProcessor();
}
The modification above is most probably invalid because the workaround code won't be even run.
Essentially, the invocation of BaseProcessor.execute in BaseProcessor.java:115 is susceptible to thrown uncaught exceptions. When this happens, the stack will not be unwound as it should, leading to "broken indent", eventually to exhausted memory and before that 100% CPU load. This is because getRunningLevel just keeps growing, putting an ever increasing strain on CommonUtil.replicate (which btw is an inefficient implementation).
The fix should be probably to wrap Variable result = execute(scraper, context); with try..finally and only rethrow the exception after unwinding stack.