|
From: David K. <dav...@al...> - 2003-09-30 14:50:10
|
McDonald, Bruce wrote: > David, Hi Bruce! > It seems that the document is always null. Is this correct from your understanding of the pipeline/data? Heh...the debug output is junk, really...it's supposed to echo the *name* of the document, which isn't set, so outputs "null". (Note to self: Strip own debugging stuff before submitting mail next time. Less confusing. ;-) ) > this might be issue here. Nope, that's not it. The document is set alright. It's parsed and goes through several stages before bombing with the NPE. > regards, > Bruce. David |
|
From: Leech, J. <jl...@vi...> - 2003-09-30 15:53:52
|
I poked around the code a little bit. I didn't see where the PipelineStage
gets created, or the config options get set, but its possible that more than
one thread is setting the config options (suboptions) at the same time in
the HashMap. None of the access to the ConfigOption.suboptions HashMap is
synchronized (at least in the version of code I'm looking at, haven't done
an update in a while). That's where I would start.
-Jonathan
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, September 30, 2003 8:27 AM
To: bab...@li...
Subject: [Babeldoc-devel] Multithreading problems...
Hi guys,
I am still struggling with getting my pipeline(s) going together
with multiple threads. I have changed my approach from using the
threadpool pipeline processor to simply use the asynchronous
scanner feeder with a poolSize=x, where x > 1, config as follows:
feeder/config.properties:
# Scanner feeder implementation
scanner.type=asynchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
which gets loaded and configured by Babeldoc, as expected.
Now, to trigger the problem I just have to supply the scanner
with two or more documents to scan and submit to the pipeline
for parallel processing and the processing dies _almost_ every
time, with an NPE in VariableProcessor.mustExpand, for some
reason.
When it doesn't die, it does strange things further down the
pipeline, indicating corrupted data payload in the document,
that messes things up, albeit not causing an NPE this time.
My particular processing consists of three pipelines, where
two of them scan documents from separate sources, applies
source-specific initial processing and then call a common
main pipeline for the remaining processing tasks.
I have had the processing fail in both the initial pipelines
as well as in the later, common, one. The processing fails
most commonly in an XpathExtractPipelineStage, but now and
then it also fails in stages of other types.
Example stacktrace (sorry for the formatting):
(Oh, and the extra text after extract_fid: is just a little
debug printout I added *after* observing the problem, to
aid in my searching for the cause. It's not part of the
problem, that is.)
<2003-09-30 16:05:27,895> INFO [Thread-3] :
extract_fid:processStage(ticket:1064930727756,document:null)
java.lang.NullPointerException
at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source)
at com.babeldoc.core.VariableProcessor.expandString(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at
com.babeldoc.core.pipeline.stage.XpathExtractPipelineStage.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown
Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source)
at
EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:536)
As you see, the PooledExecutor gets going, calling the Async Feeder,
which calls the underlying Sync Feeder and of the Pipeline goes. Up
to the NPE, that is... Several stages have already been successfully
executed, in parallel, up to this point. And, as I said, it doesn't
*always* fail, and when it does, it isn't *always* in the extract_fid
stage of type XpathExtractPipelineStage.
Hmm...I managed to catch one of the other ones as well. Here:
<2003-09-30 16:25:10,800> INFO [Thread-2] :
dl_router:processStage(ticket:1064931905930,document:null)
java.lang.NullPointerException
at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source)
at com.babeldoc.core.VariableProcessor.expandString(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at
com.babeldoc.core.pipeline.stage.RouterPipelineStage.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown
Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown
Source)
at
com.babeldoc.core.pipeline.stage.CallStagePipelineStage.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown
Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source)
at
EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:536)
Any ideas?
I can provide almost any file from my configuration setup, if it
can be of any aid in tracking this down. I am currently stumped.
The reason I need the threading support to work is that in a few
of the later pipeline stages there can be substantial delays, in
case of which it would certainly be nice if the documents that
don't cause any delays can be happily processed in parallel, but
that's kinda obvious, I know. :-) Would be nice, though.
My personal guess at this time (I have done quite some digging in
the code, but obviously not yet enough) is that there seems to be
some kind of threading race in the code supporting the options.
Then again, that might be totally off, since I don't understand
it fully, yet.
Help?
Regards,
David Kinnvall
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: Leech, J. <jl...@vi...> - 2003-09-30 16:28:27
|
Hmmm. My copy of ConfigInfo doesn't have that method -- I haven't done an update in a long while -- and cvs.sourceforge.net is refusing connections from me at the moment. At some point in the not-too-distant future I will update everything and put it through the multi-threaded babeldoc pressure-cooker that I've got going. -Jonathan -----Original Message----- From: McDonald, Bruce [mailto:Bru...@ba...] Sent: Tuesday, September 30, 2003 10:17 AM To: Leech, Jonathan; David Kinnvall; bab...@li... Subject: RE: [Babeldoc-devel] Multithreading problems... Jonathan, The pipeline stage gets created in the pipelinestagefactory code. But... There is a place in the configdata/configinfo code that actually creates suboptions when data is found that does not have a corresponding config option. I suspect that this is involved because the pipeline stage type that we are dealing with here are those with suboptions - just the kind that will be doing this kind of creating. The entry method for this is ConfigInfo.applyConfigData. This method takes the configuration data and applies it to the configuration options. It will create options if necessary. It will be necessary to synchronize either this method or on the data being fed to this method. Please experiment with this and report back. regards, Bruce. -----Original Message----- From: Leech, Jonathan [mailto:jl...@vi...] Sent: Tuesday, September 30, 2003 11:51 AM To: 'David Kinnvall'; bab...@li... Subject: RE: [Babeldoc-devel] Multithreading problems... I poked around the code a little bit. I didn't see where the PipelineStage gets created, or the config options get set, but its possible that more than one thread is setting the config options (suboptions) at the same time in the HashMap. None of the access to the ConfigOption.suboptions HashMap is synchronized (at least in the version of code I'm looking at, haven't done an update in a while). That's where I would start. -Jonathan -----Original Message----- From: David Kinnvall [mailto:dav...@al...] Sent: Tuesday, September 30, 2003 8:27 AM To: bab...@li... Subject: [Babeldoc-devel] Multithreading problems... Hi guys, I am still struggling with getting my pipeline(s) going together with multiple threads. I have changed my approach from using the threadpool pipeline processor to simply use the asynchronous scanner feeder with a poolSize=x, where x > 1, config as follows: feeder/config.properties: # Scanner feeder implementation scanner.type=asynchronous scanner.queue=disk scanner.queueDir=scanner/queue scanner.queueName=scanner scanner.poolSize=3 which gets loaded and configured by Babeldoc, as expected. Now, to trigger the problem I just have to supply the scanner with two or more documents to scan and submit to the pipeline for parallel processing and the processing dies _almost_ every time, with an NPE in VariableProcessor.mustExpand, for some reason. When it doesn't die, it does strange things further down the pipeline, indicating corrupted data payload in the document, that messes things up, albeit not causing an NPE this time. My particular processing consists of three pipelines, where two of them scan documents from separate sources, applies source-specific initial processing and then call a common main pipeline for the remaining processing tasks. I have had the processing fail in both the initial pipelines as well as in the later, common, one. The processing fails most commonly in an XpathExtractPipelineStage, but now and then it also fails in stages of other types. Example stacktrace (sorry for the formatting): (Oh, and the extra text after extract_fid: is just a little debug printout I added *after* observing the problem, to aid in my searching for the cause. It's not part of the problem, that is.) <2003-09-30 16:05:27,895> INFO [Thread-3] : extract_fid:processStage(ticket:1064930727756,document:null) java.lang.NullPointerException at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source) at com.babeldoc.core.VariableProcessor.expandString(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.stage.XpathExtractPipelineStage.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source) at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Thread.java:536) As you see, the PooledExecutor gets going, calling the Async Feeder, which calls the underlying Sync Feeder and of the Pipeline goes. Up to the NPE, that is... Several stages have already been successfully executed, in parallel, up to this point. And, as I said, it doesn't *always* fail, and when it does, it isn't *always* in the extract_fid stage of type XpathExtractPipelineStage. Hmm...I managed to catch one of the other ones as well. Here: <2003-09-30 16:25:10,800> INFO [Thread-2] : dl_router:processStage(ticket:1064931905930,document:null) java.lang.NullPointerException at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source) at com.babeldoc.core.VariableProcessor.expandString(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.stage.RouterPipelineStage.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source) at com.babeldoc.core.pipeline.stage.CallStagePipelineStage.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source) at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Thread.java:536) Any ideas? I can provide almost any file from my configuration setup, if it can be of any aid in tracking this down. I am currently stumped. The reason I need the threading support to work is that in a few of the later pipeline stages there can be substantial delays, in case of which it would certainly be nice if the documents that don't cause any delays can be happily processed in parallel, but that's kinda obvious, I know. :-) Would be nice, though. My personal guess at this time (I have done quite some digging in the code, but obviously not yet enough) is that there seems to be some kind of threading race in the code supporting the options. Then again, that might be totally off, since I don't understand it fully, yet. Help? Regards, David Kinnvall ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Babeldoc-devel mailing list Bab...@li... https://lists.sourceforge.net/lists/listinfo/babeldoc-devel ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Babeldoc-devel mailing list Bab...@li... https://lists.sourceforge.net/lists/listinfo/babeldoc-devel |
|
From: David K. <dav...@al...> - 2003-09-30 17:41:57
|
Jonathan, Bruce, list, Leech, Jonathan wrote: > Hmmm. My copy of ConfigInfo doesn't have that method -- I haven't done an > update in a long while -- and cvs.sourceforge.net is refusing connections > from me at the moment. At some point in the not-too-distant future I will > update everything and put it through the multi-threaded babeldoc > pressure-cooker that I've got going. My copy has it, based on 1.2.0-RC1 (but no different behavior on 1.1.9 - haven't tested older versions). Synchronizing the ConfigInfo.applyConfigData method does unfortunately not help. Same result, same error. Synchronizing *all* (yeah...silly) methods in ConfigInfo makes no difference. Been working too long now - my brain hurts, can't think clearly. If anyone comes up with any idea(s) or things for me to try with my config, please yell. I'll try to get a better understanding of the thread usage of Babeldoc in the coming days. I find it a tiny bit strange, though, that noone else has experienced the same or similar problems in this area. I can't be the only one running multithreaded, right? Could there be something really *wrong* with my setup? Strange, since it works *very* well single-threaded and has done so for several days now, processing live data in test- mode without any problems. Oh, well, it's solvable, it will just need some additional thinking/testing/tinkering. ;-P > -Jonathan /David |
|
From: David K. <dav...@al...> - 2003-11-04 14:24:51
|
Hi everybody!
David Kinnvall (me) wrote at a previous time:
[stuff about my multithreading problems and the lack of datapoints]
I now have a very small setup that triggers the problem, suggesting
that there is nothing really particularly strange about my setup,
but rather that it should be reproducible (debuggable) for others.
Interesting datapoint: The *first* thread seems to always survive.
It is the remaining one(s) that crashes, sooner or later.
Relevant config snippets to (hopefully) be able to reproduce:
Setup a pipeline called whatever and a directory scanner that
feeds it, using the configs outlined below. Adjust as necessary
for your environment(s).
feeder/config.properties:
=========================
# Scanner feeder implementation
scanner.type=asynchronous
#scanner.type=synchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
pipeline/whatever/whatever.properties:
======================================
entryStage=entry
entry.stageType=Null
entry.nextStage=enricher
entry.tracked=false
enricher.stageType=XpathExtract
enricher.nextStage=writer
enricher.XPath.foo=/foo/@name
writer.stageType=FileWriter
writer.nextStage=null
writer.outputFile=output/whatever/${ticket.getValue()}_${document.get("foo")}.xml
Input document(s) - create a couple of these, named <anything>.xml:
===================================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<foo name="bar1"/>
Now, armed with an active Babeldoc instance, using config as
above, give a bunch (say 4) of input xml documents of the above
form to the scanner. For me it (almost) every time bombs for the
second and third threads in the enricher stage. You might have to
retry a few times, but for me it is very consistent, with an NPE
in VariableProcessor.mustExpand()
Important: You need to give multiple input documents to the
scanner at once, to trigger the problem *quickly* ...but...
However: It is certainly possible to trigger it with one single
document at a time as well! Just keep feeding the scanner+pipeline
one of the documents, watch it process it (successfully at least
once), and keep going, until a different thread from the first
gets to work on it. Boom.
The first thread seems to correctly get its configuration for
pipeline stages et al, but the subsequent configurations do
have a tendency to lack values, particularly, as Bruce noticed,
the ones of "complex" nature. A dump of the config data for an
XpathExtract stage, as configured in the snippet above, during
exception handling of one of the NPEs shows this (formatted)
com.babeldoc.core.option.ConfigData@11c2b67[
name=enricher,value=<null>,children={
stageType=com.babeldoc.core.option.ConfigData@659db7[
name=stageType,value=XpathExtract,children=<null>
],
XPath=com.babeldoc.core.option.ConfigData@1556d12[
name=XPath,value=<null>,children={
foo=com.babeldoc.core.option.ConfigData@edf389[
name=foo,value=<null>,children=<null>
]
}
],
nextStage=com.babeldoc.core.option.ConfigData@16be68f[
name=nextStage,value=writer,children=<null>
]
}
]
from which we learn that what should have been
enricher.XPath.foo=/foo/@name
has instead become
enricher.XPath.foo=<null>
which wasn't expected by the pipeline stage. Hence the NPE.
As before, I need help in resolving this. I have so far been
utterly unable to find a solution on my own. I hope that the
additional data in the above text might be of some assistance
for others to have a go at the problem as well, or at least
verify that the problem exists outside of my particular setup.
Best regards,
David
|
|
From: McDonald, B. <Bru...@ba...> - 2003-09-30 16:57:18
|
David, Ok, Lets find out exactly what the issue is - is it always the same configuration option that is failing or is some random option. Can you surround the getOptionList in XPathExtract with a try/catch and then dump the configuration data from the pipelinestage stage and the attributes on the document. regards, Bruce. -----Original Message----- From: David Kinnvall [mailto:dav...@al...] Sent: Tuesday, September 30, 2003 10:50 AM To: McDonald, Bruce; bab...@li... Subject: Re: [Babeldoc-devel] Multithreading problems... McDonald, Bruce wrote: > David, Hi Bruce! > It seems that the document is always null. Is this correct from your understanding of the pipeline/data? Heh...the debug output is junk, really...it's supposed to echo the *name* of the document, which isn't set, so outputs "null". (Note to self: Strip own debugging stuff before submitting mail next time. Less confusing. ;-) ) > this might be issue here. Nope, that's not it. The document is set alright. It's parsed and goes through several stages before bombing with the NPE. > regards, > Bruce. David |
|
From: David K. <dav...@al...> - 2003-09-30 16:06:11
Attachments:
babeldoc_2.txt
babeldoc_1.txt
|
McDonald, Bruce wrote: > David, > > Ok, Lets find out exactly what the issue is - is it always the same configuration option that is failing or is some random option. Can you surround the getOptionList in XPathExtract with a try/catch and then dump the configuration data from the pipelinestage stage and the attributes on the document. Hmm...let's see...I managed to trigger a failure twice in the same stage with the same two input documents. The stage configuration is: # XPath Extract - get the file_id attribute extract_fid.stageType=XpathExtract extract_fid.nextStage=file_writer extract_fid.failOnError=false extract_fid.errorStage=error extract_fid.XPath.orig_file_id=/newsitem/@file_id And the (horribly unformatted) dump of doc attributes and pipeline stage configuration are attached for both runs. Initial observation (I think): - failOnError is null, as is - errorStage, as well as - XPath.orig_file_id in both cases. Correct? Any clues from that? > regards, > Bruce. David |
|
From: McDonald, B. <Bru...@ba...> - 2003-09-30 17:06:03
|
David,
It seems that the document is always null. Is this correct from your understanding of the pipeline/data?
this might be issue here.
regards,
Bruce.
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, September 30, 2003 10:27 AM
To: bab...@li...
Subject: [Babeldoc-devel] Multithreading problems...
Hi guys,
I am still struggling with getting my pipeline(s) going together
with multiple threads. I have changed my approach from using the
threadpool pipeline processor to simply use the asynchronous
scanner feeder with a poolSize=x, where x > 1, config as follows:
feeder/config.properties:
# Scanner feeder implementation
scanner.type=asynchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
which gets loaded and configured by Babeldoc, as expected.
Now, to trigger the problem I just have to supply the scanner
with two or more documents to scan and submit to the pipeline
for parallel processing and the processing dies _almost_ every
time, with an NPE in VariableProcessor.mustExpand, for some
reason.
When it doesn't die, it does strange things further down the
pipeline, indicating corrupted data payload in the document,
that messes things up, albeit not causing an NPE this time.
My particular processing consists of three pipelines, where
two of them scan documents from separate sources, applies
source-specific initial processing and then call a common
main pipeline for the remaining processing tasks.
I have had the processing fail in both the initial pipelines
as well as in the later, common, one. The processing fails
most commonly in an XpathExtractPipelineStage, but now and
then it also fails in stages of other types.
Example stacktrace (sorry for the formatting):
(Oh, and the extra text after extract_fid: is just a little
debug printout I added *after* observing the problem, to
aid in my searching for the cause. It's not part of the
problem, that is.)
<2003-09-30 16:05:27,895> INFO [Thread-3] : extract_fid:processStage(ticket:1064930727756,document:null)
java.lang.NullPointerException
at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source)
at com.babeldoc.core.VariableProcessor.expandString(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source)
at com.babeldoc.core.pipeline.stage.XpathExtractPipelineStage.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source)
at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown Source)
at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:536)
As you see, the PooledExecutor gets going, calling the Async Feeder,
which calls the underlying Sync Feeder and of the Pipeline goes. Up
to the NPE, that is... Several stages have already been successfully
executed, in parallel, up to this point. And, as I said, it doesn't
*always* fail, and when it does, it isn't *always* in the extract_fid
stage of type XpathExtractPipelineStage.
Hmm...I managed to catch one of the other ones as well. Here:
<2003-09-30 16:25:10,800> INFO [Thread-2] : dl_router:processStage(ticket:1064931905930,document:null)
java.lang.NullPointerException
at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source)
at com.babeldoc.core.VariableProcessor.expandString(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source)
at com.babeldoc.core.pipeline.stage.RouterPipelineStage.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.stage.CallStagePipelineStage.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResult(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStageResults(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipelineStage(Unknown Source)
at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source)
at com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source)
at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown Source)
at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:536)
Any ideas?
I can provide almost any file from my configuration setup, if it
can be of any aid in tracking this down. I am currently stumped.
The reason I need the threading support to work is that in a few
of the later pipeline stages there can be substantial delays, in
case of which it would certainly be nice if the documents that
don't cause any delays can be happily processed in parallel, but
that's kinda obvious, I know. :-) Would be nice, though.
My personal guess at this time (I have done quite some digging in
the code, but obviously not yet enough) is that there seems to be
some kind of threading race in the code supporting the options.
Then again, that might be totally off, since I don't understand
it fully, yet.
Help?
Regards,
David Kinnvall
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: McDonald, B. <Bru...@ba...> - 2003-09-30 18:22:13
|
Jonathan is onto something here... Let me respond to his email. -----Original Message----- From: David Kinnvall [mailto:dav...@al...] Sent: Tuesday, September 30, 2003 12:05 PM To: McDonald, Bruce; bab...@li... Subject: Re: [Babeldoc-devel] Multithreading problems... McDonald, Bruce wrote: > David, > > Ok, Lets find out exactly what the issue is - is it always the same configuration option that is failing or is some random option. Can you surround the getOptionList in XPathExtract with a try/catch and then dump the configuration data from the pipelinestage stage and the attributes on the document. Hmm...let's see...I managed to trigger a failure twice in the same stage with the same two input documents. The stage configuration is: # XPath Extract - get the file_id attribute extract_fid.stageType=XpathExtract extract_fid.nextStage=file_writer extract_fid.failOnError=false extract_fid.errorStage=error extract_fid.XPath.orig_file_id=/newsitem/@file_id And the (horribly unformatted) dump of doc attributes and pipeline stage configuration are attached for both runs. Initial observation (I think): - failOnError is null, as is - errorStage, as well as - XPath.orig_file_id in both cases. Correct? Any clues from that? > regards, > Bruce. David |
|
From: McDonald, B. <Bru...@ba...> - 2003-09-30 18:53:44
|
Jonathan,
The pipeline stage gets created in the pipelinestagefactory code. But...
There is a place in the configdata/configinfo code that actually creates suboptions when data is found that does not have a corresponding config option. I suspect that this is involved because the pipeline stage type that we are dealing with here are those with suboptions - just the kind that will be doing this kind of creating.
The entry method for this is ConfigInfo.applyConfigData. This method takes the configuration data and applies it to the configuration options. It will create options if necessary. It will be necessary to synchronize either this method or on the data being fed to this method. Please experiment with this and report back.
regards,
Bruce.
-----Original Message-----
From: Leech, Jonathan [mailto:jl...@vi...]
Sent: Tuesday, September 30, 2003 11:51 AM
To: 'David Kinnvall'; bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
I poked around the code a little bit. I didn't see where the PipelineStage
gets created, or the config options get set, but its possible that more than
one thread is setting the config options (suboptions) at the same time in
the HashMap. None of the access to the ConfigOption.suboptions HashMap is
synchronized (at least in the version of code I'm looking at, haven't done
an update in a while). That's where I would start.
-Jonathan
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, September 30, 2003 8:27 AM
To: bab...@li...
Subject: [Babeldoc-devel] Multithreading problems...
Hi guys,
I am still struggling with getting my pipeline(s) going together
with multiple threads. I have changed my approach from using the
threadpool pipeline processor to simply use the asynchronous
scanner feeder with a poolSize=x, where x > 1, config as follows:
feeder/config.properties:
# Scanner feeder implementation
scanner.type=asynchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
which gets loaded and configured by Babeldoc, as expected.
Now, to trigger the problem I just have to supply the scanner
with two or more documents to scan and submit to the pipeline
for parallel processing and the processing dies _almost_ every
time, with an NPE in VariableProcessor.mustExpand, for some
reason.
When it doesn't die, it does strange things further down the
pipeline, indicating corrupted data payload in the document,
that messes things up, albeit not causing an NPE this time.
My particular processing consists of three pipelines, where
two of them scan documents from separate sources, applies
source-specific initial processing and then call a common
main pipeline for the remaining processing tasks.
I have had the processing fail in both the initial pipelines
as well as in the later, common, one. The processing fails
most commonly in an XpathExtractPipelineStage, but now and
then it also fails in stages of other types.
Example stacktrace (sorry for the formatting):
(Oh, and the extra text after extract_fid: is just a little
debug printout I added *after* observing the problem, to
aid in my searching for the cause. It's not part of the
problem, that is.)
<2003-09-30 16:05:27,895> INFO [Thread-3] :
extract_fid:processStage(ticket:1064930727756,document:null)
java.lang.NullPointerException
at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source)
at com.babeldoc.core.VariableProcessor.expandString(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at
com.babeldoc.core.pipeline.stage.XpathExtractPipelineStage.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown
Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source)
at
EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:536)
As you see, the PooledExecutor gets going, calling the Async Feeder,
which calls the underlying Sync Feeder and of the Pipeline goes. Up
to the NPE, that is... Several stages have already been successfully
executed, in parallel, up to this point. And, as I said, it doesn't
*always* fail, and when it does, it isn't *always* in the extract_fid
stage of type XpathExtractPipelineStage.
Hmm...I managed to catch one of the other ones as well. Here:
<2003-09-30 16:25:10,800> INFO [Thread-2] :
dl_router:processStage(ticket:1064931905930,document:null)
java.lang.NullPointerException
at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source)
at com.babeldoc.core.VariableProcessor.expandString(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown
Source)
at
com.babeldoc.core.pipeline.stage.RouterPipelineStage.process(Unknown Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown
Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown
Source)
at
com.babeldoc.core.pipeline.stage.CallStagePipelineStage.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown
Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResult(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStageResults(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel
ineStage(Unknown Source)
at
com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn
own Source)
at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown
Source)
at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown
Source)
at
com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source)
at
EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:536)
Any ideas?
I can provide almost any file from my configuration setup, if it
can be of any aid in tracking this down. I am currently stumped.
The reason I need the threading support to work is that in a few
of the later pipeline stages there can be substantial delays, in
case of which it would certainly be nice if the documents that
don't cause any delays can be happily processed in parallel, but
that's kinda obvious, I know. :-) Would be nice, though.
My personal guess at this time (I have done quite some digging in
the code, but obviously not yet enough) is that there seems to be
some kind of threading race in the code supporting the options.
Then again, that might be totally off, since I don't understand
it fully, yet.
Help?
Regards,
David Kinnvall
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: McDonald, B. <Bru...@ba...> - 2003-09-30 19:00:22
|
Wow - that sounds like a great plan - you are the master of things multithreaded!! -----Original Message----- From: Leech, Jonathan [mailto:jl...@vi...] Sent: Tuesday, September 30, 2003 12:26 PM To: McDonald, Bruce; David Kinnvall; bab...@li... Subject: RE: [Babeldoc-devel] Multithreading problems... Hmmm. My copy of ConfigInfo doesn't have that method -- I haven't done an update in a long while -- and cvs.sourceforge.net is refusing connections from me at the moment. At some point in the not-too-distant future I will update everything and put it through the multi-threaded babeldoc pressure-cooker that I've got going. -Jonathan -----Original Message----- From: McDonald, Bruce [mailto:Bru...@ba...] Sent: Tuesday, September 30, 2003 10:17 AM To: Leech, Jonathan; David Kinnvall; bab...@li... Subject: RE: [Babeldoc-devel] Multithreading problems... Jonathan, The pipeline stage gets created in the pipelinestagefactory code. But... There is a place in the configdata/configinfo code that actually creates suboptions when data is found that does not have a corresponding config option. I suspect that this is involved because the pipeline stage type that we are dealing with here are those with suboptions - just the kind that will be doing this kind of creating. The entry method for this is ConfigInfo.applyConfigData. This method takes the configuration data and applies it to the configuration options. It will create options if necessary. It will be necessary to synchronize either this method or on the data being fed to this method. Please experiment with this and report back. regards, Bruce. -----Original Message----- From: Leech, Jonathan [mailto:jl...@vi...] Sent: Tuesday, September 30, 2003 11:51 AM To: 'David Kinnvall'; bab...@li... Subject: RE: [Babeldoc-devel] Multithreading problems... I poked around the code a little bit. I didn't see where the PipelineStage gets created, or the config options get set, but its possible that more than one thread is setting the config options (suboptions) at the same time in the HashMap. None of the access to the ConfigOption.suboptions HashMap is synchronized (at least in the version of code I'm looking at, haven't done an update in a while). That's where I would start. -Jonathan -----Original Message----- From: David Kinnvall [mailto:dav...@al...] Sent: Tuesday, September 30, 2003 8:27 AM To: bab...@li... Subject: [Babeldoc-devel] Multithreading problems... Hi guys, I am still struggling with getting my pipeline(s) going together with multiple threads. I have changed my approach from using the threadpool pipeline processor to simply use the asynchronous scanner feeder with a poolSize=x, where x > 1, config as follows: feeder/config.properties: # Scanner feeder implementation scanner.type=asynchronous scanner.queue=disk scanner.queueDir=scanner/queue scanner.queueName=scanner scanner.poolSize=3 which gets loaded and configured by Babeldoc, as expected. Now, to trigger the problem I just have to supply the scanner with two or more documents to scan and submit to the pipeline for parallel processing and the processing dies _almost_ every time, with an NPE in VariableProcessor.mustExpand, for some reason. When it doesn't die, it does strange things further down the pipeline, indicating corrupted data payload in the document, that messes things up, albeit not causing an NPE this time. My particular processing consists of three pipelines, where two of them scan documents from separate sources, applies source-specific initial processing and then call a common main pipeline for the remaining processing tasks. I have had the processing fail in both the initial pipelines as well as in the later, common, one. The processing fails most commonly in an XpathExtractPipelineStage, but now and then it also fails in stages of other types. Example stacktrace (sorry for the formatting): (Oh, and the extra text after extract_fid: is just a little debug printout I added *after* observing the problem, to aid in my searching for the cause. It's not part of the problem, that is.) <2003-09-30 16:05:27,895> INFO [Thread-3] : extract_fid:processStage(ticket:1064930727756,document:null) java.lang.NullPointerException at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source) at com.babeldoc.core.VariableProcessor.expandString(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.stage.XpathExtractPipelineStage.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source) at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Thread.java:536) As you see, the PooledExecutor gets going, calling the Async Feeder, which calls the underlying Sync Feeder and of the Pipeline goes. Up to the NPE, that is... Several stages have already been successfully executed, in parallel, up to this point. And, as I said, it doesn't *always* fail, and when it does, it isn't *always* in the extract_fid stage of type XpathExtractPipelineStage. Hmm...I managed to catch one of the other ones as well. Here: <2003-09-30 16:25:10,800> INFO [Thread-2] : dl_router:processStage(ticket:1064931905930,document:null) java.lang.NullPointerException at com.babeldoc.core.VariableProcessor.mustExpand(Unknown Source) at com.babeldoc.core.VariableProcessor.expandString(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.templatize(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.getOptionList(Unknown Source) at com.babeldoc.core.pipeline.stage.RouterPipelineStage.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source) at com.babeldoc.core.pipeline.stage.CallStagePipelineStage.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineStage.processStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResult(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStageResults(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.processPipel ineStage(Unknown Source) at com.babeldoc.core.pipeline.processor.SyncPipelineStageProcessor.process(Unkn own Source) at com.babeldoc.core.pipeline.PipelineStageFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactory.process(Unknown Source) at com.babeldoc.core.pipeline.PipelineFactoryFactory.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.SynchronousFeeder.process(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder.actuallyProcess(Unknown Source) at com.babeldoc.core.pipeline.feeder.AsynchronousFeeder$1.run(Unknown Source) at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Thread.java:536) Any ideas? I can provide almost any file from my configuration setup, if it can be of any aid in tracking this down. I am currently stumped. The reason I need the threading support to work is that in a few of the later pipeline stages there can be substantial delays, in case of which it would certainly be nice if the documents that don't cause any delays can be happily processed in parallel, but that's kinda obvious, I know. :-) Would be nice, though. My personal guess at this time (I have done quite some digging in the code, but obviously not yet enough) is that there seems to be some kind of threading race in the code supporting the options. Then again, that might be totally off, since I don't understand it fully, yet. Help? Regards, David Kinnvall ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Babeldoc-devel mailing list Bab...@li... https://lists.sourceforge.net/lists/listinfo/babeldoc-devel ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Babeldoc-devel mailing list Bab...@li... https://lists.sourceforge.net/lists/listinfo/babeldoc-devel |
|
From: McDonald, B. <Bru...@ba...> - 2003-11-04 14:39:52
|
David,
Thank you - I will be working on this. I would appreciate any help that we can get here.
THIS IS THE LAST BIG ISSUE BEFORE BABELDOC 1.2 GOES FINAL.
PLEASE HELP.
regards,
Bruce.
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, November 04, 2003 9:25 AM
To: bab...@li...
Subject: Re: [Babeldoc-devel] Multithreading problems...
Hi everybody!
David Kinnvall (me) wrote at a previous time:
[stuff about my multithreading problems and the lack of datapoints]
I now have a very small setup that triggers the problem, suggesting
that there is nothing really particularly strange about my setup,
but rather that it should be reproducible (debuggable) for others.
Interesting datapoint: The *first* thread seems to always survive.
It is the remaining one(s) that crashes, sooner or later.
Relevant config snippets to (hopefully) be able to reproduce:
Setup a pipeline called whatever and a directory scanner that
feeds it, using the configs outlined below. Adjust as necessary
for your environment(s).
feeder/config.properties:
=========================
# Scanner feeder implementation
scanner.type=asynchronous
#scanner.type=synchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
pipeline/whatever/whatever.properties:
======================================
entryStage=entry
entry.stageType=Null
entry.nextStage=enricher
entry.tracked=false
enricher.stageType=XpathExtract
enricher.nextStage=writer
enricher.XPath.foo=/foo/@name
writer.stageType=FileWriter
writer.nextStage=null
writer.outputFile=output/whatever/${ticket.getValue()}_${document.get("foo")}.xml
Input document(s) - create a couple of these, named <anything>.xml:
===================================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<foo name="bar1"/>
Now, armed with an active Babeldoc instance, using config as
above, give a bunch (say 4) of input xml documents of the above
form to the scanner. For me it (almost) every time bombs for the
second and third threads in the enricher stage. You might have to
retry a few times, but for me it is very consistent, with an NPE
in VariableProcessor.mustExpand()
Important: You need to give multiple input documents to the
scanner at once, to trigger the problem *quickly* ...but...
However: It is certainly possible to trigger it with one single
document at a time as well! Just keep feeding the scanner+pipeline
one of the documents, watch it process it (successfully at least
once), and keep going, until a different thread from the first
gets to work on it. Boom.
The first thread seems to correctly get its configuration for
pipeline stages et al, but the subsequent configurations do
have a tendency to lack values, particularly, as Bruce noticed,
the ones of "complex" nature. A dump of the config data for an
XpathExtract stage, as configured in the snippet above, during
exception handling of one of the NPEs shows this (formatted)
com.babeldoc.core.option.ConfigData@11c2b67[
name=enricher,value=<null>,children={
stageType=com.babeldoc.core.option.ConfigData@659db7[
name=stageType,value=XpathExtract,children=<null>
],
XPath=com.babeldoc.core.option.ConfigData@1556d12[
name=XPath,value=<null>,children={
foo=com.babeldoc.core.option.ConfigData@edf389[
name=foo,value=<null>,children=<null>
]
}
],
nextStage=com.babeldoc.core.option.ConfigData@16be68f[
name=nextStage,value=writer,children=<null>
]
}
]
from which we learn that what should have been
enricher.XPath.foo=/foo/@name
has instead become
enricher.XPath.foo=<null>
which wasn't expected by the pipeline stage. Hence the NPE.
As before, I need help in resolving this. I have so far been
utterly unable to find a solution on my own. I hope that the
additional data in the above text might be of some assistance
for others to have a go at the problem as well, or at least
verify that the problem exists outside of my particular setup.
Best regards,
David
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: McDonald, B. <Bru...@ba...> - 2003-11-04 15:19:59
Attachments:
threadtest.zip
|
Here is complete example - ready to test.
Start the scanner (babeldoc scanner)
and then copy in the data?.xml files to the input directory and watch it
break.
We need some help here - Lets just say some giving back from however babeldoc might have helped you in the past.
-----Original Message-----
From: McDonald, Bruce
Sent: Tuesday, November 04, 2003 9:38 AM
To: David Kinnvall; bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
David,
Thank you - I will be working on this. I would appreciate any help that we can get here.
THIS IS THE LAST BIG ISSUE BEFORE BABELDOC 1.2 GOES FINAL.
PLEASE HELP.
regards,
Bruce.
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, November 04, 2003 9:25 AM
To: bab...@li...
Subject: Re: [Babeldoc-devel] Multithreading problems...
Hi everybody!
David Kinnvall (me) wrote at a previous time:
[stuff about my multithreading problems and the lack of datapoints]
I now have a very small setup that triggers the problem, suggesting
that there is nothing really particularly strange about my setup,
but rather that it should be reproducible (debuggable) for others.
Interesting datapoint: The *first* thread seems to always survive.
It is the remaining one(s) that crashes, sooner or later.
Relevant config snippets to (hopefully) be able to reproduce:
Setup a pipeline called whatever and a directory scanner that
feeds it, using the configs outlined below. Adjust as necessary
for your environment(s).
feeder/config.properties:
=========================
# Scanner feeder implementation
scanner.type=asynchronous
#scanner.type=synchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
pipeline/whatever/whatever.properties:
======================================
entryStage=entry
entry.stageType=Null
entry.nextStage=enricher
entry.tracked=false
enricher.stageType=XpathExtract
enricher.nextStage=writer
enricher.XPath.foo=/foo/@name
writer.stageType=FileWriter
writer.nextStage=null
writer.outputFile=output/whatever/${ticket.getValue()}_${document.get("foo")}.xml
Input document(s) - create a couple of these, named <anything>.xml:
===================================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<foo name="bar1"/>
Now, armed with an active Babeldoc instance, using config as
above, give a bunch (say 4) of input xml documents of the above
form to the scanner. For me it (almost) every time bombs for the
second and third threads in the enricher stage. You might have to
retry a few times, but for me it is very consistent, with an NPE
in VariableProcessor.mustExpand()
Important: You need to give multiple input documents to the
scanner at once, to trigger the problem *quickly* ...but...
However: It is certainly possible to trigger it with one single
document at a time as well! Just keep feeding the scanner+pipeline
one of the documents, watch it process it (successfully at least
once), and keep going, until a different thread from the first
gets to work on it. Boom.
The first thread seems to correctly get its configuration for
pipeline stages et al, but the subsequent configurations do
have a tendency to lack values, particularly, as Bruce noticed,
the ones of "complex" nature. A dump of the config data for an
XpathExtract stage, as configured in the snippet above, during
exception handling of one of the NPEs shows this (formatted)
com.babeldoc.core.option.ConfigData@11c2b67[
name=enricher,value=<null>,children={
stageType=com.babeldoc.core.option.ConfigData@659db7[
name=stageType,value=XpathExtract,children=<null>
],
XPath=com.babeldoc.core.option.ConfigData@1556d12[
name=XPath,value=<null>,children={
foo=com.babeldoc.core.option.ConfigData@edf389[
name=foo,value=<null>,children=<null>
]
}
],
nextStage=com.babeldoc.core.option.ConfigData@16be68f[
name=nextStage,value=writer,children=<null>
]
}
]
from which we learn that what should have been
enricher.XPath.foo=/foo/@name
has instead become
enricher.XPath.foo=<null>
which wasn't expected by the pipeline stage. Hence the NPE.
As before, I need help in resolving this. I have so far been
utterly unable to find a solution on my own. I hope that the
additional data in the above text might be of some assistance
for others to have a go at the problem as well, or at least
verify that the problem exists outside of my particular setup.
Best regards,
David
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: McDonald, B. <Bru...@ba...> - 2003-11-04 16:02:54
|
Another data point...
If I synchronize the getConfig methods on the ConfigService class (as opposed to the internal synchronization) then I can run the scanner
with only the first and second group (each group has 4 files) of data files causing an exception. All subsequent runs are perfect.
go figure...
Bruce.
-----Original Message-----
From: McDonald, Bruce
Sent: Tuesday, November 04, 2003 10:19 AM
To: David Kinnvall; bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
Here is complete example - ready to test.
Start the scanner (babeldoc scanner)
and then copy in the data?.xml files to the input directory and watch it
break.
We need some help here - Lets just say some giving back from however babeldoc might have helped you in the past.
-----Original Message-----
From: McDonald, Bruce
Sent: Tuesday, November 04, 2003 9:38 AM
To: David Kinnvall; bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
David,
Thank you - I will be working on this. I would appreciate any help that we can get here.
THIS IS THE LAST BIG ISSUE BEFORE BABELDOC 1.2 GOES FINAL.
PLEASE HELP.
regards,
Bruce.
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, November 04, 2003 9:25 AM
To: bab...@li...
Subject: Re: [Babeldoc-devel] Multithreading problems...
Hi everybody!
David Kinnvall (me) wrote at a previous time:
[stuff about my multithreading problems and the lack of datapoints]
I now have a very small setup that triggers the problem, suggesting
that there is nothing really particularly strange about my setup,
but rather that it should be reproducible (debuggable) for others.
Interesting datapoint: The *first* thread seems to always survive.
It is the remaining one(s) that crashes, sooner or later.
Relevant config snippets to (hopefully) be able to reproduce:
Setup a pipeline called whatever and a directory scanner that
feeds it, using the configs outlined below. Adjust as necessary
for your environment(s).
feeder/config.properties:
=========================
# Scanner feeder implementation
scanner.type=asynchronous
#scanner.type=synchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
pipeline/whatever/whatever.properties:
======================================
entryStage=entry
entry.stageType=Null
entry.nextStage=enricher
entry.tracked=false
enricher.stageType=XpathExtract
enricher.nextStage=writer
enricher.XPath.foo=/foo/@name
writer.stageType=FileWriter
writer.nextStage=null
writer.outputFile=output/whatever/${ticket.getValue()}_${document.get("foo")}.xml
Input document(s) - create a couple of these, named <anything>.xml:
===================================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<foo name="bar1"/>
Now, armed with an active Babeldoc instance, using config as
above, give a bunch (say 4) of input xml documents of the above
form to the scanner. For me it (almost) every time bombs for the
second and third threads in the enricher stage. You might have to
retry a few times, but for me it is very consistent, with an NPE
in VariableProcessor.mustExpand()
Important: You need to give multiple input documents to the
scanner at once, to trigger the problem *quickly* ...but...
However: It is certainly possible to trigger it with one single
document at a time as well! Just keep feeding the scanner+pipeline
one of the documents, watch it process it (successfully at least
once), and keep going, until a different thread from the first
gets to work on it. Boom.
The first thread seems to correctly get its configuration for
pipeline stages et al, but the subsequent configurations do
have a tendency to lack values, particularly, as Bruce noticed,
the ones of "complex" nature. A dump of the config data for an
XpathExtract stage, as configured in the snippet above, during
exception handling of one of the NPEs shows this (formatted)
com.babeldoc.core.option.ConfigData@11c2b67[
name=enricher,value=<null>,children={
stageType=com.babeldoc.core.option.ConfigData@659db7[
name=stageType,value=XpathExtract,children=<null>
],
XPath=com.babeldoc.core.option.ConfigData@1556d12[
name=XPath,value=<null>,children={
foo=com.babeldoc.core.option.ConfigData@edf389[
name=foo,value=<null>,children=<null>
]
}
],
nextStage=com.babeldoc.core.option.ConfigData@16be68f[
name=nextStage,value=writer,children=<null>
]
}
]
from which we learn that what should have been
enricher.XPath.foo=/foo/@name
has instead become
enricher.XPath.foo=<null>
which wasn't expected by the pipeline stage. Hence the NPE.
As before, I need help in resolving this. I have so far been
utterly unable to find a solution on my own. I hope that the
additional data in the above text might be of some assistance
for others to have a go at the problem as well, or at least
verify that the problem exists outside of my particular setup.
Best regards,
David
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: David K. <dav...@al...> - 2003-11-04 19:26:22
|
Hi Bruce! McDonald, Bruce wrote: > Another data point... Great! (Well, I am happy to see that the problem wasn't only occuring in my own setup, since I would then be even *more* stumped than I am currently. :-) Now that the problem has been confirmed elsewhere I feel much better about the whole thing. Then again, I feel sorry for causing a delay in the 1.2 release. Oh, well, the things is pretty serious...) > If I synchronize the getConfig methods on the ConfigService class (as opposed to the internal synchronization) then I can run the scanner > with only the first and second group (each group has 4 files) of data files causing an exception. All subsequent runs are perfect. Hmm. How many threads are doing work in the subsequent runs? > go figure... Indeed... As an added data point (not exactly new, but anyway), I can have the problem occur in a multi-pipeline setup as well, in which I have two pipelines doing initial processing for two separate incoming data-sources, and when done they both call a third pipeline for common processing. I have had the crash occur in the third, common, pipeline. So, the problem travels across Call-stages as well. FWIW. I am available (during Swedish working hours, at least) for running more or less whatever tests anybody can come up with that are worth running, if it helps. I can do and test code changes as well, as long as I can grasp what they intend to do. :-) Right now, I feel that I have put in synchronization both here and there to no avail, so I am *obviously* missing the target. Hints to get me on the right track are welcome! > Bruce. /David |
|
From: McDonald, B. <Bru...@ba...> - 2003-11-04 21:59:56
|
It looks like I might have a fix here.
It is:
1. Make the following two methods synchronized (at the method level):
SimplePipelineStageFactory.getResolver
PipelineFactory.getPipelineStageFactory
2. Add the line to: ConfigInfo.applyConfigValue(IConfigData data, ConfigOption configoption):
After the line: subconfigoption = createDynamicOption(subdata, key);
add: subconfigoption.setConfigData(subdata);
Please confirm!
regards,
Bruce.
-----Original Message-----
From: McDonald, Bruce
Sent: Tuesday, November 04, 2003 11:01 AM
To: McDonald, Bruce; David Kinnvall;
bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
Another data point...
If I synchronize the getConfig methods on the ConfigService class (as opposed to the internal synchronization) then I can run the scanner
with only the first and second group (each group has 4 files) of data files causing an exception. All subsequent runs are perfect.
go figure...
Bruce.
-----Original Message-----
From: McDonald, Bruce
Sent: Tuesday, November 04, 2003 10:19 AM
To: David Kinnvall; bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
Here is complete example - ready to test.
Start the scanner (babeldoc scanner)
and then copy in the data?.xml files to the input directory and watch it
break.
We need some help here - Lets just say some giving back from however babeldoc might have helped you in the past.
-----Original Message-----
From: McDonald, Bruce
Sent: Tuesday, November 04, 2003 9:38 AM
To: David Kinnvall; bab...@li...
Subject: RE: [Babeldoc-devel] Multithreading problems...
David,
Thank you - I will be working on this. I would appreciate any help that we can get here.
THIS IS THE LAST BIG ISSUE BEFORE BABELDOC 1.2 GOES FINAL.
PLEASE HELP.
regards,
Bruce.
-----Original Message-----
From: David Kinnvall [mailto:dav...@al...]
Sent: Tuesday, November 04, 2003 9:25 AM
To: bab...@li...
Subject: Re: [Babeldoc-devel] Multithreading problems...
Hi everybody!
David Kinnvall (me) wrote at a previous time:
[stuff about my multithreading problems and the lack of datapoints]
I now have a very small setup that triggers the problem, suggesting
that there is nothing really particularly strange about my setup,
but rather that it should be reproducible (debuggable) for others.
Interesting datapoint: The *first* thread seems to always survive.
It is the remaining one(s) that crashes, sooner or later.
Relevant config snippets to (hopefully) be able to reproduce:
Setup a pipeline called whatever and a directory scanner that
feeds it, using the configs outlined below. Adjust as necessary
for your environment(s).
feeder/config.properties:
=========================
# Scanner feeder implementation
scanner.type=asynchronous
#scanner.type=synchronous
scanner.queue=disk
scanner.queueDir=scanner/queue
scanner.queueName=scanner
scanner.poolSize=3
pipeline/whatever/whatever.properties:
======================================
entryStage=entry
entry.stageType=Null
entry.nextStage=enricher
entry.tracked=false
enricher.stageType=XpathExtract
enricher.nextStage=writer
enricher.XPath.foo=/foo/@name
writer.stageType=FileWriter
writer.nextStage=null
writer.outputFile=output/whatever/${ticket.getValue()}_${document.get("foo")}.xml
Input document(s) - create a couple of these, named <anything>.xml:
===================================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<foo name="bar1"/>
Now, armed with an active Babeldoc instance, using config as
above, give a bunch (say 4) of input xml documents of the above
form to the scanner. For me it (almost) every time bombs for the
second and third threads in the enricher stage. You might have to
retry a few times, but for me it is very consistent, with an NPE
in VariableProcessor.mustExpand()
Important: You need to give multiple input documents to the
scanner at once, to trigger the problem *quickly* ...but...
However: It is certainly possible to trigger it with one single
document at a time as well! Just keep feeding the scanner+pipeline
one of the documents, watch it process it (successfully at least
once), and keep going, until a different thread from the first
gets to work on it. Boom.
The first thread seems to correctly get its configuration for
pipeline stages et al, but the subsequent configurations do
have a tendency to lack values, particularly, as Bruce noticed,
the ones of "complex" nature. A dump of the config data for an
XpathExtract stage, as configured in the snippet above, during
exception handling of one of the NPEs shows this (formatted)
com.babeldoc.core.option.ConfigData@11c2b67[
name=enricher,value=<null>,children={
stageType=com.babeldoc.core.option.ConfigData@659db7[
name=stageType,value=XpathExtract,children=<null>
],
XPath=com.babeldoc.core.option.ConfigData@1556d12[
name=XPath,value=<null>,children={
foo=com.babeldoc.core.option.ConfigData@edf389[
name=foo,value=<null>,children=<null>
]
}
],
nextStage=com.babeldoc.core.option.ConfigData@16be68f[
name=nextStage,value=writer,children=<null>
]
}
]
from which we learn that what should have been
enricher.XPath.foo=/foo/@name
has instead become
enricher.XPath.foo=<null>
which wasn't expected by the pipeline stage. Hence the NPE.
As before, I need help in resolving this. I have so far been
utterly unable to find a solution on my own. I hope that the
additional data in the above text might be of some assistance
for others to have a go at the problem as well, or at least
verify that the problem exists outside of my particular setup.
Best regards,
David
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Babeldoc-devel mailing list
Bab...@li...
https://lists.sourceforge.net/lists/listinfo/babeldoc-devel
|
|
From: David K. <dav...@al...> - 2003-11-05 12:59:12
|
Bruce, list, McDonald, Bruce wrote: > It looks like I might have a fix here. I am inclined to agree! See below. > It is: > > 1. Make the following two methods synchronized (at the method level): > > SimplePipelineStageFactory.getResolver > PipelineFactory.getPipelineStageFactory Done. > 2. Add the line to: ConfigInfo.applyConfigValue(IConfigData data, ConfigOption configoption): > > After the line: subconfigoption = createDynamicOption(subdata, key); > add: subconfigoption.setConfigData(subdata); Done. (But changed "subdata" to "option" in my local copy.) > Please confirm! Success! I am unable to reproduce the aformentioned threading problems with the above changes in place. Great work, Bruce!! (My tests so far have been with the isolated test-case code. I will now continue with my "real" setup with multiple pipes and call-stages etc. and verify that this also works, but I feel confident that it will. Will report back if it doesn't.) > regards, > Bruce. Regards, David |
|
From: McDonald, B. <Bru...@ba...> - 2003-11-05 13:53:48
|
You got it. It was two problems - one from the unsynchronized methods and another from the subdata configuration not getting set. -----Original Message----- From: David Kinnvall [mailto:dav...@al...] Sent: Wednesday, November 05, 2003 7:59 AM To: bab...@li... Cc: McDonald, Bruce Subject: Re: [Babeldoc-devel] Multithreading problems... Bruce, list, McDonald, Bruce wrote: > It looks like I might have a fix here. I am inclined to agree! See below. > It is: > > 1. Make the following two methods synchronized (at the method level): > > SimplePipelineStageFactory.getResolver > PipelineFactory.getPipelineStageFactory Done. > 2. Add the line to: ConfigInfo.applyConfigValue(IConfigData data, ConfigOption configoption): > > After the line: subconfigoption = createDynamicOption(subdata, key); > add: subconfigoption.setConfigData(subdata); Done. (But changed "subdata" to "option" in my local copy.) > Please confirm! Success! I am unable to reproduce the aformentioned threading problems with the above changes in place. Great work, Bruce!! (My tests so far have been with the isolated test-case code. I will now continue with my "real" setup with multiple pipes and call-stages etc. and verify that this also works, but I feel confident that it will. Will report back if it doesn't.) > regards, > Bruce. Regards, David |
|
From: David K. <dav...@al...> - 2003-11-05 14:27:14
|
McDonald, Bruce wrote: > You got it. It was two problems - one from the unsynchronized methods and another from the subdata configuration not getting set. I have run some initial test on my internal production setup, with no ill effects so far, I am very happy to say. The fix you made looks to be *it* as far as I am concerned. Great! For a completely different question/issue: I have noticed, partly when trying to aid in debugging this particular issue, that not very many of the logging points in Babeldoc use the possibility to provide a name through the LogService. This leads to the log output being sometimes hard to follow, as in finding out who exactly logged a particular message. I think it could be a useful task to review the use of the LogService and in particular the instances that look like this LogService.getInstance().logFoo(bar); and try to invent, say "baz", for each LogService user, and instead of the above, do this LogService.getInstance(baz).logFoo(bar); Some kind of naming hierarchy perhaps? E.g for pipelines and their respective stages a name like "PipelineName"."PipelineStageName" would lead to log output resembling <time> INFO [PipelineName.PipelineStageName] : <message> instead of the current <time> INFO [Thread-x] : <message> and so on, for the different parts of Babeldoc. To make it easy to use, I think direct LogService.getInstance() calls should be discouraged in most of the code and instead there could be made available a utility method getLogService() in which the proper setup for each class and instance is made. (getLogService would preferably be responsible for selecting a suitable log-name, and whatever else might be necessary, and could likely be made into a base-class method in many cases, e.g for PipelineStage implementations to use.) What do you think? Overkill, maybe? /David |
|
From: McDonald, B. <Bru...@ba...> - 2003-11-05 14:37:59
|
Good. Great news about the bug. This is/was a big one. Logging. There is a method called getLog which is peppered all over the code which gets a particular logger for the code. This is structured largely as a hierarchy based on the babeldoc package layout. Now, here is the thing: The base getInstance() actually tries to get a logger based on the name of the class that is calling it. It does this by inspecting the call stack until it can get the Fully Qualified Class Name (FQCN). (see LogService line 161). This and the log4j properies file can specify the exactly logging requirements for your class - you just need to add the appender (named either as the FQCN or a parent package) with the particular pattern that you want to log with. Bruce. -----Original Message----- From: David Kinnvall [mailto:dav...@al...] Sent: Wednesday, November 05, 2003 9:25 AM To: bab...@li... Cc: McDonald, Bruce Subject: Re: [Babeldoc-devel] Multithreading problems... McDonald, Bruce wrote: > You got it. It was two problems - one from the unsynchronized methods and another from the subdata configuration not getting set. I have run some initial test on my internal production setup, with no ill effects so far, I am very happy to say. The fix you made looks to be *it* as far as I am concerned. Great! For a completely different question/issue: I have noticed, partly when trying to aid in debugging this particular issue, that not very many of the logging points in Babeldoc use the possibility to provide a name through the LogService. This leads to the log output being sometimes hard to follow, as in finding out who exactly logged a particular message. I think it could be a useful task to review the use of the LogService and in particular the instances that look like this LogService.getInstance().logFoo(bar); and try to invent, say "baz", for each LogService user, and instead of the above, do this LogService.getInstance(baz).logFoo(bar); Some kind of naming hierarchy perhaps? E.g for pipelines and their respective stages a name like "PipelineName"."PipelineStageName" would lead to log output resembling <time> INFO [PipelineName.PipelineStageName] : <message> instead of the current <time> INFO [Thread-x] : <message> and so on, for the different parts of Babeldoc. To make it easy to use, I think direct LogService.getInstance() calls should be discouraged in most of the code and instead there could be made available a utility method getLogService() in which the proper setup for each class and instance is made. (getLogService would preferably be responsible for selecting a suitable log-name, and whatever else might be necessary, and could likely be made into a base-class method in many cases, e.g for PipelineStage implementations to use.) What do you think? Overkill, maybe? /David |
|
From: David K. <dav...@al...> - 2003-11-05 15:07:12
|
McDonald, Bruce wrote: > Good. Great news about the bug. This is/was a big one. Yeah, it feels much better now. This allows me to get around the delay issues I have in some of the subsequent pipeline stage processing, which is very nice indeed. > Logging. There is a method called getLog which is peppered all over the code which gets a particular logger for the code. This is structured largely as a hierarchy based on the babeldoc package layout. Now, here is the thing: > > The base getInstance() actually tries to get a logger based on the name of the class that is calling it. It does this by inspecting the call stack until it can get the Fully Qualified Class Name (FQCN). (see LogService line 161). This and the log4j properies file can specify the exactly logging requirements for your class - you just need to add the appender (named either as the FQCN or a parent package) with the particular pattern that you want to log with. Hmm. Seems I haven't really understood how Babeldoc logs I will take a second, closer, look and try to fix it the configuration way. Thanks for the pointer(s). > Bruce. /David |
|
From: Michael A. <mic...@ze...> - 2003-11-05 17:45:34
|
Does this mean that we're going to branch 1.2 maintenance soon? Cheers... On Wed, 2003-11-05 at 15:06, David Kinnvall wrote: > McDonald, Bruce wrote: > > > Good. Great news about the bug. This is/was a big one. > > Yeah, it feels much better now. This allows me to get around > the delay issues I have in some of the subsequent pipeline > stage processing, which is very nice indeed. > > > Logging. There is a method called getLog which is peppered all over the code which gets a particular logger for the code. This is structured largely as a hierarchy based on the babeldoc package layout. Now, here is the thing: > > > > The base getInstance() actually tries to get a logger based on the name of the class that is calling it. It does this by inspecting the call stack until it can get the Fully Qualified Class Name (FQCN). (see LogService line 161). This and the log4j properies file can specify the exactly logging requirements for your class - you just need to add the appender (named either as the FQCN or a parent package) with the particular pattern that you want to log with. > > Hmm. Seems I haven't really understood how Babeldoc logs > I will take a second, closer, look and try to fix it the > configuration way. Thanks for the pointer(s). > > > Bruce. > > /David > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Babeldoc-devel mailing list > Bab...@li... > https://lists.sourceforge.net/lists/listinfo/babeldoc-devel |
|
From: David K. <dav...@al...> - 2003-11-07 08:21:56
|
David Kinnvall (me) wrote: > McDonald, Bruce wrote: >> Logging. There is a method called getLog which is peppered all over >> the code which gets a particular logger for the code. This is >> structured largely as a hierarchy based on the babeldoc package >> layout. Now, here is the thing: Hmm...searching through the code there seems to be about 300 direct uses of LogService.getInstance(), as compared to about 60 uses of a getLog() method. Seems to me as if the use of a getLog() is not consistent. (I know it isn't in the code that I have written myself, so far.) Perhaps there needs to be some kind of policy and/or cleanup here? On the other hand, it seems to work pretty fine anyway, so I might very well be out on a limb here. I am willing to be educated, as always. :-) >> The base getInstance() actually tries to get a logger based on the >> name of the class that is calling it. It does this by inspecting the >> call stack until it can get the Fully Qualified Class Name (FQCN). >> (see LogService line 161). This and the log4j properies file can >> specify the exactly logging requirements for your class - you just >> need to add the appender (named either as the FQCN or a parent >> package) with the particular pattern that you want to log with. > Hmm. Seems I haven't really understood how Babeldoc logs > I will take a second, closer, look and try to fix it the > configuration way. Thanks for the pointer(s). Second, closer, look taken. Reconfiguration of the LogService successfully made and more context is now available to me in my logs. Great! However, I can't seem to find a way to make my changes effective unless I actually do them in the log config inside the babeldoc_core.jar. I did initially expect to be able to do so by creating either a log4j.properties (or perhaps a log/config.properties) which would then override the ones in the core.jar. They didn't. Which leads me to ask: - Which one is actually controlling logging behavior? log4j.properties or log/config.properties? They are very similar but it can't be both, can it? (I would suspect log4j.properties myself...?) - Is it supposed to be possible to override default logging config via local configuration outside the core.jar ? I can't get that to work currently. Am i missing something? Oh, and yeah, a more massive test-run using the multithread fix from Bruce worked beautifully yesterday. 5500 documents processed in parallel by four threads with no hiccups. Took about four hours to complete and exposed no problems. (And it excercised my SftpScanner as well, transferring 60 MB of document data over 10 parallel sftp channels. Sweet. :-) ) Regards, David. |
|
From: McDonald, B. <Bru...@ba...> - 2003-11-05 18:01:45
|
Yes - unless there are objections. -----Original Message----- From: Michael Ansley [mailto:mic...@ze...] Sent: Wednesday, November 05, 2003 12:39 PM To: bab...@li...; McDonald, Bruce Subject: Re: [Babeldoc-devel] Multithreading problems... Does this mean that we're going to branch 1.2 maintenance soon? Cheers... On Wed, 2003-11-05 at 15:06, David Kinnvall wrote: > McDonald, Bruce wrote: > > > Good. Great news about the bug. This is/was a big one. > > Yeah, it feels much better now. This allows me to get around > the delay issues I have in some of the subsequent pipeline > stage processing, which is very nice indeed. > > > Logging. There is a method called getLog which is peppered all over the code which gets a particular logger for the code. This is structured largely as a hierarchy based on the babeldoc package layout. Now, here is the thing: > > > > The base getInstance() actually tries to get a logger based on the name of the class that is calling it. It does this by inspecting the call stack until it can get the Fully Qualified Class Name (FQCN). (see LogService line 161). This and the log4j properies file can specify the exactly logging requirements for your class - you just need to add the appender (named either as the FQCN or a parent package) with the particular pattern that you want to log with. > > Hmm. Seems I haven't really understood how Babeldoc logs > I will take a second, closer, look and try to fix it the > configuration way. Thanks for the pointer(s). > > > Bruce. > > /David > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Babeldoc-devel mailing list > Bab...@li... > https://lists.sourceforge.net/lists/listinfo/babeldoc-devel |
|
From: McDonald, B. <Bru...@ba...> - 2003-11-05 22:13:00
|
Yes - we are going to be releasing 1.2 and then making that the stable branch. We will open up the development branch as 1.3. This will eventually be released as 1.4. After that we will be shooting for 2.0. -----Original Message----- From: Michael Ansley [mailto:mic...@ze...] Sent: Wednesday, November 05, 2003 12:39 PM To: bab...@li...; McDonald, Bruce Subject: Re: [Babeldoc-devel] Multithreading problems... Does this mean that we're going to branch 1.2 maintenance soon? Cheers... On Wed, 2003-11-05 at 15:06, David Kinnvall wrote: > McDonald, Bruce wrote: > > > Good. Great news about the bug. This is/was a big one. > > Yeah, it feels much better now. This allows me to get around > the delay issues I have in some of the subsequent pipeline > stage processing, which is very nice indeed. > > > Logging. There is a method called getLog which is peppered all over the code which gets a particular logger for the code. This is structured largely as a hierarchy based on the babeldoc package layout. Now, here is the thing: > > > > The base getInstance() actually tries to get a logger based on the name of the class that is calling it. It does this by inspecting the call stack until it can get the Fully Qualified Class Name (FQCN). (see LogService line 161). This and the log4j properies file can specify the exactly logging requirements for your class - you just need to add the appender (named either as the FQCN or a parent package) with the particular pattern that you want to log with. > > Hmm. Seems I haven't really understood how Babeldoc logs > I will take a second, closer, look and try to fix it the > configuration way. Thanks for the pointer(s). > > > Bruce. > > /David > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Babeldoc-devel mailing list > Bab...@li... > https://lists.sourceforge.net/lists/listinfo/babeldoc-devel ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Babeldoc-devel mailing list Bab...@li... https://lists.sourceforge.net/lists/listinfo/babeldoc-devel |