I am trying to do some tuning with threadpools etc. but I can't get the parameters to work; this is the idea:
I have an sql-scanner which scans a database-table for records; the number of records returned by the scanner can vary (1 to 50 (??)) according to how I can configure the system; this scanner feeds a pipeline named "AttestGenerator" to generate a pdf-document for each record;
as my sql-scanner can return multiple rows, I use the XpathSplitter to split the rows into separate babeldoc-documents; so, after the XpathSplitter, I would like the create multiple processes which treat the documents simultaneously.
Now I would like to do some tuning the augment the number of parallel processes but I can figure out how (I have put some debug statements in the logging but the poolsize and maxThreads never change) ; any hints would be welcome
in my pipelineStages i had defined the maxThreads, but it does not make any difference:
<pipeline-name>AttestGenerator</pipeline-name>
<dynamic>
<entry-stage>entry</entry-stage>
<!-- STAGES: Defines the stages -->
<stage-inst>
<stage-name>entry</stage-name>
<stage-desc>This does nothing</stage-desc>
<stage-type>Null</stage-type>
</stage-inst>
<stage-inst>
<stage-name>splitRows</stage-name>
<stage-desc>several records might be returned; this will split them into separate documents</stage-desc>
<stage-type>XpathSplitter</stage-type>
<option>
<option-name>XPath</option-name>
<option-value>/document/docgen</option-value>
</option>
<option>
<option-name>tracked</option-name>
<option-value>false</option-value>
</option>
<option>
<option-name>xmlOmitDecl</option-name>
<option-value>false</option-value>
</option>
<option>
<option-name>xmlIndent</option-name>
<option-value>true</option-value>
</option>
<option>
<!-- maxThreads does not seem to work --> <option-name>maxThreads</option-name>
<option-value>10</option-value>
</option>
</stage-inst>
...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think I am missing something; I have put the option in the stage which follows the splitter (which is an XpathExtract) and in my logging, I still see 5 threads for the extract; this seems to be the default which I found in the class AsyncPipelineStageProcessor (DEFAULT_MAX_THREADS); so how do I configure this stage to get more threads, or do I have to configure something in the processor ?
regards,
Jan
this is the definition of the XpathExtract which follows the splitter:
<stage-inst>
<stage-name>extract</stage-name>
<stage-desc>this extracts stuff from the xml document</stage-desc>
<stage-type>XpathExtract</stage-type>
<option>
<option-name>XPath</option-name>
<option-value></option-value>
<sub-option>
<option-name>id</option-name>
<option-value>/docgen/id/text()</option-value>
</sub-option>
<!-- more suboptions here -->
OK, I found it; I left everything as it was; for the pipeline stage, I changed the processor.type to async, and in my XpathSplitter - stage, I added the option
However, the option of 10 was than made 5 because of the globalMaxThreads - parameter in the fAsyncPipelineStageProcessor.java file, which I changed to 10 (and recompiled);
is there a way to specify this in the config files and where ?
Jan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I am trying to do some tuning with threadpools etc. but I can't get the parameters to work; this is the idea:
I have an sql-scanner which scans a database-table for records; the number of records returned by the scanner can vary (1 to 50 (??)) according to how I can configure the system; this scanner feeds a pipeline named "AttestGenerator" to generate a pdf-document for each record;
as my sql-scanner can return multiple rows, I use the XpathSplitter to split the rows into separate babeldoc-documents; so, after the XpathSplitter, I would like the create multiple processes which treat the documents simultaneously.
Now I would like to do some tuning the augment the number of parallel processes but I can figure out how (I have put some debug statements in the logging but the poolsize and maxThreads never change) ; any hints would be welcome
Jan
feeder definition:
async.type=asynchronous
async.queue=memory
async.poolSize=20
async.maxThreads=20
pipeline definition
AttestGenerator.type=xml
AttestGenerator.configFile=attestB3/attest/attestGenerator.xml
AttestGenerator.processor.type=threadpool
AttestGenerator.processor.poolSize=20
in my pipelineStages i had defined the maxThreads, but it does not make any difference:
<pipeline-name>AttestGenerator</pipeline-name>
<dynamic>
<entry-stage>entry</entry-stage>
<!-- STAGES: Defines the stages -->
<stage-inst>
<stage-name>entry</stage-name>
<stage-desc>This does nothing</stage-desc>
<stage-type>Null</stage-type>
</stage-inst>
<stage-inst>
<stage-name>splitRows</stage-name>
<stage-desc>several records might be returned; this will split them into separate documents</stage-desc>
<stage-type>XpathSplitter</stage-type>
<option>
<option-name>XPath</option-name>
<option-value>/document/docgen</option-value>
</option>
<option>
<option-name>tracked</option-name>
<option-value>false</option-value>
</option>
<option>
<option-name>xmlOmitDecl</option-name>
<option-value>false</option-value>
</option>
<option>
<option-name>xmlIndent</option-name>
<option-value>true</option-value>
</option>
<option>
<!-- maxThreads does not seem to work --> <option-name>maxThreads</option-name>
<option-value>10</option-value>
</option>
</stage-inst>
...
After the XpathSplitter, use CallStage to call another pipeline that has all the async settings.
Sherman
Sherman,
I think I am missing something; I have put the option in the stage which follows the splitter (which is an XpathExtract) and in my logging, I still see 5 threads for the extract; this seems to be the default which I found in the class AsyncPipelineStageProcessor (DEFAULT_MAX_THREADS); so how do I configure this stage to get more threads, or do I have to configure something in the processor ?
regards,
Jan
this is the definition of the XpathExtract which follows the splitter:
<stage-inst>
<stage-name>extract</stage-name>
<stage-desc>this extracts stuff from the xml document</stage-desc>
<stage-type>XpathExtract</stage-type>
<option>
<option-name>XPath</option-name>
<option-value></option-value>
<sub-option>
<option-name>id</option-name>
<option-value>/docgen/id/text()</option-value>
</sub-option>
<!-- more suboptions here -->
</option>
<option>
<option-name>tracked</option-name>
<option-value>false</option-value>
</option>
<option>
<option-name>maxThreads</option-name>
<option-value>10</option-value>
</option>
</stage-inst>
Put the XpathExtract in a completely new pipeline with the threading options defined, not in the same pipeline as the XpathSplitter.
Sherman
OK, I found it; I left everything as it was; for the pipeline stage, I changed the processor.type to async, and in my XpathSplitter - stage, I added the option
<option>
<option-name>maxThreads</option-name>
<option-value>10</option-value>
</option>
However, the option of 10 was than made 5 because of the globalMaxThreads - parameter in the fAsyncPipelineStageProcessor.java file, which I changed to 10 (and recompiled);
is there a way to specify this in the config files and where ?
Jan