Syslogd2 Wiki

High capacity syslog data collection, filtering, and management.

Brought to you by: efreesmeyer

MultiThread Models

Authors:

Attachments

MultiThreadModel12.png (68511 bytes)

MultiThreadModel3.png (73389 bytes)

MultiThreadModel4.png (79575 bytes)

MultiThreadModel5.png (106962 bytes)

MultiThreadModel6.png (103830 bytes)

Return to Home page

Return to top
All references in this wiki to "multi-thread models" reference this background discussion and the diagrams on this page. See also Definition of "ThreadPool".
Syslogd2 supports not only more than one TYPE of threadpool, (the threadpool type is defined by the code that the threads in that threadpool execute). Each threadpool will have one or more threads assigned to it that share the same set of processing instructions -- using both the same input source and the same output targets but doing so in parallel. For Syslogd2 the supported threadpool types (in addition to the single (one-off) parent thread) are:

`Reader threads` (with sub-types): This thread-type reads data from an IP or Unix socket and either queues the data into a Worker-threads FIFO queue or calls the work-thread subroutine directly.
* `DRT read-thread type`: An ultra-high-speed socket-reading thread-type that is limited to reading one UDP or LInux datagram socket only.
* `PSelect read-thread type`: The 'generic' (but slower) socket-reading thread-type that can simultaneously monitor multiple sockets for input. It is named after the system pselect() function call that is the core of this routine. A pselect read thread can listen on multiple datagram sockets simultaneously with multiple stream listen sockets and multiple stream data-connection sockets.
* `Tailfile read-thread type`: This thread-type is explicitly designed to read ascii tailfiles using a polling algorithm as opposed to relying on the system to determine when data is ready to read.
* 'Kernel read-thread type': This thread-type is explicitly designed to read kernel messages from either the /proc/kmsg file or directly from the kernel's ring-buffer (without needing the /proc filesystem to be mounted). This thread-type is still under development and not available in version 1.0.0.

`Worker threads`: This thread-type runs in a loop: dequeueing a message and it's envelope data, parsing the message elements, applying input filters and either queueing the result into one or more output threadpool FIFO queues or directly comparing the message facility & priority against each output and calling the applicable output subroutine.

`Output threads`: This thread-type runs in a loop: dequeuing a message and it's envelope data then copmaring facility & priority against each destination in it's member list (by comparing against each selector mask). Each time the facility & priority match a selector mask, an output filter is applied (if defined) before the message is formatted and written to its destination (or spool file if the destination is unreachable).

`User threads`: This threadpool is really just a FIFO queue and a group of threads that are dedicated to writing each queued message to a list of users IF those users are currently logged into the host.

`Housekeeping threads`: This 'threadpool' is really a thread**group** that runs background processes as assigned by the parent thread. Their purpose is to offload what would otherwise be an overwhelming processing load on the parent thread.

Since threadpool model 1 only consists of a single threadpool, the configuration file only allows specification of a positive value for the number of threads to create for the default threadpool (which is assigned a non-changable threadpoolid value of 0). If CAP_SINGLETHREAD is defined, the threadcount is also fixed at a value of 1.
Return to top

Model 1 (Single Thread-pool)

Return to top
The diagram on the left below (model 1) represents the general program design of both single-threaded and most (if not all) current multi-threaded syslog servers available today. If you assume that the single threadpool shown in model 1 contains a single thread, you have a single-thread application such as the 1970's syslogd daemons.
If you allow multiple threads in that threadpool, you get a model that reresents MOST multi-threaded appraches to solving performance problems -- that is to simply create more threads that all do exactly the same thing.

Syslogd2 can implement this model if no optional thread-types are defined (other than the default socket or *input* type threadpool). (ie: No CAP_WORKERTHREADS, CAP_OUTPUTTHREADS, CAP_HOUSEKEEPING, CAP_KERNELTHREADS, CAP_USERTHREADS or CAP_TAILFILES).

To cause Syslogd2 to emulate Multi-threadpool model 1 vs threadpool model 2, define CAP_SINGLEPOOL to restrict the number of threadpools to a maximum of 1.

To get single-thread performance, also define CAP_SINGLETHREAD which limits the number of threads in any threadpool to a maximum of 1.

Threadpool model 1 is the most familiar configurations to system admins today but this model has limited performance potential.

Let us define 'speed' not as the time it takes to process one message through the system, but as a measure of throughput (presented with infinite input, how many messages can a given system design process per unit of time ?).

The reason threadpool model 1 is so limited is that each thread performs all processing steps for each message and there is no way to support any input mechanism that cannot be read by the single data-input method. Also, no matter how many threads are allocated to that single threadpool, sooner or later it is highly likely that all of them will be waiting on some system response (DNS response, slow output connection, IP timeout, etc), leaving no threads availabe to accept incoming traffic. Create too many threads and you quickly reach a point of diminishing returns -- especially if reading input from TCP connections where multiple (even partial) messages may be read from a socket at one time and have to be buffered, then parsed into separate events by the application. Additionally, attempting to write to a broken connection causes each thread in turn to time-out before the next thread may make an attempt (resulting in yet another timeout).

Having each thread process every step in each message means that if/when delays occur in processing a message, that thread is not available to read the next message from the network buffer before it is over-written. Network buffers are not infinite. They typically hold approximately 6 to 10 incoming packets per input socket. If the application (overall) does not read the data from the network buffer(s) before incoming packets overwrite the data currently in the network buffer(s), the existing (unread) data is lost.

Since UDP is connectionless, any data lost due to a slow-responding application will not be / cannot be retransmitted and is lost for good.

There are many points in processing a syslog message where a thread must wait on a system-call response. (While waiting on a system call response, the thread cannot be reading more input data.) Some of these wait times can become quite lengthy. For example, waiting on DNS response to resolve host-names or attempting to open a pipe or network connection that cannot be resolved or opened due to network outage or DNS being unavailable. These delays may cause multiple threads to "stack up" waiting for responses while input data is being lost at the input socket.

Return to top

Model 2 (Muliple Thread-pools)

Return to top
Model 2 (on the right below) represents a daemon design that allows for multiple threadpools (possibly invoking different processing algorithms -referred to here as `thread-types`. This model allows for multiple threadpools reading different sockets (Linux vs IP or high-speed vs low-speed) as well as for threads reading non-socket input (such as tailing files via a polling mechanism or directly reading the kernel's syslog ring-buffer).

This model has potential to be faster than model 1 designs since the addiitonal threadpools can be assigned to 'isolate' the input load and can even be dedicated to high-speed sockets for performance while allowing lower-speed traffic to share a single input threadpool. Model 2 also allows for threadpools dedicated to non-traditional inputs such as direct access to the kernel's ring-buffer (so the /proc filesystem need not be mounted for syslog purposes) or for polling of multiple files mounted from remote systems where Linux efficiencies cannot be used. Model 2 also allows for isolation of input traffic to avoid interference between inputs. (IP vs Linux or TCP vs UDP for example).
In practice, this threadpool model also quickly reaches a point of diminishing returns as the advantages gained by parallel read-threadpools is quickly neutralized because the processing threads may start blocking each other as they "pile-up" behind a TCP host connetion that is timing out or as they may all be waiting on DNS responses while incoming network packets are going unaddressed and being over-written by newer input. Adding more active threads to such a design will not prove effective past a certain point because the processing delays downstream of reading the input data combined with the delays caused by one thread blocking one or more of the others will far outstrip the gains of adding more threads (or even threadpools) into the mix.

In the configuration file, each threadpool (in model 2) is associated with a non-negative threadpool-id and a positive number of threads to create in that threadpool.
Return to top

Model 3 (Muliple Thread-pool sets, Dedicated Readers)

Return to top
With multithread model 3, Syslogd moves out of the realm of host syslog server and into the realm of network/host-management edge data collector. The diagram below shows the resulting multi-thread model when CAP_WORKERTHREADS is defined in Syslogd2.

Note that the reader threads are now isolated into their own thread-group and responsible only for reading data. A separate (much larger) set of threads is responsible for processing the data after it has been moved out of the network buffers and into application buffers (FIFOs). As a result of the load-reduction on reader threads and their (new) focus on moving data from network buffers to application buffers, reader threads can be reduced to a nominal range of 4-8 threads. A new (considerably larger) set of threads (I refer to as `worker threads`) is responsible for processing the data and writing to outputs. The two sets of threads are linked by a FIFO (First In First Out) buffer designated in the diagram by the "Input Queue" block. This FIFO queue (buffer) may be as large (or small) as the situation dictates. For extremely heavy input or exceptionally bursty data, larger buffers (on the order of several thousand lines) are suggested. Fewer than about 600-1000 lines will likely result in data loss during traffic-bursts as the worker-threada are unable to keep up with the sudden demand.

In this model, the reader threads have no responsibiliies for parsing messages or writing them to their multiple destinations. This allows them to concentrate on moving data from network buffers to application buffers as fast as possible. Reader-thread processing for any given message ends when they have queued that message to the worker-threads' FIFO input-queue-buffer.

It's important to have the reader threads be as fast as possible in high-volume UDP environments (such as firewall monitoring) because UDP has no flow-control and any data not retrieved into application buffers before being overwritten by new network packets is lost.

Syslogd2 seeks to optimize performance by automatically selecting the high-volume DRT-type read thread if there is only one UDP (or Unix datagram) socket assigned to a given threadpool. Otherwise a (default) PSelect-type read thread is used for socket input.

PSelect-type read threads involve more overhead than DRT-type threads. PSelect-type threads need to prepare a list of candidate sockets as input to the system call. The system call returns an indicator of which sockets are ready to be written. Each socket must then be located in the list of available sockets to retrieve per-source settings, the data needs to be read from the socket, the reciept-time obtained and all this information queued for the work-threads. All ready-to-read sockets need to be processed before the next system call is made to avoid data loss in the application since the system call will block (not return) until at least one socket has new data ready to read (and that socket may be an unread socket from the previous call).

It's often a good idea to separate UDP and TCP connections into separate threadpools or to put TCP connections into a threadpool with slower (low-volume) UDP traffic which can tolerate slight processing delays while multiple messages from the data-block read by streaming sockets are being processed between read-cycles.

Currently, reading IP or LInux sockets (blocking on a system call) and polling text files to check for new data cannot be done (efficiently or otherwise) in the same threadpool since blocking (or excessive processing) on one source-type means potentially missing data on the other, so if both are desired, multiple (different) reader threadpools must be activated (not possible in threadpool model 1). This may change when Syslogd2 implements the Linux inotify() functionality and is able to have Linux notify it via socket when a file has grown or changed. Even then, the polling tailfile thread-type will likely be kept to support ports to other operating systems or to support tailing files on mounted filesystems.

In threadpool model 3, Syslogd2 isolates the input (read) functions from the processing functions. In the configuration file, each threadpool is associated with a non-negative threadpool-id, a positive number of `reader` threads for reading input data into the buffer (as in threadpool model 2) as well as a 2nd positive value indicating the number of 'processing' threads (or `worker` threads) and a 3rd positive number designating the number of queued messages `lines` (storage slots) to be allocated in the connecting FIFO buffer. These 4 values (`id`, `readers`, `workers` and `lines`) are the calculated maximum of the compiled-in values, run-time global over-rides, values given in the`--threadmaps` keyword or values provided on individual input-source configuration lines.
Return to top

Model 4 (Muliple Thread-pool sets, Dedicated Output)

Return to top
Multithread model 4 in Syslgod2 may be more appropriate for larger hosts (servers). In this model, the input threads are responsible fro reading data AND processing it up to the point where the output destinations are chosen. Instead of checking each individual output destination and taking the time to write to that output if necessary, multithread model 4 compares against a `summary threadpool mask` that is calculated for each threadpool. This mask is automatically created by combining (merging) the selection mask for each destination that is assigned to a given threadpool. This results in significant reduction in processing delays (and subsequent ability for the reader threads to more quickly return and read the next network-buffered input) by 'handing off' the bulk of the output-processing to threadgroups dedicated to that purpose. Unlike the `worker threads` that are tightly coupled to `reader threads` in multi-thread model 3, these `output threads` are tied to (and exclusively 'feed') their designated set of outputs.

Conventional wisdom states that syslog configuration files should have as few destinations as possible since writing to too many outputs degrades performance -- especially slow outputs or broken network links. With the introduction of *output threadpools*, Syslogd2 rejects that 'wisdom' by allowing outputs to be grouped and assigning each *group* of threads a unique id number and set of 'worker' threads that are dedicated to reading from the threadpool's queue and writing to each destination assigned to the threadpool. This allows the processing threads to inspect and **log to** each **group** of destinations (based on the output threadpool summary mask) which (as a memory-operation) is much faster than inspecting and **writing to** each **individual** destination -- especially when considering that output threads can be blocked by other (slow-responding) output threads already writing to a given destination.

For those configurations that have a need for lots of output destinations or complex output filters, multithread model 4 should be considered. If you have a mix of reliable high-speed (disk) and unreliable or low-speed (pipes, socket-processes, unreliable network) output destinations you may wish to isolate the high-speed output from the low-speed devices to avoid threads getting blocked behind network time-outs or slow devices and thus being unable to process incoming traffic.

In the configuration file, model 4 allows the specification of input threadpools as in model 2 by specifying a non-negative number for the threadpool-id and a positive thread-count for the number of `readers` desired., but with this model syslogd2 also keeps a separate sequence of threadpool-ids for output threads (again the default output threadpool-id is zero (0) with additional threadpools created by assigning positive id-values. To distinguish between the input-theadpool-series of threadpool-ids and the output-threadpool series, a type-designator is used in the `--threadmaps` option and slightly different syntax is used for global over-ride values. To designate a non-negative threadpool id, a positive threadcount and a positive line-count for the output threadpool's FIFO buffer, the parameter names are as for the input threadpools in threadpool model 3, but applied to configuration-file destination lines. (`id=`, `workers=`, `lines=` with `readers=` being superfluous and invalid).

Please note that the reasons for isolating input threads (by type -- Kernel vs files vs sockets -- or high-speed vs low-speed are still valid. What changes here is that the input threads save time by doing one comparison against an aggregated selection mask for each threadpool and by queueing the message for each match versus comparing against each destination's individual selectors and then actually writing to the output device (with associated delays while waiting for other threads already using that output connection, for network timeouts or for slow process responses).

Multithread model 4 may be applicable to larger Linux servers where maintaining (and logging to) numerous log files is desirable.
Return to top

Model 5 (Muliple Thread-pool sets, Dedicated Readers and Output)

Return to top
For truly high-speed syslog processing Syslogd2 allows combining multi-thread models 3 and 4, creating multi-thread model 5. This model provides the highest throughput and performance for machines where system resources can be dedicated to the purpose of syslog data-collection. It also requires the highest resource committment and memory footprint. In this model, we get the benefit of dedicated reader-threads for high-speed input and dedicated output threads for isolating high-speed vs low-speed outputs and for distributing the load of writing to large numbers of output files or external processes. We gain an additional benefit by being able to allocate (concentrate) thread resources where needed most (read-phase, processing-phase, or output-phase of message processing). (Note input filters are executed by worker-threads while output filters are executed by output threads, so if you use both filter-types, be sure to provide adequate processing resources See filter page here.)

In this model, Syslgod2 accepts the combination of threadpool options of mutli-thread model 3 (`id`, `readers`, `workers`, `lines`) for input sources and multithread model 4 (`id`, `workers`, `lines`) for output destinations. Reminder: There is no conflict between input-threadpool-0 and output-threadpool-0 because they are of differnt `threadpool types`.
Threadpool model 5 has been obsoleted in favor of model 6 described below. The difference is that model 6 has decoupled the reader and worker threadgroups. They are now two separeate data structures coupled via the reader-threadspool "queueid" parameter.

Return to top

There is one more recently-concieved multi-threading model that is really more of a refinement to model 5 above. Model 6 is shown below. Model 6 has already been integraged into Syslogd2. It is also the multi-threrad model chosen for the DBD2 companion project (also on sourceforge.net).
This model has all the flexibility and speed of model 5, but in model 6, the input thread-groups and worker thread-pools (FIFO + threadgroup) are decoupled allowing multiple groups of reader threads to share a common queue and worker threadgroup in an effort to conserve resources where possible.

Return to top

Return to top

Return to Home page

Wiki: Configuring Syslogd2
Wiki: HomeObsolete
Wiki: Tour Syslogd2

Discussion

Anonymous