Syslogd2 Wiki

High capacity syslog data collection, filtering, and management.

Brought to you by: efreesmeyer

BaseLineFeatures

BaseLine Features in Syslogd2

User-Definable Interrupts
Connection Management and Delayed Resolution
Extra Facilities
Network Friendly
Sources, Selectors and Destinations
Test Stubs and Utilities

Return to top

User-Definable Interrupts

Return to top
Syslogd2 supports 4 interrupts that users may set to any of 10 different functions at compile-time. These 4 interrupt definitions can then be over-ridden at run-time. The interrupts can also be queried and redefined using the Syslogd2 command-tool (tsucX).

To over-ride an interrupt at run-time, use the --defaults= command-line option. The sub-option names are the name of the interrupt and the value is the function-name you wish to set the interrupt to: --defaults= sigusr2=rotatefiles

The interrupts supported are:

1. SIGHUP
2. SIGINT
3. SIGUSR1
4. SIGUSR2

There are two ways to call the interrupts:

kill -HUP <process-id> --> Executes the function assigned to the SIGHUP interrupt
killall -USR1 <process-name> --> Executes the function assigned to the SIGUSR1 interrupt

The 10 functions that the user-definable interrupts can be set to are:

1. NoOp -- do nothing
2. CheckSources
3. CheckDestinations
4. Mark
5. ResetCache
6. RotateFiles
7. CheckReconfig
8. CheckFilters
9. DisplayConfig
10. FlushSpoolFiles

Return to top

Connection Maintenance and Delayed Resolution

Return to top
Connection Management

Traditional syslog algorithms were developed for individual host logging with minimal (and unreliable) forwarding of traffic via UDP. These (rather simple) requirements were based around a concept where all output is either to local devices (disk, console, user-terminals) or via UDP (sometimes referred to as "spray and pray" because there is no expectation that delivery will be acknowledged and no handshake to maintain that could cause processing interruptions in the event of handshake timeouts

Unfortunately, UDP traffic (esp through firewalls and some CPU-based routers) can become CPU-intensive in large volumes and can adversely impact performance and throughput for those devices. This issue, combined with unreliable connections and even unreliable networks (such as the internet or modem links) constitutes a demand for transmitting syslog traffic over TCP. Transmission of syslog data over TCP requires new connection-management techniques:

1. When reading from a TCP (stream) connection, unlike UDP where each network-data-packet is a single, complete syslog message, individual messages must be isolated from each other in the incoming byte stream. This implies a delimiter be used between messages. For Syslogd2, this delimiter is the Linux/Unix new-line character (a line-feed character with binary value of 10 decimal). The results of reading from a TCP stream may be a partial message (without finding a delimiter), a full message (with a delimiter) or more than a single full message (bytes are read in excess of those contained in the first message). Input code must be prepared to buffer this extra input and use it in satisfying future read requests. One of the unaddressed connection-maintenance tasks of Linux socket input is to close and reopen the socket (thus creating a new filesystem entry) whenever the filesystem entry for that socket gets deleted.
2. When writing to a TCP (stream) connection, the remote host or intermediate network may go down at any time, resulting in a "broken pipe" error to the originator. This condition should trigger a shutdown of the (now useless) connection by the sender and re-establishment of the output connection from scratch. This differs from a UDP (datagram) write operation where such errors are simply ignored if they are received at all (they are only reported to the sending application for Linux/Unix datagram sockets).

Because data-delivery (write) errors are usually not reported at all in UDP communications, It is safe to continue writing to UDP (datagram) connections even when the network (or Linux/Unix socket) is down. Because there is no handshake or receipt confirmation and because the network/host will discard any undeliverable traffic. As a result of these characteristics, there is no real "connection maintenance" required for UDP (datagram) output connections.

Connection-maintenance of a TCP connection is more complicated. Each write attempt must be prepared to detect and handle a network or connection failure. Upon detection of a failure, the output socket must be closed and re-opened before the next write attempt can be made. Attempting to re-establish a broken TCP (stream) connection however, can result in excessive delays if the remote host has become non-responsive (network or host is down or unresponsive). In an environment such as a syslog daemon, the traditional approach of attempting to re-open a down TCP (stream) connection for each new outgoing message can usually be expected to result in all threads of a multi-threaded application backing up behind (waiting on) the single thread that is in a wait-state, waiting on a connection-handshake response that will never come from a non-responding host. Once that wait-state completes (the timeout is over), the next thread will attempt to write, fail, then attempt to re-open the connection and start the time-out timer all over again. The overall effect of sequential timeouts can be a virtually complete cessation of processing.

Syslogd2 takes a fundamentally different approach. In Syslogd2 each connection (input and output) keeps a set of boolean state-flags indicating the state of a connection. Before each write-attempt, these flags are consulted. Based on the connection-state, different actions are taken.

1. If the connection is up and connected (as indicated by a positive file-descriptor), the write attempt proceeds. On write-error, the connection state is set to "down" and an attempt is made to re-open the connection. If this re-open attempt succeeds, the write is re-attempted and (if successful), processing continues. If the attempt to re-open fails or the write attempt fails a 2nd time, the connection is marked "invalid".
2. If the connection is marked "down", an attempt to open the connection is made. If the open attempt succeeds, a write-attempt is made. If the open attempt fails, the connection is marked "invalid" and the write is aborted. If the open attempt succeeds, but the write fails, an attempt is made to reopen the connection. If the reopen attempt succeeds, a 2nd write attempt is made. If either the reopen attempt or the 2nd write attempt fail, the connection is marked "invalid".
3. Invalid connections are simply skipped. No attempt is made to re-open them for each new message.

Periodically, Syslogd2 runs a housekeeping routine that walks the list of either all input or all output connections, attempting to re-open all connections marked "invalid". On success, the connection is marked as up and connected. On failure, the connection's "schedule index" is incremented and the next "invalid" connection is addressed. These housekeeping routines are either CheckSources or CheckDestinations, depending on whether input or output connections are being scanned.

Connection-Maintenance is configured at runtime using:

1. The --defaults command-line option can be used to configure interrupts to call either CheckSources or CheckDestinations on request.
2. The --defaults command-line option can be used to re-define the time-interval schedules for "SourceCheckIntervals" or "DestinationCheckIntervals".

Delayed Resolution

The concept of Delayed Resolution is Syslogd2's response to a common problem faced by all system administrators attempting to set up a syslog-based reporting / host-forwarding system.
Simply stated, the system's syslog daemon can either be started before or after the network-stack is initialized.

If started BEFORE the network stack, any IP connections specified for forwarded hosts or IP input will time-out during initialization and be unusable by the time the network is initialized, effectively defeating the purpose of attempting to forward messages to a central management host.
If started AFTER the network stack, any errors or log events (especially from the kernel or application-initialization) that occur before the network is initialized will not be captured and cannot be forwarded for action.

Syslogd2's approach is to move all IP host-resolution de-conflicting and initialization processing to a pair of re-resolutionsubroutines called by the CheckReconfig housekeeping routine. CheckReconfig is called by the connection-maintenance routines to periodically test the network during Syslogd2's operation. When it detects changes in network status (external interfaces become available or ALL external interfaces go down), Syslogd2 then calls the re-resolution subroutines to re-define which requested connections (input or output) are valid under the new network configuration.

If the network status does NOT change from one check to the next, Syslogd2 will still call the re-resolution routines based on the assumption that even though the network status has not changed, the configured DNS host(s) content may have recently become available or been modified allowing additional connections to resolve.

This approach to IP resolution preserves the administrator's ability to use DNS to resolve destination addresses regardless of when the DNS startup occurs or whether Syslogd2 is started before or after the network-stack. Taking this logic one step further, it is highly likely that any applications that writes to Syslogd2 input sockets or that creates files for Syslogd2 consumption via --tailfile input may not yet be started (or may not have attempted output). The combination of connection-maintenance and delayed resolution vastly improves Syslogd2's ability to self-correct for (respond to and recover from) network and application failures while eliminating the performance bottlenecks caused by network-timeouts when establishing connections to non-responsive hosts.

Many (especially home-grown) applications can be expected to simply quit if unable to establish TCP or stream connections to their output connections at startup. Applications that use UDP (datagram) connections should have no problem, but this could become an issue for stream connections. Having the system log daemon already running and able to listen for these applications before the IP network stack becomes available (esp for laptops that may or may not ever go on-line) addresses these issues as well.

Delayed Resolution is configured at runtime using:

1. The --defaults command-line option can be used to configure interrupts to call CheckReconfig on request.
2. The --defaults command-line option can be used to re-define the time-interval schedule for "ReconfigCheckIntervals".

NOTE: CheckReconfig is called by and as a precursor to the connection-maintenance routines CheckSources and CheckDestinations. When the scheduled time interval expires for CheckReconfig it will not actually run until either one of the connection-maintenance routines' next execution time.

See the discussions on CAP_RECONFIG and CAP_NETWORK for more advanced dynamic configuration options.
Return to top

Extra Facilities

Return to top
Syslogd2 supports a number of facilities over-and-above the 24 that are currently used. These facilities are named "extra<NNN>" where "NNN" is a number with a range of 0 to 999. The default number of "extra" facilities can be set at compile-time. It defaults to 16 (extra0 - extra15).
Additionally, Syslogd2 provides access to the 4 "reserved" facilities in the range 12-15. Syslogd2 uses the names "reserved0" through "reserved3" for this purpose.
Syslogd2 makes these facilities available to provide network management administrators a sufficient number of "clean" facilities that can be used to sort and route traffic from multiple sources to multiple destinations. Syslogd2 provides multiple methods for accessing these facilities and full routing support once the facilities are set into events.

1. Use the "facility=" or "priority=" settings on individual input connection-specs to set or over-ride the facility value for all messages entering that input.
2. Use the transformation feature of Syslogd2 input filters to reset the facility value on selected messages then route the message based on the "extra" facilities to desired destinations.
3. Use the transformation feature of Syslogd2 output filters to (re-)set the facility value of selected messages as they are being written to remote devices or services.
4. Read the "extra" facilities directly (as part of the message string) from any application that supports extended facilities.
5. Use the --Defaults= suboptions "UserFacility=" or "KernelFacility=" to change the default facility/priority values for any events that do not contain facility/priority values.

The above methods may also be used in combination.

Syslogd2's support for "extra" and "reserved" facilities is fully integrated into all aspects of the code. As the "reserved" or "extra" facilities receive names in the system header files, Syslogd2 will recognize those names in addition to the names it is using now. These "extra" (and "reserved" facilities are also passed to other applications and hosts if so configured (so care should be shown to insure the remote systems can handle these values.)

Calculating transmission values for facilities and priorities

The standard syslog facility field (in either the rfc3164 (current version 0) or the rfc5424 (version 1) specification define the facility field as '<' + 3 digits + '>'. Syslogd2 has to modify this from 3 digits to 4 digits to accomodate values above extra99. (The numeric value is calculated from (numeric_facility * 8) + numeric_priority. For example: the user facility with a numeric value of 1 (one) gives a range of formula values between 8 and 15. Likewise local7, with a numeric facility value of 23 has a range of values from 8 x 23 to (8x 23) + 7 or a range from 184 - 191.

Syslog's "extra" facilities start at a numeric value of 24 so 1000 facilities (extra0 - extra999) produces a maximum facility numeric value of 1023.

The syslog specification arbitrarily constrains the number of "extra" facilities to 101 (extra0 -> extra100 by specifying a 3-digit formula result. (999/8 = 124 with a remainder of 7, giving 124 as the highest-numbered facility that 'fits' within the specification. 124 - 24 (0-23 existing facilities) = 100 as the highest possible "extra" facility number that "fits" in the current syslog definition.
Checking this: 101 "extra" facilities (numbered 0-100) + 24 base facilities (0-23) is 125 total facilities numbered 0-124. (8 x 124) + 7 (for 'debug' priority) = 992 + 7 gives a formula value of 999 -- the highest possible 3-digit value that is "currently legal" for "extra100.debug".

On the high end of Syslogd2's range: 1000 extra facilities (extra0 -> extra999) + 24 base facilities (0 - 23) gives 1024 total facilities with numeric values of 0-1023. 8 x 1023 = 8184 for the base value of the 1024th facility (extra999 with numeric value of 1023). Adding a value of 7 for the "debug" priority value to this gives a maximum (4-digit) formula value of 8191 for "extra999.debug" using Syslogd2's limit of 1000 "extra" facilities.

This calculation of the syslog priority field is no secret. Any application that sends a characte string to Syslogd2 that starts with the the string "'<' + 1-4 digits + '>'" will have specified a valid facility and priority for purpose of syslog traffic as long as the numeric value is in the range 0 - 8191 and Syslogd2 has been configured to support sufficient "extra" facilities to include that value.

Syslogd2 defaults to 16 "extra" facilities (extra0-extra15) with a default maximum formula-value of: (16+24=40 facilities numbered 0-39) ==> (39 x 8) +7 = 312 + 7 or 319.
Return to top

Network Friendly

Return to top
Syslogd2 is able to auto-detect and recover from many network issues that cause failure in previous syslog daemons. Among these are:

1. At startup, Syslogd2 will scan the host's network interfaces.

If it finds an IPv4 address configured on at least one interface (even if only on the loopback interface), it will auto-enable IPv4 support. IPv4 support can be manually disabled at run-time.
If it finds an IPv6 address configured on at least one interface (even if only on the loopback interface), it will atuo-enable IPv6 support. IPv6 support can be manually disabled at run-time.
It will set the network state to: down if no IP addresses are found, local if only the loopback interface is configured with an address, other if at least one external interface has a configured IP address.

2. The scanning of the network interfaces can be manually disabled at run-time by disabling the "interfacequery" global boolean value.
3. If Syslogd2 is unable to detect an IP address that is known to be present (or if the IP address is not 'findable' by the "ifconfig" command), Syslogd2 allows manual specification of IP addresses via the selfaddress= sub-option to the --defaults command-line option.
4. If the network state is initially detected to be "down", IP port resolution and initialization will be delayed until periodic testing indicates the network state has changed to either "local" or "other".
5. If a connection cannot be initially established or is broken for any reason and cannot be immediately re-established, re-establishing the connection will be attempted by background threads rather than imposing timeout delays on operational threads.
6. Syslogd2 will attempt to resolve destinations and then combine all output connection-specs that reference each destination to avoid duplicating messages while allowing parameter variation (primarily filter specifications) for different selector-string values.
7. Syslogd2 allows multiple input threadpools which allows for isolation of (and different processing for) traffic coming from different interfaces on multi-homed hosts.
8. Syslogd2 allows most global values to be over-ridden at the individual socket-or-file level (DNS enable, Cache enable, AllMessages, filters, binary-to-printable, address-family, etc.)

Return to top

Sources, Selectors and Destinations

Return to top
The way that Syslogd2 handles Sources, Selectors and Destinations has been inspired and derived from a variety of sources and has changed significantly over time, but has converged on a standard that (I believe) represents the simplest, most common-sense approach I've been able to devise to date.

Starting with the concept that Syslogd2 should retain as much backward-compatibility as possible with traditional syslog configuration files and performance characteristics, every effort has been made to be able to use the traditional syslog file format(s). This drove the decision to retain the familiar "<selector-string> *ltwhitespace> <destination>" syntax for defining an output destination. The configuration file comment policy summarized here and described here in more detail was modified to allow for in-line comments and to allow for configuration-file lines that Syslogd2 will read but that other daemons will ignore.

In the simple case (where Syslogd2 will be the only application using the config file, the only changes needed are to add a 2nd comment character to the start of any line with a valid output line that has been made inactive by prepending a comment-character in column one. In order to avoid undo format changes while adding features and options to the traditional configuration file output lines, all "new" configuration is added as a list of comma-separated options that follow the destination specification.

Return to top

Test Stubs and Utilities

Return to top
For those that wish to test or evaluate Syslogd2 or for those that wish to examine and "play" with the code (within the GPL license), Syslogd2 provides a set of test-stubs and utilities in the ./tools subdirectory.
All test stubs can be recognized by their 3-letter name. In general, a name such as XYZ.c translates as:

X: Either a 't' to transmit command-line parameters over a connection or an 'r' to receive data from a connection and display it to the screen.
Y: Either a 's' for a "stream" connection (TCP) or a 'd' for a "datagram" connection (UDP).
Z: One of:

4: Connection is over IPv4.
6: Connection is over IPv6.
p: Connection is over a named pipe. u: Connection is over a Linux/Unix socket.

As a variation on a theme, some test stubs and utilities add a 4th letter interpreted as follows:

f: File. This test stub takes a file as input instead of command-line input.
c Command. This identifies the 'command-utility client'. A suffix string added during compilation identifies the binary version that a particular command-client is compiled to be used with.

The remaining applications are:

checknet.c: This is essentially a "wrapper" around the "checknet" library that Syslogd2 uses to test the network state. It prints out the state that was found, and returns a numeric value that shell scripts can access via the "$?" environment variable after running the code. One direction for further development is to create a project that would allow administrators to test for various conditions to determine one network from another that could be called by Syslogd2 as well as shell scripts and other code to provide automated configuration of laptops as they move from home networks to coffee shops to work networks with network-outages in between. The basics for such a library have already been designed into the Syslogd2 via the CAP_NETWORK capability.

popen.c: This is a sample "post-processor" that is still incomplete. The two include files (tools/*.inc) 'belong' to this code. The sample is heavily documented and simplified in an attempt to illustrate the use of threadpools, libraries and settings in the Syslogd2 codebase that can ease development of companion projects.

Executing any of these stub routines without parameters will produce usage information.

For those that wish to activate the "development mode" of Syslogd2 for learning or code modifications (again within the GPL license), the following guidelines are provided:

--> In the main directory are two shell scripts: debug and undebug. The "debug" script uncomments and activates all development debug statements in the code. The "undebug" reverses this and comments the debug lines back out, effective removing them from the binaries.
--> Enabling the "development mode" enables the following additional settings:

1. --debug (-D) command-line option to specify a third file (a log-spec) for debug output.
2. --screen (-S) screen output level. (Screen output via printf() is also enabled). The parameter to -S is a simple integer.
3. Additional sub-options to the CAP_WHATIF --testconfig option to write to screen or debug file.
4. Some additional (debug) output provided in configuration display output.

Return to top

Return to Home page

Discussion

Anonymous