Syslogd2 Wiki

High capacity syslog data collection, filtering, and management.

Brought to you by: efreesmeyer

HistoryDesign

The Case For Syslogd2

Syslogd2 was conceived as an entire re-evaluation of the syslog data-collection needs of today's corporate network environments. It is based on experience of the limitations and shortcomings of using 1970s-era algorithms (even in multi-threaded wrappers) for network-management data-collection tasks - a purpose traditional syslog collection systems are manifestly unsuited for. Syslogd2 is the result of my recognition that an entirely new, re-imagined syslog-data-collection mechanism is required for today's networks - not just for network management purposes, but also for the management of any sizeable deployment of Linux desktop hosts.

In today's networks:

TCP protocol support is vital to being able to reliably deliver syslog traffic over unreliable links or through firewalls.
Spooling is vital for preventing the loss of remote forensic data during network-link failures.
filtering is critical to reducing or eliminating the massive volume of unusable, unwanted syslog traffic that is currently transmitted over network links because of limitations in today's syslog processors.

Linux-on-laptops universally fails to be responsive to laptop-specific network "events" such as when those laptops move from one wireless environment to another - especially when such laptops are configured to forward syslog information to collection stations on one or more of those netowrks.
Another common laptop "event" is loss of network signal or being put sleep - both of which result in closed network interfaces from which current syslog processors are unable to recover.

Syslogd2 Design Departures from Traditional Syslog Processors

The design of Syslogd2 has undergone several design and structural changes over its years of development as new concepts and techniques have been integrated into it's overall capability-set.

The most radical departures from traditional syslog daemons in the overall design and architecture of Syslogd2 have to do with Syslogd2's extensive use of threadpools. See here for an overview of the possible architectures of Syslogd2 under various compile-time declarations and for a brief discussion of the pros and cons of each architecture.
Many of the most radical departures from traditional syslog daemons in the area of program-code design Syslogd2 have to do with Syslogd2's extensive use of compile-time declarations. Syslogd2 uses several categories of compile-time declarations for comfiguration purposes in additon to the more common uses of parameter-setting, feature-inclusion and determining the availability of ssytem-libraries and services.

Several other fairly radical departures from traditional syslog daemons are due to the implementation of technical concepts implemented in Syslogd2. It is these extensions and innovations that change Syslogd2 from "just another syslog daemon" to a "host- and network-management edge data-collector and pre-processor". See here for an overview of base-line features, here for filters, here for reconfiguration options, here for the internal name-cache and here for connection management for some of the more impactful improvements.

The modular design of Syslogd2 allows for (relatively) easy implementation of additional threadpool types and feature "modules" in futre releases.

To summarize, Syslogd2's implementation departs from traditional syslog-processing daemons due to the choice of development philosophies and the experience I gained attempting to implement syslog-dependent host- and network-management systems on production networks while being limited to the constratins of existing syslog processing daemons on Unix and Linux. The experience of attempting to utilize traditional syslog daemons for network-management combined with developing the programming skills to develop what I really needed has resulted in Syslogd2 and it's "new designation" as a host- and network-management syslog collection tool rather than "just another multi-threaded syslog daemon".

The Evolution and History of Syslogd2

The prototype of what would eventually become Syslogd2 was actually hardcoded as a multi-thread-model 5 system. It was only in the 4th or 5th re-design that the idea of using compile-time declarations to modularize and reduce the size of the overall binaries was conceived. From that realization, it was only a small step to "carve out" the additional non-traditional feaures into optional CAP_*-abilities. The process of carving out CAP_*-abilities continues.

As features were added over time, the configuration and command-line grew too complex and the command-line option-list too long for even the 4-kilobyte Linux command-line.

Option proliferation initiated a consolidation of boolean options from individual command-line values to the (simplified) --enable / --disable options of today.
Option proliferation also initiated the collection of misc independent values into the single --defaults command-line option of today.
Even the ultimate decision to allow the command-line to be moved (or extended) into the configuration file was a response to option proliferation and the growing length of the command-line required for even reasonably complex configurations.

To simplify administration and encourage acceptance, a conscience decision was made to remain as true as possible to traditional configuration-file syntax and traditional operation. The default values in Syslogd2 have been configured to emulate traditional performance in Syslogd2 when given a traditional configuration file and parameters.

For the most part, this effort has succeeded.

The current comment policy is not yet perfected because it will not ignore traditional configuration file entries that are commented out with a single '#' in the first column. The short-term fix is to convert the single comment-character to a double-comment-character, thereby converting it to a Syslogd2 comment
Syslogd2 also defaults to basic connection management mode instead of the less-capable (but more compatible) traditional connection management mode.

There are several performance settings and differences in Syslogd2 that are part of the "base code" and not subject to being "turned off". Examples of this include automatic detection and support for IPv4 and IPv6, user-assignable input and output sockets for both Linux and IP, connection management of both input and output connections, and the ability to extend the command-line into the conifguration file.

As an computer engineer and administrator, I believe it is better to provide more options to control system services (even if it adds complexity) than it is to "dumb down" configurations at the expense of flexibility or capabilitiy. If configuration default values are set to emulate the "dumbed down" system performance, the best of both appraches can be realized.. Syslogd2 parameters are pre-configured to emulate traditional syslog processing results so busy administrators can deploy them with default parametes and get reasonable results, then (when time or need permits) do more detailed configuration analysis and tuning.

This "better to have the option and not need it than not have it when you need it" approach has driven the plethora of parameters in Syslogd2 to cover as many (reasonable) use-cases as I can conceive of.

When deploying Syslogd2 as network-management "edge" data-collectors, Linux platforms (like OSX and Unix platforms) are best-suited to multi-home configurations. Linux platforms have the edge over OSX and Unix due to overall cost of deployment. It was multi-homed configurations (espeically those in DMZ environments) that drove the requirement to allow multiple, independent IP socket connections in the prototype code.

Once formed, the concept of using connection-spec options fed on itself.
Once input (command-line) connection-specs entries could be put into the configuration file, adding additional options to tailor additional aspects of the connection became "no-brainers".

Return to top

Return to Home page

Discussion

Anonymous