xCAT Development Journal

An extreme cluster/cloud administration toolkit

Brought to you by: besawn, cxhong, gurevich, obihoernchen, victorhu

Confluent status and future

This is the first post to the development journal. The intent of this is to provide more insight to the sausage being made. For those who'd rather not know how their sausage is made, this likely extends to the content of these writeups.

Confluent is intended in the short term to supersede conserver, and in the longer term supersede many tasks currently done in xcatd. I thought I'd take a little time to explain the state and plan. This post is going to focus on relationship between conserver and confluent, though some points will flow through to the larger situation.

conserver has served us well, but there have been some awkwardness around using it and limiting factors for reasonable addition of capabilities that we had not added yet.

On performance front, the situation now is that conserver is pretty much certain to require a relatively expensive process per active console. confluent folds ipmi into the category of consoles that are 'cheap' to hold open. Additionally on logging, confluent takes measures to better aggregate writes to disk. Disk activity has been a large contributor to conserver scaling problems. This should make full time logging a bit feasible for a larger set of users.

On the security front, we have long had out-of-tree patches to try to have conserver pay rough attention to our CA for coarsely flagging a client as 'a legitimate user'. This means that all console sessions bear the weight of SSL. It also means that we didn't rework the authentication scheme away from being host based, inconveniently restricting the client hosts while not providing adequately specific measure of the user. In confluent, security is user-centric. If using the client on the same host, it will also use the kernel facilities to authenticate a cheaper, unencrypted unix domain socket rather than applying key based authentication. Ultimately this means the most common case is dead simple and cheap (authenticated, no cryptography, no configuration/keys to get straight) and the remote case is easier (no need to mess about with trusted hosts when you really intend to track users, can be password based rather than certificate based if desired, and can be linked to PAM if that is preferred over the built in user database.

With respect to features, one oft-requested feature has been web based access. Others have done this through rather unattractive approaches like Java applets calling out to ssh as a workaround for the underlying capability not being there. confluent provides an http interface usable by a normal browser running normal javascript with long-lived polls for reasonable performance without complicating things with respect to proxies. A console widget implemented in javascript is available for developers to reuse in their web application development. This interface is rather respectable and in many cases can't be distinguished from a conventional socket connection. However, it does have limits (a bit 'chunky' over high latency links), and as such the command line has a different channel to access the data, but the channels act much the same. In fact each client session is notified of other sessions the same way whether the peer is http, unix socket, or remote direct socket connected. This is a critical theme of confluent architecture: the backend knows little about the details and the respective connection handler abstracts all distinctions between the communication channels. Backend changes naturally manifest in several different ways, assuring that no access method is neglected in the face of development.

Another feature is enhanced logging data. The available logging data can be used to create a faster and more sophisticated 'replaycons' utility/capability. A great deal of critical metadata is now stored in a binary format designed for machine parsing. However there is a great deal of value in plain text log data, so the metadata is stored alongside a plaintext file, with indexes to the bulk content. When appropriate, the binary metadata intentionally skips portions of the text file, and portions of the metadata are replicated in plain text within those 'holes'. The primary example of this is that the metadata will have a machine readable timecode, but the plaintext will have a string representation of that exact same data to keep the plain text useful on its own. Some stuff is present only in the metadata, for example the precise second of all of the output is only in the metadata as that verbosity would ruin the plaintext in terms of readability and filesize. The critical design decision is that no information currently in a plaintext format is removed, though it might be duplicated in both places.

The configuration engine bears no small resemblance to a hybrid of 'objdef' and 'table.column' found in xCAT. Compared to xCAT today, a greater emphasis is placed on being reactive (e.g. change a password and all relevant activity is retried or disconnected). Inheritance with formulas is there as in xCAT, though the formula language is both more straightforward and performs much faster. Confidential data (e.g. passwords) is given more specialized treatment with redaction and encryption with hooks for more sophisticated tricks coming (including seal to password or seal to tpm with backup recovery password). The important part is that configuration is structured in a way more consistent with typical xCAT practices and no longer calls for the service to be restarted to respond to any change. Restarting the service is of course incredibly lower cost than conserver, but still better to not be needed at all.

There is probably more that is pertinent to conserver specifically, but I think that more than covers the big parts.

Posted by 2015-01-19 | Edit Labels: confluent

Anonymous