[Ssic-linux-announce] OpenSSI 1.2.0 released for Debian Sarge (testing)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

The OpenSSI 1.2.0 stable release for Debian testing, 3.1(Sarge) is now 
available from OpenSSI.org.
The release notes are below.

Regards,

Gopal

---
OpenSSI 1.2.0 is a stable release, suitable for production use. These
release notes are compilation of the notes for 1.1.0 and 1.1.1, for the
convenience of users who have not upgraded since 1.0.0. A description of
what's changed since 1.1.1 can be found at the bottom.

This release has versions for Fedora Core 2 ("FC2"), Debian testing, and
Red Hat 9 ("RH9"). The OpenSSI 1.2.x series will be the last set of
releases for RH9, so you should only use RH9 if you're upgrading from
1.0.0 or 1.1.x for RH9, and you're unable to do a fresh install on FC2
or Debian.

The OpenSSI kernel is now based on the most recent Fedora Core 1 ("FC1")
kernel (2.4.22-1.2199.nptl). This is true for all distributions, since
it is difficult to maintain multiple versions of the OpenSSI kernel. It
might seem strange that OpenSSI 1.2.0 runs on FC2, but it is based on
the FC1 kernel.  This is because the FC1 kernel is based on the Linux
2.4 kernel, whereas FC2 is based on 2.6. There is currently a project to
port OpenSSI to the 2.6 kernel, but it will not be ready for a few more
months. When it is ready, it will be part of OpenSSI 1.9/2.0.

There are several features that have been developed for OpenSSI 1.2
since the last stable release: 1.0.0. One of them is performance
enhancements for the Cluster File System ("CFS"). Now it not only caches
remote reads, but it also caches remote writes, while still maintaining
a coherent view of the filesystem across the cluster. Furthermore, CFS
now does asynchronous remote read-aheads of data blocks that programs
might want, so that they are already cached locally by time the program
wants them. Hopefully you will notice a significant performance
improvement in your filesystem intensive application!

Another new feature is atomic migration of a group of processes. The
group could either be a POSIX process group or the "threads" of a
multi-threaded application (on Linux, each thread is a full process). To
migrate a POSIX process group, call the migrate command with the
negative PID of the process group leader. These semantics are very
similar to signaling a process group. To migrate a thread group, call
the migrate command with the PID of any "thread" in that group. With
both forms of group migration, either the entire group migrates or it
does not. If any process in the group is unable to migrate for any
reason, the entire group will remain on the old node.

Another process migration enhancement is the ability to migrate a
process while it is holding file record locks. These locks will continue
to be held during the migration and after the process continues running
on the new node.

LVS-NAT can now be used with OpenSSI. Linux Virtual Server ("LVS") is a
third-party open-source project that load balances TCP connections among
the nodes in a cluster. For a long time, LVS has been integrated with
OpenSSI, making it easier to manage than an LVS cluster without OpenSSI,
but only the Direct Routing ("DR") feature of LVS was supported. DR
allows load balancing in a cluster where every node has a direct
connection to the external network (as well as a network connection to a
private switch for the cluster interconnect, as recommended in the
OpenSSI installation instructions). Unfortunately, a security feature in
the Linux kernel prevents DR from being used in a cluster where only
some nodes are connected to the external network. For these situations,
the Network Address Translation ("NAT") feature of LVS should be used,
and it is now supported on OpenSSI.

Note that LVS-NAT is different from the NAT that you would use for
making outbound connections from a private IP address. LVS-NAT is for
load balancing inbound connections from a public IP address among a
cluster of machines that are all connected to a private network, such as
an OpenSSI cluster interconnect. Of course, LVS-NAT requires that
potential director nodes are connected to both the external network and
the cluster interconnect, so that traffic can move between the two
networks.

The 'fast' and 'fastnode' commands have been added. 'fastnode' returns
the node number for the least-loaded node in the cluster, as determined
by the process load-leveling algorithm. 'fast' executes a command on the
least-loaded node. Read the man pages for these commands for more
information.

Several files were added to /proc/cluster/: nm_rate, nm_log_threshold,
and nm_nodedown_disabled. nm_rate can be used to alter how often node
monitoring messages are exchanged (default is 1 per second) and how long
before a node is declared down (default 10 seconds). nm_log_threshold
indicates how many monitoring messages can be missed before a kernel
warning is generated (default 2). nm_nodedown_disabled can be set to
disable nodedown detection, which is useful if you need to enter the
kernel debugger on one of the nodes. Previously, you had to recompile an
OpenSSI kernel to change one these values. Now you can do it by simply
writing a new value into one of these /proc/cluster files.

The top command was enhanced for this release by Roopa Prabhu. By
default, it adds an execution node number column and only displays
clusterwide information in the header. The node number column replaces
the mem % column, which is potentially confusing. When top is run in
localview mode (i.e., `localview top'), it limits the list of processes
to just those that are running on the local node, and it displays all
the same information as the base version of top.

To improve performance, the init failover state file was moved from
/etc/initstate to /cluster/init/initstate. This avoids the need to
constantly hit the large /etc directory.

Since the last development release (1.1.1), OpenSSI now has the ability
to automatically move loadleveled process off a node that is gracefully
shutting down due to a clusternode_shutdown call.

Another new feature is an interactive 'e' command for top, which prompts
the user for a node number and displays only processes on that node.
This new feature is only available when top is run in defaultview (not
localview).

A README was added for configuring clusterwide NFS client mounts.

Several interface changes to HA-LVS include a new
/proc/cluster/lvs_internal_gw file, as well as changes to make
/proc/cluster/lvs_routing use the seq_file interface. The files
/usr/sbin/clusterip.sh and /etc/default/lvs_routing no longer exist.

There have been many bug fixes. See the ChangeLog for more details.