[Ssic-linux-announce] OpenSSI 1.2.0 released for Debian Sarge (testing)
Brought to you by:
brucewalker,
rogertsang
From: Gopalakrishna NM <go...@in...> - 2004-12-23 07:29:00
|
The OpenSSI 1.2.0 stable release for Debian testing, 3.1(Sarge) is now available from OpenSSI.org. The release notes are below. Regards, Gopal --- OpenSSI 1.2.0 is a stable release, suitable for production use. These release notes are compilation of the notes for 1.1.0 and 1.1.1, for the convenience of users who have not upgraded since 1.0.0. A description of what's changed since 1.1.1 can be found at the bottom. This release has versions for Fedora Core 2 ("FC2"), Debian testing, and Red Hat 9 ("RH9"). The OpenSSI 1.2.x series will be the last set of releases for RH9, so you should only use RH9 if you're upgrading from 1.0.0 or 1.1.x for RH9, and you're unable to do a fresh install on FC2 or Debian. The OpenSSI kernel is now based on the most recent Fedora Core 1 ("FC1") kernel (2.4.22-1.2199.nptl). This is true for all distributions, since it is difficult to maintain multiple versions of the OpenSSI kernel. It might seem strange that OpenSSI 1.2.0 runs on FC2, but it is based on the FC1 kernel. This is because the FC1 kernel is based on the Linux 2.4 kernel, whereas FC2 is based on 2.6. There is currently a project to port OpenSSI to the 2.6 kernel, but it will not be ready for a few more months. When it is ready, it will be part of OpenSSI 1.9/2.0. There are several features that have been developed for OpenSSI 1.2 since the last stable release: 1.0.0. One of them is performance enhancements for the Cluster File System ("CFS"). Now it not only caches remote reads, but it also caches remote writes, while still maintaining a coherent view of the filesystem across the cluster. Furthermore, CFS now does asynchronous remote read-aheads of data blocks that programs might want, so that they are already cached locally by time the program wants them. Hopefully you will notice a significant performance improvement in your filesystem intensive application! Another new feature is atomic migration of a group of processes. The group could either be a POSIX process group or the "threads" of a multi-threaded application (on Linux, each thread is a full process). To migrate a POSIX process group, call the migrate command with the negative PID of the process group leader. These semantics are very similar to signaling a process group. To migrate a thread group, call the migrate command with the PID of any "thread" in that group. With both forms of group migration, either the entire group migrates or it does not. If any process in the group is unable to migrate for any reason, the entire group will remain on the old node. Another process migration enhancement is the ability to migrate a process while it is holding file record locks. These locks will continue to be held during the migration and after the process continues running on the new node. LVS-NAT can now be used with OpenSSI. Linux Virtual Server ("LVS") is a third-party open-source project that load balances TCP connections among the nodes in a cluster. For a long time, LVS has been integrated with OpenSSI, making it easier to manage than an LVS cluster without OpenSSI, but only the Direct Routing ("DR") feature of LVS was supported. DR allows load balancing in a cluster where every node has a direct connection to the external network (as well as a network connection to a private switch for the cluster interconnect, as recommended in the OpenSSI installation instructions). Unfortunately, a security feature in the Linux kernel prevents DR from being used in a cluster where only some nodes are connected to the external network. For these situations, the Network Address Translation ("NAT") feature of LVS should be used, and it is now supported on OpenSSI. Note that LVS-NAT is different from the NAT that you would use for making outbound connections from a private IP address. LVS-NAT is for load balancing inbound connections from a public IP address among a cluster of machines that are all connected to a private network, such as an OpenSSI cluster interconnect. Of course, LVS-NAT requires that potential director nodes are connected to both the external network and the cluster interconnect, so that traffic can move between the two networks. The 'fast' and 'fastnode' commands have been added. 'fastnode' returns the node number for the least-loaded node in the cluster, as determined by the process load-leveling algorithm. 'fast' executes a command on the least-loaded node. Read the man pages for these commands for more information. Several files were added to /proc/cluster/: nm_rate, nm_log_threshold, and nm_nodedown_disabled. nm_rate can be used to alter how often node monitoring messages are exchanged (default is 1 per second) and how long before a node is declared down (default 10 seconds). nm_log_threshold indicates how many monitoring messages can be missed before a kernel warning is generated (default 2). nm_nodedown_disabled can be set to disable nodedown detection, which is useful if you need to enter the kernel debugger on one of the nodes. Previously, you had to recompile an OpenSSI kernel to change one these values. Now you can do it by simply writing a new value into one of these /proc/cluster files. The top command was enhanced for this release by Roopa Prabhu. By default, it adds an execution node number column and only displays clusterwide information in the header. The node number column replaces the mem % column, which is potentially confusing. When top is run in localview mode (i.e., `localview top'), it limits the list of processes to just those that are running on the local node, and it displays all the same information as the base version of top. To improve performance, the init failover state file was moved from /etc/initstate to /cluster/init/initstate. This avoids the need to constantly hit the large /etc directory. Since the last development release (1.1.1), OpenSSI now has the ability to automatically move loadleveled process off a node that is gracefully shutting down due to a clusternode_shutdown call. Another new feature is an interactive 'e' command for top, which prompts the user for a node number and displays only processes on that node. This new feature is only available when top is run in defaultview (not localview). A README was added for configuring clusterwide NFS client mounts. Several interface changes to HA-LVS include a new /proc/cluster/lvs_internal_gw file, as well as changes to make /proc/cluster/lvs_routing use the seq_file interface. The files /usr/sbin/clusterip.sh and /etc/default/lvs_routing no longer exist. There have been many bug fixes. See the ChangeLog for more details. |