Thread: [SSI] process migration
Brought to you by:
brucewalker,
rogertsang
From: Mathias N. <mn...@os...> - 2002-11-13 09:56:06
|
Hi, I was wondering if its possible with SSIC to migrate an entire process, so that the origin node (or home node like in mosix) can be shut down ? I wrote some test programs and figured out that there are still some dependencies on the home node. What kind of dependencies? Aneesh talked about some pseudo processes. What about network connections via TCP/IP or UDP after a migration of an process? How SSIC handles open network connections while the process is migrating? Regards, Mathias |
From: Aneesh K. K.V <ane...@di...> - 2002-11-13 10:46:38
|
I checked the below program . I started it on node2 and after fork immediately rebooted node2. #include <stdio.h> #include <unistd.h> main() { FILE *fp; if( (fp = fopen("/tmp/myfile","w+")) == NULL) { perror("Error:"); exit(1); } getchar(); if(fork() != 0) exit(); close(0); close(1); close(2); setsid(); fwrite("SSIC:",5,1,fp); fwrite("SSIC:",5,1,fp); fflush(fp); /* Just to make sure that it is flushed properly */ fflush(fp); fflush(fp); fflush(fp); migrate(1); sleep(30); fwrite("After migrate SSIC:",19,1,fp); fwrite("SSIC:",5,1,fp); fwrite("SSIC:",5,1,fp); fwrite("SSIC:",5,1,fp); fflush(fp); fflush(fp); } The content of /tmp/myfile was aneesh@dhavalgiri:~$ more /tmp/myfile SSIC:SSIC:After migrate SSIC:SSIC:SSIC:SSIC: That means a.out ran on node1 even after rebooting node2. So for a simple application like above it will work. On Wed, 2002-11-13 at 15:25, Mathias Noack wrote: > Hi, > > I was wondering if its possible with SSIC to migrate an entire process, so > that the origin node (or home node like in mosix) can be shut down ? > > I wrote some test programs and figured out that there are still some > dependencies on the home node. Have you detached the app from the control terminal ? > What kind of dependencies? Aneesh talked > about some pseudo processes. ?????? > What about network connections via TCP/IP or UDP after a migration of an > process? How SSIC handles open network connections while the process is > migrating? All network connections are routed to the node in which the connection originate, much or less the same thing as the mosix home node concept. But this is because the socket is bound on the IP that is local to the node and you can get it rebind on the node on which the process migrated( On that node there is no such IP). But once we have the full clusterwide IP support it should go away. ( With the full DNET code ). Now the interesting part. ( I am not sure about the below details). What happens to SYSV IPC. I don't think when the application migrates the message queue is also migrating with the app. Because we have cluster wide message queue it is not needed. All message queue related activity is still happening on the node it was created. ( Here we could do a performance opt by making the queue migrate if the queue read/write is happening frequently from some other node. IBM DLM code does some such thing for locks . They move the lock to the node which is doing max lock unlock ). May be someone else can elaborate on how IPC are handled. ? > Regards, > Mathias > > -aneesh |
From: Bruce W. <br...@ka...> - 2002-11-13 18:28:01
|
snip > That means a.out ran on node1 even after rebooting node2. So for a > simple application like above it will work. > snip > > > What kind of dependencies? Aneesh talked > > about some pseudo processes. > As I responded to another message, there is no pseudo process on the process's creation node. The creation node does keep track of the existence of the process and location where it is running, for two reasons - one, so it doesn't reuse the pid; second so it can help other processes locate the process in order to send signals,etc. (all done transparently in the kernel, of course). If the process creation node (called the origin node in the code) goes away, however, another node (surrogate origin node) transparently takes over the tracking of the process. Thus the loss of the creation node is transparent. Note that when the creation node rejoins the cluster, it resumes the tracking of old processes, for lots of good reasons. There are currently other object dependencies, however. Objects, like sockets, ptys, physical devices, pipes, systeVIPC, etc. are always created on the node where the process is running at the time it creates it (if it is created before a migrate, it is thus on the node the process is migrating away from; if it is created after the migrate (or rexec or rfork), it is created on the new node). The architectural plan is that objects will be able to move, just like processes. When that is complete, processes will be even more highly available. Some notes about the various objects: a. pipes, fifos, semaphores, message queues, shared memory, unix domain sockets, ... (all IPC objects that are within the cluster) - these are typically created to communicate between processes so if the processes themsevles are distributed, the IPC object can at best be located on one of the execution nodes. If that node crashes, the object and some of the processes using that object would be lost. Our goal, for availability, is to move all processes working together and all the objects they are using to the same node, when appropriate. b. internet sockets - connections which come into the cluster using the LVS VIP (Virtual IP) can be moved (the ldirector would have to be told the new location to redirect to). To the extent that the socket goes to telnetd and on to a pty, telnetd and the pty master and slave would also have to move. All this is possible but not done yet. Outgoing connections currently use the IP address of the local card as part of the name of the socket so these cannot be successfully migrated. Solving that would either involve using the VIP for outgoing IP address (I belive TruClusters does it this way) or doing local device IP address failover (which can get out of hand and is mostly unnecessary when you have VIP). c. ptys - have to move master and slave together. bruce > > > > What about network connections via TCP/IP or UDP after a migration of an > > process? How SSIC handles open network connections while the process is > > migrating? see above. > > All network connections are routed to the node in which the connection > originate, much or less the same thing as the mosix home node concept. > But this is because the socket is bound on the IP that is local to the > node and you can get it rebind on the node on which the process > migrated( On that node there is no such IP). But once we have the full > clusterwide IP support it should go away. ( With the full DNET code ). I think I just stated this a different way above. > > > Now the interesting part. ( I am not sure about the below details). > What happens to SYSV IPC. I don't think when the application migrates > the message queue is also migrating with the app. Because we have > cluster wide message queue it is not needed. All message queue related > activity is still happening on the node it was created. ( Here we could > do a performance opt by making the queue migrate if the queue read/write > is happening frequently from some other node. IBM DLM code does some > such thing for locks . They move the lock to the node which is doing > max lock unlock ). > > May be someone else can elaborate on how IPC are handled. ? I think you covered it. I'm surprised to hear that lock management of active locks moves in the DLM. I'll have to look into that. bruce > > > Regards, > > Mathias > > > > > > -aneesh > > > > > ------------------------------------------------------- > This sf.net email is sponsored by: Are you worried about > your web server security? Click here for a FREE Thawte > Apache SSL Guide and answer your Apache SSL security > needs: http://www.gothawte.com/rd523.html > _______________________________________________ > ssic-linux-devel mailing list > ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel |
From: Aneesh K. K.V <ane...@di...> - 2002-11-14 05:47:32
|
On Wed, 2002-11-13 at 23:57, Bruce Walker wrote: > snip > > > That means a.out ran on node1 even after rebooting node2. So for a > > simple application like above it will work. > > > snip > > > > > What kind of dependencies? Aneesh talked > > > about some pseudo processes. > > It is not me how said about pseudo process. I was replying to a mail from linux kernel mailing list.Prasad was talking about pseudo process which he was planning in his implementation . > As I responded to another message, there is no pseudo process > on the process's creation node. The creation node does keep track > of the existence of the process and location where it is running, for > two reasons - one, so it doesn't reuse the pid; second so it can > help other processes locate the process in order to send signals,etc. > (all done transparently in the kernel, of course). If the > process creation node (called the origin node in the code) goes away, however, > another node (surrogate origin node) transparently takes over the > tracking of the process. Thus the loss of the creation node is > transparent. Note that when the creation node rejoins the cluster, it > resumes the tracking of old processes, for lots of good reasons. > How do i get to know about the surrogate node. From the PID i can find the creation node ( right ? ). But if the creation node goes down which subsystem gives me information of surrogate origin node. Is this the master node ? In that case what happens if master node also went down ? Does this information is build again by asking all the nodes ? Any hint on code part ? ( Is it VPROC I should be looking at ) NOTE regarding DLM, the lock manager documentation can be found at . http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/opendlm/opendlm/docs/lockmgr.txt?rev=HEAD&content-type=text/plain -aneesh |
From: Bruce W. <br...@ka...> - 2002-11-15 18:57:20
|
> > > How do i get to know about the surrogate node. From the PID i can find > the creation node ( right ? ). But if the creation node goes down which > subsystem gives me information of surrogate origin node. Is this the > master node ? In that case what happens if master node also went down > ? Does this information is build again by asking all the nodes ? > > Any hint on code part ? ( Is it VPROC I should be looking at ) How it works: a. pid has creation node encoded in it (and processes keep the same pid when they migrate). b. creation node (code refers to it as origin node) tracks whether a given pid is in use and where it is currently running. c. there is one designated "surrogate origin" node in the cluster. It is by definition the same as the currently CLMS node (node which runs the membership protocol). All nodes know who the current surrogate origin node is so if a creation node is down, they ask the surrogate. d. When a node reboots, the surrogate purges it's tracking information for pids who have that node as a creation node and all nodes repopulate the creation node with information on any processes that have that node as creation node. e. If the surrogate crashes, the new surrogate is the new CLMS master and all nodes repopulate the new surrogate with information about any processes running on them whose origin node is not up. there is appropriate "locking" so that it is never the case (even with an arbitrary number of simultaneous failures) that a process is reported as not existing when in fact it does. Code would be in the vproc nodedown and nodeup. Laura can point you specifically. bruce > > > NOTE regarding DLM, the lock manager documentation can be found at . > > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/opendlm/opendlm/docs/lockmgr.txt?rev=HEAD&content-type=text/plain > thanks > > -aneesh > > |
From: Christian L. <ly...@po...> - 2003-01-24 21:14:31
|
Hi =09SSI has automatic process migration? The help for config_mosix_ll seem= s to=20 sugest that it has, but I couldnt see it working, nor I could migrate pro= cess=20 with the migrate command. I tried to run a little shell script with a lon= g=20 loop inside just to waste cpu, but even running several instances none of= =20 them migrate. Forcing to migrate running the migrate command doesnt work=20 either! =09The kernel was compiled with all "cluster" options.=20 =09Hm.... I just saw a "cluster_start" command here.... what this do? it = ask=20 for a /etc/cluster.conf that I dont have! --=20 Christian Lyra POP-PR - RNP Thus spake the master programmer:=20 ``A well-written program is its own heaven; a poorly-written program is= its=20 own hell.''=20 =09=09=09=09=09=09The Tao Of Programing |
From: John B. <joh...@hp...> - 2003-01-24 21:35:17
|
Christian Lyra wrote: > Hi > SSI has automatic process migration? The help for config_mosix_ll seems to > sugest that it has, but I couldnt see it working, nor I could migrate process > with the migrate command. I tried to run a little shell script with a long > loop inside just to waste cpu, but even running several instances none of > them migrate. Forcing to migrate running the migrate command doesnt work > either! As to automatic migration: certain steps must be taken first to enable and set it up. I just peeked and it does not seem that anyone has updated the documentation on mosixll. I will beat someone up about this. As to the migrate command: What process did you try to migrate? They are some things that cannot be migrated at this time. (Threaded processes; processes that use obscure features.) Try this, assuming you are logged into node 1: $ clusternode_num 1 $ migrate 2 $$ $ clusternode_num 2 If the second command doesn't produce a "2" (or whatever node you used) then I am puzzled. > The kernel was compiled with all "cluster" options. > > Hm.... I just saw a "cluster_start" command here.... what this do? it ask > for a /etc/cluster.conf that I dont have! > > Something for CI, not SSI. Don't worry about it. John Byrne |
From: Laura R. <lra...@ka...> - 2003-01-25 02:08:14
|
> Christian Lyra wrote: > > Hi > > SSI has automatic process migration? The help for config_mosix_ll seems to > > sugest that it has, but I couldnt see it working, nor I could migrate process > > with the migrate command. I tried to run a little shell script with a long > > loop inside just to waste cpu, but even running several instances none of > > them migrate. Forcing to migrate running the migrate command doesnt work > > either! > > As to automatic migration: certain steps must be taken first to enable > and set it up. I just peeked and it does not seem that anyone has > updated the documentation on mosixll. I will beat someone up about this. Sorry for not updated the documentation...i have just finished updating and will check it in. A quick answer to your questions... If you have several application/programs that you want to be automatically loadleveled: 1. add the absolute pathname of the programs into the /proc/cluster/loadlevellist file 2. loadlevel -a on (this will turn loadleveling on for all the nodes) 3. Start your application If you just want to loadlevel a program: 1. loadlevel -a on 2. loadlevel <application name> The "loads" command should tell you if loadleveling is on or off. (An * next to the load means that its off) Remember that anything that is in the /proc/cluster/loadlevellist or was started with the loadlevel command is eligible to move as well as any children they fork. The children will inherit the loadlevel characteristic from their parent. laura |
From: <mar...@me...> - 2003-01-27 08:13:37
|
Quoting John Byrne <joh...@hp...>: > > $ clusternode_num > 1 > $ migrate 2 $$ > $ clusternode_num > 2 Any reason why this only works as root? When running as my own uid, I get no errors and no migration. I'm running the binary 0.8.0r1 ontop of RedHat 7.3. Users are authenticated through YP/NIS. Martin -- "Computer science is not about computers any more than astronomy is about telescopes." -- EW Dijkstra |
From: John B. <joh...@hp...> - 2003-01-27 19:16:42
|
Martin Høy wrote: > Quoting John Byrne <joh...@hp...>: > >> >>$ clusternode_num >>1 >>$ migrate 2 $$ >>$ clusternode_num >>2 > > > Any reason why this only works as root? When running as my own uid, > I get no errors and no migration. The migrate command as currently defined simply sends a signal to the process. There is no convenient way to feed back any status information. A process can migrate itself using the migrate() call in libcluster and it will receive any status information. > > I'm running the binary 0.8.0r1 ontop of RedHat 7.3. Users are > authenticated through YP/NIS. Any order for the process to migrate successfully, all the files are directories must be on clusterized filesystems. Which means GFS or CFS stacked. CFS will currently stack automatically only on ext2. The code can be modified to allow others. However, NFS is currently a problem. Since you are using NIS, are you using NFS by any chance? John > > > Martin > -- > "Computer science is not about computers any more than astronomy > is about telescopes." -- EW Dijkstra > > |
From: <mar...@me...> - 2003-01-28 07:46:36
|
Quoting John Byrne <joh...@hp...>: > > Any order for the process to migrate successfully, all the files are > directories must be on clusterized filesystems. Which means GFS or > CFS stacked. CFS will currently stack automatically only on ext2. > The code can be modified to allow others. However, NFS is currently > a problem. > > Since you are using NIS, are you using NFS by any chance? Ah, so that's the problem. Yes, /home is mounted via NFS from a file-server. Are there plans to enable CFS on NFS? And if so, when? Regards, Martin -- "Computer science is not about computers any more than astronomy is about telescopes." -- EW Dijkstra |
From: John B. <joh...@hp...> - 2003-01-24 22:09:22
|
Christian Lyra wrote: >>As to the migrate command: What process did you try to migrate? They are >>some things that cannot be migrated at this time. (Threaded processes; >>processes that use obscure features.) >> >>Try this, assuming you are logged into node 1: >> >>$ clusternode_num >>1 >>$ migrate 2 $$ >>$ clusternode_num >>2 >> > > > it works... but why the following doesnt? > > c3sl2:~# cat burnit.sh > #!/bin/sh > for i in $(seq 1 2000000); do foo=$i; done; > > c3sl2:~# ./burnit.sh & > [1] 65906 > c3sl2:~# migrate 2 65906 > c3sl2:~# where_pid 65906 > 1 > > ??????? > > I don't know. It just did for me on my Itanium test systems. I'll try it on an ia32 and see if things are any different. I don't have these up at the moment, nor do I have a kernel built. It will be a while. John Byrne |
From: John B. <joh...@hp...> - 2003-01-27 23:35:42
|
John Byrne wrote: > Christian Lyra wrote: > >>> As to the migrate command: What process did you try to migrate? They are >>> some things that cannot be migrated at this time. (Threaded processes; >>> processes that use obscure features.) >>> >>> Try this, assuming you are logged into node 1: >>> >>> $ clusternode_num >>> 1 >>> $ migrate 2 $$ >>> $ clusternode_num >>> 2 >>> >> >> >> it works... but why the following doesnt? >> >> c3sl2:~# cat burnit.sh >> #!/bin/sh >> for i in $(seq 1 2000000); do foo=$i; done; >> >> c3sl2:~# ./burnit.sh & >> [1] 65906 >> c3sl2:~# migrate 2 65906 >> c3sl2:~# where_pid 65906 >> 1 >> >> ??????? >> >> This is causing me trouble on ia32. I'm not quite sure why, yet. John |
From: Christian L. <ly...@po...> - 2003-01-28 12:31:20
|
> >> it works... but why the following doesnt? > >> > >> c3sl2:~# cat burnit.sh > >> #!/bin/sh > >> for i in $(seq 1 2000000); do foo=3D$i; done; > >> > >> c3sl2:~# ./burnit.sh & > >> [1] 65906 > >> c3sl2:~# migrate 2 65906 > >> c3sl2:~# where_pid 65906 > >> 1 > >> > >> ??????? > > This is causing me trouble on ia32. I'm not quite sure why, yet. =09Ow... bug hunting season initiated :-). Well.. I dont know why, but th= e=20 demo2002 program works very well, process migrate, etc... =09I tried again with the script above and found the echoing it to=20 loadlevellist and running various instances, some of them migrate=20 automatically to other nodes, but still the migrate program doesnt work! > > John --=20 Christian Lyra POP-PR - RNP Something mysterious is formed, born in the silent void. Waiting alone = and=20 unmoving, it is at once still and yet in constant motion. It is the sourc= e of=20 all programs. I do not know its name, so I will call it the Tao of=20 Programming.=20 =09=09=09=09=09=09The Tao Of Programing |
From: John B. <joh...@hp...> - 2003-02-05 01:37:06
|
I checked in a fix to the CVS repository so your "burnit" test migrates reliably. (At least for me.) John |
From: Christian L. <ly...@po...> - 2003-01-25 17:51:29
|
Hi, =09I think that all the problems that i'm facing is somewhat kernel relat= ed. I=20 couldnt get the cvs patches to work, and even the ssic-0.8.0 althought ap= ply=20 cleanly, generate a lot of warning while compiling.=20 =09So I ask, if it is possible, to someone build a kernel-image.deb (Anne= esh?),=20 with everything as possible build as modules. I'm using Athlons machines = and=20 will apreciate if the kernel is compiled with athlon architeture, but I k= now=20 this is to much to ask :), so i586 will do. (but netfilter is essential!) =09BTW I didnt saw any warnings in documentation about possible bad inter= ations=20 between various kernel components/configurations, like (example) turning = on=20 acpi/apm may break something else and so on.=20 =09 =09Plz, let me know if is something else I can do to help/fix this proble= m=20 (kernel programing is too much to me but I can do other things :-) ). =09Christian Lyra On Friday 24 January 2003 20:09, you wrote: > Christian Lyra wrote: > >>As to the migrate command: What process did you try to migrate? They = are > >>some things that cannot be migrated at this time. (Threaded processes= ; > >>processes that use obscure features.) > >> > >>Try this, assuming you are logged into node 1: > >> > >>$ clusternode_num > >>1 > >>$ migrate 2 $$ > >>$ clusternode_num > >>2 > > > > it works... but why the following doesnt? > > > > c3sl2:~# cat burnit.sh > > #!/bin/sh > > for i in $(seq 1 2000000); do foo=3D$i; done; > > > > c3sl2:~# ./burnit.sh & > > [1] 65906 > > c3sl2:~# migrate 2 65906 > > c3sl2:~# where_pid 65906 > > 1 > > > > ??????? > > I don't know. It just did for me on my Itanium test systems. I'll try i= t > on an ia32 and see if things are any different. I don't have these up a= t > the moment, nor do I have a kernel built. It will be a while. > > John Byrne |
From: Aneesh K. K.V <ane...@di...> - 2003-01-27 13:05:18
|
On Sat, 2003-01-25 at 23:20, Christian Lyra wrote: > > Hi, > > I think that all the problems that i'm facing is somewhat kernel related. I > couldnt get the cvs patches to work, and even the ssic-0.8.0 althought apply > cleanly, generate a lot of warning while compiling. Can you send me you .config file. ??? > So I ask, if it is possible, to someone build a kernel-image.deb (Anneesh?), > with everything as possible build as modules. I'm using Athlons machines and > will apreciate if the kernel is compiled with athlon architeture, but I know > this is to much to ask :), so i586 will do. (but netfilter is essential!) I will try to build .deb files. > > BTW I didnt saw any warnings in documentation about possible bad interations > between various kernel components/configurations, like (example) turning on > acpi/apm may break something else and so on. > > Plz, let me know if is something else I can do to help/fix this problem > (kernel programing is too much to me but I can do other things :-) ). > > Christian Lyra > |
From: stevie m. <ste...@ho...> - 2003-09-29 12:20:50
|
i have read that openssi uses the load balancing algorithm from openmosix but not the migration technology. is the load balancing based on the resource costing method from openmosix? what about the migration technology used? stephen mckibbin MPhil Sheffield Hallam University _________________________________________________________________ The new MSN 8: advanced junk mail protection and 2 months FREE* http://join.msn.com/?page=features/junkmail |
From: Laura R. <lra...@ka...> - 2003-09-30 01:11:19
|
OpenSSI is currently using the openmosix kernel version 1.5.2. As far as the resource costing method from openmosix, which uses CPU, memory and I/O resources to determine which node is the best node to go to (i.e lowest cost), OpenSSI uses everything except the I/O costs information to select the best node. The OpenSSI migration model is quite different from openMosix. OpenSSI uses vprocs for process management, which allows us to move the entire process, both user and kernel context to the new node, so nothing remains at the "home" node. The chosen process will call the mosix algorithm to find the best node, then migrate itself to the new node and continue running. Bruce can give a better history of vprocs. laura > i have read that openssi uses the load balancing algorithm from openmosix > but not the migration technology. is the load balancing based on the > resource costing method from openmosix? what about the migration technology > used? > stephen mckibbin > MPhil Sheffield Hallam University |