You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(25) |
Nov
|
Dec
(22) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(13) |
Feb
(22) |
Mar
(39) |
Apr
(10) |
May
(26) |
Jun
(23) |
Jul
(38) |
Aug
(20) |
Sep
(27) |
Oct
(76) |
Nov
(32) |
Dec
(11) |
2003 |
Jan
(8) |
Feb
(23) |
Mar
(12) |
Apr
(39) |
May
(1) |
Jun
(48) |
Jul
(35) |
Aug
(15) |
Sep
(60) |
Oct
(27) |
Nov
(9) |
Dec
(32) |
2004 |
Jan
(8) |
Feb
(16) |
Mar
(40) |
Apr
(25) |
May
(12) |
Jun
(33) |
Jul
(49) |
Aug
(39) |
Sep
(26) |
Oct
(47) |
Nov
(26) |
Dec
(36) |
2005 |
Jan
(29) |
Feb
(15) |
Mar
(22) |
Apr
(1) |
May
(8) |
Jun
(32) |
Jul
(11) |
Aug
(17) |
Sep
(9) |
Oct
(7) |
Nov
(15) |
Dec
|
From: henken <he...@se...> - 2002-01-18 01:39:05
|
I have tracked down an error message that shows up everytime bpsh hangs: Jan 29 12:57:53 master bpmaster: write(ghost): missing process for message type 14 req; to=3,27978 from=1,27978 result=0 Jan 29 16:20:06 master bpmaster: write(ghost): missing process for message type 14 req; to=3,31128 from=1,31128 result=11 Nic -- Nicholas Henke Undergraduate - Engineerring 2002 -- Senior Architect and Developer Liniac Project - University of Pennsylvania http://clubmask.sourceforge.net ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's nothing like good food, good beer, and a bad girl. |
From: henken <he...@se...> - 2002-01-17 15:44:02
|
Hey-- I am running 2.4.16 with bproc-3.1.5. I am still using the infamous 'noop.sh' test script to try and break things. I have noticed that the stability is much better, but I have noticed a few things. The first is that bpsh will seem to hang indefinately, and if the bpsh is kill -9'd, the program it was executing goes into zombie state. I didn't see any errors in the logs on the nodes or master for that. The second error I saw, did give me some error message, but there was no core dump from bpmaster. Here are the messages: Jan 29 11:19:29 master /usr/sbin/bpmaster: FATAL: assoc_find: invalid pid -11553 After that, bpmaster was dead and of course the bpsh would give the usual errors. I have tried using -m on bpmaster to get message traces, but I dont seem to have enough harddrive space to watch those as this takes around 350K iterations of noop in parallel on 4 processors ( 1.2 mil procs total) to trigger. Any ideas on how to give you more information on what is happening? Nic -- Nicholas Henke Undergraduate - Engineerring 2002 -- Senior Architect and Developer Liniac Project - University of Pennsylvania http://clubmask.sourceforge.net ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's nothing like good food, good beer, and a bad girl. |
From: Erik A. H. <er...@he...> - 2002-01-09 19:33:00
|
I just uploaded BProc version 3.1.5 to sourceforge. http://sourceforge.net/project/showfiles.php?group_id=24453&release_id=69133 Release notes and change log are attached below. - Erik 3.1.5 --------------------------------------------------------------------- This release deals with a x86 VMADump problem migrating between machines with different FPU types. This caused a kernel oops when the process touched the FPU (and caused a restore) on the second machine. VMADump now does the restore itself and traps exceptions. In the case of an exception, the moving process will lose its FPU state. This needs some testing with different combinations of FPU types. I say this mostly because the Intel documentaiton for frstor and fxrstor doesn't list any exceptions or say anything about what is supposed to happen when restoring bogus data. This release also adds a first take on vrfork (vector rfork) which will make creation of large numbers of processes in a cluster more efficient. vrfork uses a tree based scheme to distribute the process image while creating a flat process tree. Changes from 3.1.4 to 3.1.5 * Fixed VMADump FPU restore to trap restore failures and restore a clean FPU state in those cases. This should address (although not really solve) problems migrating between FPU architectures. * Fixed a master daemon bug that could result in move responses getting lost. * Added first take on vrfork (vector rfork) to make creation of large numbers of child processes efficient. * Cleaned up some lingering goofiness in rfork related to error handling and signals. |
From: Erik A. H. <er...@he...> - 2002-01-08 23:49:29
|
On Mon, Jan 07, 2002 at 05:13:06PM -0500, henken wrote: > I am playing around with some of the bproc C functions, and I have noticed > that most of them do not work on the nodes if the program is started > locally on the node instead of through bpsh. here is the sample program: > #include <stdio.h> > #include <sys/bproc.h> > > int main() { > printf("currnode: %d\n", bproc_currnode()): > } > > when invoked via bpsh: > # bpsh <node> test > the out put is as expected, that the right node number is printed. > However, when I ssh to the node and run it: > # ssh node<node> > # ./test > > I always get -1 as the output. Is this the correct behavior? If so, how > would I get the information I am looking for with out using bpsh? This is the correct behavior. The issue here is the process space that test ends up in. The master is the machine that answers current node requests. For the bpsh case, test's request is answered by the master you started it from and you'll get whatever node number that it's on. If you ssh to the node, test is just another (not remotely managed) process on that node. test will be running in the slave's own process space in that case. In that sense it's running on the front end (node -1) of the process space that machine controls. There's no way for local processes on a slave node to make requests of a master they're not being managed by. Keep in mind that a slave can run many slave daemons so it wouldn't even be clear who to ask if you could ask a master you weren't being managed by. I hope that made some sense. - Erik |
From: henken <he...@se...> - 2002-01-08 22:14:04
|
I am playing around with some of the bproc C functions, and I have noticed that most of them do not work on the nodes if the program is started locally on the node instead of through bpsh. here is the sample program: #include <stdio.h> #include <sys/bproc.h> int main() { printf("currnode: %d\n", bproc_currnode()): } when invoked via bpsh: # bpsh <node> test the out put is as expected, that the right node number is printed. However, when I ssh to the node and run it: # ssh node<node> # ./test I always get -1 as the output. Is this the correct behavior? If so, how would I get the information I am looking for with out using bpsh? Nic -- Nicholas Henke Undergraduate - SEAS '02 Liniac Project - University of Pennsylvania http://clubmask.sourceforge.net ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's nothing like good food, good beer, and a bad girl. |
From: Erik A. H. <er...@he...> - 2001-12-19 20:01:23
|
BProc 3.1.4 is up on sourceforge. http://sourceforge.net/project/showfiles.php?group_id=24453 Attached below are the change log and release notes... - Erik Release notes: 3.1.4 --------------------------------------------------------------------- Lots more bugfixes here. Most of them come from seeing problems running under high load (i.e. 10000+ processes). There couple of minor new features and changes as well. See the change log for details. Change log: Changes from 3.1.3 to 3.1.4 * Added come code to work around a Linux TCP bug. The symptom of this one is two machines where the machine doing the connect has a TCP connection in the ESTABLISHED state and the other machine has no connection at all. This bug exists in 2.4.13 and 2.4.16. I noticed it when migrating 10000 jobs off the front end machine simultaneously. About 10 or so would end up in that state every time. This was not detected since the connecting end was blocking on a read. * Added some code to retry connections for "Connection timed out" during moves. This seems necessary under really high load. ...or maybe if your network sucks. * Fixed problems where signals would sometimes not be forwarded to remote processes. * Fixed a possible race in ghost status updates which could lead to not seeing stopped children with wait() * Cleaned up some locking in the ghost code. * Backed out signal bypass stuff. I'm not sure it's broken but I've seen some weird stuff going on under high load so it's out for now. * Fixed a possible kernel oops in bproc_rfork. * Fixed bpsh IO fowarding race condition. bpsh could fail to close stdin for the remote process if it received EOF on its stdin before the connection from the remote process AND the size of input on bpsh's stdin is zero bytes. * Added "pingtimeout" to the configuration language... finally. The configuration affects the master and the slave and can be changed at runtime via SIGHUP to the master. * Added sysctl interface. This is mostly for debugging right now but I expect more will go in there in the future. * Added non-blocking versions of reboot, halt, poweroff. The blocking versions of those calls are now interruptible. * Fixed a deadlock problem in VMADump. The symptom of this was 'ps' or anything else that read /proc getting hung up. * Fixed tiny bug in bplib that kept -d from working. |
From: Erik A. H. <er...@he...> - 2001-12-18 15:08:43
|
On Tue, Dec 18, 2001 at 10:55:39AM +0100, Carlos J. Garcia Orellana wrote: > One question more: Is necesary to have libbeostat and libbeomap to use mpi > with Erik's patch? No. The patch doesn't address scheduling at all. It's up to mpirun to chose nodes and place the right number of jobs on each node. mpirun can use anything at all to make those decisions. The mpirun I wrote (which I will probably get out some time after xmas) doesn't do any resource allocation at all. We were hoping to trick somebody err... I mean encourage somebody to write an open scheduler for a system like this. We'll probably try to allocate a student to that probalem some time in the future. > Last weekend, I've modificated libbeostat to run with 2.4.13 kernel and now > I have working a mpi 1.2.2 build as rpm package (using the spec file > provided by Scyld in utils dir of mpich 1.2.2 distribution). I looks work > fine. > > Are there any problem in using libbeostat?. Nope. Use whatever you want. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: Carlos J. G. O. <ca...@ne...> - 2001-12-18 09:56:09
|
One question more: Is necesary to have libbeostat and libbeomap to use mpi with Erik's patch? Last weekend, I've modificated libbeostat to run with 2.4.13 kernel and now I have working a mpi 1.2.2 build as rpm package (using the spec file provided by Scyld in utils dir of mpich 1.2.2 distribution). I looks work fine. Are there any problem in using libbeostat?. Thanks a lot: Carlos. ----- Original Message ----- From: "Nicholas Henke" <he...@se...> To: "Erik Arjan Hendriks" <er...@he...> Cc: <bpr...@so...> Sent: Monday, December 17, 2001 11:11 PM Subject: Re: [BProc] Clustermatic and MPI > Thanks!! This makes much more sense... and it works nicely without having > to execute from the master node. I am taking a month off for semester > break, so I will check for your mpirun script when I get back. > > Thanks a ton-- keep up the good work. It is nice to see bproc based stuff > without seeing it based specifically on Scyld. > > Nic > > On Mon, 17 Dec 2001, Erik Arjan Hendriks wrote: > > [SNIP] > > > > We saw that here and ended up making our own little modification. > > Attached below is the MPICH patch. The down side is you need a > > special MPI run to use this. Unfortunately since that's a separate > > piece of code that I wrote from scratch here, there's a procedure to > > do through to release that. > > > > It's a very simple program though. Somebody could rewrite it a LOT > > faster than I can get it released if they're feeling impatient. > > > > The patch just creates a new "external execer" facility. For the > > program "app" and -np 4, mpirun would fork and bproc_execmove the > > following: > > > > rank 0: app -p4execer 0 4 n-1 45541 ;n5,0;n6,1;n7,1;n10,1 > > rank 1: app -p4execer 1 4 n5 41922 > > rank 2: app -p4execer 2 4 n5 41922 > > rank 3: app -p4execer 3 4 n5 41922 > > > > -p4execer is the magic argument and it works like this: > > > > for rank 0: > > -p4execer rank jobsize mpirunhost mpirunport procgroup > > > > for rank 1+: > > -p4execer rank jobsize rank0host rank0port > > > > The reason rank 0 is special is because it is the job that all the > > others must connect to in MPI_Init. In order to do that the others > > must know what host and port rank 0 is on. mpirun won't know what > > port to tell the others unless rank 0 tells it. That's why rank zero > > connects to mpirun and sends its port number. Then mpirun can start > > all the other jobs with approprate arguments. > > > > The format of the process group argument is: > > > > ;host0,0;host1,1;host2,1;host3,1 > > > > > > You could just wait for me to get our simple mpirun released but it > > probably wont be for a while since I probably can't do it before xmas > > and the lab is closed for a week then. > > > > - Erik > > > > -- > Nicholas Henke > Undergraduate - SEAS '02 > Liniac Project - University of Pennsylvania > http://clubmask.sourceforge.net > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > work: 215-873-5149 > cell/home: 215-681-2705 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > There's nothing like good food, good beer, and a bad girl. > > > > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users > |
From: Nicholas H. <he...@se...> - 2001-12-17 22:12:09
|
Thanks!! This makes much more sense... and it works nicely without having to execute from the master node. I am taking a month off for semester break, so I will check for your mpirun script when I get back. Thanks a ton-- keep up the good work. It is nice to see bproc based stuff without seeing it based specifically on Scyld. Nic On Mon, 17 Dec 2001, Erik Arjan Hendriks wrote: [SNIP] > > We saw that here and ended up making our own little modification. > Attached below is the MPICH patch. The down side is you need a > special MPI run to use this. Unfortunately since that's a separate > piece of code that I wrote from scratch here, there's a procedure to > do through to release that. > > It's a very simple program though. Somebody could rewrite it a LOT > faster than I can get it released if they're feeling impatient. > > The patch just creates a new "external execer" facility. For the > program "app" and -np 4, mpirun would fork and bproc_execmove the > following: > > rank 0: app -p4execer 0 4 n-1 45541 ;n5,0;n6,1;n7,1;n10,1 > rank 1: app -p4execer 1 4 n5 41922 > rank 2: app -p4execer 2 4 n5 41922 > rank 3: app -p4execer 3 4 n5 41922 > > -p4execer is the magic argument and it works like this: > > for rank 0: > -p4execer rank jobsize mpirunhost mpirunport procgroup > > for rank 1+: > -p4execer rank jobsize rank0host rank0port > > The reason rank 0 is special is because it is the job that all the > others must connect to in MPI_Init. In order to do that the others > must know what host and port rank 0 is on. mpirun won't know what > port to tell the others unless rank 0 tells it. That's why rank zero > connects to mpirun and sends its port number. Then mpirun can start > all the other jobs with approprate arguments. > > The format of the process group argument is: > > ;host0,0;host1,1;host2,1;host3,1 > > > You could just wait for me to get our simple mpirun released but it > probably wont be for a while since I probably can't do it before xmas > and the lab is closed for a week then. > > - Erik > -- Nicholas Henke Undergraduate - SEAS '02 Liniac Project - University of Pennsylvania http://clubmask.sourceforge.net ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ work: 215-873-5149 cell/home: 215-681-2705 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's nothing like good food, good beer, and a bad girl. |
From: Erik A. H. <er...@he...> - 2001-12-17 18:09:20
|
On Sun, Dec 16, 2001 at 05:21:39PM -0500, Nicholas Henke wrote: > Yes, but unfortunatley it is specific to Scyld's use of Bproc. They have > hacked in dependancies for beomap and beostatus. To use mpich requires you > to use this programs instead of another resource manager. We saw that here and ended up making our own little modification. Attached below is the MPICH patch. The down side is you need a special MPI run to use this. Unfortunately since that's a separate piece of code that I wrote from scratch here, there's a procedure to do through to release that. It's a very simple program though. Somebody could rewrite it a LOT faster than I can get it released if they're feeling impatient. The patch just creates a new "external execer" facility. For the program "app" and -np 4, mpirun would fork and bproc_execmove the following: rank 0: app -p4execer 0 4 n-1 45541 ;n5,0;n6,1;n7,1;n10,1 rank 1: app -p4execer 1 4 n5 41922 rank 2: app -p4execer 2 4 n5 41922 rank 3: app -p4execer 3 4 n5 41922 -p4execer is the magic argument and it works like this: for rank 0: -p4execer rank jobsize mpirunhost mpirunport procgroup for rank 1+: -p4execer rank jobsize rank0host rank0port The reason rank 0 is special is because it is the job that all the others must connect to in MPI_Init. In order to do that the others must know what host and port rank 0 is on. mpirun won't know what port to tell the others unless rank 0 tells it. That's why rank zero connects to mpirun and sends its port number. Then mpirun can start all the other jobs with approprate arguments. The format of the process group argument is: ;host0,0;host1,1;host2,1;host3,1 You could just wait for me to get our simple mpirun released but it probably wont be for a while since I probably can't do it before xmas and the lab is closed for a week then. - Erik |
From: Nicholas H. <he...@se...> - 2001-12-16 22:22:29
|
Yes, but unfortunatley it is specific to Scyld's use of Bproc. They have hacked in dependancies for beomap and beostatus. To use mpich requires you to use this programs instead of another resource manager. Nic ...snip... > > The chages introduced in Scyld MPI are now in standard MPICH. So get > the newest mpich from > http://www-unix.mcs.anl.gov/mpi/mpich > and build it. > > In other words, standard mpich has now support for bproc. > > |
From: J.A. M. <jam...@ab...> - 2001-12-16 22:09:56
|
On 20011216 Jag wrote: >On Sun, 16 Dec 2001, Carlos J. Garcia Orellana wrote: > >> Hello, >> >> Can I use mpich with clustermatic?. > >Clustermatic uses a setup very similar to Scyld's, and Scyld ships a >version of mpich modified to work nicely with BProc. You might need to >make some miner changes as Scyld's mpich was modified for BProc 2.2 and >clustermatic uses BProc 3 (which has a slightly different API), but >other than that Scyld's MPICH should compile and work on a clustermatic >system. > >If you want a supported and out of the box system using BProc, you can >also look at Scyld Beowulf, http://www.scyld.com/ > The chages introduced in Scyld MPI are now in standard MPICH. So get the newest mpich from http://www-unix.mcs.anl.gov/mpi/mpich and build it. In other words, standard mpich has now support for bproc. -- J.A. Magallon # Let the source be with you... mailto:jam...@ab... Mandrake Linux release 8.2 (Cooker) for i586 Linux werewolf 2.4.17-rc1-beo #1 SMP Fri Dec 14 09:58:53 CET 2001 i686 |
From: Jag <ag...@li...> - 2001-12-16 15:26:07
|
On Sun, 16 Dec 2001, Carlos J. Garcia Orellana wrote: > Hello, >=20 > Can I use mpich with clustermatic?. Clustermatic uses a setup very similar to Scyld's, and Scyld ships a version of mpich modified to work nicely with BProc. You might need to make some miner changes as Scyld's mpich was modified for BProc 2.2 and clustermatic uses BProc 3 (which has a slightly different API), but other than that Scyld's MPICH should compile and work on a clustermatic system. If you want a supported and out of the box system using BProc, you can also look at Scyld Beowulf, http://www.scyld.com/ |
From: Carlos J. G. O. <ca...@ne...> - 2001-12-16 01:28:00
|
Hello, Can I use mpich with clustermatic?. Carlos. |
From: Nicholas H. <he...@se...> - 2001-12-14 21:44:09
|
We are writing a pam_bproc module here that would allow us to controll ssh/rsh logins via the bproc access control. We are pretty sure that this is a straight forward application, as you just check the nodeinfo for the current node, and compare the user and group. Has anyone already done this? Any feature requests? Nic -- Nicholas Henke Undergraduate - SEAS '02 Liniac Project - University of Pennsylvania http://clubmask.sourceforge.net ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's nothing like good food, good beer, and a bad girl. |
From: Nicholas H. <he...@se...> - 2001-12-14 21:39:37
|
On Fri, 14 Dec 2001, Erik Arjan Hendriks wrote > The bpsh errors are probably just a symptom of other stuff going > wrong. The bpmaster crash is probably the best place to start here. > It should produce a core dump at the point where it dies with that > message. That core file would be useful to look at. > > 1) build bpmaster with debugging turned on (-g). > 2) Run it w/ core dumps enabled > 3) Run it w/ message trace enabled "-m filename" > > If you can reproduce the crash with all of that, we should have a good > point to start from to try and figure out what happened. > > I've been running your test script for a few hours here and haven't > had any trouble. > > - Erik I am running the test again here, with all of the suggestions above. I do believe it took several hours to fail last time, somewhere around 200K iterations per processor, or around 1 million total. I will email you the results if/when it fails again. Nic |
From: Erik A. H. <er...@he...> - 2001-12-14 20:12:24
|
On Thu, Dec 13, 2001 at 09:22:40PM -0500, henken wrote: ..snip.. > Iteration: 205859 on node: 1 > bpsh: invalid pid -29611 > bpsh: invalid pid -29611 > bpsh: invalid pid -29611 > bpsh: Child process exit abnormally. > Iteration: 207649 on node: 0 > bproc_nodelist: Input/output error This is very weird. Since bpsh is getting that bogus PID back, it implies that the remote process got that PID assigned to it. Which says "problem with move" to me. A very very weird problem though. Like maybe to procs with the same pid are trying to move to the same node at the same time. Either that or data corruption or the slave disfunctioning in some weird way. > I also found this in /var/log/messages: > Dec 13 16:05:39 master /usr/sbin/bpmaster: FATAL: assoc_find: invalid pid > -29611 > > Am I expecting too much of bproc? Nope. Absolutely not. This kind of stress testing is how you weed out the hard to reproduce and harder to find bugs. > I know I am being ridiculously evil in running this test script, but > we have seen situations similar to this when our users run jobs that > fail at the onset and they do not exit cleanly. The bpsh errors are probably just a symptom of other stuff going wrong. The bpmaster crash is probably the best place to start here. It should produce a core dump at the point where it dies with that message. That core file would be useful to look at. 1) build bpmaster with debugging turned on (-g). 2) Run it w/ core dumps enabled 3) Run it w/ message trace enabled "-m filename" If you can reproduce the crash with all of that, we should have a good point to start from to try and figure out what happened. I've been running your test script for a few hours here and haven't had any trouble. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: henken <he...@se...> - 2001-12-14 02:23:07
|
Hello -- I have upgraded to the ~stock kernel shipped onthe clustermatic CD, and have re-run my stress test. If they are not remembered, here they are: [root@master /tmp]# more /home/henken/cvs/jobs/noop.c #include <unistd.h> #include <stdio.h> int main() { } [root@master /tmp]# and the job script [root@master /tmp]# more /home/henken/cvs/jobs/noop.sh #!/bin/bash JOBID=${0##/*/} /bin/echo "JOBID:$JOBID" NODES=`/usr/local/clubmask-0.5a2/bin/getnodes $JOBID` /bin/echo "NODES:$NODES" for node in $NODES; do ( let count=0 while [ $count -le $1 ]; do /bin/echo "Iteration: $count on node: $node" bpsh $node /home/henken/cvs/jobs/bin/noop let count=count+1 done ) & done I have been running these with 2 SMP nodes, which each get 2 proccesses, for a total of 4 instances of the while loop at the same time. I have been running with a $1 ( the number of iterations ) around 1 million. I have seen other kernel related problems related to RH's patching of the kernel, but now with the stock kernel I am getting bpmaster, bpslave, and bpsh failures. Here are the captures from the message I could find related to the error: in the stdout capture from the job script: [SNIP] Iteration: 205859 on node: 1 bpsh: invalid pid -29611 bpsh: invalid pid -29611 bpsh: invalid pid -29611 bpsh: Child process exit abnormally. Iteration: 207649 on node: 0 bproc_nodelist: Input/output error I also found this in /var/log/messages: Dec 13 16:05:39 master /usr/sbin/bpmaster: FATAL: assoc_find: invalid pid -29611 Am I expecting too much of bproc? I know I am being ridiculously evil in running this test script, but we have seen situations similar to this when our users run jobs that fail at the onset and they do not exit cleanly. Thanks for any and all help, and I am fully able to send more info/test other options. Nic -- Nicholas Henke Undergraduate - SEAS '02 Liniac Project - University of Pennsylvania http://clubmask.sourceforge.net ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ work: 215-873-5149 cell/home: 215-681-2705 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's nothing like good food, good beer, and a bad girl. |
From: Nicholas H. <he...@se...> - 2001-12-11 02:54:34
|
I wouldnt mind helping out with the packaging and testing with the new versions, as I would be using them a bunch. I am hoping to do this, as I would like to have a standard platform of SRPMS to use when testing new versions of BProc. As of now, we are running on top of RH71, and you can see the mess that creates, as per the kernel OOPS I have had with that. Nic On Mon, 10 Dec 2001, Erik Arjan Hendriks wrote: > On Mon, Dec 10, 2001 at 08:42:02PM -0500, Nicholas Henke wrote: > > Will the SRPMS/RPMS that are on the CD be available in updated form on > > the web? That is to ask will you post new kernel and bproc RPMS when there > > are new versions available? > > Yes - when new versions actually get packaged up. I don't plan on > doing this for every kernel or every patch level of BProc. That would > easily consume all of my time. The hope is to do something like this > every few (3 or 4?) months. > > Building and testings the kernel packages in particular is hugely time > consuming given the multiple platforms. > > I can probably (need to check policies) post binaries here if anybody > else wants to get involved in updating packages. It's not much fun > though... > > - Erik > |
From: Erik A. H. <er...@he...> - 2001-12-11 02:43:32
|
On Mon, Dec 10, 2001 at 08:42:02PM -0500, Nicholas Henke wrote: > Will the SRPMS/RPMS that are on the CD be available in updated form on > the web? That is to ask will you post new kernel and bproc RPMS when there > are new versions available? Yes - when new versions actually get packaged up. I don't plan on doing this for every kernel or every patch level of BProc. That would easily consume all of my time. The hope is to do something like this every few (3 or 4?) months. Building and testings the kernel packages in particular is hugely time consuming given the multiple platforms. I can probably (need to check policies) post binaries here if anybody else wants to get involved in updating packages. It's not much fun though... - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: Nicholas H. <he...@se...> - 2001-12-11 01:42:49
|
Will the SRPMS/RPMS that are on the CD be available in updated form on the web? That is to ask will you post new kernel and bproc RPMS when there are new versions available? Nic On Mon, 10 Dec 2001, Erik Arjan Hendriks wrote: > FYI, > > The ISO image and files from the clustermatic CD that the LANL people > and I were handing out at ALS and SC '01 is available online now at: > > http://www.clustermatic.org/ > > The download links are at the bottom of the page. Oh, and I know the > MIME types are a little screwed up on our web server :) > > - Erik > |
From: Erik A. H. <er...@he...> - 2001-12-11 01:07:20
|
FYI, The ISO image and files from the clustermatic CD that the LANL people and I were handing out at ALS and SC '01 is available online now at: http://www.clustermatic.org/ The download links are at the bottom of the page. Oh, and I know the MIME types are a little screwed up on our web server :) - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: Erik A. H. <er...@he...> - 2001-12-10 00:42:27
|
On Mon, Dec 10, 2001 at 01:29:38AM +0100, J.A. Magallon wrote: > Hi al... > > I am trying to modify the patch included with bproc to apply on 2.4.16 > and 17-pre. All is offset correction but the places where bproc touches > ptrace (kernel/ptrace.c, arch/i386/kernel/ptrace.c). There is a new > function ptrace_check_attach() and changes introduced by bproc seem to > go inside, but need vars from outside. > > Any suggestion ? Yup. Patch attached. Functionally, it should be identical to the old code... Basically, the extra information check_attach needs goes in current->bproc.arg. It's a bit of scratch space I've used a few times to pass information down through a few function calls. It's definitely an ugly hack but it prevents me from having to change a whole slew of function prototypes. I haven't beaten on this one too much yet (no alpha testing) but it seems ok on x86. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: J.A. M. <jam...@ab...> - 2001-12-10 00:30:00
|
Hi al... I am trying to modify the patch included with bproc to apply on 2.4.16 and 17-pre. All is offset correction but the places where bproc touches ptrace (kernel/ptrace.c, arch/i386/kernel/ptrace.c). There is a new function ptrace_check_attach() and changes introduced by bproc seem to go inside, but need vars from outside. Any suggestion ? -- J.A. Magallon # Let the source be with you... mailto:jam...@ab... Mandrake Linux release 8.2 (Cooker) for i586 Linux werewolf 2.4.17-pre6 #1 SMP Sun Dec 9 21:55:48 CET 2001 i686 |
From: J.A. M. <jam...@ab...> - 2001-12-09 23:58:08
|
Hi. This patch changes the includes of linux/malloc.h to linux/slab.h, to make bproc modules compliant with new kernel policy. It is backwards compatible, so no ifdef'ing needed. Please, apply. (NOTE: it also adds a couple MODULE_LICENSE's) -- J.A. Magallon # Let the source be with you... mailto:jam...@ab... Mandrake Linux release 8.2 (Cooker) for i586 Linux werewolf 2.4.17-pre6 #1 SMP Sun Dec 9 21:55:48 CET 2001 i686 |